Skip to content
This repository has been archived by the owner on Aug 4, 2022. It is now read-only.

Store test run results along with report 📝 #39

Closed
hackebrot opened this issue Jun 24, 2020 · 9 comments · Fixed by #62
Closed

Store test run results along with report 📝 #39

hackebrot opened this issue Jun 24, 2020 · 9 comments · Fixed by #62
Labels
bigquery Tasks related to the bigquery component capability New capability for burnham or burnham-bigquery discussion Issues for discussing ideas for features
Milestone

Comments

@hackebrot
Copy link
Collaborator

We need to store the test report and test run information from the burnham-bigquery Docker image somewhere.

Let's discuss a good approach for that.

@hackebrot hackebrot added capability New capability for burnham or burnham-bigquery bigquery Tasks related to the bigquery component labels Jun 24, 2020
@hackebrot hackebrot added this to the Future work milestone Jun 24, 2020
@hackebrot hackebrot added the discussion Issues for discussing ideas for features label Jun 24, 2020
@jklukas
Copy link
Contributor

jklukas commented Jun 24, 2020

Do you have a good sense of what the test output will look like? My go-to solution is that we provision a destination table in BQ for test results and the burnham-bigquery image can be responsible for writing there. Even if the output is basically a block of text, we can dump that into a STRING field.

@hackebrot
Copy link
Collaborator Author

  • When we run the tests on Airflow, we will generate a UUID for the test run and pass it to burnham, which stores it in metrics.uuid.test_run, and also pass it to burnham-bigquery to query data for this run
  • Every test scenario will generate a test item with a unique name, which is stored in pings under metrics.string.test_name, and that test item has an outcome (passed, failed, error etc.) and output
  • It might be useful to also keep track of the Glean SDK version used to produce the data for a test run

We can produce test reports in Markdown using https://github.com/hackebrot/pytest-md

@hackebrot
Copy link
Collaborator Author

What do you think about this table schema?

This would be a row for each individual test.

Field name Type Description
test_run STRING ID of the current test run
test_name STRING Name of the current test.
test_outcome STRING Outcome of the current test. (see below)
test_duration FLOAT64 Duration of the current test.
test_report STRING Report for the current test for error or failed tests

Possible outcomes:

  • ERROR
  • FAILED
  • PASSED
  • SKIPPED
  • XFAILED
  • XPASSED

Does that make sense or do you think it would be better to write one row for the entire test run and use a record to store results for individual tests?

@jklukas
Copy link
Contributor

jklukas commented Aug 6, 2020

I think the flatter structure you propose will be easier to work with than a nested one. The data size will be small, so we don't have to worry about making the representation efficient. This looks good to me.

@hackebrot
Copy link
Collaborator Author

hackebrot commented Aug 6, 2020

@jklukas can you provision a destination table for this in BigQuery, please?

jklukas added a commit to mozilla/bigquery-etl that referenced this issue Aug 6, 2020
@jklukas
Copy link
Contributor

jklukas commented Aug 6, 2020

Proposed table structure in mozilla/bigquery-etl#1220

jklukas added a commit to mozilla/bigquery-etl that referenced this issue Aug 6, 2020
@hackebrot
Copy link
Collaborator Author

Thank you @jklukas! I now realize that we probably also need to link to the Airflow job or the logs for the burnham operators, so that we can diagnose client-side issues for failed test runs. 🤔

@jklukas
Copy link
Contributor

jklukas commented Aug 7, 2020

we probably also need to link to the Airflow job or the logs for the burnham operators, so that we can diagnose client-side issues for failed test runs

I think it would make sense to include a test_log_url field. We should have all the relevant data via Airflow variables. Links look like:

https://workflow.telemetry.mozilla.org/log?task_id=verify_data&dag_id=burnham&execution_date=2020-08-06T00%3A00%3A00%2B00%3A00

We should be able to pull this using Airflow macros as {{ task_instance.log_url }} and pass it into the task. We just need to make sure it's passed as a parameter that's going to have templating applied.

@hackebrot
Copy link
Collaborator Author

Perfect! 👍

jklukas added a commit to mozilla/bigquery-etl that referenced this issue Aug 7, 2020
jklukas added a commit to mozilla/bigquery-etl that referenced this issue Aug 10, 2020
* Add burnham test report table

For mozilla/burnham#39

* Add test_log_url and test_duration_millis

* Apply suggestions from code review

Co-authored-by: Raphael Pierzina <raphael@hackebrot.de>

Co-authored-by: Raphael Pierzina <raphael@hackebrot.de>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bigquery Tasks related to the bigquery component capability New capability for burnham or burnham-bigquery discussion Issues for discussing ideas for features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants