-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
193 append test refactor #217
Conversation
… rewrite our tests to ensure that every integration test runs against every database. This step will simultaneously reduce the number of tests we need to maintain, make testing much simpler as we add new databases, and will make a future refactor much simpler as we can ensure proper coverage. To do this, we will take advantage of two features in pytest, fixtures and parameterize. For this ticket, we will update the `append` function. **Acceptance criteria** Have a single test file validating `append` across all databases. 1. Integration tests should be marked with `pytest.marker.integration`: * Each test should work across all databases * Validating both `Table` and `TempTable` as inputs * Use `test_utils.run_dag` * Use PyTest `fixtures` and `parameterize` to have a single main test that will validate transform across multiple databases The tests should validate these scenarios: 1. appending two tables against a single column 2. appending two tables against multiple columns 3. appending against all fields by not specifying fields 4. appending with casting 5. append with some casted fields and some uncasted fields 6. test with two different databases (should fail)
Codecov Report
@@ Coverage Diff @@
## main #217 +/- ##
==========================================
- Coverage 88.33% 87.96% -0.38%
==========================================
Files 61 60 -1
Lines 3583 3381 -202
Branches 317 317
==========================================
- Hits 3165 2974 -191
+ Misses 376 365 -11
Partials 42 42
Continue to review full report at Codecov.
|
|
||
|
||
@pytest.fixture | ||
def append_params(request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be expanded to testcases instead of fixtures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will make it much easier to add more testcases with less boilerplate. We can see if it becomes a problem but this ultimately makes it easier to create test grids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dimberman I agree with @utkarsharma2 on this one. I don't think we should aim to have a single test per operator. We already have a few dimensions we are using parametrizations:
- multiple databases, when relevant
- multiple files, when relevant
- multiple file locations, when relevant
- multiple file types, when relevant
I strongly recommend we do not use parametrization for groups of parameters sent to our tasks/operators.
For many operators, I don't think we need to test all the possible configurations of parameters with all the databases.
app_param, validate_append = append_params | ||
|
||
with sample_dag: | ||
load_main = aql.load_file( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move loading files into tmp_table? is there any usecase with just tmp_table without loading it to db?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now using load_file ensures that the actual function is being passed info from a previous task instead of a table object
], | ||
indirect=True, | ||
) | ||
def test_append_on_tables_on_different_db(sample_dag, sql_server): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this test case be renamed or we should add more DBs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would it need to be renamed? It's testing a piece of code that's not DB dependent
@dimberman +116 −586, nicely done! |
}, validate_basic | ||
if mode == "all_fields": | ||
return {}, validate_append_all | ||
if mode == "with_caste": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this test runs against all the databases, we probably do not need to run "basic" and "cast_only" against all the other databases.
tmp_table_1 = TempTable(conn_id="postgres_conn") | ||
tmp_table_2 = TempTable(conn_id="sqlite_conn") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dimberman Are we tearing down these tables?
* As a step towards "maturing" the astro DAG authoring project, we must rewrite our tests to ensure that every integration test runs against every database. This step will simultaneously reduce the number of tests we need to maintain, make testing much simpler as we add new databases, and will make a future refactor much simpler as we can ensure proper coverage. To do this, we will take advantage of two features in pytest, fixtures and parameterize. For this ticket, we will update the `append` function. **Acceptance criteria** Have a single test file validating `append` across all databases. 1. Integration tests should be marked with `pytest.marker.integration`: * Each test should work across all databases * Validating both `Table` and `TempTable` as inputs * Use `test_utils.run_dag` * Use PyTest `fixtures` and `parameterize` to have a single main test that will validate transform across multiple databases The tests should validate these scenarios: 1. appending two tables against a single column 2. appending two tables against multiple columns 3. appending against all fields by not specifying fields 4. appending with casting 5. append with some casted fields and some uncasted fields 6. test with two different databases (should fail) * merged append tests * fix invalid test * fix different db test
Description
As a step towards "maturing" the astro DAG authoring project, we must rewrite our tests to ensure that every integration test runs against every database.
This step will simultaneously reduce the number of tests we need to maintain, make testing much simpler as we add new databases, and will make a future refactor much simpler as we can ensure proper coverage.
To do this, we will take advantage of two features in pytest, fixtures and parameterize.
For this ticket, we will update the
append
function.Acceptance criteria
Have a single test file validating
append
across all databases.pytest.marker.integration
:Table
andTempTable
as inputstest_utils.run_dag
fixtures
andparameterize
to have a single main test that will validate transform across multiple databasesThe tests should validate these scenarios: