Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

193 append test refactor #217

Merged
merged 4 commits into from
Mar 16, 2022
Merged

193 append test refactor #217

merged 4 commits into from
Mar 16, 2022

Conversation

dimberman
Copy link
Collaborator

@dimberman dimberman commented Mar 15, 2022

Description
As a step towards "maturing" the astro DAG authoring project, we must rewrite our tests to ensure that every integration test runs against every database.

This step will simultaneously reduce the number of tests we need to maintain, make testing much simpler as we add new databases, and will make a future refactor much simpler as we can ensure proper coverage.

To do this, we will take advantage of two features in pytest, fixtures and parameterize.

For this ticket, we will update the append function.

Acceptance criteria
Have a single test file validating append across all databases.

  1. Integration tests should be marked with pytest.marker.integration:
  • Each test should work across all databases
  • Validating both Table and TempTable as inputs
  • Use test_utils.run_dag
  • Use PyTest fixtures and parameterize to have a single main test that will validate transform across multiple databases

The tests should validate these scenarios:

  • appending two tables against a single column
  • appending two tables against multiple columns
  • appending against all fields by not specifying fields
  • appending with casting
  • append with some casted fields and some uncasted fields
  • test with two different databases (should fail)

… rewrite our tests to ensure that every integration test runs against every database.

This step will simultaneously reduce the number of tests we need to maintain, make testing much simpler as we add new databases, and will make a future refactor much simpler as we can ensure proper coverage.

To do this, we will take advantage of two features in pytest, fixtures and parameterize.

For this ticket, we will update the `append` function.

**Acceptance criteria**
Have a single test file validating `append` across all databases.

1. Integration tests should be marked with `pytest.marker.integration`:

* Each test should work across all databases
* Validating both `Table` and `TempTable` as inputs
* Use `test_utils.run_dag`
* Use PyTest `fixtures` and `parameterize` to have a single main test that will validate transform across multiple databases

The tests should validate these scenarios:
1. appending two tables against a single column
2. appending two tables against multiple columns
3. appending against all fields by not specifying fields
4. appending with casting
5. append with some casted fields and some uncasted fields
6. test with two different databases (should fail)
@codecov
Copy link

codecov bot commented Mar 15, 2022

Codecov Report

Merging #217 (3d1395f) into main (9a565e8) will decrease coverage by 0.37%.
The diff coverage is 89.09%.

@@            Coverage Diff             @@
##             main     #217      +/-   ##
==========================================
- Coverage   88.33%   87.96%   -0.38%     
==========================================
  Files          61       60       -1     
  Lines        3583     3381     -202     
  Branches      317      317              
==========================================
- Hits         3165     2974     -191     
+ Misses        376      365      -11     
  Partials       42       42              
Impacted Files Coverage Δ
tests/operators/test_agnostic_append.py 89.09% <89.09%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9a565e8...3d1395f. Read the comment docs.



@pytest.fixture
def append_params(request):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be expanded to testcases instead of fixtures?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will make it much easier to add more testcases with less boilerplate. We can see if it becomes a problem but this ultimately makes it easier to create test grids

Copy link
Collaborator

@tatiana tatiana Mar 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimberman I agree with @utkarsharma2 on this one. I don't think we should aim to have a single test per operator. We already have a few dimensions we are using parametrizations:

  • multiple databases, when relevant
  • multiple files, when relevant
  • multiple file locations, when relevant
  • multiple file types, when relevant

I strongly recommend we do not use parametrization for groups of parameters sent to our tasks/operators.
For many operators, I don't think we need to test all the possible configurations of parameters with all the databases.

app_param, validate_append = append_params

with sample_dag:
load_main = aql.load_file(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move loading files into tmp_table? is there any usecase with just tmp_table without loading it to db?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now using load_file ensures that the actual function is being passed info from a previous task instead of a table object

],
indirect=True,
)
def test_append_on_tables_on_different_db(sample_dag, sql_server):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this test case be renamed or we should add more DBs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it need to be renamed? It's testing a piece of code that's not DB dependent

@utkarsharma2
Copy link
Collaborator

utkarsharma2 commented Mar 16, 2022

@dimberman +116 −586, nicely done!

@dimberman dimberman merged commit 2b4bef2 into main Mar 16, 2022
@dimberman dimberman deleted the 193_append-test-refactor branch March 16, 2022 22:06
}, validate_basic
if mode == "all_fields":
return {}, validate_append_all
if mode == "with_caste":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this test runs against all the databases, we probably do not need to run "basic" and "cast_only" against all the other databases.

Comment on lines +100 to +101
tmp_table_1 = TempTable(conn_id="postgres_conn")
tmp_table_2 = TempTable(conn_id="sqlite_conn")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimberman Are we tearing down these tables?

utkarsharma2 pushed a commit that referenced this pull request Mar 30, 2022
* As a step towards "maturing" the astro DAG authoring project, we must rewrite our tests to ensure that every integration test runs against every database.

This step will simultaneously reduce the number of tests we need to maintain, make testing much simpler as we add new databases, and will make a future refactor much simpler as we can ensure proper coverage.

To do this, we will take advantage of two features in pytest, fixtures and parameterize.

For this ticket, we will update the `append` function.

**Acceptance criteria**
Have a single test file validating `append` across all databases.

1. Integration tests should be marked with `pytest.marker.integration`:

* Each test should work across all databases
* Validating both `Table` and `TempTable` as inputs
* Use `test_utils.run_dag`
* Use PyTest `fixtures` and `parameterize` to have a single main test that will validate transform across multiple databases

The tests should validate these scenarios:
1. appending two tables against a single column
2. appending two tables against multiple columns
3. appending against all fields by not specifying fields
4. appending with casting
5. append with some casted fields and some uncasted fields
6. test with two different databases (should fail)

* merged append tests

* fix invalid test

* fix different db test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants