-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate remaining Always & CLI tests to pytest
#28459
Conversation
@@ -30,68 +30,79 @@ | |||
) | |||
|
|||
|
|||
class TestProjectStructure(unittest.TestCase): | |||
class TestProjectStructure: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought TestProjectStructure
should be re-worked, I miss the main idea of this test so it would be nice if someone could share the idea about this tests
test_deprecated_packages
- I thought this test not properly worked since actual files replaced by PEP-562 getattr - #26153
test_providers_modules_should_have_tests
- This test broken for a while, generators wrapped into generators and when it actually transformed to set this line broke expected_test_files
calculation and it always set([])
modules_files = set(modules_files) |
Instead of raise an error in this test I print warning until we fix or rewrite this test logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this module come from our past renaming we've doe in AIP-21 - and some of them can be removed or simplified or moved elsewhere as they never caught up as "common provider" approach:
- test_deprecated_packages -> we should indeed remove, they are no problem any more
- Test...ProviderProjectStructure - we should remove them - they bring no value and having then for few providers in here makes littel sense
- ExampleCoverageTest - similarly
- AssetsCoverageTest -> I think this is largely used by Google team. Naybe @bhirsz might tell more about it (though in any case I thik if it is useful for Google it should go to google provider tests.
- test_providers_modules_should_have_tests -> is the only one I find useful from those for now but if it's broken, it might be we are far-off and fixing it would be problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_providers_modules_should_have_tests
after changes by this PR shows this warning
Detect missing tests in providers module.
Modules Files: 566, Current Test Files: 583, Missing Tests Files: 97.
tests/providers/amazon/aws/hooks/test_dms.py
tests/providers/amazon/aws/links/test_base_aws.py
tests/providers/amazon/aws/links/test_batch.py
tests/providers/amazon/aws/links/test_emr.py
tests/providers/amazon/aws/links/test_logs.py
tests/providers/amazon/aws/operators/test_aws_lambda.py
tests/providers/amazon/aws/operators/test_dms.py
tests/providers/amazon/aws/operators/test_emr.py
tests/providers/amazon/aws/operators/test_s3.py
tests/providers/amazon/aws/operators/test_sagemaker.py
tests/providers/amazon/aws/sensors/test_dms.py
tests/providers/amazon/aws/sensors/test_ecs.py
tests/providers/amazon/aws/sensors/test_emr.py
tests/providers/amazon/aws/sensors/test_s3.py
tests/providers/amazon/aws/sensors/test_sagemaker.py
tests/providers/amazon/aws/test_exceptions.py
tests/providers/amazon/aws/utils/test_rds.py
tests/providers/amazon/aws/waiters/test_base_waiter.py
tests/providers/apache/cassandra/hooks/test_cassandra.py
tests/providers/apache/druid/operators/test_druid_check.py
tests/providers/cncf/kubernetes/backcompat/test_backwards_compat_converters.py
tests/providers/cncf/kubernetes/test_python_kubernetes_script.py
tests/providers/cncf/kubernetes/utils/test_xcom_sidecar.py
tests/providers/databricks/hooks/test_databricks_base.py
tests/providers/databricks/utils/test_databricks.py
tests/providers/dbt/cloud/hooks/test_dbt.py
tests/providers/dbt/cloud/operators/test_dbt.py
tests/providers/dbt/cloud/sensors/test_dbt.py
tests/providers/elasticsearch/log/test_es_json_formatter.py
tests/providers/google/cloud/links/test_base.py
tests/providers/google/cloud/links/test_bigquery.py
tests/providers/google/cloud/links/test_bigquery_dts.py
tests/providers/google/cloud/links/test_bigtable.py
tests/providers/google/cloud/links/test_cloud_build.py
tests/providers/google/cloud/links/test_cloud_functions.py
tests/providers/google/cloud/links/test_cloud_memorystore.py
tests/providers/google/cloud/links/test_cloud_sql.py
tests/providers/google/cloud/links/test_cloud_storage_transfer.py
tests/providers/google/cloud/links/test_cloud_tasks.py
tests/providers/google/cloud/links/test_compute.py
tests/providers/google/cloud/links/test_data_loss_prevention.py
tests/providers/google/cloud/links/test_datacatalog.py
tests/providers/google/cloud/links/test_dataflow.py
tests/providers/google/cloud/links/test_dataform.py
tests/providers/google/cloud/links/test_dataplex.py
tests/providers/google/cloud/links/test_dataprep.py
tests/providers/google/cloud/links/test_dataproc.py
tests/providers/google/cloud/links/test_datastore.py
tests/providers/google/cloud/links/test_kubernetes_engine.py
tests/providers/google/cloud/links/test_life_sciences.py
tests/providers/google/cloud/links/test_mlengine.py
tests/providers/google/cloud/links/test_pubsub.py
tests/providers/google/cloud/links/test_spanner.py
tests/providers/google/cloud/links/test_stackdriver.py
tests/providers/google/cloud/links/test_vertex_ai.py
tests/providers/google/cloud/links/test_workflows.py
tests/providers/google/cloud/operators/vertex_ai/test_auto_ml.py
tests/providers/google/cloud/operators/vertex_ai/test_batch_prediction_job.py
tests/providers/google/cloud/operators/vertex_ai/test_custom_job.py
tests/providers/google/cloud/operators/vertex_ai/test_dataset.py
tests/providers/google/cloud/operators/vertex_ai/test_endpoint_service.py
tests/providers/google/cloud/operators/vertex_ai/test_hyperparameter_tuning_job.py
tests/providers/google/cloud/operators/vertex_ai/test_model_service.py
tests/providers/google/cloud/sensors/test_dataform.py
tests/providers/google/cloud/transfers/test_presto_to_gcs.py
tests/providers/google/cloud/transfers/test_trino_to_gcs.py
tests/providers/google/cloud/triggers/test_cloud_composer.py
tests/providers/google/cloud/triggers/test_dataproc.py
tests/providers/google/cloud/utils/test_bigquery.py
tests/providers/google/cloud/utils/test_bigquery_get_data.py
tests/providers/google/cloud/utils/test_dataform.py
tests/providers/google/common/links/test_storage.py
tests/providers/google/common/test_consts.py
tests/providers/google/test_go_module_utils.py
tests/providers/microsoft/azure/hooks/test_batch.py
tests/providers/microsoft/azure/hooks/test_container_instance.py
tests/providers/microsoft/azure/hooks/test_container_registry.py
tests/providers/microsoft/azure/hooks/test_container_volume.py
tests/providers/microsoft/azure/hooks/test_cosmos.py
tests/providers/microsoft/azure/hooks/test_data_factory.py
tests/providers/microsoft/azure/hooks/test_data_lake.py
tests/providers/microsoft/azure/hooks/test_fileshare.py
tests/providers/microsoft/azure/hooks/test_synapse.py
tests/providers/microsoft/azure/operators/test_adls.py
tests/providers/microsoft/azure/operators/test_batch.py
tests/providers/microsoft/azure/operators/test_container_instances.py
tests/providers/microsoft/azure/operators/test_cosmos.py
tests/providers/microsoft/azure/operators/test_data_factory.py
tests/providers/microsoft/azure/operators/test_synapse.py
tests/providers/microsoft/azure/secrets/test_key_vault.py
tests/providers/microsoft/azure/sensors/test_cosmos.py
tests/providers/microsoft/azure/sensors/test_data_factory.py
tests/providers/mongo/sensors/test_mongo.py
tests/providers/presto/transfers/test_gcs_to_presto.py
tests/providers/redis/operators/test_redis_publish.py
tests/providers/redis/sensors/test_redis_key.py
tests/providers/trino/transfers/test_gcs_to_trino.py
Some of them about deprecated modules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- some of tests moved into tests.integrations
- Amazon links tested in one go in same module
- Some of amazon tests split across different test modules, for example:
test_s3_bucket.py
+test_s3_bucket_tagging.py
+test_s3_file_transform.py
+test_s3_list.py
+test_s3_list_prefixes.py
+test_s3_object.py
but module named ass3.py
(merged version)
So in theory there is no problem to fix some issue, add tests.integration as search path and add temporary exclusion for remaining and turn on this test again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Sounds like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- test_deprecated_packages -> we should indeed remove, they are no problem any more
Done
- Test...ProviderProjectStructure - we should remove them - they bring no value and having then for few providers in here makes littel sense
Done, keep only ProjectStructureTest
because ProjectStructureTest
depends on it
- ExampleCoverageTest - similarly
Removed
- AssetsCoverageTest -> I think this is largely used by Google team. Naybe @bhirsz might tell more about it (though in any case I thik if it is useful for Google it should go to google provider tests.
Keep for a while
- test_providers_modules_should_have_tests -> is the only one I find useful from those for now but if it's broken, it might be we are far-off and fixing it would be problematic.
Will try to fix (make it run again) as follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see if we could write this in a way that's more generatable. Pytest has some pretty powerful features that should prevent us from having to write out our parameters one by one.
"scheme://user:password@host%2Flocation:1234/schema?extra1=a%20value&extra2=%2Fpath%2F", | ||
{ | ||
"conn_type": "scheme", | ||
"host": "host/location", | ||
"schema": "schema", | ||
"login": "user", | ||
"password": "password", | ||
"port": 1234, | ||
"extra_dejson": {"extra1": "a value", "extra2": "/path/"}, | ||
}, | ||
id="with extras", | ||
), | ||
pytest.param( | ||
"scheme://user:password@host%2Flocation:1234/schema?__extra__=single+value", | ||
{ | ||
"conn_type": "scheme", | ||
"host": "host/location", | ||
"schema": "schema", | ||
"login": "user", | ||
"password": "password", | ||
"port": 1234, | ||
"extra": "single value", | ||
}, | ||
id="with extras single value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a lot of repeat code here. Is there any way of compressing this into a generator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is any problem to generate tests cases:
- Functions
- Generators
- Fixtures
- pytest-cases
Right now general idea migrate all tests to pytest
it is sill remaining about 100-120 modules (most of them sits into Google Provider).
Also all code generation could contain some bugs and complicated logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can always reorganize and refactor those tests later - l;et's not do both changing the type of tests and refactoring the way how they are defined in one PR
This is also a question of DRY vs DAMP - I think in many cases for tests DRY does not work that well. if you have to look at several places (inevitable with DRY) to understand what is the expected test input and output, it's worse than if you have the inputs and outputs together (and no common code that you also have to look-up). It's really nice to see all the cases with input/output clearly grouped together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if we can find a good way also having similar properties and less code - that's cool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dimberman - BTW. Do you have some propoal in mind ? Maybe some example that you can show how it could look like ? I would be interested to see how this could be done nicer ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About DRY, right now a lot of providers tests usually do the the same things:
- Mock Airflow Connections
- Mock their clients
- Create task and execute them
Most of this stuff could be achived by couple fixtures or context managers. Unfortunetly right now structure of internal framework split across tests package. My suggestion move all generic stuff into separate package module and inject them as pytest
plugin.
In this case we could also test our fixtures/test context managers and helpers functions if we want.
If we talk about this part it is a bit tricky generate some stuff because within this test cases we tests our Connection model and how it transform URI -> Connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About DRY, right now a lot of providers tests usually do the the same things:
- Mock Airflow Connections
- Mock their clients
- Create task and execute them
Most of this stuff could be achived by couple fixtures or context managers. Unfortunetly right now structure of internal framework split across tests package. My suggestion move all generic stuff into separate package module and inject them as
pytest
plugin.
Yep. It would be good if we can do some generalisation. But we have to take into account that we plan at some point of time to split providers out - possibly even to a separate repo per provider - and for sure we want to restructure the providers to become independent standalone "sub-folders" as an interim step - https://lists.apache.org/thread/3s5tn1wnvo0cw9vofwmbjl0rkyvhrtbx and there any common code might become challenging. Pytest plugins might actually help with it.
If we talk about this part it is a bit tricky generate some stuff because within this test cases we tests our Connection model and how it transform URI -> Connection.
Yeah I also find it a but challenging how to do it "dry-er" vs loosing some of hte DAMPness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I've also suggest in dev-list about separate generic tests parts into package this should also help split providers into separate sub-folders and even in separate repository because in this case all of them would have access to this package (we just nee install them) and all tests configurations which live inside tests/conftest.py
could be moved into it.
Of course it is in theory and I plan to create PoC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah. I missed that :( ... sorry, been a bit swamed recently :)
2c2f793
to
fc3fe6c
Compare
fc3fe6c
to
7e91f8f
Compare
@dimberman ? What do you think? |
7e91f8f
to
9730195
Compare
9730195
to
41cefe0
Compare
LGTM. @dimberman -> do you still have doubts? Or should we merge this one? |
👀 😢 👀 |
Migrate remaining Always & CLI tests to
pytest
in selected directoriestests/always/*
tests/cli/*
As usual add findings, info about significant changes and potential follow up in comments