-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Use mocking for Docker decorator tests #21323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ashb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a) What's wrong with using an actual docker container for this test?
b) We are now testing the docker_execute function, rather than the code that the docker decorator injects.
So If we don't want to us a container (which I'm not sure of) we at least need to test the real code.
This is just the decorator code. The assumption is that it uses the Docker operator which is tested elsewhere (I hope!) |
|
Test failure seems unrelated. |
93ac918 to
196ef24
Compare
|
@ashb ? |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
|
@ashb fwiw I still think mocking this is much better than calling out to the docker tool – also because this is what we do with all other operators. |
|
The key difference here to my mind is that we can run the actual docker code without calling out to any external service -- so I'm wary of using mocks here as we run the risk that we are then testing the mock, not the actual code. In this case, we are testing the |
|
But isn't that always the assumption when we mock external dependencies – that we're doing it correctly. There is always that risk that the assumptions are wrong. But that would the hopefully be discovered during alpha/beta testing. |
Not really. We rather follow the typical test pyramid: This is just one incarnation of it: https://martinfowler.com/articles/practical-test-pyramid.html
This is rather comprehensively described in https://github.com/apache/airflow/blob/main/TESTING.rst (all the test types we have are described there). I personally think the "test pyramid" approach is far more useful than "only unit tests". I find the latter a bit too dogmatic and I think our integration tests and system tests play a role there (System tests especially for Providers in the future). |
|
@potiuk that sounds reasonable so in this particular case, are you saying that we do want to actually test that we can run a container workload? Just bear in mind that what motivated this was that an external container registry was unable to provide the container image in time so the test timed out. |
Yeah. Those external images fail occassionally - but not often enough to make it a problem. Our tests will anyhow refuse to work if a number of images are not available. We actually have quite a good sytem in-place to prevent from depending on "external images" to break our tests - rather than rely on those external images we either build, or simply push such an image to our Github Registry - this way we are free from dependence of the external images in a location that is not ours and speed up the builds as Github Registry is generally "close" and "more stable" for Github actions. Also it has the nice benefit that we do not fall into DockerHub limits of pulling images (DockerHub images have very aggressive rate limiting for unauthenticated pulls (based on outgoing IP address). Github Actions Public runners are exempted from it agreement of DockerHub and GitHub, similarly as Apache DockerHub Images (ASF has an agreement with DockerHub about it) but our self-hosted runners pulling regular "DockerHub" images are only protected by the limit for authenticated users (we do authenticcate in our self-hosted runners). Those rate limits are better but still we might be rate limited at times. GHCR.IO images have no such limit - you can see example image here, but there are few more: https://github.com/orgs/apache/packages?tab=packages&q=airflow-trino. In this case I think we are using |
|
Couldn't we test against the Airflow docker image we build/run tests in already? |
I think is quite heavy for the tests and have some side effects (for example we have a custom entrypoint)- this image to test here is just a plain python image. I just pushed an alpine based one: https://github.com/orgs/apache/packages/container/package/airflow-python This was the command to prepare it (@malthe you can add it in a comment next to the test that this is the command we used to get it: |
Previously, the tests would actually run a Docker container.
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.