-
Notifications
You must be signed in to change notification settings - Fork 13.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run helm chart tests in parallel #15706
Conversation
FYI. @jedcunningham -> this is already possibe: https://github.com/apache/airflow/blob/master/TESTING.rst#running-full-airflow-test-suite-in-parallel What happens there is the whole "chart" is copied to a separate directory and helm tests are run in parallel. This is done in CI but you can also run the script to run helm tests in parallel |
|
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
@@ -484,6 +484,7 @@ def get_sphinx_theme_version() -> str: | |||
'click~=7.1', | |||
'coverage', | |||
'docutils', | |||
'filelock', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the py-filelock project is no longer maintained, the last commit to the project was in 2019. Apart from that, this LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that matters @ephraimbuddy ? This is rather simple library and I do not expect too many changes (300 lines of code). Looks like it is not updated because it simply does its job well :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right @potiuk :)
Correction to my previous comment-> what I wrote was correct for Kubernetes tests not Helm tests (this is where we copy all the charts and run them in parallell. For Helm tests it runs well in parallell, because they are run in a separate docker containers where sources are baked into image, rather than mounted. I see why you'd want to run them locally though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Great for local iteration speed :)
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
Yeah, this was driven by me getting tired of the tests running slow locally.
(not sure how to deep link further): https://github.com/apache/airflow/pull/15627/checks?check_run_id=2518755958 Looks like even with self hosted runners these are running sequentially, based on the time? |
I think we still need to set a flag to CI script to activate parallelism. See: airflow/scripts/in_container/entrypoint_ci.sh Lines 245 to 249 in 3b4fdd0
We should add the following change. if [[ "${TEST_TYPE}" == "Helm" ]]; then
# Enable parallelism
EXTRA_PYTEST_ARGS+=(
"-n" "auto"
)
else
EXTRA_PYTEST_ARGS+=(
"--with-db-init"
)
fi |
The helm chart tests are pretty slow when run sequentially. Modifying them so they can be run in parallel saves a lot of time, from 10 minutes to 3 minutes on my machine with 8 cores. The only test that needed modification was `test_pod_template_file.py`, as it temporarily moves a file into the templates directory which was causing other tests to fail as they weren't expecting any objects from that temporary file. This is resolved by giving the pod_template_file test an isolated chart directory it can modify. `helm dep update` also doesn't work when it is called in parallel, so the fixture responsible for running it now ensures we only run it one at a time.
b053f72
to
422bd56
Compare
@mik-laj, do you want me to do that as part of this PR? |
@jedcunningham , Yes. We have a limited budget for CI and we try to optimize it. See: https://lists.apache.org/thread.html/r7d712327f985536b33c7791ddb7f443730f0c88a1e2e2c50538411fa%40%3Cdev.airflow.apache.org%3E |
You are right. I've forgotten that the helm charts were run as separate process. We had far less number of unit tests before for helm charts and they run much faster. After your recent addition they started to run much longer - they were running much quicker before. Good call. Yep. The change proposed by @mik-laj should work (we already have pytest-xdist added). |
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
Some static checks failed, but it looks unrelated to this change. I am merging this change to test on a self-hosted runner. |
Looks like it was faster on an actions runner too. |
Yep. There are 2 CPUs there. Most of our tests are already auto-detecting the number of CPUs and I sped them up this way (unfortunately a lot of our regular Airflow tests (unlike the helm tests) cannot be run in parallel with pytest-xdist because they are using shared database and rely on the DB state. So I had to parallelize them 'per test type'. |
@mik-laj - I think you merged it too early - there were static-checks and build-docs failing @jedcunningham - can you please fix them ? Otherwise master is broken. |
AH. I see the comment now. |
@potiuk it is unrelated to this change. We have these problem on maaster too. See: https://github.com/apache/airflow/runs/2521146207 |
So we already have broken master . Do we know why? |
@potiuk It may be related to this PR. #15681 this build doesn't check all files.
|
Reverting it then: #15707 |
I reverted it just in case but I think there were two problems. The change was run on the AMI in Ohio that @ashb is testing (Runner name: 'Airflow Runner 85'. @ashb -> FYI: https://github.com/apache/airflow/runs/2521146207 seems that https://github.com/apache/airflow/runs/2521146207#step:10:188 |
* Allow helm chart tests to run in parallel The helm chart tests are pretty slow when run sequentially. Modifying them so they can be run in parallel saves a lot of time, from 10 minutes to 3 minutes on my machine with 8 cores. The only test that needed modification was `test_pod_template_file.py`, as it temporarily moves a file into the templates directory which was causing other tests to fail as they weren't expecting any objects from that temporary file. This is resolved by giving the pod_template_file test an isolated chart directory it can modify. `helm dep update` also doesn't work when it is called in parallel, so the fixture responsible for running it now ensures we only run it one at a time. * Enable parallelism for helm unit tests in CI Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com> Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com> Partial Commit Extracted From: https://github.com/apache/airflow
The helm chart tests are pretty slow when run sequentially. Modifying
them so they can be run in parallel saves a lot of time, from 10 minutes
to 3 minutes on my machine with 8 cores.
The only test that needed modification was
test_pod_template_file.py
,as it temporarily moves a file into the templates directory
which was causing other tests to fail as they weren't expecting any
objects from that temporary file. This is resolved by giving the
pod_template_file test an isolated chart directory it can modify.
helm dep update
also doesn't work when it is called in parallel, sothe fixture responsible for running it now ensures we only run it one at
a time.