New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp integration tests, run in parallel #1105
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
briantist
added
misc
Used as a release-drafter "category"
consul
tests
related to tests (not necessarily CI/CD)
maintenance
General technical debt
developer experience
Developer setup and experience
labels
Nov 11, 2023
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1105 +/- ##
==========================================
- Coverage 87.16% 87.12% -0.04%
==========================================
Files 64 64
Lines 3162 3162
==========================================
- Hits 2756 2755 -1
- Misses 406 407 +1 |
briantist
force-pushed
the
tests/paralellize
branch
from
November 22, 2023 22:38
6aa0995
to
f51c6f0
Compare
This was referenced Mar 16, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
consul
developer experience
Developer setup and experience
maintenance
General technical debt
misc
Used as a release-drafter "category"
tests
related to tests (not necessarily CI/CD)
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I started looking at improving the performance of tests, especially integration tests. Since they run serially and have to spin up Vault and/or Consul, they tend to take a while to run.
On my local machine, they take a little over 1 minute to run, in CI more like 1m40s or so. Not bad, but when you're iterating locally it can be a slog, and in CI we'd like to add more platforms to test against, and possibly more python versions for integration, so speedier test times will add up.
(unit tests were already very fast, ~3s on my machine, ~10s or less in CI)
pytest-xdist
seemed like a good choice for parallelizing the tests and taking advantage of multiple cores.The challenges came with the Vault and Consul servers that needed to be started up, and assumptions made in the tests themselves: the way everything was written hardcoded TCP port numbers and IP addresses, and assumed that if a server existed on those addresses already that it was an error state.
So this PR makes a whole lot of changes to that process, so that we can start multiple Vault and Consul servers simultaneously without them stepping on each other.
For this, the configuration files for each of them are now patched and generated at test run time, with the help of a new class that works as a context manager to find free port numbers.
The context manager usage helps in the case of launching a Vault cluster where we need to launch two instances of Vault and one instance of Consul, all with ports that don't conflict with each other, and we need to know those ports before they start so we can write configs before those servers actually use the ports.
This also means that tests that assume vault is always available on
127.0.0.1:8200
needed to be updated to dynamically use the Vault server launched for that test, and this required changes in the server manager class and the test case class to ensure that all the right things got to the right places.The result: integration tests take ~16s on my 6-core machine with 12 test workers.
In CI ~30s
It's worth noting that GitHub's Ubuntu runners are supposed to have 2 vCPUs. Since xdist is configured by default to auto-detect the number of CPUs or hyperthreads, and set the worker count to that, I noticed that some test runs were choosing 2, and some were choosing 4.
I have not been able to find official word, but it seems like GitHub is slowly increasing the number of vCPUs on its runners (perhaps also the performance per vCPU, but that is harder for me to tell). Some independent research led me to find evidence that Ubuntu and Windows runners are going from 2 -> 4 vCPUs, and MacOS runners are going from 3 -> 4 vCPUs.
This GitHub action which shows the number of cores available runs itself in its own CI once a week, and you can look at the history of runs to see some evidence: https://github.com/SimenB/github-actions-cpu-cores/actions
Anyway all the above means is that this is a great time parallelize the tests, because we get even more gain from that change!
Unit tests are also set to use xdist now, but the times on both my local machine and in CI are roughly the same, because they were already so fast and any parallelized time savings are eaten up by the fixed overhead of setting up the workers and distributing test load, but if we use unit tests more, we'll see benefit in the future.
Other changes:
Other notes
apt
.