New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: more parallel tests #2938
ci: more parallel tests #2938
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
c625078
to
246f1fd
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work so far! 😀
In the commit history, one diff for disabled_clamav_spamassassin.bats
had this comment:
# TODO: find a better way to know when we have waited long enough
# for ClamAV to should have come up, if it were enabled
Was this relevant to the clamav.bats
test at all and it's current approach for this?
test/tests/parallel/set1/spam_virus/disabled_clamav_spamassassin.bats
Outdated
Show resolved
Hide resolved
test/tests/parallel/set1/spam_virus/disabled_clamav_spamassassin.bats
Outdated
Show resolved
Hide resolved
test/tests/parallel/set1/spam_virus/disabled_clamav_spamassassin.bats
Outdated
Show resolved
Hide resolved
test/tests/parallel/set1/spam_virus/disabled_clamav_spamassassin.bats
Outdated
Show resolved
Hide resolved
I thought the commit for Don't worry about it if you're ok with me rewriting the history afterwards :) (similar will be needed after merging the TLS PR affecting Due to all the tests being renamed/moved, and Github not being able to properly communicate that kind of diff, I'd like to avoid merge commits from master branch - and then later.. once the PR is approved for merging:
|
This comment was marked as outdated.
This comment was marked as outdated.
0aec4c8
to
40b8bf2
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
|
||
If your machine is capable, you can increase the amount of tests that are run simultaneously by prepending the `make clean all` command with `BATS_PARALLEL_JOBS=X` (i.e. `BATS_PARALLEL_JOBS=X make clean all`). This wil speed up the test procedure. You can also run all tests in serial by setting `BATS_PARALLEL_JOBS=1` this way. | ||
|
||
The default value of `BATS_PARALLEL_JOBS` is 2. Increasing it to `3` requires 6 threads and 6GB of main memory; increasing it to `4` requires 8 threads and at least 8GB of main memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did you get the resource requirements here? On the SSL tests I was running docker stats
was only showing about 50MB memory usage for each container 🤔
I removed the opt-out flag to run all test cases in the cipher lists test (total 6) and bumped the jobs (although my CPU is only a quad core with only 4 threads). No major memory usage increase, but it did take much longer (and failed test cases due to scripts in that test connecting to the wrong containers), about an extra minute or two due to the load I think.
Other tests are faster however with concurrent test cases, or when running multiple test files concurrently 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran some tests on my system; I have an older i7-4770k with 8 threads / 4 cores. Running with --jobs 4
was the maximum when it came to CPU utilisation for me. I'd rather set the bar high and avoid unnecessary test failures due to resource problems than setting it low and risking them. I used htop
to look at CPU/RAM, ran the parallel set 1 though (higher resource usages to due to activated anti-spam / anti-virus programs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used --jobs 4
(with and without --no-parallelize-within-files
) on my i5-6500 system (32GB RAM, 16GB in use) and for parallel/set1
I only observed near 2GB peak usage increase via system monitor and reaching 100% CPU load.
The only reason I think you'd have experienced higher RAM usage would perhaps be if the image under test hadn't been built with a recent ClamAV layer updated. I have run tests with the last clean image build being a week or so old and observed ClamAV tests were very memory hungry.
Might have been the case for you? That or there's something different with the latest changes I've pushed for this branch 🤷♂️
So I'm still not sure how high of a bar you want to set that. Before parallel tests were added, I've seen tests with ClamAV under some circumstances gobble up 6GB+ in memory 🤷♂️
Do you have a source for each increment of the BATS_PARALLEL_JOBS
/ --jobs
requiring double the number of threads? It doesn't seem to be mentioned on the BATS docs.
My understanding is it's just a concurrent queue of jobs that'll get scheduled CPU time on available threads, so since N jobs doesn't necessarily use constant CPU, more CPU time can be available for additional jobs to run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems as if the resource requirements differ a lot. The ClamAV update could be an issue, but I don't think so since I built it directly before running the tests, IIRC.
I have no source when it comes to the --jobs
flag needing more threads. We should therefore probably refrain from using exact numbers here and be more abstract. Something like: "Depending on the hardware in your system, you can increase the number of parallel jobs. Make sure you do not overload the system, which can lead to failing tests."
40b8bf2
to
8b7c9d3
Compare
Apologies for the delay.
|
UPDATE: CI takes now
From 30 minutes down to 17. Future PRs can improve on this even further. @casperklein this is a rather big PR, but most changes are the same (container setup routines, etc.). I'd recomment going through each commit one by one. |
8b7c9d3
to
33d840e
Compare
Additionally with the `tls.bash` helper for the `letsencrypt` tests.
`tls.bash` helper was adapted to the new helper scripts location. The `setup.bash` helper saw a bugfix (expanding the array properly) and updates the container default config to configure for IPv4 explicitly. The IPv4 default was added after recent Docker pushes and I saw weird IPv6 related errors in the logs.. now we're sure IPv4 is the default during tests. Added functionality to check if a process is running: - This change adds a helper function to check whether a program is running inside a container or not. - This added the need for a function like `_run_in_container` but allowing for providing an explicit container name. - Future PRs can use this helper function now to check whether a process is running or not. This was done for the tests of Fail2Ban, but can be used for other tests in the future as well. --- chore: Restructured BATS flags in `Makefile` The `Makefile` has seen a bit of a restructuring when it comes to flags: 1. The `MAKEFLAGS` variables is used by `make`, and allows for adding additional flags that can be used within in recursive calls (via `$(MAKE)`) too, thus DRY approach. 2. The flags for calling BATS were adjusted. `--no-parallelize-within-files` has been added as well to ensure tests _inside_ a single file are run sequentially. `dms-test` prefix matching changed to expect a `_` suffix as a delimiter. --- docs: Add a note regarding output from running tests in parallel
- The usual serial to parallel test conversion to utilize the `setup.bash` common setup structure, and adding a `TEST_PREFIX` var for each test case to leverage. - Standardize on parallel test naming conventions for variables / values. - More consistent use of `bash -c` instead of `/bin/bash -c` or `/bin/sh -c`. - Using the `_run_in_container` helper instead of `run docker exec ${CONTAINER_NAME}`. - Updates tests to use the `check_if_process_is_running` helper. --- chore: Revise inline docs for the `ssl_letsencrypt` test - Moves the override to be in closer proximity to the `initial_setup` call, and better communicates the intent to override. - Removes top comment block that is no longer providing value or correct information to maintainers. - Revised `acme.json` test case inline doc comments.
- `disabled_clamav_spamassassin`: - Just shuffling the test order around, and removing the restart test at the end which doesn't make sense. - `postscreen`: - Now uses common helper for getting container IP - Does not appear to need the `NET_ADMIN` capability? - Reduced startup time for the 2nd container + additional context about it's relevance. - Test cases are largely the same, but refactored the `nc` alternative that properly waits it's turn. This only needs to run once. Added additional commentary and made into a generic method if needed in other tests. - `fail2ban`: - Use the common container IP helper method. - Postscreen isn't affecting this test, it's not required to do the much slower exchange with the mail server when sending a login failure. - IP being passed into ENV is no longer necessary. - `sleep 5` in the related test cases doesn't seem necessary, can better rely on polling with timeout. - `sleep 10` for `setup.sh` also doesn't appear to be necessary. - `postgrey`: - Reduced POSTGREY_DELAY to 3, which shaves a fair amount of wasted time while still verifying the delay works. - One of the checks in `main.cf` doesn't seem to need to know about the earlier spamhaus portion of the line to work, removed. - Better test case descriptions. - Improved log matching via standard method that better documents the expected triplet under test. - Removed a redundant whitelist file and test that didn't seem to have any relevance. Added a TODO with additional notes about a concern with these tests. - Reduced test time as 8 second timeouts from `-w 8` don't appear to be required, better to poll with grep instead. - Replaced `wc -l` commands with a new method to assert expected line count, better enabling assertions on the actual output. - `undef_spam_subject`: - Split to two separate test cases, and initialize each container in their case instead of `setup_file()`, allowing for using the default `teardown()` method (and slight benefit if running in parallel). - `permit_docker`: - Not a parallel test, but I realized that the repeat helper methods don't necessarily play well with `run` as the command (can cause false positive of what was successful).
33d840e
to
52987e3
Compare
Documentation preview for this PR is ready! 🎉 Built with commit: 52987e3 |
Thanks @georglauterbach for tackling this and @casperklein for taking the time to review 🎉 I'll follow-up with a few more similar PRs 💪 (updating the docs is being deferred so we can just get this merged) Commit message history for referencetests(chore): Move some serial tests into parallel sets
===
Additionally with the `tls.bash` helper for the `letsencrypt` tests. tests: Adjusted files not directly related to tests
===
`tls.bash` helper was adapted to the new helper scripts location. The `setup.bash` helper saw a bugfix (expanding the array properly) and updates the container default config to configure for IPv4 explicitly.
The IPv4 default was added after recent Docker pushes and I saw weird IPv6 related errors in the logs.. now we're sure IPv4 is the default during tests.
Added functionality to check if a process is running:
- This change adds a helper function to check whether a program is running inside a container or not.
- This added the need for a function like `_run_in_container` but allowing for providing an explicit container name.
- Future PRs can use this helper function now to check whether a process is running or not. This was done for the tests of Fail2Ban, but can be used for other tests in the future as well.
---
chore: Restructured BATS flags in `Makefile`
The `Makefile` has seen a bit of a restructuring when it comes to flags:
1. The `MAKEFLAGS` variables is used by `make`, and allows for adding additional flags that can be used within in recursive calls (via `$(MAKE)`) too, thus DRY approach.
2. The flags for calling BATS were adjusted. `--no-parallelize-within-files` has been added as well to ensure tests _inside_ a single file are run sequentially.
`dms-test` prefix matching changed to expect a `_` suffix as a delimiter.
---
docs: Add a note regarding output from running tests in parallel tests: Adjust parallel tests
===
- The usual serial to parallel test conversion to utilize the `setup.bash` common setup structure, and adding a `TEST_PREFIX` var for each test case to leverage.
- Standardize on parallel test naming conventions for variables / values.
- More consistent use of `bash -c` instead of `/bin/bash -c` or `/bin/sh -c`.
- Using the `_run_in_container` helper instead of `run docker exec ${CONTAINER_NAME}`.
- Updates tests to use the `check_if_process_is_running` helper.
---
chore: Revise inline docs for the `ssl_letsencrypt` test
- Moves the override to be in closer proximity to the `initial_setup` call, and better communicates the intent to override.
- Removes top comment block that is no longer providing value or correct information to maintainers.
- Revised `acme.json` test case inline doc comments. refactor: Parallel Tests
===
- `disabled_clamav_spamassassin`:
- Just shuffling the test order around, and removing the restart test at the end which doesn't make sense.
- `postscreen`:
- Now uses common helper for getting container IP
- Does not appear to need the `NET_ADMIN` capability?
- Reduced startup time for the 2nd container + additional context about it's relevance.
- Test cases are largely the same, but refactored the `nc` alternative that properly waits it's turn. This only needs to run once. Added additional commentary and made into a generic method if needed in other tests.
- `fail2ban`:
- Use the common container IP helper method.
- Postscreen isn't affecting this test, it's not required to do the much slower exchange with the mail server when sending a login failure.
- IP being passed into ENV is no longer necessary.
- `sleep 5` in the related test cases doesn't seem necessary, can better rely on polling with timeout.
- `sleep 10` for `setup.sh` also doesn't appear to be necessary.
- `postgrey`:
- Reduced POSTGREY_DELAY to 3, which shaves a fair amount of wasted time while still verifying the delay works.
- One of the checks in `main.cf` doesn't seem to need to know about the earlier spamhaus portion of the line to work, removed.
- Better test case descriptions.
- Improved log matching via standard method that better documents the expected triplet under test.
- Removed a redundant whitelist file and test that didn't seem to have any relevance. Added a TODO with additional notes about a concern with these tests.
- Reduced test time as 8 second timeouts from `-w 8` don't appear to be required, better to poll with grep instead.
- Replaced `wc -l` commands with a new method to assert expected line count, better enabling assertions on the actual output.
- `undef_spam_subject`:
- Split to two separate test cases, and initialize each container in their case instead of `setup_file()`, allowing for using the default `teardown()` method (and slight benefit if running in parallel).
- `permit_docker`:
- Not a parallel test, but I realized that the repeat helper methods don't necessarily play well with `run` as the command (can cause false positive of what was successful). docs: Revise contributing advice for tests |
Description
More parallel tests -> reduce time it takes to run CI. Profit.
Type of change
Checklist:
docs/
)