ci: more parallel tests #2938

georglauterbach · 2022-12-19T23:52:24Z

Description

More parallel tests -> reduce time it takes to run CI. Profit.

Type of change

Improvement (non-breaking change that does improve existing functionality)

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (README.md or the documentation under docs/)
If necessary I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

polarathene

Great work so far! 😀

In the commit history, one diff for disabled_clamav_spamassassin.bats had this comment:

# TODO: find a better way to know when we have waited long enough
# for ClamAV to should have come up, if it were enabled

Was this relevant to the clamav.bats test at all and it's current approach for this?

test/tests/parallel/set1/spam_virus/clamav.bats

test/tests/parallel/set1/spam_virus/disabled_clamav_spamassassin.bats

test/tests/parallel/set1/spam_virus/spam_junk_folder.bats

test/tests/parallel/set1/spam_virus/undef_spam_subject.bats

test/tests/parallel/set1/spam_virus/fail2ban.bats

polarathene · 2022-12-23T08:07:25Z

I thought the commit for SASL_PASSWD might have resolved the conflict issue 😅 Perhaps I need to rewrite history to tackle that.

Don't worry about it if you're ok with me rewriting the history afterwards :) (similar will be needed after merging the TLS PR affecting letsencrypt test)

Due to all the tests being renamed/moved, and Github not being able to properly communicate that kind of diff, I'd like to avoid merge commits from master branch - and then later.. once the PR is approved for merging:

Rebase to latest commit on master.
Rewrite / compress the commit history (_which seems to be tricky when merge commits from master are involved? 🤷‍♂️ )
Then make a merge commit into master, instead of the usual squash merge.

Makefile

polarathene · 2022-12-24T07:37:33Z

docs/content/contributing/general.md

+
+    If your machine is capable, you can increase the amount of tests that are run simultaneously by prepending the `make clean all` command with `BATS_PARALLEL_JOBS=X` (i.e. `BATS_PARALLEL_JOBS=X make clean all`). This wil speed up the test procedure. You can also run all tests in serial by setting `BATS_PARALLEL_JOBS=1` this way.
+
+    The default value of `BATS_PARALLEL_JOBS` is 2. Increasing it to `3` requires 6 threads and 6GB of main memory; increasing it to `4` requires 8 threads and at least 8GB of main memory.


Where did you get the resource requirements here? On the SSL tests I was running docker stats was only showing about 50MB memory usage for each container 🤔

I removed the opt-out flag to run all test cases in the cipher lists test (total 6) and bumped the jobs (although my CPU is only a quad core with only 4 threads). No major memory usage increase, but it did take much longer (and failed test cases due to scripts in that test connecting to the wrong containers), about an extra minute or two due to the load I think.

Other tests are faster however with concurrent test cases, or when running multiple test files concurrently 👍

Ran some tests on my system; I have an older i7-4770k with 8 threads / 4 cores. Running with --jobs 4 was the maximum when it came to CPU utilisation for me. I'd rather set the bar high and avoid unnecessary test failures due to resource problems than setting it low and risking them. I used htop to look at CPU/RAM, ran the parallel set 1 though (higher resource usages to due to activated anti-spam / anti-virus programs).

I used --jobs 4 (with and without --no-parallelize-within-files) on my i5-6500 system (32GB RAM, 16GB in use) and for parallel/set1 I only observed near 2GB peak usage increase via system monitor and reaching 100% CPU load.

The only reason I think you'd have experienced higher RAM usage would perhaps be if the image under test hadn't been built with a recent ClamAV layer updated. I have run tests with the last clean image build being a week or so old and observed ClamAV tests were very memory hungry.

Might have been the case for you? That or there's something different with the latest changes I've pushed for this branch 🤷‍♂️

So I'm still not sure how high of a bar you want to set that. Before parallel tests were added, I've seen tests with ClamAV under some circumstances gobble up 6GB+ in memory 🤷‍♂️

Do you have a source for each increment of the BATS_PARALLEL_JOBS / --jobs requiring double the number of threads? It doesn't seem to be mentioned on the BATS docs.

My understanding is it's just a concurrent queue of jobs that'll get scheduled CPU time on available threads, so since N jobs doesn't necessarily use constant CPU, more CPU time can be available for additional jobs to run?

Seems as if the resource requirements differ a lot. The ClamAV update could be an issue, but I don't think so since I built it directly before running the tests, IIRC.

I have no source when it comes to the --jobs flag needing more threads. We should therefore probably refrain from using exact numbers here and be more abstract. Something like: "Depending on the hardware in your system, you can increase the number of parallel jobs. Make sure you do not overload the system, which can lead to failing tests."

polarathene · 2023-01-03T08:36:42Z

Apologies for the delay.

I'm happy to rebase squash the docs commit into tests: Adjusted files not directly related to tests. I've just kept it separate for review feedback as I revised it a bit. Preview page. Still discussing about a portion of this content in earlier review comment here.
I've not only rewritten history, other changes have been added and compressed into the commits. The first is minimal, a 2nd commit applies larger refactoring and could be optionally extracted out of this PR as a follow-up. Each commit provides details on relevant changes.
There was a notable gotcha with the _run_in_container method @georglauterbach added in this PR that I've included a comment next to it's declaration to keep in mind. I also needed to update check_if_process_is_running to not use that method as it was leading to tests passing (restart when killed test cases) when they actually were failing 😅

georglauterbach · 2023-01-03T12:28:51Z

UPDATE: CI takes now

28s to build the AMD64 image
16min to test the serial tests

From 30 minutes down to 17. Future PRs can improve on this even further.

@casperklein this is a rather big PR, but most changes are the same (container setup routines, etc.). I'd recomment going through each commit one by one.

Additionally with the `tls.bash` helper for the `letsencrypt` tests.

`tls.bash` helper was adapted to the new helper scripts location. The `setup.bash` helper saw a bugfix (expanding the array properly) and updates the container default config to configure for IPv4 explicitly. The IPv4 default was added after recent Docker pushes and I saw weird IPv6 related errors in the logs.. now we're sure IPv4 is the default during tests. Added functionality to check if a process is running: - This change adds a helper function to check whether a program is running inside a container or not. - This added the need for a function like `_run_in_container` but allowing for providing an explicit container name. - Future PRs can use this helper function now to check whether a process is running or not. This was done for the tests of Fail2Ban, but can be used for other tests in the future as well. --- chore: Restructured BATS flags in `Makefile` The `Makefile` has seen a bit of a restructuring when it comes to flags: 1. The `MAKEFLAGS` variables is used by `make`, and allows for adding additional flags that can be used within in recursive calls (via `$(MAKE)`) too, thus DRY approach. 2. The flags for calling BATS were adjusted. `--no-parallelize-within-files` has been added as well to ensure tests _inside_ a single file are run sequentially. `dms-test` prefix matching changed to expect a `_` suffix as a delimiter. --- docs: Add a note regarding output from running tests in parallel

- The usual serial to parallel test conversion to utilize the `setup.bash` common setup structure, and adding a `TEST_PREFIX` var for each test case to leverage. - Standardize on parallel test naming conventions for variables / values. - More consistent use of `bash -c` instead of `/bin/bash -c` or `/bin/sh -c`. - Using the `_run_in_container` helper instead of `run docker exec ${CONTAINER_NAME}`. - Updates tests to use the `check_if_process_is_running` helper. --- chore: Revise inline docs for the `ssl_letsencrypt` test - Moves the override to be in closer proximity to the `initial_setup` call, and better communicates the intent to override. - Removes top comment block that is no longer providing value or correct information to maintainers. - Revised `acme.json` test case inline doc comments.

- `disabled_clamav_spamassassin`: - Just shuffling the test order around, and removing the restart test at the end which doesn't make sense. - `postscreen`: - Now uses common helper for getting container IP - Does not appear to need the `NET_ADMIN` capability? - Reduced startup time for the 2nd container + additional context about it's relevance. - Test cases are largely the same, but refactored the `nc` alternative that properly waits it's turn. This only needs to run once. Added additional commentary and made into a generic method if needed in other tests. - `fail2ban`: - Use the common container IP helper method. - Postscreen isn't affecting this test, it's not required to do the much slower exchange with the mail server when sending a login failure. - IP being passed into ENV is no longer necessary. - `sleep 5` in the related test cases doesn't seem necessary, can better rely on polling with timeout. - `sleep 10` for `setup.sh` also doesn't appear to be necessary. - `postgrey`: - Reduced POSTGREY_DELAY to 3, which shaves a fair amount of wasted time while still verifying the delay works. - One of the checks in `main.cf` doesn't seem to need to know about the earlier spamhaus portion of the line to work, removed. - Better test case descriptions. - Improved log matching via standard method that better documents the expected triplet under test. - Removed a redundant whitelist file and test that didn't seem to have any relevance. Added a TODO with additional notes about a concern with these tests. - Reduced test time as 8 second timeouts from `-w 8` don't appear to be required, better to poll with grep instead. - Replaced `wc -l` commands with a new method to assert expected line count, better enabling assertions on the actual output. - `undef_spam_subject`: - Split to two separate test cases, and initialize each container in their case instead of `setup_file()`, allowing for using the default `teardown()` method (and slight benefit if running in parallel). - `permit_docker`: - Not a parallel test, but I realized that the repeat helper methods don't necessarily play well with `run` as the command (can cause false positive of what was successful).

github-actions · 2023-01-06T03:51:09Z

Documentation preview for this PR is ready! 🎉

Built with commit: 52987e3

polarathene · 2023-01-06T03:56:26Z

Thanks @georglauterbach for tackling this and @casperklein for taking the time to review 🎉

I'll follow-up with a few more similar PRs 💪 (updating the docs is being deferred so we can just get this merged)

Commit message history for reference

tests(chore): Move some serial tests into parallel sets
===
Additionally with the `tls.bash` helper for the `letsencrypt` tests.

tests: Adjusted files not directly related to tests
===
`tls.bash` helper was adapted to the new helper scripts location. The `setup.bash` helper saw a bugfix (expanding the array properly) and updates the container default config to configure for IPv4 explicitly.

The IPv4 default was added after recent Docker pushes and I saw weird IPv6 related errors in the logs.. now we're sure IPv4 is the default during tests.

Added functionality to check if a process is running:
- This change adds a helper function to check whether a program is running inside a container or not.
- This added the need for a function like `_run_in_container` but allowing for providing an explicit container name.
- Future PRs can use this helper function now to check whether a process is running or not. This was done for the tests of Fail2Ban, but can be used for other tests in the future as well.

---

chore: Restructured BATS flags in `Makefile`

The `Makefile` has seen a bit of a restructuring when it comes to flags:

1. The `MAKEFLAGS` variables is used by `make`, and allows for adding additional flags that can be used within in recursive calls (via `$(MAKE)`) too,  thus DRY approach.
2. The flags for calling BATS were adjusted. `--no-parallelize-within-files` has been added as well to ensure tests  _inside_ a single file are run sequentially.

`dms-test` prefix matching changed to expect a `_` suffix as a delimiter.

---

docs: Add a note regarding output from running tests in parallel

tests: Adjust parallel tests
===
- The usual serial to parallel test conversion to utilize the `setup.bash` common setup structure, and adding a `TEST_PREFIX` var for each test case to leverage.
- Standardize on parallel test naming conventions for variables / values.
- More consistent use of `bash -c` instead of `/bin/bash -c` or `/bin/sh -c`.
- Using the `_run_in_container` helper instead of `run docker exec ${CONTAINER_NAME}`.
- Updates tests to use the `check_if_process_is_running` helper.

---

chore: Revise inline docs for the `ssl_letsencrypt` test

- Moves the override to be in closer proximity to the `initial_setup` call, and better communicates the intent to override.
- Removes top comment block that is no longer providing value or correct information to maintainers.
- Revised `acme.json` test case inline doc comments.

refactor: Parallel Tests
===
- `disabled_clamav_spamassassin`:
  - Just shuffling the test order around, and removing the restart test at the end which doesn't make sense.

- `postscreen`:
  - Now uses common helper for getting container IP
  - Does not appear to need the `NET_ADMIN` capability?
  - Reduced startup time for the 2nd container + additional context about it's relevance.
  - Test cases are largely the same, but refactored the `nc` alternative that properly waits it's turn. This only needs to run once. Added additional commentary and made into a generic method if needed in other tests.

- `fail2ban`:
  - Use the common container IP helper method.
  - Postscreen isn't affecting this test, it's not required to do the much slower exchange with the mail server when sending a login failure.
  - IP being passed into ENV is no longer necessary.
  - `sleep 5` in the related test cases doesn't seem necessary, can better rely on polling with timeout.
  - `sleep 10` for `setup.sh` also doesn't appear to be necessary.

- `postgrey`:
  - Reduced POSTGREY_DELAY to 3, which shaves a fair amount of wasted time while still verifying the delay works.
  - One of the checks in `main.cf` doesn't seem to need to know about the earlier spamhaus portion of the line to work, removed.
  - Better test case descriptions.
  - Improved log matching via standard method that better documents the expected triplet under test.
  - Removed a redundant whitelist file and test that didn't seem to have any relevance. Added a TODO with additional notes about a concern with these tests.
  - Reduced test time as 8 second timeouts from `-w 8` don't appear to be required, better to poll with grep instead.
  - Replaced `wc -l` commands with a new method to assert expected line count, better enabling assertions on the actual output.

- `undef_spam_subject`:
  - Split to two separate test cases, and initialize each container in their case instead of `setup_file()`, allowing for using the default `teardown()` method (and slight benefit if running in parallel).

- `permit_docker`:
  - Not a parallel test, but I realized that the repeat helper methods don't necessarily play well with `run` as the command (can cause false positive of what was successful).

docs: Revise contributing advice for tests

georglauterbach added area/ci kind/improvement Improve an existing feature, configuration file or the documentation labels Dec 19, 2022

georglauterbach added this to the v12.0.0 milestone Dec 19, 2022

georglauterbach requested a review from polarathene December 19, 2022 23:52

georglauterbach assigned polarathene and georglauterbach Dec 19, 2022

This comment was marked as resolved.

Sign in to view