Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: more parallel tests #2938

Merged
merged 5 commits into from Jan 6, 2023
Merged

ci: more parallel tests #2938

merged 5 commits into from Jan 6, 2023

Conversation

georglauterbach
Copy link
Member

Description

More parallel tests -> reduce time it takes to run CI. Profit.

Type of change

  • Improvement (non-breaking change that does improve existing functionality)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (README.md or the documentation under docs/)
  • If necessary I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@georglauterbach georglauterbach added area/ci kind/improvement Improve an existing feature, configuration file or the documentation labels Dec 19, 2022
@georglauterbach georglauterbach added this to the v12.0.0 milestone Dec 19, 2022
@georglauterbach

This comment was marked as resolved.

@polarathene

This comment was marked as resolved.

@georglauterbach

This comment was marked as outdated.

@georglauterbach georglauterbach force-pushed the ci/more-parallel-tests branch 7 times, most recently from c625078 to 246f1fd Compare December 22, 2022 17:41
@georglauterbach

This comment was marked as outdated.

@polarathene

This comment was marked as resolved.

@georglauterbach

This comment was marked as resolved.

@polarathene

This comment was marked as outdated.

Copy link
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work so far! 😀


In the commit history, one diff for disabled_clamav_spamassassin.bats had this comment:

# TODO: find a better way to know when we have waited long enough
# for ClamAV to should have come up, if it were enabled

Was this relevant to the clamav.bats test at all and it's current approach for this?

test/tests/parallel/set1/spam_virus/clamav.bats Outdated Show resolved Hide resolved
test/tests/parallel/set1/spam_virus/spam_junk_folder.bats Outdated Show resolved Hide resolved
test/tests/parallel/set1/spam_virus/spam_junk_folder.bats Outdated Show resolved Hide resolved
test/tests/parallel/set1/spam_virus/fail2ban.bats Outdated Show resolved Hide resolved
test/tests/parallel/set1/spam_virus/fail2ban.bats Outdated Show resolved Hide resolved
@polarathene
Copy link
Member

polarathene commented Dec 23, 2022

I thought the commit for SASL_PASSWD might have resolved the conflict issue 😅 Perhaps I need to rewrite history to tackle that.

Don't worry about it if you're ok with me rewriting the history afterwards :) (similar will be needed after merging the TLS PR affecting letsencrypt test)


Due to all the tests being renamed/moved, and Github not being able to properly communicate that kind of diff, I'd like to avoid merge commits from master branch - and then later.. once the PR is approved for merging:

  • Rebase to latest commit on master.
  • Rewrite / compress the commit history (_which seems to be tricky when merge commits from master are involved? 🤷‍♂️ )
  • Then make a merge commit into master, instead of the usual squash merge.

@georglauterbach

This comment was marked as outdated.

@georglauterbach

This comment was marked as resolved.

@polarathene

This comment was marked as outdated.

Makefile Outdated Show resolved Hide resolved

If your machine is capable, you can increase the amount of tests that are run simultaneously by prepending the `make clean all` command with `BATS_PARALLEL_JOBS=X` (i.e. `BATS_PARALLEL_JOBS=X make clean all`). This wil speed up the test procedure. You can also run all tests in serial by setting `BATS_PARALLEL_JOBS=1` this way.

The default value of `BATS_PARALLEL_JOBS` is 2. Increasing it to `3` requires 6 threads and 6GB of main memory; increasing it to `4` requires 8 threads and at least 8GB of main memory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did you get the resource requirements here? On the SSL tests I was running docker stats was only showing about 50MB memory usage for each container 🤔

I removed the opt-out flag to run all test cases in the cipher lists test (total 6) and bumped the jobs (although my CPU is only a quad core with only 4 threads). No major memory usage increase, but it did take much longer (and failed test cases due to scripts in that test connecting to the wrong containers), about an extra minute or two due to the load I think.

Other tests are faster however with concurrent test cases, or when running multiple test files concurrently 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran some tests on my system; I have an older i7-4770k with 8 threads / 4 cores. Running with --jobs 4 was the maximum when it came to CPU utilisation for me. I'd rather set the bar high and avoid unnecessary test failures due to resource problems than setting it low and risking them. I used htop to look at CPU/RAM, ran the parallel set 1 though (higher resource usages to due to activated anti-spam / anti-virus programs).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used --jobs 4 (with and without --no-parallelize-within-files) on my i5-6500 system (32GB RAM, 16GB in use) and for parallel/set1 I only observed near 2GB peak usage increase via system monitor and reaching 100% CPU load.

The only reason I think you'd have experienced higher RAM usage would perhaps be if the image under test hadn't been built with a recent ClamAV layer updated. I have run tests with the last clean image build being a week or so old and observed ClamAV tests were very memory hungry.

Might have been the case for you? That or there's something different with the latest changes I've pushed for this branch 🤷‍♂️


So I'm still not sure how high of a bar you want to set that. Before parallel tests were added, I've seen tests with ClamAV under some circumstances gobble up 6GB+ in memory 🤷‍♂️

Do you have a source for each increment of the BATS_PARALLEL_JOBS / --jobs requiring double the number of threads? It doesn't seem to be mentioned on the BATS docs.

My understanding is it's just a concurrent queue of jobs that'll get scheduled CPU time on available threads, so since N jobs doesn't necessarily use constant CPU, more CPU time can be available for additional jobs to run?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems as if the resource requirements differ a lot. The ClamAV update could be an issue, but I don't think so since I built it directly before running the tests, IIRC.

I have no source when it comes to the --jobs flag needing more threads. We should therefore probably refrain from using exact numbers here and be more abstract. Something like: "Depending on the hardware in your system, you can increase the number of parallel jobs. Make sure you do not overload the system, which can lead to failing tests."

@polarathene
Copy link
Member

Apologies for the delay.

  • I'm happy to rebase squash the docs commit into tests: Adjusted files not directly related to tests. I've just kept it separate for review feedback as I revised it a bit. Preview page. Still discussing about a portion of this content in earlier review comment here.
  • I've not only rewritten history, other changes have been added and compressed into the commits. The first is minimal, a 2nd commit applies larger refactoring and could be optionally extracted out of this PR as a follow-up. Each commit provides details on relevant changes.
  • There was a notable gotcha with the _run_in_container method @georglauterbach added in this PR that I've included a comment next to it's declaration to keep in mind. I also needed to update check_if_process_is_running to not use that method as it was leading to tests passing (restart when killed test cases) when they actually were failing 😅

@georglauterbach
Copy link
Member Author

UPDATE: CI takes now

  1. 28s to build the AMD64 image
  2. 16min to test the serial tests

From 30 minutes down to 17. Future PRs can improve on this even further.


@casperklein this is a rather big PR, but most changes are the same (container setup routines, etc.). I'd recomment going through each commit one by one.

Additionally with the `tls.bash` helper for the `letsencrypt` tests.
`tls.bash` helper was adapted to the new helper scripts location. The `setup.bash` helper saw a bugfix (expanding the array properly) and updates the container default config to configure for IPv4 explicitly.

The IPv4 default was added after recent Docker pushes and I saw weird IPv6 related errors in the logs.. now we're sure IPv4 is the default during tests.

Added functionality to check if a process is running:
- This change adds a helper function to check whether a program is running inside a container or not.
- This added the need for a function like `_run_in_container` but allowing for providing an explicit container name.
- Future PRs can use this helper function now to check whether a process is running or not. This was done for the tests of Fail2Ban, but can be used for other tests in the future as well.

---

chore: Restructured BATS flags in `Makefile`

The `Makefile` has seen a bit of a restructuring when it comes to flags:

1. The `MAKEFLAGS` variables is used by `make`, and allows for adding additional flags that can be used within in recursive calls (via `$(MAKE)`) too,  thus DRY approach.
2. The flags for calling BATS were adjusted. `--no-parallelize-within-files` has been added as well to ensure tests  _inside_ a single file are run sequentially.

`dms-test` prefix matching changed to expect a `_` suffix as a delimiter.

---

docs: Add a note regarding output from running tests in parallel
- The usual serial to parallel test conversion to utilize the `setup.bash` common setup structure, and adding a `TEST_PREFIX` var for each test case to leverage.
- Standardize on parallel test naming conventions for variables / values.
- More consistent use of `bash -c` instead of `/bin/bash -c` or `/bin/sh -c`.
- Using the `_run_in_container` helper instead of `run docker exec ${CONTAINER_NAME}`.
- Updates tests to use the `check_if_process_is_running` helper.

---

chore: Revise inline docs for the `ssl_letsencrypt` test

- Moves the override to be in closer proximity to the `initial_setup` call, and better communicates the intent to override.
- Removes top comment block that is no longer providing value or correct information to maintainers.
- Revised `acme.json` test case inline doc comments.
- `disabled_clamav_spamassassin`:
  - Just shuffling the test order around, and removing the restart test at the end which doesn't make sense.

- `postscreen`:
  - Now uses common helper for getting container IP
  - Does not appear to need the `NET_ADMIN` capability?
  - Reduced startup time for the 2nd container + additional context about it's relevance.
  - Test cases are largely the same, but refactored the `nc` alternative that properly waits it's turn. This only needs to run once. Added additional commentary and made into a generic method if needed in other tests.

- `fail2ban`:
  - Use the common container IP helper method.
  - Postscreen isn't affecting this test, it's not required to do the much slower exchange with the mail server when sending a login failure.
  - IP being passed into ENV is no longer necessary.
  - `sleep 5` in the related test cases doesn't seem necessary, can better rely on polling with timeout.
  - `sleep 10` for `setup.sh` also doesn't appear to be necessary.

- `postgrey`:
  - Reduced POSTGREY_DELAY to 3, which shaves a fair amount of wasted time while still verifying the delay works.
  - One of the checks in `main.cf` doesn't seem to need to know about the earlier spamhaus portion of the line to work, removed.
  - Better test case descriptions.
  - Improved log matching via standard method that better documents the expected triplet under test.
  - Removed a redundant whitelist file and test that didn't seem to have any relevance. Added a TODO with additional notes about a concern with these tests.
  - Reduced test time as 8 second timeouts from `-w 8` don't appear to be required, better to poll with grep instead.
  - Replaced `wc -l` commands with a new method to assert expected line count, better enabling assertions on the actual output.

- `undef_spam_subject`:
  - Split to two separate test cases, and initialize each container in their case instead of `setup_file()`, allowing for using the default `teardown()` method (and slight benefit if running in parallel).

- `permit_docker`:
  - Not a parallel test, but I realized that the repeat helper methods don't necessarily play well with `run` as the command (can cause false positive of what was successful).
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2023

Documentation preview for this PR is ready! 🎉

Built with commit: 52987e3

@polarathene polarathene merged commit 623d53b into master Jan 6, 2023
@polarathene polarathene deleted the ci/more-parallel-tests branch January 6, 2023 03:53
@polarathene
Copy link
Member

polarathene commented Jan 6, 2023

Thanks @georglauterbach for tackling this and @casperklein for taking the time to review 🎉

I'll follow-up with a few more similar PRs 💪 (updating the docs is being deferred so we can just get this merged)


Commit message history for reference
tests(chore): Move some serial tests into parallel sets
===
Additionally with the `tls.bash` helper for the `letsencrypt` tests.
tests: Adjusted files not directly related to tests
===
`tls.bash` helper was adapted to the new helper scripts location. The `setup.bash` helper saw a bugfix (expanding the array properly) and updates the container default config to configure for IPv4 explicitly.

The IPv4 default was added after recent Docker pushes and I saw weird IPv6 related errors in the logs.. now we're sure IPv4 is the default during tests.

Added functionality to check if a process is running:
- This change adds a helper function to check whether a program is running inside a container or not.
- This added the need for a function like `_run_in_container` but allowing for providing an explicit container name.
- Future PRs can use this helper function now to check whether a process is running or not. This was done for the tests of Fail2Ban, but can be used for other tests in the future as well.

---

chore: Restructured BATS flags in `Makefile`

The `Makefile` has seen a bit of a restructuring when it comes to flags:

1. The `MAKEFLAGS` variables is used by `make`, and allows for adding additional flags that can be used within in recursive calls (via `$(MAKE)`) too,  thus DRY approach.
2. The flags for calling BATS were adjusted. `--no-parallelize-within-files` has been added as well to ensure tests  _inside_ a single file are run sequentially.

`dms-test` prefix matching changed to expect a `_` suffix as a delimiter.

---

docs: Add a note regarding output from running tests in parallel
tests: Adjust parallel tests
===
- The usual serial to parallel test conversion to utilize the `setup.bash` common setup structure, and adding a `TEST_PREFIX` var for each test case to leverage.
- Standardize on parallel test naming conventions for variables / values.
- More consistent use of `bash -c` instead of `/bin/bash -c` or `/bin/sh -c`.
- Using the `_run_in_container` helper instead of `run docker exec ${CONTAINER_NAME}`.
- Updates tests to use the `check_if_process_is_running` helper.

---

chore: Revise inline docs for the `ssl_letsencrypt` test

- Moves the override to be in closer proximity to the `initial_setup` call, and better communicates the intent to override.
- Removes top comment block that is no longer providing value or correct information to maintainers.
- Revised `acme.json` test case inline doc comments.
refactor: Parallel Tests
===
- `disabled_clamav_spamassassin`:
  - Just shuffling the test order around, and removing the restart test at the end which doesn't make sense.

- `postscreen`:
  - Now uses common helper for getting container IP
  - Does not appear to need the `NET_ADMIN` capability?
  - Reduced startup time for the 2nd container + additional context about it's relevance.
  - Test cases are largely the same, but refactored the `nc` alternative that properly waits it's turn. This only needs to run once. Added additional commentary and made into a generic method if needed in other tests.

- `fail2ban`:
  - Use the common container IP helper method.
  - Postscreen isn't affecting this test, it's not required to do the much slower exchange with the mail server when sending a login failure.
  - IP being passed into ENV is no longer necessary.
  - `sleep 5` in the related test cases doesn't seem necessary, can better rely on polling with timeout.
  - `sleep 10` for `setup.sh` also doesn't appear to be necessary.

- `postgrey`:
  - Reduced POSTGREY_DELAY to 3, which shaves a fair amount of wasted time while still verifying the delay works.
  - One of the checks in `main.cf` doesn't seem to need to know about the earlier spamhaus portion of the line to work, removed.
  - Better test case descriptions.
  - Improved log matching via standard method that better documents the expected triplet under test.
  - Removed a redundant whitelist file and test that didn't seem to have any relevance. Added a TODO with additional notes about a concern with these tests.
  - Reduced test time as 8 second timeouts from `-w 8` don't appear to be required, better to poll with grep instead.
  - Replaced `wc -l` commands with a new method to assert expected line count, better enabling assertions on the actual output.

- `undef_spam_subject`:
  - Split to two separate test cases, and initialize each container in their case instead of `setup_file()`, allowing for using the default `teardown()` method (and slight benefit if running in parallel).

- `permit_docker`:
  - Not a parallel test, but I realized that the repeat helper methods don't necessarily play well with `run` as the command (can cause false positive of what was successful).
docs: Revise contributing advice for tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci kind/improvement Improve an existing feature, configuration file or the documentation priority/medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants