Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests take a long time to run #10818

Closed
dfandrich opened this issue Mar 22, 2023 · 6 comments
Closed

Tests take a long time to run #10818

dfandrich opened this issue Mar 22, 2023 · 6 comments
Labels

Comments

@dfandrich
Copy link
Contributor

Running the close to 1600 tests in the test suite takes a long time and slows developer productivity. This issue will consolidate efforts to speed up running tests, primarily by allowing them to run in parallel. The roadmap will be along this design proposal.

@kdudka
Copy link
Contributor

kdudka commented Mar 23, 2023

I like the idea! Did you consider also running multiple HTTP(S) servers in parallel? It is probably not something to begin with. On the other hand, it would be good to design the solution in a way that such an extension could be introduced in the future. Some users build (lib)curl with a configuration where HTTPS is the only enabled protocol and they would not gain anything if the parallelization used a single server instance for each protocol.

@bagder
Copy link
Member

bagder commented Mar 23, 2023

I think the individual "runners" should be able to run all test servers, which should make it possible to run all runtests test cases in parallel. HTTP and HTTPS are two of the most tested protocols so it is important that they can be highly parallelized.

@dfandrich
Copy link
Contributor Author

dfandrich commented Mar 23, 2023 via email

dfandrich added a commit that referenced this issue Mar 24, 2023
These functions scan through the entire test file every time to find the
right section, so they can be slow for large test files.

Ref: #10818
dfandrich added a commit that referenced this issue Mar 24, 2023
Unlike some other languages that just copy a pointer, perl copies the
entire array contents which takes time for a large array.

Ref: #10818
dfandrich added a commit that referenced this issue Mar 24, 2023
Namely:
- Verify that this test case should be run
- Start the servers needed to run this test case
- Check that test environment is fine to run this test case
- Prepare the test environment to run this test case
- Run the test command
- Clean up after test command
- Verify test succeeded

Ref: #10818
dfandrich added a commit that referenced this issue Mar 24, 2023
This takes it from a 1200 line behemoth into something more manageable.
The content and order of the functions is taken almost directly from
singletest() so the diff sans whitespace is quite short.

Ref: #10818
dfandrich added a commit that referenced this issue Mar 24, 2023
dfandrich added a commit that referenced this issue Mar 24, 2023
Use the feature map stored in the hash table instead. Most of the
variables were only used once, to set the value in the hash table.

Ref: #10818
dfandrich added a commit that referenced this issue Mar 25, 2023
These functions scan through the entire test file every time to find the
right section, so they can be slow for large test files.

Ref: #10818
dfandrich added a commit that referenced this issue Mar 25, 2023
Unlike some other languages that just copy a pointer, perl copies the
entire array contents which takes time for a large array.

Ref: #10818
dfandrich added a commit that referenced this issue Mar 25, 2023
Namely:
- Verify that this test case should be run
- Start the servers needed to run this test case
- Check that test environment is fine to run this test case
- Prepare the test environment to run this test case
- Run the test command
- Clean up after test command
- Verify test succeeded

Ref: #10818
ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023
This must be done so variables pick up the runner's unique $LOGDIR.

Ref: curl#10818
ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023
This allows all messages relating to a single test case to be displayed
together at the end of the test.

Ref: curl#10818
ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023
Such as what happens with the --repeat option.  Some functions are
changed to pass the runner ID instead of relying on the non-unique test
number.

Ref: curl#10818
ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023
Parallel testing is enabled by using a nonzero value for the -j option
to runtests.pl. Performant values seem to be about 7*num CPU cores, or
1.3*num CPU cores if Valgrind is in use.

Flaky tests due to improper log locking (bug curl#11231) are exacerbated
while parallel testing, so it is not enabled by default yet.

Fixes curl#10818
Closes curl#11246
ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023
dfandrich added a commit that referenced this issue Sep 28, 2023
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
dfandrich added a commit that referenced this issue Sep 29, 2023
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
dfandrich added a commit that referenced this issue Sep 30, 2023
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
dfandrich added a commit that referenced this issue Oct 5, 2023
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
dfandrich added a commit that referenced this issue Oct 8, 2023
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
dfandrich added a commit that referenced this issue Oct 11, 2023
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
dfandrich added a commit that referenced this issue Oct 11, 2023
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
dfandrich added a commit that referenced this issue Oct 11, 2023
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
dfandrich added a commit that referenced this issue Nov 13, 2023
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

TODO: completely remove the 2 here:  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
vszakats pushed a commit that referenced this issue May 13, 2024
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
vszakats pushed a commit that referenced this issue May 28, 2024
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
vszakats pushed a commit that referenced this issue Jun 4, 2024
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
dfandrich added a commit that referenced this issue Jul 6, 2024
This was missed during some refactoring more than a year ago and is
causing a warning "Use of uninitialized value $path in pattern match".

Follow-up to 70d2fca

Ref: #10818
Closes #14113
dfandrich added a commit that referenced this issue Jul 7, 2024
This was missed during some refactoring more than a year ago and is
causing a warning "Use of uninitialized value $path in pattern match".

Follow-up to 70d2fca

Ref: #10818
Closes #14113
dfandrich added a commit that referenced this issue Jul 7, 2024
This is the same thing as the previous commit fd194f46 but on the next
line.

Follow-up to 70d2fca

Ref: #10818
vszakats pushed a commit that referenced this issue Jul 8, 2024
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
vszakats pushed a commit that referenced this issue Jul 19, 2024
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Ref: #10818
Closes #11510
vszakats pushed a commit to vszakats/curl that referenced this issue Jul 20, 2024
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Already merged via separate commits:
- 2a7c8b2 curl#14171
- 7234106

Ref: curl#10818
Closes curl#11510
vszakats pushed a commit that referenced this issue Aug 3, 2024
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Already merged via separate commits:
- 2a7c8b2 #14171
- 7234106
- efce544 #14244
- c6cf411

Ref: #10818
Closes #11510
vszakats pushed a commit that referenced this issue Aug 3, 2024
The test-ci target now uses 2 processes by default, but the amount of
parallelism is tuned for each CI service and build environment based on
results of a number of test runs.  Some CI services use super-
oversubscribed build machines that can barely run the curl tests
already with no parallelism without frequently failing with
timing-induced failures. These continue to be run without parallelism.
Other services provide two fast, unloaded cores and these run with 14
processes, which is a good default for this kind of environment.

Here's a summary of the number of test processes by CI service:

  Appveyor - 2 (Windows MSVC), 1 (others)
  Azure - 2
  Circle CI - 14
  Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows)
  GitHub Actions - 3 (macOS), 2 (Linux)

Some of these are a bit conservative to keep timing-induced flakiness down.

The net result is that the first test results should arrive only
3 minutes after a commit submission.

Changes merged via separate commits:
- 2a7c8b2 #14171
- 7234106
- efce544 #14244
- c6cf411

Ref: #10818
Closes #11510
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants