Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests: random order can lead to test failures. #2825

Closed
Segaja opened this issue Feb 13, 2022 · 9 comments · Fixed by #2830
Closed

Tests: random order can lead to test failures. #2825

Segaja opened this issue Feb 13, 2022 · 9 comments · Fixed by #2830

Comments

@Segaja
Copy link

Segaja commented Feb 13, 2022

The default order of minitest is random which means the order of the tests is not deterministic.

This can lead to issues in some runs:

install -c tmp/x86_64-linux/puma_http11/3.0.3/puma_http11.so lib/puma/puma_http11.so
cp tmp/x86_64-linux/puma_http11/3.0.3/puma_http11.so tmp/x86_64-linux/stage/lib/puma/puma_http11.so

ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x86_64-linux]
RUBYOPT: -r/usr/lib/ruby/gems/3.0.0/gems/bundler-2.2.26/lib/bundler/setup
                         Puma::MiniSSL                   OpenSSL
OPENSSL_LIBRARY_VERSION: OpenSSL 1.1.1m  14 Dec 2021     OpenSSL 1.1.1m  14 Dec 2021
        OPENSSL_VERSION: OpenSSL 1.1.1m  14 Dec 2021     OpenSSL 1.1.1l  24 Aug 2021

Run options: --seed 20326

# Running:

.........................................................................................................................................................................................................................................E..............................................................................................................S............................................S.....S......................................................

Finished in 56.937445s, 7.9034 runs/s, 23.3590 assertions/s.
Errors & Failures:

  1) Error:
TestBinder#test_ssl_binder_sets_backlog:
IOError: closed stream
    /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:377:in `setsockopt'
    /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:377:in `add_ssl_listener'
    /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:261:in `block in parse'
    /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:156:in `each'
    /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:156:in `parse'
    /build/ruby-puma/src/puma-5.6.2/test/test_binder.rb:290:in `block in test_ssl_binder_sets_backlog'

450 runs, 1330 assertions, 0 failures, 1 errors, 3 skips

You have skipped tests. Run with --verbose for details.

──────────────────────────────────────────────────────────────────────── Debugging Info ──
TestIntegrationSingle#test_hot_restart_does_not_drop_connections_threads
   restart_count 1, reset 0, success after restart 0
TestIntegrationSingle#test_hot_restart_does_not_drop_connections
   restart_count 1, reset 0, success after restart 146
──────────────────────────────────────────────────────────────────────────────────────────

rake aborted!

If I rerun the tests a few times without changing any code at some point it will succeed.

This makes packaging this for Arch Linux very difficult.

One solution could be to make minitest run the tests in a fixed order and make sure they work then.

@dentarg
Copy link
Member

dentarg commented Feb 13, 2022

The problem here is not the random test order. You can see a run where the tests passed with seed 20326 at https://github.com/dentarg/puma/runs/5174043363?check_suite_focus=true#step:10:43

If you want to run the tests in the same order, use the same seed for every run. You can set it via the TESTOPTS env: TESTOPTS="--seed=12345"

Also ensure you have given your build environment sufficient resources to run the tests, otherwise they may fail in mysterious ways.

If you think you are running the tests with sufficient resources and you have consistent failures, you can report those specific failures in issues here.

@dentarg dentarg closed this as completed Feb 13, 2022
@Segaja
Copy link
Author

Segaja commented Feb 13, 2022

Hey @dentarg ,

thanks for the information. For now I set a fixed seed and increased the soft limit for file descriptors (ulimit -Sn) to 5000.

Unfortunately I still get occasional test failures with "Bad file descriptor". Does that mean I should increase the ulimit even more?

Best regards

@dentarg
Copy link
Member

dentarg commented Feb 13, 2022

Please try raising it, honestly I don't know exactly what resources are needed to run the tests. You could look at the specs for the hardware GitHub Actions use to get a feeling for it.

@Segaja
Copy link
Author

Segaja commented Feb 13, 2022

Looking at the github actions ( https://github.com/puma/puma/blob/master/.github/workflows/non_mri.yml#L71 ) I don't see any resource changes.

@MSP-Greg
Copy link
Member

MSP-Greg commented Feb 13, 2022

@Segaja Locally, I'm seeing the same error and more in the same test file. All are 'IOError: closed stream'

@MSP-Greg
Copy link
Member

@Segaja

Can you check the issue with PR #2830? Many tests run parallel, and there are infrequent intermittent failures/errors. They seem to be with older (EOL) Ruby versions. This issue is not one of those issues, as I could repo it locally with Ruby head...

@Segaja
Copy link
Author

Segaja commented Feb 14, 2022

@MSP-Greg I can try to run the tests against the changes in that PR later this week.

I'm doing this as I'm packaging puma for Arch Linux ( https://archlinux.org/packages/community/x86_64/ruby-puma/ ).

Arch Linux currently works with ruby 3.0.3p157 so it is not the newest but still maintained.

@Segaja
Copy link
Author

Segaja commented Feb 17, 2022

@MSP-Greg I finally had time again to look into this. I can't apply the changes from #2830 directly, since I'm building from the v5.6.2 tag and the mentioned PR goes against the master which already has changes in the patched file.

However when I recreate the patch myself locally I can run the build/test process and so far it feels more stable then without the change.

In 5 runs with the change I only had once this error:

Run options: --seed=21082

# Running:

........................................................................................................................................................................................................S............S.................................................S.............../build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:341:in `initialize': Address already in use - bind(2) for "127.0.0.1" port 35401 (Errno::EADDRINUSE)
        from /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:341:in `new'
        from /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:341:in `add_tcp_listener'
        from /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:173:in `block in parse'
        from /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:156:in `each'
        from /build/ruby-puma/src/puma-5.6.2/lib/puma/binder.rb:156:in `parse'
        from /build/ruby-puma/src/puma-5.6.2/lib/puma/cluster.rb:380:in `run'
        from /build/ruby-puma/src/puma-5.6.2/lib/puma/launcher.rb:182:in `run'
        from /build/ruby-puma/src/puma-5.6.2/lib/puma/cli.rb:81:in `run'
        from /build/ruby-puma/src/puma-5.6.2/bin/puma:10:in `<main>'
.................................................................................................................................................................S..............................................E

Finished in 86.267786s, 5.6568 runs/s, 17.3529 assertions/s.
Errors & Failures:

  1) Error:
TestIntegrationCluster#test_warning_message_not_outputted_when_single_worker_silenced:
TimeoutEveryTestCase::TestTookTooLong: execution expired
    /build/ruby-puma/src/puma-5.6.2/test/helpers/integration.rb:126:in `gets'
    /build/ruby-puma/src/puma-5.6.2/test/helpers/integration.rb:126:in `wait_for_server_to_boot'
    /build/ruby-puma/src/puma-5.6.2/test/helpers/integration.rb:80:in `cli_server'
    /build/ruby-puma/src/puma-5.6.2/test/test_integration_cluster.rb:344:in `test_warning_message_not_outputted_when_single_worker_silenced'
    /build/ruby-puma/src/puma-5.6.2/test/helper.rb:85:in `block (4 levels) in run'
    /usr/lib/ruby/3.0.0/timeout.rb:107:in `timeout'
    /build/ruby-puma/src/puma-5.6.2/test/helper.rb:83:in `block (3 levels) in run'

488 runs, 1497 assertions, 0 failures, 1 errors, 4 skips

You have skipped tests. Run with --verbose for details.

──────────────────────────────────────────────────────────────────────── Debugging Info ──
TestIntegrationSingle#test_hot_restart_does_not_drop_connections
   restart_count 1, reset 0, success after restart 104
TestIntegrationSingle#test_hot_restart_does_not_drop_connections_threads
   restart_count 1, reset 0, success after restart 0
TestIntegrationCluster#test_hot_restart_does_not_drop_connections_threads
   restart_count 1, reset 0, success after restart 680
TestIntegrationCluster#test_hot_restart_does_not_drop_connections
   restart_count 2, reset 0, success after restart 717
TestIntegrationCluster#test_term_closes_listeners_unix
    10 successes, 0 resets, 30 refused, 0 failures, 0 read timeouts
TestIntegrationCluster#test_term_closes_listeners_tcp
    11 successes, 1 resets, 28 refused, 0 failures, 0 read timeouts
──────────────────────────────────────────────────────────────────────────────────────────

rake aborted!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants