Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server run thread safety fix [changelog skip] #2435

Merged
merged 3 commits into from
Oct 20, 2020

Conversation

wjordan
Copy link
Contributor

@wjordan wjordan commented Oct 19, 2020

Description

This PR fixes a race-condition creating the @check, @notify pipe in Server, which caused various intermittent test failures on JRuby.

The issue can be triggered by a call to #run (which calls #handle_servers in a background-thread) followed immediately by #stop (which calls #notify_safely). This can result in two separate @check, @notify pipes getting created concurrently, causing the signal sent by #stop to not get picked up by the listen-loop in #handle_servers, and therefore fail to exit as expected. This PR adds a test test_run_stop_thread_safety demonstrating the race condition that consistently fails on JRuby without this fix (see this test run).

The fix moves pipe-creation into Server#run (before the background thread is spawned), making the behavior more consistent that the server will stop as long as the call to #stop comes after the call to #run returns.

This PR includes a related timing change in Single, moving the Use Ctrl-C to stop log message after the call to Server#run. This is because the wait_for_server_to_boot integration-test helper uses this log message to determine when the server is 'booted', and some integration tests (e.g., test_int_refuse) send an INT signal (which calls stop) immediately after this log message is printed, causing intermittent failures if the stop call arrived before the run call finished. The change ensures that stop always occurs after run, avoiding intermittent failures in this test.

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added an entry to History.md if this PR fixes a bug or adds a feature. If it doesn't need an entry to HISTORY.md, I have added [changelog skip] or [ci skip] to the pull request title.
  • I have added appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

Log after `server.run` in Single mode since
some integration tests use this message for timing stop signals.
@@ -1141,4 +1141,12 @@ def test_client_quick_close_no_lowlevel_error_handler_call
sleep 0.5
assert_empty @events.stdout.string
end

def test_run_stop_thread_safety
100.times do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does this take to run?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe 0.3 sec on JRuby, less on MRI. I should have reviewed this earlier, but...

I suspect I can revert the band-aids for JRuby (the sleep after run). This is the same issue of Server#run getting messed up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.02s on MRI, 0.28s on JRuby

@notify << message
rescue IOError, NoMethodError, Errno::EPIPE
# The server, in another thread, is shutting down
@notify << message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup here.

@MSP-Greg MSP-Greg self-requested a review October 19, 2020 19:51
@wjordan wjordan changed the title Server run thread safety fix Server run thread safety fix [changelog skip] Oct 19, 2020
@nateberkopec nateberkopec merged commit fdb936d into puma:master Oct 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants