Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push signal traps and at_exit blocks to a common thread for exec. #5441

Merged
merged 2 commits into from Feb 11, 2019

Conversation

@headius
Copy link
Member

@headius headius commented Nov 12, 2018

This avoids races due to the signal traps running on the JVM's
signal-handling thread while we run at_exit blocks on JRuby's
main thread.

Fixes #5437.

This avoids races due to the signal traps running on the JVM's
signal-handling thread while we run at_exit blocks on JRuby's
main thread.

Fixes #5437.
@headius headius added this to the JRuby 9.2.4.0 milestone Nov 12, 2018
@headius
Copy link
Member Author

@headius headius commented Nov 12, 2018

@Adithya-copart Can you provide an example script that causes a race and verify that this PR fixes it?

@headius headius requested review from kares and enebo Nov 12, 2018
@Adithya-copart
Copy link

@Adithya-copart Adithya-copart commented Nov 12, 2018

@headius Will the code in the first comment of puma/puma#1675 work?

## Gemfile


ruby '2.5.0', engine: 'jruby', engine_version: '9.2.0.0'

gem 'sinatra'
gem 'sucker_punch'
gem 'puma'


## config.ru

require 'sinatra/base'
require 'sucker_punch'
require './test_app.rb'

run TestApp

## test_app.rb

class TestApp < Sinatra::Base

  class TestJob
    include SuckerPunch::Job

    def perform
      puts "The job #{self} is starting to sleep 5sec in #{Thread.current}"
      hsh = Thread.list.reduce({}) {|hsh,t| hsh[t] = t.status; hsh}
      # puts "Thread list in #{Thread.current}: #{hsh}"
      sleep 5
      puts "The job #{self} woke up from sleep in #{Thread.current}"
    end
  end

  get '/' do
    3.times {TestJob.perform_async}
  end

end

Steps:
bundle exec puma and hit the URL get '/' and Ctrl + C.
The process should exit if the fix works otherwise it just hangs there.

There is a race between the at_exit in sucker_punch and the trap in puma.

I can recreate something later today if it doesn't.

Thanks!

EDIT: Added the code.

@headius
Copy link
Member Author

@headius headius commented Nov 12, 2018

@Adithya-copart Ah yes, that's quite similar to how this PR works, except that we use a JDK "ExecutorService" that manages the native thread and queueing for us.

I will try to reproduce based on your instructions.

There are failures in CI, unfortunately. It seems that at least the CRuby/MRI tests expect at_exit to run on the main thread, affecting the main thread's state, raising new exceptions in the main thread, and so on. And as it turns out, CRuby avoids the races you describe by running the signal handlers on the main thread as well. That's right, it hijacks the main thread to run signal handlers.

$ rvm ruby-2.5 do ruby -e 'Signal.trap("INT") { p Thread.current }; p Thread.current; sleep'
#<Thread:0x00007fafdb87dc40 run>
^C#<Thread:0x00007fafdb87dc40 run>

I'm not sure we want to emulate that behavior.

I have a thought for a modification to my fix that just waits for all scheduled signal handlers to execute before proceeding on to the at_exit blocks. We'll see if that helps CI.

try {
executor.shutdown();
if (!executor.awaitTermination(1, TimeUnit.HOURS)) {
throw newRuntimeError("failed to execute at_exit blocks within 1h timeout");

This comment has been minimized.

@kares

kares Nov 12, 2018
Member

heh :) ... is this an arbitrary 1h?

This comment has been minimized.

@headius

headius Nov 12, 2018
Author Member

I thought about it for at least 30 seconds. But yeah it's pretty much arbitrary.

@enebo enebo modified the milestones: JRuby 9.2.4.0, JRuby 9.2.5.0 Nov 13, 2018
@enebo enebo modified the milestones: JRuby 9.2.5.0, JRuby 9.2.6.0 Dec 6, 2018
@headius headius modified the milestones: JRuby 9.2.6.0, JRuby 9.2.7.0 Dec 18, 2018
@enebo enebo merged commit da55f9e into jruby:master Feb 11, 2019
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@headius
Copy link
Member Author

@headius headius commented Apr 8, 2019

This had to be reverted due to some unexpected behavior changes in at_exit. See #5437 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants