Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel#sleep hangs on CI #5365

Closed
ahorek opened this Issue Oct 13, 2018 · 5 comments

Comments

Projects
None yet
2 participants
@ahorek
Copy link
Contributor

ahorek commented Oct 13, 2018

Environment

jruby 9.2.1.0-SNAPSHOT (2.5.0) 2018-10-12 b2f694c Java HotSpot(TM) 64-Bit Server VM 25.181-b13 on 1.8.0_181-b13 +jit [linux-x86_64]

script

100.times do
  t = Thread.new do
    Kernel.sleep
    :ok
  end

  JRuby.reference(t).native_thread.interrupt
  t.value
end

JRuby.reference(t).native_thread.interrupt should stop the thread

  t = Thread.new do
    Kernel.sleep
    :ok
  end

  JRuby.reference(t).native_thread.interrupt
  => #<Thread:0x69aabcb0@(irb):31 dead>

under load (if I run it in a loop), some threads sometimes remain still in a sleep state

=> #<Thread:0x2dfeb141@(irb):39 sleep>

t.value calls Thread#join and because the thread is alive, it sleeps forever

if (!threadImpl.isAlive()) {

Expected Behavior

it looks like there's a race condition somewhere? If I add a sleep call between Thread.new and thread.interrupt it won't hang. Or is it just a bad test?

100.times do
  t = Thread.new do
    Kernel.sleep
    :ok
  end

  sleep 0.25
  JRuby.reference(t).native_thread.interrupt
  t.value
end

Actual Behavior

hangs in an infinite loop

https://travis-ci.org/jruby/jruby/jobs/440880330

....
     [exec] Kernel#sleep
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated
@headius

This comment has been minimized.

Copy link
Member

headius commented Oct 15, 2018

Have you confirmed that the thread is actually sleeping before you try to interrupt it? The race here is that the thread may or may not have actually called sleep, so the interrupt call might happen when it's just running the preamble code for the block. It's possible the thread hasn't even been scheduled yet by the time you call interrupt.

@ahorek

This comment has been minimized.

Copy link
Contributor Author

ahorek commented Oct 15, 2018

It's possible the thread hasn't even been scheduled yet by the time you call interrupt.

I think this is the case

100.times do
  t = Thread.new do
    Kernel.sleep
    :ok
  end

  sleep 0.25
  puts t.status

  JRuby.reference(t).native_thread.interrupt
  t.value
end

=>
sleep
sleep
sleep
sleep
....

100.times do
  t = Thread.new do
    Kernel.sleep
    :ok
  end

# sleep 0.25
  puts t.status

  JRuby.reference(t).native_thread.interrupt
  t.value
end

=>
run
run
run
run
hangs

@headius

This comment has been minimized.

Copy link
Member

headius commented Oct 19, 2018

Since that's the case...is there really anything to fix? You are basically interrupting the sleep before it happens. If you wait for status == sleep it should work better, I believe.

Let us know if you believe there's something more we should be doing in JRuby for this case.

@headius headius closed this Oct 19, 2018

@headius headius added this to the Won't Fix milestone Oct 19, 2018

@ahorek

This comment has been minimized.

Copy link
Contributor Author

ahorek commented Oct 19, 2018

no, I think the behaviour is ok, but the test is wrong.
https://github.com/jruby/jruby/blob/4e8bb2666a7257f0f5986800f96bb88efdd6acbd/spec/regression/GH-4206_kernel_sleep_interruptedexception_spec.rb

if the thread isn't scheduled yet, it'll cause random failures like this
https://travis-ci.org/jruby/jruby/jobs/440880330

the test should wait until the thread is ready before calling

JRuby.reference(t).native_thread.interrupt

otherwise it hangs

makes sense? @headius ?

@headius

This comment has been minimized.

Copy link
Member

headius commented Oct 19, 2018

Oh, somehow I missed that you were debugging a test hang. Yes, I agree the test should be fixed.

@headius headius modified the milestones: Won't Fix, JRuby 9.2.1.0 Oct 19, 2018

headius added a commit that referenced this issue Oct 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.