New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock when tearing down runtime during debugging #4352

Merged
merged 1 commit into from Dec 4, 2016

Conversation

Projects
None yet
2 participants
@ivoanjo
Contributor

ivoanjo commented Dec 1, 2016

When trying to exit with a hard exit (exit!), runtime teardown is
performed by the shutdown hook thread added in Main.internalRun(), which
is run on a different thread.

This poses a problem when using ruby-debug, as it grabs a lock while
processing debug events 1, and thus if you try to exit! while trying to
process debug events (for instance if you're debugging interactively
with pry and you want to exit), you'll get a deadlock looking something
like this:

"Thread-1" - Thread t@11
   java.lang.Thread.State: BLOCKED
  at org.jruby.debug.DebugEventHook.eventHandler(DebugEventHook.java:91)
  - waiting to lock <3b88136f> (a org.jruby.debug.DebugEventHook) owned by "main" t@1
  at org.jruby.runtime.EventHook.event(EventHook.java:30)
  at org.jruby.Ruby.callEventHooks(Ruby.java:3179)
  at org.jruby.runtime.ThreadContext.trace(ThreadContext.java:652)
  (...)
  at org.jruby.RubyProc.call(RubyProc.java:249)
  at org.jruby.Ruby.tearDown(Ruby.java:3281)
  at org.jruby.Ruby.tearDown(Ruby.java:3254)
  at org.jruby.Main$1.run(Main.java:293)

   Locked ownable synchronizers:
  - None

"main" - Thread t@1
   java.lang.Thread.State: WAITING
  at java.lang.Object.wait(Native Method)
  - waiting on <5579bb86> (a org.jruby.Main$1)
  at java.lang.Thread.join(Thread.java:1249)
  at java.lang.Thread.join(Thread.java:1323)
  at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
  at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
  at java.lang.Shutdown.runHooks(Shutdown.java:123)
  at java.lang.Shutdown.sequence(Shutdown.java:167)
  at java.lang.Shutdown.exit(Shutdown.java:212)
  - locked <39d0fc85> (a java.lang.Class)
  at java.lang.Runtime.exit(Runtime.java:109)
  at java.lang.System.exit(System.java:971)
  at org.jruby.RubyKernel.exit(RubyKernel.java:708)
  at org.jruby.RubyKernel.exit_bang(RubyKernel.java:683)
  (...)
  at org.jruby.debug.DebugEventHook.eventHandler(DebugEventHook.java:97)
  - locked <3b88136f> (a org.jruby.debug.DebugEventHook)
  at org.jruby.runtime.EventHook.event(EventHook.java:30)
  at org.jruby.Ruby.callEventHooks(Ruby.java:3179)
  at org.jruby.runtime.ThreadContext.trace(ThreadContext.java:652)
  (...)
  at org.jruby.Main.main(Main.java:204)

To fix this, we disable event hooks during teardown, so that code events
during this period are not sent to the debugger.

Fix deadlock when tearing down runtime during debugging
When trying to exit with a hard exit (exit!), runtime teardown is
performed by the shutdown hook thread added in Main.internalRun(), which
is run on a different thread.

This poses a problem when using ruby-debug, as it grabs a lock while
processing debug events [1], and thus if you try to exit! while trying to
process debug events (for instance if you're debugging interactively
with pry and you want to exit), you'll get a deadlock looking something
like this:

```
"Thread-1" - Thread t@11
   java.lang.Thread.State: BLOCKED
  at org.jruby.debug.DebugEventHook.eventHandler(DebugEventHook.java:91)
  - waiting to lock <3b88136f> (a org.jruby.debug.DebugEventHook) owned by "main" t@1
  at org.jruby.runtime.EventHook.event(EventHook.java:30)
  at org.jruby.Ruby.callEventHooks(Ruby.java:3179)
  at org.jruby.runtime.ThreadContext.trace(ThreadContext.java:652)
  (...)
  at org.jruby.RubyProc.call(RubyProc.java:249)
  at org.jruby.Ruby.tearDown(Ruby.java:3281)
  at org.jruby.Ruby.tearDown(Ruby.java:3254)
  at org.jruby.Main$1.run(Main.java:293)

   Locked ownable synchronizers:
  - None

"main" - Thread t@1
   java.lang.Thread.State: WAITING
  at java.lang.Object.wait(Native Method)
  - waiting on <5579bb86> (a org.jruby.Main$1)
  at java.lang.Thread.join(Thread.java:1249)
  at java.lang.Thread.join(Thread.java:1323)
  at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
  at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
  at java.lang.Shutdown.runHooks(Shutdown.java:123)
  at java.lang.Shutdown.sequence(Shutdown.java:167)
  at java.lang.Shutdown.exit(Shutdown.java:212)
  - locked <39d0fc85> (a java.lang.Class)
  at java.lang.Runtime.exit(Runtime.java:109)
  at java.lang.System.exit(System.java:971)
  at org.jruby.RubyKernel.exit(RubyKernel.java:708)
  at org.jruby.RubyKernel.exit_bang(RubyKernel.java:683)
  (...)
  at org.jruby.debug.DebugEventHook.eventHandler(DebugEventHook.java:97)
  - locked <3b88136f> (a org.jruby.debug.DebugEventHook)
  at org.jruby.runtime.EventHook.event(EventHook.java:30)
  at org.jruby.Ruby.callEventHooks(Ruby.java:3179)
  at org.jruby.runtime.ThreadContext.trace(ThreadContext.java:652)
  (...)
  at org.jruby.Main.main(Main.java:204)
```

To fix this, we disable event hooks during teardown, so that code events
during this period are not sent to the debugger.

[1]: https://github.com/ruby-debug/ruby-debug/blob/master/src/org/jruby/debug/DebugEventHook.java#L91-L101

@kares kares added this to the JRuby 9.1.7.0 milestone Dec 2, 2016

@kares kares merged commit b17bee6 into jruby:master Dec 4, 2016

1 of 2 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Jan 4, 2017

Member

Hey Ivo, any chance you could look at ruby-debug whether this could be avoided there?
... seems like the change messes up others such as simplecov (#4404) thus might end up being reverted.

Member

kares commented Jan 4, 2017

Hey Ivo, any chance you could look at ruby-debug whether this could be avoided there?
... seems like the change messes up others such as simplecov (#4404) thus might end up being reverted.

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Jan 5, 2017

Member

PR was reverted at 57ba4fd (due #4404) ... there's #4419 for tracking the dead-lock issue.

Member

kares commented Jan 5, 2017

PR was reverted at 57ba4fd (due #4404) ... there's #4419 for tracking the dead-lock issue.

@ivoanjo

This comment has been minimized.

Show comment
Hide comment
@ivoanjo

ivoanjo Jan 5, 2017

Contributor

Ah, not good indeed. I will try to find other way to tackle this issue, thanks for the heads up.

Contributor

ivoanjo commented Jan 5, 2017

Ah, not good indeed. I will try to find other way to tackle this issue, thanks for the heads up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment