Deadlock occurs on Thread.kill #1015

Closed
halorgium opened this Issue Sep 15, 2013 · 1 comment

Projects

None yet

2 participants

@halorgium

Encountered a deadlock with @celluloid.

Thread dump

/cc @headius @tarcieri

@headius headius was assigned Sep 15, 2013
@headius
Member
headius commented Sep 15, 2013

Trivial reproduction:

loop { Thread.new{}.kill; p :ok }

Barely runs at all for me. The problem is that the logic that runs when a thread is shutting down normally acquires two locks in the opposite order of kill called against that thread. I have a local patch that just removes several locks and fixes this. Will commit when I have a better connection.

@headius headius added a commit that referenced this issue Sep 15, 2013
@headius headius Remove several locks involved in thread mail and lifecycle.
The deadlock reported in #1015 was caused by RubyThread#dispose
and Thread#kill logic (RubyThread#kill) acquiring the same two
locks in opposite order. dispose first acquired the RubyThread
monitor, in order to read and write volatile state, and then
acquired the ThreadService monitor without releasing RubyThread in
order to unregister the RubyThread object. kill acquired the
ThreadService monitor and then attempted to acquire the RubyThread
monitor in order to deliver the kill event.

As it turned out, the "mail" field on RubyThread was the primary
actor triggering the use of synchronization, and much of that
synchronization was not necessary. The removal of the thread from
ThreadService did not need to be synchronized. The dispose and
receiveMail methods did not need to be synchronized if the mail
slot was made into an atomic reference. The deliverEvent method
did not need to be synchronized for any reason.

I made the appropriate changes to turn mail into an atomic
reference, avoid keeping the RubyThread lock when calling into
ThreadService to remove the RubyThread, and remove locks that
became irrelevant once mail became atomic.

Fixes #1015.
937f414
@headius headius added a commit that closed this issue Sep 15, 2013
@headius headius Remove several locks involved in thread mail and lifecycle.
The deadlock reported in #1015 was caused by RubyThread#dispose
and Thread#kill logic (RubyThread#kill) acquiring the same two
locks in opposite order. dispose first acquired the RubyThread
monitor, in order to read and write volatile state, and then
acquired the ThreadService monitor without releasing RubyThread in
order to unregister the RubyThread object. kill acquired the
ThreadService monitor and then attempted to acquire the RubyThread
monitor in order to deliver the kill event.

As it turned out, the "mail" field on RubyThread was the primary
actor triggering the use of synchronization, and much of that
synchronization was not necessary. The removal of the thread from
ThreadService did not need to be synchronized. The dispose and
receiveMail methods did not need to be synchronized if the mail
slot was made into an atomic reference. The deliverEvent method
did not need to be synchronized for any reason.

I made the appropriate changes to turn mail into an atomic
reference, avoid keeping the RubyThread lock when calling into
ThreadService to remove the RubyThread, and remove locks that
became irrelevant once mail became atomic.

Fixes #1015.
937f414
@headius headius closed this in 937f414 Sep 15, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment