Skip to content

Commit

Permalink
Fix timing issue of Thread#kill and sleep
Browse files Browse the repository at this point in the history
In some cases, a thread isn't correctly killed by Thread#kill. If that is the
case, Thread#join on that thread leads to the deadlock.

Consider the following example:

    thread = Thread.new do
      sleep
    end
    thread.kill # actually, don't kill
    thread.join # thread lives forever; deadlock

Thread#kill does 2 things to kill a thread:
    1. prepares the thread to be terminated and
    2. wakes up the thread.

Most of time, the wake-up is executed after the sleep in the killed thread.

However, it is possible the wake-up is executed BEFORE the sleep in some rare
cases. If that is the case, the wake-up does nothing because the thread is
still waking up at that time. Then, the thread begins to sleep forever. As a
result, Thread#join never returns.

To avoid such a deadlock, Thread#sleep must check to terminate or not before
actually sleeping. In addition, it must be asssured that the wake-up isn't
executed AFTER the check BEFORE the sleep.

The follwing diagrams illustrate the good and bad timings.

Good timings:

                 Time
                 ------------------------------------------------------>
    main thread  prepare - wakeup (no-op) |
    new thread                            | check - no sleep - terminate

                 Time
                 ------------------------------------------------------>
    main thread                | prepare - wake-up |
    new thread   check - sleep |                   | terminate

                 Time
                 ------------------------------------------------------>
    main thread        | prepare |       | wake-up |
    new thread   check |         | sleep |         | terminate

Bad timing:

                 Time
                 ------------------------------------------------------>
    main thread        | prepare - wake-up (no-op) |
    new thread   check |                           | sleep forever

To fix the deadlock, this commit makes check_async be called before sleeping
inside the park mutex.
  • Loading branch information
ryoqun committed Sep 2, 2012
1 parent 149a364 commit 1f9ddd1
Show file tree
Hide file tree
Showing 5 changed files with 18 additions and 14 deletions.
4 changes: 2 additions & 2 deletions vm/builtin/system.cpp
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -757,9 +757,9 @@ namespace rubinius {
ts.tv_sec += tv.tv_sec + nano / NANOSECONDS; ts.tv_sec += tv.tv_sec + nano / NANOSECONDS;
ts.tv_nsec = nano % NANOSECONDS; ts.tv_nsec = nano % NANOSECONDS;


state->park_timed(gct, calling_environment, &ts); if(!state->park_timed(gct, calling_environment, &ts)) return NULL;
} else { } else {
state->park(gct, calling_environment); if(!state->park(gct, calling_environment)) return NULL;
} }


if(!state->check_async(calling_environment)) return NULL; if(!state->check_async(calling_environment)) return NULL;
Expand Down
11 changes: 7 additions & 4 deletions vm/park.cpp
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
#include "builtin/thread.hpp" #include "builtin/thread.hpp"


namespace rubinius { namespace rubinius {
void Park::park(STATE, CallFrame* call_frame) { Object* Park::park(STATE, CallFrame* call_frame) {
utilities::thread::Mutex::LockGuard lg(mutex_); utilities::thread::Mutex::LockGuard lg(mutex_);
if(!state->check_async(call_frame)) return NULL;


wake_ = false; wake_ = false;
sleeping_ = true; sleeping_ = true;
Expand All @@ -18,22 +19,24 @@ namespace rubinius {


sleeping_ = false; sleeping_ = false;
state->vm()->thread->sleep(state, cFalse); state->vm()->thread->sleep(state, cFalse);
return cNil;
} }


bool Park::park_timed(STATE, CallFrame* call_frame, struct timespec* ts) { Object* Park::park_timed(STATE, CallFrame* call_frame, struct timespec* ts) {
utilities::thread::Mutex::LockGuard lg(mutex_); utilities::thread::Mutex::LockGuard lg(mutex_);
if(!state->check_async(call_frame)) return NULL;


wake_ = false; wake_ = false;
sleeping_ = true; sleeping_ = true;
state->vm()->thread->sleep(state, cTrue); state->vm()->thread->sleep(state, cTrue);


bool timeout = false; Object* timeout = cFalse;


while(!wake_) { while(!wake_) {
GCIndependent gc_guard(state, call_frame); GCIndependent gc_guard(state, call_frame);


if(cond_.wait_until(mutex_, ts) == utilities::thread::cTimedOut) { if(cond_.wait_until(mutex_, ts) == utilities::thread::cTimedOut) {
timeout = true; timeout = cTrue;
break; break;
} }
} }
Expand Down
5 changes: 3 additions & 2 deletions vm/park.hpp
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@


namespace rubinius { namespace rubinius {
struct CallFrame; struct CallFrame;
class Object;


class Park { class Park {
utilities::thread::Condition cond_; utilities::thread::Condition cond_;
Expand All @@ -31,8 +32,8 @@ namespace rubinius {
return sleeping_; return sleeping_;
} }


void park(STATE, CallFrame* call_frame); Object* park(STATE, CallFrame* call_frame);
bool park_timed(STATE, CallFrame* call_frame, struct timespec* ts); Object* park_timed(STATE, CallFrame* call_frame, struct timespec* ts);
}; };
} }


Expand Down
8 changes: 4 additions & 4 deletions vm/state.cpp
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -78,11 +78,11 @@ namespace rubinius {
} }




void State::park(GCToken gct, CallFrame* call_frame) { Object* State::park(GCToken gct, CallFrame* call_frame) {
vm_->park_->park(this, call_frame); return vm_->park_->park(this, call_frame);
} }


void State::park_timed(GCToken gct, CallFrame* call_frame, struct timespec* ts) { Object* State::park_timed(GCToken gct, CallFrame* call_frame, struct timespec* ts) {
vm_->park_->park_timed(this, call_frame, ts); return vm_->park_->park_timed(this, call_frame, ts);
} }
} }
4 changes: 2 additions & 2 deletions vm/state.hpp
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -144,9 +144,9 @@ namespace rubinius {
vm_->unlock(vm_); vm_->unlock(vm_);
} }


void park(GCToken gct, CallFrame* call_frame); Object* park(GCToken gct, CallFrame* call_frame);


void park_timed(GCToken gct, CallFrame* call_frame, struct timespec* ts); Object* park_timed(GCToken gct, CallFrame* call_frame, struct timespec* ts);
}; };
} }


Expand Down

0 comments on commit 1f9ddd1

Please sign in to comment.