Permalink
Browse files

Fix timing issue of Thread#kill and sleep

In some cases, a thread isn't correctly killed by Thread#kill. If that is the
case, Thread#join on that thread leads to the deadlock.

Consider the following example:

    thread = Thread.new do
      sleep
    end
    thread.kill # actually, don't kill
    thread.join # thread lives forever; deadlock

Thread#kill does 2 things to kill a thread:
    1. prepares the thread to be terminated and
    2. wakes up the thread.

Most of time, the wake-up is executed after the sleep in the killed thread.

However, it is possible the wake-up is executed BEFORE the sleep in some rare
cases. If that is the case, the wake-up does nothing because the thread is
still waking up at that time. Then, the thread begins to sleep forever. As a
result, Thread#join never returns.

To avoid such a deadlock, Thread#sleep must check to terminate or not before
actually sleeping. In addition, it must be asssured that the wake-up isn't
executed AFTER the check BEFORE the sleep.

The follwing diagrams illustrate the good and bad timings.

Good timings:

                 Time
                 ------------------------------------------------------>
    main thread  prepare - wakeup (no-op) |
    new thread                            | check - no sleep - terminate

                 Time
                 ------------------------------------------------------>
    main thread                | prepare - wake-up |
    new thread   check - sleep |                   | terminate

                 Time
                 ------------------------------------------------------>
    main thread        | prepare |       | wake-up |
    new thread   check |         | sleep |         | terminate

Bad timing:

                 Time
                 ------------------------------------------------------>
    main thread        | prepare - wake-up (no-op) |
    new thread   check |                           | sleep forever

To fix the deadlock, this commit makes check_async be called before sleeping
inside the park mutex.
  • Loading branch information...
1 parent 149a364 commit 1f9ddd10c10386f4e12f9a52f41ff77731061612 @ryoqun ryoqun committed Sep 2, 2012
Showing with 18 additions and 14 deletions.
  1. +2 −2 vm/builtin/system.cpp
  2. +7 −4 vm/park.cpp
  3. +3 −2 vm/park.hpp
  4. +4 −4 vm/state.cpp
  5. +2 −2 vm/state.hpp
@@ -757,9 +757,9 @@ namespace rubinius {
ts.tv_sec += tv.tv_sec + nano / NANOSECONDS;
ts.tv_nsec = nano % NANOSECONDS;
- state->park_timed(gct, calling_environment, &ts);
+ if(!state->park_timed(gct, calling_environment, &ts)) return NULL;
} else {
- state->park(gct, calling_environment);
+ if(!state->park(gct, calling_environment)) return NULL;
}
if(!state->check_async(calling_environment)) return NULL;
View
@@ -3,8 +3,9 @@
#include "builtin/thread.hpp"
namespace rubinius {
- void Park::park(STATE, CallFrame* call_frame) {
+ Object* Park::park(STATE, CallFrame* call_frame) {
utilities::thread::Mutex::LockGuard lg(mutex_);
+ if(!state->check_async(call_frame)) return NULL;
wake_ = false;
sleeping_ = true;
@@ -18,22 +19,24 @@ namespace rubinius {
sleeping_ = false;
state->vm()->thread->sleep(state, cFalse);
+ return cNil;
}
- bool Park::park_timed(STATE, CallFrame* call_frame, struct timespec* ts) {
+ Object* Park::park_timed(STATE, CallFrame* call_frame, struct timespec* ts) {
utilities::thread::Mutex::LockGuard lg(mutex_);
+ if(!state->check_async(call_frame)) return NULL;
wake_ = false;
sleeping_ = true;
state->vm()->thread->sleep(state, cTrue);
- bool timeout = false;
+ Object* timeout = cFalse;
while(!wake_) {
GCIndependent gc_guard(state, call_frame);
if(cond_.wait_until(mutex_, ts) == utilities::thread::cTimedOut) {
- timeout = true;
+ timeout = cTrue;
break;
}
}
View
@@ -5,6 +5,7 @@
namespace rubinius {
struct CallFrame;
+ class Object;
class Park {
utilities::thread::Condition cond_;
@@ -31,8 +32,8 @@ namespace rubinius {
return sleeping_;
}
- void park(STATE, CallFrame* call_frame);
- bool park_timed(STATE, CallFrame* call_frame, struct timespec* ts);
+ Object* park(STATE, CallFrame* call_frame);
+ Object* park_timed(STATE, CallFrame* call_frame, struct timespec* ts);
};
}
View
@@ -78,11 +78,11 @@ namespace rubinius {
}
- void State::park(GCToken gct, CallFrame* call_frame) {
- vm_->park_->park(this, call_frame);
+ Object* State::park(GCToken gct, CallFrame* call_frame) {
+ return vm_->park_->park(this, call_frame);
}
- void State::park_timed(GCToken gct, CallFrame* call_frame, struct timespec* ts) {
- vm_->park_->park_timed(this, call_frame, ts);
+ Object* State::park_timed(GCToken gct, CallFrame* call_frame, struct timespec* ts) {
+ return vm_->park_->park_timed(this, call_frame, ts);
}
}
View
@@ -144,9 +144,9 @@ namespace rubinius {
vm_->unlock(vm_);
}
- void park(GCToken gct, CallFrame* call_frame);
+ Object* park(GCToken gct, CallFrame* call_frame);
- void park_timed(GCToken gct, CallFrame* call_frame, struct timespec* ts);
+ Object* park_timed(GCToken gct, CallFrame* call_frame, struct timespec* ts);
};
}

0 comments on commit 1f9ddd1

Please sign in to comment.