New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock fair::(recursive_)timed_mutex::try_lock_{for,until} #5

Closed
yohhoy opened this Issue Jan 23, 2017 · 3 comments

Comments

Projects
None yet
1 participant
@yohhoy
Owner

yohhoy commented Jan 23, 2017

commit aee4958

Some tests cause deadlock in test/basic_test.

  • TimedMutexTest/2.TryLockFor
  • TimedMutexTest/2.TryLockUntil
  • TimedMutexTest/3.TryLockFor
  • TimedMutexTest/3.TryLockUntil

@yohhoy yohhoy added the bug label Jan 23, 2017

@yohhoy

This comment has been minimized.

Owner

yohhoy commented Jan 23, 2017

candidate patch for timed_mutex: Nope. this fix leads another deadlock.

   bool do_try_lockwait(const std::chrono::time_point<Clock, Duration>& tp)
   {
     std::unique_lock<decltype(mtx_)> lk(mtx_);
-    std::size_t request = next_;
+    std::size_t request = next_++;
     while (request != curr_) {
       if (cv_.wait_until(lk, tp) == std::cv_status::timeout) {
         if (request == curr_)  // re-check predicate
           break;
+        --next_;
+        cv_.notify_all();
         return false;
       }
     }
-    ++next_;
     return true;
   }
@yohhoy

This comment has been minimized.

Owner

yohhoy commented Jan 23, 2017

This is an another draft patch to resolve this problem. In this algorithm, the order of mutex lock acquisition may not be strict FIFO scheduling. skip_ denotes timeouted try_lock_* request count (except timeouted thread was next candidate.), and it relax the condition of lock acquisition from "next candidate only" to "any thread in next + timeouted-count range". lock and try_lock member function shall be changed likewise.

  • Pros: Token-based implementation with no dynamic memory allocation, compared to trivial FIFO queue-based implementation.
  • Cons: Non-strict fairness lock scheduling. Under highly lock contention and short-term timeout, skip_ count get bigger all the time.
 class timed_mutex : private detail::mutex_base {
   using base = detail::mutex_base;
 
+  std::size_t skip_ = 0;
+
   template <typename Clock, typename Duration>
   bool do_try_lockwait(const std::chrono::time_point<Clock, Duration>& tp)
   {
     std::unique_lock<decltype(mtx_)> lk(mtx_);
-    std::size_t request = next_;
-    while (request != curr_) {
+    std::size_t request = next_++;
+    // wait condition: request is in [curr_, curr_ + skip_]
+    while (!(curr_ <= request && request <= curr_ + skip_)) {
       if (cv_.wait_until(lk, tp) == std::cv_status::timeout) {
-        if (request == curr_)  // re-check predicate
+        if (curr_ <= request && request <= curr_ + skip_)  // re-check predicate
           break;
+        if (request == curr_ + 1) {
+          ++curr_;  // I was next candidate.
+        } else {
+          ++skip_;  // I was future (not next) candidate
+        }
         return false;
       }
     }
-    ++next_;
+    assert(curr_ <= request && request <= curr_ + skip_);
+    if (0 < skip_ && next_ == curr_ + skip_ + 1) {
+      // No other waiting thread, reset skip counter.
+      skip_ = 0;
+      curr_ = next_ - 1;
+    }
     return true;
   }
@yohhoy

This comment has been minimized.

Owner

yohhoy commented Jan 24, 2017

Negatively resolved by commit 3e064f1 that remove timed variant from yamc::fair::*.

I suppose that it's impossible to ensure perfect FIFO scheduling with ticked-based implementation.

When fair mutex use waiting thread queue (Ex. std::deque<std::thread::id>), we can achieve reliable FIFO lock order. But its run-time overhead can not be ignored, and I feel such queue-based implementation is worthless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment