Use "semaphore-like" mutexes in the threads library #9863

xavierleroy · 2020-08-28T07:46:54Z

This is a "for reference" pull request, documenting an alternative to #9846

The problem with mutexes pointed out in #9757 is that Mutex.unlock, as implemented today using POSIX threads mutexes or Win32 critical sections, can fail or crash or misbehave arbitrarily if it is called on a mutex that is held by another thread.

#9846 addresses this issue by reliably raising a Sys_error exception in this case of bad unlocking.

The present PR addresses this issue by making it safe for any thread to unlock a locked mutex. In effect, this causes OCaml mutexes to behave like 0-1 semaphores. The implementation is relatively simple, but I would expect it to be significantly slower than native POSIX threads error-checking mutexes.

The only reason to consider this PR is to break less OCaml code than the #9846 approach. (Even though code that relies on unlocking mutexes from other threads is already broken, in my opinion.)

So that it becomes safe to unlock a mutex from a thread other than the one that locked the mutex.

gadmm · 2020-09-08T12:27:13Z

For the record, an examples where unlock happened in another thread was reported there https://discuss.ocaml.org/t/mutex-lock-resource-deadlock-avoided-on-freebsd-12-1-ocaml-4-09-1-lwt-4-2-1/6206 and it would indeed be nice to know whether this shows a real example of use of the semaphore behaviour in libraries.

dra27

I realise it's not intended for merging, but I have "reviewed".

For the Win32 side: Windows XP is long dead, which means I think we can risk formally breaking it - condition variables (see InitializeConditionVariable and SleepConditionVariableCS) would certainly simplify the Win32 version of this and I think it should also be faster, since it would avoid using the heavier event objects (in the same way as the mutexes are made faster by using critical sections not actual Windows mutexes)

dra27 · 2020-09-08T09:52:17Z

otherlibs/systhreads/st_posix.h

+  switch (m->status) {
+  case LOCKED_WAITED:
+    pthread_cond_broadcast(&m->free);
+    /* fallthrough */
+  case LOCKED:
+    m->status = UNLOCKED;
+    break;
+  case UNLOCKED:
+    break; /* we could return an error code */
+  }


FWIW, were this merged, this would be better in an inline function st_release_mutex or something so that it can be used...

dra27 · 2020-09-08T09:52:26Z

otherlibs/systhreads/st_posix.h

+  /* Start releasing the mutex */
+  switch (m->status) {
+  case LOCKED_WAITED:
+    pthread_cond_broadcast(&m->free);
+    /* fallthrough */
+  case LOCKED:
+    m->status = UNLOCKED;
+    break;
+  case UNLOCKED:
+    break; /* we could return an error code */
+  }


dra27 · 2020-09-08T12:41:39Z

otherlibs/systhreads/st_win32.h

+  while (m->status != UNLOCKED) {
+    ev = st_event_for_current_thread();
+    if (ev == NULL) { 
+      rc = GetLastError();
+      LeaveCriticalSection(&m->lock);
+      return rc;
+    }
+    /* Insert the current thread in the waiting list (atomically) */
+    wait.event = ev;
+    wait.next = m->waiters;
+    m->waiters = &wait;
+    LeaveCriticalSection(&m->lock);
+    /* Wait for our event to be signaled.  There is no risk of lost
+       wakeup, since we inserted ourselves on the waiting list of m
+       before releasing m's critical section */
+    TRACE1("st_mutex_lock: blocking on event", ev);
+    if (WaitForSingleObject(ev, INFINITE) == WAIT_FAILED)
+      return GetLastError();
+    TRACE1("st_mutex_lock: restarted", m);
+    EnterCriticalSection(&m->lock);
+  }


Suggested change

while (m->status != UNLOCKED) {

ev = st_event_for_current_thread();

if (ev == NULL) {

rc = GetLastError();

LeaveCriticalSection(&m->lock);

return rc;

}

/* Insert the current thread in the waiting list (atomically) */

wait.event = ev;

wait.next = m->waiters;

m->waiters = &wait;

LeaveCriticalSection(&m->lock);

/* Wait for our event to be signaled. There is no risk of lost

wakeup, since we inserted ourselves on the waiting list of m

before releasing m's critical section */

TRACE1("st_mutex_lock: blocking on event", ev);

if (WaitForSingleObject(ev, INFINITE) == WAIT_FAILED)

return GetLastError();

TRACE1("st_mutex_lock: restarted", m);

EnterCriticalSection(&m->lock);

}

if (m->status != UNLOCKED)

if ((ev = st_event_for_current_thread()) == NULL) {

rc = GetLastError();

LeaveCriticalSection(&m->lock);

return rc;

}

while (m->status != UNLOCKED) {

/* Insert the current thread in the waiting list (atomically) */

wait.event = ev;

wait.next = m->waiters;

m->waiters = &wait;

LeaveCriticalSection(&m->lock);

/* Wait for our event to be signaled. There is no risk of lost

wakeup, since we inserted ourselves on the waiting list of m

before releasing m's critical section */

TRACE1("st_mutex_lock: blocking on event", ev);

if (WaitForSingleObject(ev, INFINITE) == WAIT_FAILED)

return GetLastError();

TRACE1("st_mutex_lock: restarted", m);

EnterCriticalSection(&m->lock);

}

}

(ev needs only to be initialised once)

dra27 · 2020-09-08T12:46:44Z

otherlibs/systhreads/st_win32.h

+    next = curr->next;
+    TRACE1("st_mutex_unlock: waking up", curr->event);
+    if (! SetEvent(curr->event)) rc = GetLastError();
+    curr = next;


I know it's what was there before, but what's wrong with:

Suggested change

curr = next;

curr = curr->next;

and not have next at all?

dra27 · 2020-09-08T12:47:55Z

otherlibs/systhreads/st_win32.h

+  /* Remove them all from the waiting list */
+  m->waiters = NULL;
+  m->status = UNLOCKED;
+  LeaveCriticalSection(&m->lock);
  return 0;


Suggested change

return 0;

return rc;

dra27 · 2020-09-08T13:01:32Z

otherlibs/systhreads/st_win32.h

+struct st_mutex_struct {
+  CRITICAL_SECTION lock;        /* protect the data structure */
+  enum { UNLOCKED, LOCKED } status;
+  struct st_wait_list * waiters; /* list of threads waiting to lock it */


I think this is more heavyweight than it needs to be - a single manual reset event should work for this, as only broadcast semantics are needed - SetEvent and ResetEvent would only be called with the lock.

As an aside, I think it's more heavyweight than is needed for condition variables as well - I think st_condvar_struct could simply have two Event objects, one auto-reset and one manual-reset - st_condvar_wait would then do a WaitForMultipleObjects with bWaitAll == FALSE and you'd use the auto-reset event for st_condvar_signal (wakes up just one thread and resets the event) and the manual-reset event for st_condvar_broadcast (wakes them all up), but that's O/T for this PR.

xavierleroy · 2020-09-09T12:50:32Z

Thanks again for the review. I'll probably wait a bit before acting on it, as it occurred to me that a simpler implementation might be possible, taking advantage of the master lock. I'm also still skeptical that this is the direction to follow.

xavierleroy · 2020-10-05T13:06:35Z

Now that #9846 is merged, with the error checking semantics for mutexes, this PR will not be pursued and can be closed.

Reimplement mutexes as 0-1 semaphores

46bceb5

So that it becomes safe to unlock a mutex from a thread other than the one that locked the mutex.

dra27 reviewed Sep 8, 2020

View reviewed changes

dra27 mentioned this pull request Sep 8, 2020

Reimplement Condition.t on Win32 #9894

Closed

xavierleroy marked this pull request as draft September 14, 2020 14:40

xavierleroy mentioned this pull request Sep 14, 2020

Reimplement mutexes and condition variables in OCaml using atomics + suspend/notify #9915

Closed

xavierleroy closed this Oct 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use "semaphore-like" mutexes in the threads library #9863

Use "semaphore-like" mutexes in the threads library #9863

xavierleroy commented Aug 28, 2020

gadmm commented Sep 8, 2020

dra27 left a comment

dra27 Sep 8, 2020

dra27 Sep 8, 2020

dra27 Sep 8, 2020

dra27 Sep 8, 2020

dra27 Sep 8, 2020

dra27 Sep 8, 2020

xavierleroy commented Sep 9, 2020

xavierleroy commented Oct 5, 2020

Use "semaphore-like" mutexes in the threads library #9863

Use "semaphore-like" mutexes in the threads library #9863

Conversation

xavierleroy commented Aug 28, 2020

gadmm commented Sep 8, 2020

dra27 left a comment

Choose a reason for hiding this comment

dra27 Sep 8, 2020

Choose a reason for hiding this comment

dra27 Sep 8, 2020

Choose a reason for hiding this comment

dra27 Sep 8, 2020

Choose a reason for hiding this comment

dra27 Sep 8, 2020

Choose a reason for hiding this comment

dra27 Sep 8, 2020

Choose a reason for hiding this comment

dra27 Sep 8, 2020

Choose a reason for hiding this comment

xavierleroy commented Sep 9, 2020

xavierleroy commented Oct 5, 2020