Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use "semaphore-like" mutexes in the threads library #9863

Closed
wants to merge 1 commit into from

Conversation

xavierleroy
Copy link
Contributor

This is a "for reference" pull request, documenting an alternative to #9846

The problem with mutexes pointed out in #9757 is that Mutex.unlock, as implemented today using POSIX threads mutexes or Win32 critical sections, can fail or crash or misbehave arbitrarily if it is called on a mutex that is held by another thread.

#9846 addresses this issue by reliably raising a Sys_error exception in this case of bad unlocking.

The present PR addresses this issue by making it safe for any thread to unlock a locked mutex. In effect, this causes OCaml mutexes to behave like 0-1 semaphores. The implementation is relatively simple, but I would expect it to be significantly slower than native POSIX threads error-checking mutexes.

The only reason to consider this PR is to break less OCaml code than the #9846 approach. (Even though code that relies on unlocking mutexes from other threads is already broken, in my opinion.)

So that it becomes safe to unlock a mutex from a thread other than
the one that locked the mutex.
@gadmm
Copy link
Contributor

gadmm commented Sep 8, 2020

For the record, an examples where unlock happened in another thread was reported there https://discuss.ocaml.org/t/mutex-lock-resource-deadlock-avoided-on-freebsd-12-1-ocaml-4-09-1-lwt-4-2-1/6206 and it would indeed be nice to know whether this shows a real example of use of the semaphore behaviour in libraries.

Copy link
Member

@dra27 dra27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise it's not intended for merging, but I have "reviewed".

For the Win32 side: Windows XP is long dead, which means I think we can risk formally breaking it - condition variables (see InitializeConditionVariable and SleepConditionVariableCS) would certainly simplify the Win32 version of this and I think it should also be faster, since it would avoid using the heavier event objects (in the same way as the mutexes are made faster by using critical sections not actual Windows mutexes)

Comment on lines +269 to +278
switch (m->status) {
case LOCKED_WAITED:
pthread_cond_broadcast(&m->free);
/* fallthrough */
case LOCKED:
m->status = UNLOCKED;
break;
case UNLOCKED:
break; /* we could return an error code */
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, were this merged, this would be better in an inline function st_release_mutex or something so that it can be used...

Comment on lines +321 to +331
/* Start releasing the mutex */
switch (m->status) {
case LOCKED_WAITED:
pthread_cond_broadcast(&m->free);
/* fallthrough */
case LOCKED:
m->status = UNLOCKED;
break;
case UNLOCKED:
break; /* we could return an error code */
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... here

Comment on lines +220 to +240
while (m->status != UNLOCKED) {
ev = st_event_for_current_thread();
if (ev == NULL) {
rc = GetLastError();
LeaveCriticalSection(&m->lock);
return rc;
}
/* Insert the current thread in the waiting list (atomically) */
wait.event = ev;
wait.next = m->waiters;
m->waiters = &wait;
LeaveCriticalSection(&m->lock);
/* Wait for our event to be signaled. There is no risk of lost
wakeup, since we inserted ourselves on the waiting list of m
before releasing m's critical section */
TRACE1("st_mutex_lock: blocking on event", ev);
if (WaitForSingleObject(ev, INFINITE) == WAIT_FAILED)
return GetLastError();
TRACE1("st_mutex_lock: restarted", m);
EnterCriticalSection(&m->lock);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while (m->status != UNLOCKED) {
ev = st_event_for_current_thread();
if (ev == NULL) {
rc = GetLastError();
LeaveCriticalSection(&m->lock);
return rc;
}
/* Insert the current thread in the waiting list (atomically) */
wait.event = ev;
wait.next = m->waiters;
m->waiters = &wait;
LeaveCriticalSection(&m->lock);
/* Wait for our event to be signaled. There is no risk of lost
wakeup, since we inserted ourselves on the waiting list of m
before releasing m's critical section */
TRACE1("st_mutex_lock: blocking on event", ev);
if (WaitForSingleObject(ev, INFINITE) == WAIT_FAILED)
return GetLastError();
TRACE1("st_mutex_lock: restarted", m);
EnterCriticalSection(&m->lock);
}
if (m->status != UNLOCKED)
if ((ev = st_event_for_current_thread()) == NULL) {
rc = GetLastError();
LeaveCriticalSection(&m->lock);
return rc;
}
while (m->status != UNLOCKED) {
/* Insert the current thread in the waiting list (atomically) */
wait.event = ev;
wait.next = m->waiters;
m->waiters = &wait;
LeaveCriticalSection(&m->lock);
/* Wait for our event to be signaled. There is no risk of lost
wakeup, since we inserted ourselves on the waiting list of m
before releasing m's critical section */
TRACE1("st_mutex_lock: blocking on event", ev);
if (WaitForSingleObject(ev, INFINITE) == WAIT_FAILED)
return GetLastError();
TRACE1("st_mutex_lock: restarted", m);
EnterCriticalSection(&m->lock);
}
}

(ev needs only to be initialised once)

next = curr->next;
TRACE1("st_mutex_unlock: waking up", curr->event);
if (! SetEvent(curr->event)) rc = GetLastError();
curr = next;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's what was there before, but what's wrong with:

Suggested change
curr = next;
curr = curr->next;

and not have next at all?

/* Remove them all from the waiting list */
m->waiters = NULL;
m->status = UNLOCKED;
LeaveCriticalSection(&m->lock);
return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return 0;
return rc;

struct st_mutex_struct {
CRITICAL_SECTION lock; /* protect the data structure */
enum { UNLOCKED, LOCKED } status;
struct st_wait_list * waiters; /* list of threads waiting to lock it */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is more heavyweight than it needs to be - a single manual reset event should work for this, as only broadcast semantics are needed - SetEvent and ResetEvent would only be called with the lock.

As an aside, I think it's more heavyweight than is needed for condition variables as well - I think st_condvar_struct could simply have two Event objects, one auto-reset and one manual-reset - st_condvar_wait would then do a WaitForMultipleObjects with bWaitAll == FALSE and you'd use the auto-reset event for st_condvar_signal (wakes up just one thread and resets the event) and the manual-reset event for st_condvar_broadcast (wakes them all up), but that's O/T for this PR.

@xavierleroy
Copy link
Contributor Author

Thanks again for the review. I'll probably wait a bit before acting on it, as it occurred to me that a simpler implementation might be possible, taking advantage of the master lock. I'm also still skeptical that this is the direction to follow.

@xavierleroy
Copy link
Contributor Author

Now that #9846 is merged, with the error checking semantics for mutexes, this PR will not be pursued and can be closed.

@xavierleroy xavierleroy closed this Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants