Implement an efficient mutex in `std::sync` #11462

alexcrichton · 2014-01-11T00:50:22Z

These commits introduce Mutex and StaticMutex types in libstd as first-class primitives. One of the major goals of this implementation is to be efficient with a threshold of pthread mutexes as the benchmark.

Most of the comments can be found on the mutex module itself and I won't repeat the details here. The good news is that this allows for a different building block of libextra which means that try_send_deferred can now be completely removed!

In profiling, native performance is the same as pthreads performance, and green performance is about 3x faster than pthreads pending fixes to the scheduler that doesn't attempt to fan out all workloads as quickly as possible.

erickt · 2014-01-11T01:00:11Z

src/libstd/sync/mutex.rs

+
+impl StaticMutex {
+    /// Attempts to grab this lock, see `Mutex::trylock`
+    pub fn trylock<'a>(&'a self) -> Option<Guard<'a>> {


Minor nit: I think it would be more rust-y if this were named try_lock().

Good point! I remembered we had things like try_recv and try_send, so I think is certainly most appropriate as try_lock

Much of the justification can be found in the large implementation comment found in the module itself.

bill-myers · 2014-01-11T02:30:24Z

Is it really necessary to have the "held" variable?

Isn't it possible to use the pthread mutex itself to figure it if it's held, by unconditionally trylocking after push in lock(), and by unlocking it before (not after like in the current code!) pop() in unlock()?

Note that this probably needs an explicit memory fence in both paths.

The invariant is of course that if the pthread lock is locked, everything on the queue will be woken up eventually.

You would have the issue that the "memory of a mutex must be guaranteed to be valid until all unlocks have returned", but I don't think this is a problem in Rust where you can't just free memory arbitrarily like in C and always have a reference count that protects shared memory.

alexcrichton · 2014-01-11T02:38:08Z

If the point of this mutex is to be the mutex, then the clause I mentioned is incredibly important in my opinion. That's the exact problem that a mutex is trying to solve (wait for a time at which it's ok to do something). Most rust code wouldn't be susceptible to that condition, but any code that is susceptible is really going to need it (and otherwise won't have a solution in libstd.

You also can't pop outside of the lock because it's a single-consumer queue.

I would love to get rid of the held flag, but I thought about this for two days and couldn't come up with anything better.

bill-myers · 2014-01-11T03:01:52Z

The problem that the mutex is trying to solve is mainly making sure that the code protected by a single mutex is globally serialized, not that as soon as you unlock it and know that nothing else is going to lock it again you can kill the structure .

Figuring out when you can kill the structure is generally the job of the reference count in the structure containing the mutex; I'm not even sure that all native OS mutexes provide that guarantee (EDIT: POSIX requires that it is provided, Windows seems to say nothing).

Rust code without unsafe code that breaks Rust semantics cannot be affected because to call unlock() you need to pass an &mut to the mutex, and if that &mut is reborrowed, then the original &mut it is reborrowed from will continue to be semantically valid just after the unlock() call, and thus the memory must continue to be valid after the unlock(), making it irrelevant whether unlock() touches the memory or not after unlocking.

You also can't pop outside of the lock because it's a single-consumer queue.

That's a serious issue though.

A possible solution is to wrap the q.pop() in an additional pthread mutex used only for that, and also keep an atomic int with the queue size, which would allow to skip taking the lock and popping when it is zero.

bill-myers · 2014-01-11T05:34:36Z

BTW, it's possible to create a mutex providing that guarantee using a mutex that doesn't, by boxing the mutex that doesn't in an UnsafeArc, and keeping a clone of the UnsafeArc in the lock guard for the mutex wrapper.

But as already argued no proper Rust code can need that guarantee, and anyway the sensible way of doing things is to put the mutex and the data it protects inside an Arc or RWARC, where the Arc or RWARC reference count determines when to destroy the object independently of the mutex.

So I think that guarantee is unimportant for Rust and should only be provided if it can be provided for free, which doesn't seem to be the case.

This removes all usage of `try_send_deferred` and all related functionality. Primarily, this builds all extra::sync primitives on top of `sync::Mutex` instead of `unstable::sync::Mutex`.

bill-myers · 2014-01-13T17:49:05Z

I think I have found a lockless algorithm that works.

Note that it requires support for 64-bit atomics even on 32-bit platforms, and either limits tasks to 2^32 or requires 128-bit atomics on 64-bit platforms.

However, while libstd doesn't currently implement those, 64-bit atomics are available on all modern 32-bit x86 and ARM systems (x86 >= Pentium Pro, ARM >= ARMv6).

The idea is that we hold the state in an atomic (u32, u32) called (queue_size, lockers).

Lockers is the number of threads that are about to call lock() on the internal lock or that hold the internal lock; queue_size is the number of queued tasks.

Basically the "held" variable is replaced by lockers != 0, and having queue_size in the same atomic value allows to do the held check and enqueuing atomically.

The "atomically" blocks are to be implemented with compare-and-swap, the queue using the intrusive MPSC queue already written.

// try_lock() {
//     if atomically {queue_size == 0 && lockers == 0} && lock.try_lock() {
//         if atomically {let b = queue_size == 0; if b {++lockers; true} else {false}} {
//             ok
//         } else {
//             lock.unlock()
//             fail
//         }
//     } else {
//         fail
//     }
// }
// 
// lock() {
//     if try_lock() {
//         return guard;
//     }
//
//     if can_block && atomically {let b = queue_size == 0; if b {++lockers; true} else {false}} {
//         lock.lock();
//     } else {
//         q.push();
//         if atomically {++queue_size; if(lockers == 0) {++lockers; true} else {false}} {
//             // this never blocks indefinitely
//             // this is because lockers was 0, so we have no one having or trying to get the lock
//             // and we atomically set queue_size to a positive value, so no one will start blocking
//             lock.lock();
//             atomically {--queue_size}
//             t = q.pop();
//             if t != ourselves {
//                 t.wakeup();
//                 go to sleep
//             }
//         } else {
//             go to sleep
//         }
//     }
// }
// 
// unlock() {
//     if atomically {if(queue_size != 0 && lockers == 1) {--queue_size; true} else {--lockers; false}} {
//         t = q.pop();
//         t.wakeup();
//     } else {
//         lock.unlock()
//     }
// }

Transitions:

(0, lockers) => (0, lockers + 1) // task locks with nothing queued
(queue_size, lockers > 1) => (queue_size, lockers - 1) // task unlocks with native task waiting
(0, 1) => (0, 0) // task unlocks with nothing else waiting

(queue_size, lockers) => (queue_size + 1, lockers) // task queues up
(queue_size > 0, 1) => (queue_size - 1, lockers) // task unlocks, waking up queued task

EDIT: changed to now also be globally fair

The idea is that if lockers > 1, we don't pop the queue, but rather just unlock, which means that queued native tasks get run.

New native tasks instead queue up due to queue_size != 0

EDIT 2: added try_lock, fixed lock to use it and not have a race that caused blocking on green tasks

EDIT 3: and if we replace the MPSC queue implementation with one using list reversal, we can remove the last yields

EDIT 4: implemented this in pull #11520, suggesting to consider that pull requests as an improvement over the code in this one (assuming the algorithm indeed works)

alexcrichton · 2014-01-14T19:29:27Z

Closing, adding another crucial Thread::yield_now() makes me cringe, and #11520 looks like a much better alternative.

…rve-literals, r=blyxyas Preserve literals and range kinds in `manual_range_patterns` Fixes rust-lang#11461 Also enables linting when there are 3 or fewer alternatives if one of them is already a range pattern changelog: none

erickt reviewed Jan 11, 2014
View reviewed changes

std: Introduce a standard Mutex type

fea60da

Much of the justification can be found in the large implementation comment found in the module itself.

Remove try_send_deferred plus fallout

4219fd3

This removes all usage of `try_send_deferred` and all related functionality. Primarily, this builds all extra::sync primitives on top of `sync::Mutex` instead of `unstable::sync::Mutex`.

bill-myers mentioned this pull request Jan 13, 2014

Alternative Mutex implementation #11509

Closed

bill-myers mentioned this pull request Jan 13, 2014

Alternative lockless Mutex implementation (2nd try) #11520

Closed

alexcrichton closed this Jan 14, 2014

alexcrichton mentioned this pull request Jan 17, 2014

Add a Mutex, AtomicU64, and deal with lots of related fallout #11610

Closed

alexcrichton deleted the mutex branch January 23, 2014 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement an efficient mutex in `std::sync` #11462

Implement an efficient mutex in `std::sync` #11462

Uh oh!

alexcrichton commented Jan 11, 2014

Uh oh!

erickt Jan 11, 2014

Uh oh!

alexcrichton Jan 11, 2014

Uh oh!

bill-myers commented Jan 11, 2014

Uh oh!

alexcrichton commented Jan 11, 2014

Uh oh!

bill-myers commented Jan 11, 2014

Uh oh!

bill-myers commented Jan 11, 2014

Uh oh!

bill-myers commented Jan 13, 2014

Uh oh!

alexcrichton commented Jan 14, 2014

Uh oh!

Uh oh!

Implement an efficient mutex in std::sync #11462

Implement an efficient mutex in std::sync #11462

Uh oh!

Conversation

alexcrichton commented Jan 11, 2014

Uh oh!

erickt Jan 11, 2014

Choose a reason for hiding this comment

Uh oh!

alexcrichton Jan 11, 2014

Choose a reason for hiding this comment

Uh oh!

bill-myers commented Jan 11, 2014

Uh oh!

alexcrichton commented Jan 11, 2014

Uh oh!

bill-myers commented Jan 11, 2014

Uh oh!

bill-myers commented Jan 11, 2014

Uh oh!

bill-myers commented Jan 13, 2014

Uh oh!

alexcrichton commented Jan 14, 2014

Uh oh!

Uh oh!

Implement an efficient mutex in `std::sync` #11462

Implement an efficient mutex in `std::sync` #11462