Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: Rewrite the sync module #19274

Merged
merged 2 commits into from
Dec 5, 2014
Merged

Conversation

alexcrichton
Copy link
Member

This commit is a reimplementation of std::sync to be based on the
system-provided primitives wherever possible. The previous implementation was
fundamentally built on top of channels, and as part of the runtime reform it has
become clear that this is not the level of abstraction that the standard level
should be providing. This rewrite aims to provide as thin of a shim as possible
on top of the system primitives in order to make them safe.

The overall interface of the std::sync module has in general not changed, but
there are a few important distinctions, highlighted below:

  • The condition variable type, Condvar, has been separated out of a Mutex.
    A condition variable is now an entirely separate type. This separation
    benefits users who only use one mutex, and provides a clearer distinction of
    who's responsible for managing condition variables (the application).

  • All of Condvar, Mutex, and RWLock are now directly built on top of
    system primitives rather than using a custom implementation. The Once,
    Barrier, and Semaphore types are still built upon these abstractions of
    the system primitives.

  • The Condvar, Mutex, and RWLock types all have a new static type and
    constant initializer corresponding to them. These are provided primarily for C
    FFI interoperation, but are often useful to otherwise simply have a global
    lock. The types, however, will leak memory unless destroy() is called on
    them, which is clearly documented.

  • The fundamental architecture of this design is to provide two separate layers.
    The first layer is that exposed by sys_common which is a cross-platform
    bare-metal abstraction of the system synchronization primitives. No attempt is
    made at making this layer safe, and it is quite unsafe to use! It is currently
    not exported as part of the API of the standard library, but the stabilization
    of the sys module will ensure that these will be exposed in time. The
    purpose of this layer is to provide the core cross-platform abstractions if
    necessary to implementors.

    The second layer is the layer provided by std::sync which is intended to be
    the thinnest possible layer on top of sys_common which is entirely safe to
    use. There are a few concerns which need to be addressed when making these
    system primitives safe:

    • Once used, the OS primitives can never be moved. This means that they
      essentially need to have a stable address. The static primitives use
      &'static self to enforce this, and the non-static primitives all use a
      Box to provide this guarantee.
    • Poisoning is leveraged to ensure that invalid data is not accessible from
      other tasks after one has panicked.

    In addition to these overall blanket safety limitations, each primitive has a
    few restrictions of its own:

    • Mutexes and rwlocks can only be unlocked from the same thread that they
      were locked by. This is achieved through RAII lock guards which cannot be
      sent across threads.
    • Mutexes and rwlocks can only be unlocked if they were previously locked.
      This is achieved by not exposing an unlocking method.
    • A condition variable can only be waited on with a locked mutex. This is
      achieved by requiring a MutexGuard in the wait() method.
    • A condition variable cannot be used concurrently with more than one mutex.
      This is guaranteed by dynamically binding a condition variable to
      precisely one mutex for its entire lifecycle. This restriction may be able
      to be relaxed in the future (a mutex is unbound when no threads are
      waiting on the condvar), but for now it is sufficient to guarantee safety.
  • Condvars support timeouts for their blocking operations. The
    implementation for these operations is provided by the system.

Due to the modification of the Condvar API, removal of the std::sync::mutex
API, and reimplementation, this is a breaking change. Most code should be fairly
easy to port using the examples in the documentation of these primitives.

[breaking-change]

Closes #17094
Closes #18003

@rust-highfive
Copy link
Collaborator

warning Warning warning

  • These commits modify unsafe code. Please review it carefully!

@alexcrichton
Copy link
Member Author

Note that this is duplicating a lot of implementation details of librustrt and libsync as of this red hot minute. Once these two libs are merged into std I'll delete all the duplication and make sure that we've only got one copy of the definitions. This PR is meant to reflect the final state of the stdlib for std::sync and its implementation details.

@sfackler
Copy link
Member

Condvars and RWLocks now support timeouts for their blocking operations. The
implementation for these operations is provided by the system.

Woo!

/// specified duration.
///
/// The semantics of this function are equivalent to `wait()` except that
/// the thread will be blocked for no longer than `dur`. If the wait timed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to qualify this a bit - it's not really possible in general to unblock the thread after exactly the right duration.

@sfackler
Copy link
Member

I didn't see any timeout stuff in the RWLock impl, btw

@alexcrichton alexcrichton force-pushed the rewrite-sync branch 2 times, most recently from 48e3d43 to 5420ea8 Compare November 24, 2014 21:54
@alexcrichton
Copy link
Member Author

Ah yes sorry, I mis-remembered what was added to rwlocks. RWLocks now support try_read and try_write, not timeouts just yet. I'd like to provide a unix extension trait to provide these functions, but Windows at least does not implement locking with a timeout.

@alexcrichton
Copy link
Member Author

One thing I also just remembered, this removes the ability to use a condition variable with a RWLock for now. We can build something such as C++'s condition_variable_any in the future, but this PR is just aimed at exposing the system primitives for now.

@alexcrichton
Copy link
Member Author

Now that #19255 has merged I've rebased on top and removed all implementations of primitives based on channels. At the same time I have also remove all concurrent queues from the public interface as I don't think we're going to be able to stabilize them before 1.0. Two are kept as implementation details of channels, but the chase-lev deque and mpmc queue have been entirely removed.

// We need a while loop to guard against spurious wakeups.
// http://en.wikipedia.org/wiki/Spurious_wakeup
while local_gen == lock.generation_id &&
lock.count < self.num_threads {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this second comparison needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly just copy-pasted from the existing implementation, but I believe the first comparison is to determine whether this was the final thread (the notifier) or a thread which needs to wait. The second comparison is redundant the first iteration of the loop, but all other iterations it's intended to help with spurious wakeups.

}
}

/// Block the current thread until all tasks has rendezvoused here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"tasks have"


#[test]
fn test_sem_runtime_friendly_blocking() {
// Force the runtime to schedule two threads on the same sched_loop.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is out of date :P

@aturon
Copy link
Member

aturon commented Nov 26, 2014

OK, I've finished reading this over and it looks great! (I didn't scrutinize the OS bindings.)

I think there are a lot of questions going forward about the organization of this module, static variants, and additional system bindings, but this looks like a great step forward. I'm happy for it to land for now, and then we can talk through exactly what we want to stabilize a bit later.

r=me after nits and a rebase.

@sfackler
Copy link
Member

Needs a rebase

assert_eq!(*lock2, 1);
tx.send(());
});
rx.recv();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rx.recv() is nice because the old version couldn't catch a possible assertion failure in the spawned proc. Can we perhaps define a macro to standardize this pattern and to make sure there's no longer any instances of meaningless assert_eq!?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be somewhat difficult to refactor out into a standard pattern, but for now I'm going to focus on landing this to unblock some more runtime removal.


// First, figure out what time it currently is
let mut tv = libc::timeval { tv_sec: 0, tv_usec: 0 };
let r = ffi::gettimeofday(&mut tv, 0 as *mut _);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't correct. You need to use the monotonic clock support. The system time can and will often change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thestinger Please remember that Linux CLOCK_MONOTONIC is susceptible to machine hibernation during which TSC doesn't count up. We would have to wait for futex(2) to support CLOCK_BOOTTIME for truly correct timeout operation. So IMHO it's acceptable to live with this code meantime (unless we don't forget that we're repeating the same bugs found in Java bug tracker for long...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't personally want to debate these details on this PR as this is blocking more runtime removal and reform. I would like to prioritize pushing through this rewrite as much as possible so the wait_timeout functions are now non-public with comments pointing at this PR to review the implementation before making them public.

@thestinger
Copy link
Contributor

It's painful (but possible) to do timeout support properly on Windows. It should be left out if it's not going to be done correctly because having incorrect implementations is worse than not having the feature.

This commit is a reimplementation of `std::sync` to be based on the
system-provided primitives wherever possible. The previous implementation was
fundamentally built on top of channels, and as part of the runtime reform it has
become clear that this is not the level of abstraction that the standard level
should be providing. This rewrite aims to provide as thin of a shim as possible
on top of the system primitives in order to make them safe.

The overall interface of the `std::sync` module has in general not changed, but
there are a few important distinctions, highlighted below:

* The condition variable type, `Condvar`, has been separated out of a `Mutex`.
  A condition variable is now an entirely separate type. This separation
  benefits users who only use one mutex, and provides a clearer distinction of
  who's responsible for managing condition variables (the application).

* All of `Condvar`, `Mutex`, and `RWLock` are now directly built on top of
  system primitives rather than using a custom implementation. The `Once`,
  `Barrier`, and `Semaphore` types are still built upon these abstractions of
  the system primitives.

* The `Condvar`, `Mutex`, and `RWLock` types all have a new static type and
  constant initializer corresponding to them. These are provided primarily for C
  FFI interoperation, but are often useful to otherwise simply have a global
  lock. The types, however, will leak memory unless `destroy()` is called on
  them, which is clearly documented.

* The `Condvar` implementation for an `RWLock` write lock has been removed. This
  may be added back in the future with a userspace implementation, but this
  commit is focused on exposing the system primitives first.

* The fundamental architecture of this design is to provide two separate layers.
  The first layer is that exposed by `sys_common` which is a cross-platform
  bare-metal abstraction of the system synchronization primitives. No attempt is
  made at making this layer safe, and it is quite unsafe to use! It is currently
  not exported as part of the API of the standard library, but the stabilization
  of the `sys` module will ensure that these will be exposed in time. The
  purpose of this layer is to provide the core cross-platform abstractions if
  necessary to implementors.

  The second layer is the layer provided by `std::sync` which is intended to be
  the thinnest possible layer on top of `sys_common` which is entirely safe to
  use. There are a few concerns which need to be addressed when making these
  system primitives safe:

    * Once used, the OS primitives can never be **moved**. This means that they
      essentially need to have a stable address. The static primitives use
      `&'static self` to enforce this, and the non-static primitives all use a
      `Box` to provide this guarantee.

    * Poisoning is leveraged to ensure that invalid data is not accessible from
      other tasks after one has panicked.

  In addition to these overall blanket safety limitations, each primitive has a
  few restrictions of its own:

    * Mutexes and rwlocks can only be unlocked from the same thread that they
      were locked by. This is achieved through RAII lock guards which cannot be
      sent across threads.

    * Mutexes and rwlocks can only be unlocked if they were previously locked.
      This is achieved by not exposing an unlocking method.

    * A condition variable can only be waited on with a locked mutex. This is
      achieved by requiring a `MutexGuard` in the `wait()` method.

    * A condition variable cannot be used concurrently with more than one mutex.
      This is guaranteed by dynamically binding a condition variable to
      precisely one mutex for its entire lifecycle. This restriction may be able
      to be relaxed in the future (a mutex is unbound when no threads are
      waiting on the condvar), but for now it is sufficient to guarantee safety.

* Condvars now support timeouts for their blocking operations. The
  implementation for these operations is provided by the system.

Due to the modification of the `Condvar` API, removal of the `std::sync::mutex`
API, and reimplementation, this is a breaking change. Most code should be fairly
easy to port using the examples in the documentation of these primitives.

[breaking-change]

Closes rust-lang#17094
Closes rust-lang#18003
@alexcrichton alexcrichton merged commit c3adbd3 into rust-lang:master Dec 5, 2014
@alexcrichton alexcrichton deleted the rewrite-sync branch December 5, 2014 23:50
rozaliev added a commit to rozaliev/mio that referenced this pull request Dec 8, 2014
mpmc was removed from stdlib, so we just vendor it for now
(rust-lang/rust#19274)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants