Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace `std::task` with `std::thread`, remove librustrt, make final "runtime" features lazily initialize #19654

Merged
merged 24 commits into from Dec 19, 2014

Conversation

Projects
None yet
@aturon
Copy link
Member

aturon commented Dec 9, 2014

This PR substantially narrows the notion of a "runtime" in Rust, and allows calling into Rust code directly without any setup or teardown.

After this PR, the basic "runtime support" in Rust will consist of:

  • Unwinding and backtrace support
  • Stack guards

Other support, such as helper threads for timers or the notion of a "current thread" are initialized automatically upon first use.

When using Rust in an embedded context, it should now be possible to call a Rust function directly as a C function with absolutely no setup, though in that case panics will cause the process to abort. In this regard, the C/Rust interface will look much like the C/C++ interface.

In more detail, this PR:

  • Merges librustrt back into std::rt, undoing the facade. While doing so, it removes a substantial amount of redundant functionality (such as mutexes defined in the rt module). Code using librustrt can now call into std::rt to e.g. start executing Rust code with unwinding support.
  • Allows all runtime data to be initialized lazily, including the "current thread", the "at_exit" infrastructure, and the "args" storage.
  • Deprecates and largely removes std::task along with the widespread requirement that there be a "current task" for many APIs in std. The entire task infrastructure is replaced with std::thread, which provides a more standard API for manipulating and creating native OS threads. In particular, it's possible to join on a created thread, and to get a handle to the currently-running thread. In addition, threads are equipped with some basic blocking support in the form of park/unpark operations (following a tradition in some OSes as well as the JVM). See the std::thread documentation for more details.
  • Channels are refactored to use a new internal blocking infrastructure that itself sits on top of park/unpark.

One important change here is that a Rust program ends when its main thread does, following most threading models. On the other hand, threads will often be created with an RAII-style join handle that will re-institute blocking semantics naturally (and with finer control).

This is very much a:

[breaking-change]

Closes #18000
r? @alexcrichton

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Dec 9, 2014

Note: this PR does not attempt to exhaustively replace the task terminology with thread, nor does it clear out all use of the now-deprecated APIs. Both are left to follow-up work.

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Dec 9, 2014

Note also: the implementation of park/unpark is currently pretty silly: it just uses a lock and condvar. In the future these can be implemented in more clever ways, e.g. following the JVM.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Dec 9, 2014

@@ -92,7 +92,7 @@ impl<'a, T, Sized? B> BorrowFrom<Cow<'a, T, B>> for B where B: ToOwned<T> {

/// Trait for moving into a `Cow`
pub trait IntoCow<'a, T, Sized? B> {
/// Moves `self` into `Cow`
/// Moves `serlf` into `Cow`

This comment has been minimized.

@alexcrichton
@@ -473,18 +473,22 @@ pub fn monitor(f: proc():Send) {
static STACK_SIZE: uint = 32000000; // 32MB

let (tx, rx) = channel();
let w = io::ChanWriter::new(tx);
let mut w = Some(io::ChanWriter::new(tx)); // option dance

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

How come this dance is necessary? I would have thought that it could move into the proc below

This comment has been minimized.

@engstad

engstad Dec 9, 2014

If I am not mistaken, this is due to a limitation of closures not being able to capture w directly. I've seen this trick several places in the code-base, hoping it would disappear at some point - perhaps with unboxed closures?

@@ -868,7 +868,11 @@ fn run_work_multithreaded(sess: &Session,
let diag_emitter = diag_emitter.clone();
let remark = sess.opts.cg.remark.clone();

let future = TaskBuilder::new().named(format!("codegen-{}", i)).try_future(proc() {
let (tx, rx) = channel();
let mut tx = Some(tx);

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Similar to above, how come this needs an option?

pub struct WaitToken {
inner: Arc<Inner>,
no_send: NoSend,
}

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Should this also require NoSync to prevent from sharing amongst fork-join parallelism? I'm thinking of a function like this, but it may only be safe with Send + Sync anyway, so this is probably enough.

This comment has been minimized.

@reem

reem Dec 9, 2014

Contributor

Right now the only safe bound for that function is Send + Sync, since Sync is overly permissive (allows stuff like thread-local references or other non-transferrable types).

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Dec 9, 2014

Once piece of feedback I'd like: is spawn and with_join the right setup? That makes detached threads the "default". We could alternatively do spawn and spawn_detached, but I worry this will be surprising to people.

no_send: NoSend,
}

pub fn tokens() -> (WaitToken, SignalToken) {

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Out of curiosity, doesn't this abstraction break if the thread is blocked on anything other than WaitToken while some SignalTokens are signaled? I'm thinking that you call tokens, signal the token, manually call Thread::park(), and then call wait, you'd sleep forever, right? That seems pretty silly though, and I imagine we could document this example when it's a public-facing API.

This comment has been minimized.

@aturon

aturon Dec 9, 2014

Author Member

It's correct that you won't get unparked by that signal, but you could be unparked by a different signal. But in any case, you're probably doing something wrong.

(Note that the intent of structures like this and park/unpark is for building blocking concurrency abstractions like channels; they aren't a good fit for application-level concurrent programming. See http://gee.cs.oswego.edu/dl/papers/aqs.pdf)

/// The state field is protected by this mutex
lock: NativeMutex,
state: UnsafeCell<State<T>>,
lock: Mutex<State<T>>,

This comment has been minimized.

@alexcrichton

This comment has been minimized.

@aturon

aturon Dec 9, 2014

Author Member

You know, I probably could've saved myself a lot of pain by also replacing the queue and buf data structures with something from std. Mutexes around Copy data ... bad news.

_ => unreachable!()
}
});
let (wait_token, signal_token) = blocking::tokens();

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

I somewhat fear the perf impact of creating a new set of tokens each time a task blocks, but I'd personally rather land the runtime removal and optimize later.

This comment has been minimized.

@aturon

aturon Dec 9, 2014

Author Member

Same here. FWIW, it would be quite easy to re-instate the BlockedTask-style enum.

This comment has been minimized.

@aturon

aturon Dec 9, 2014

Author Member

Although note, this is basically just one allocation per context switch. Probably a drop in the bucket.

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

agreed, I'd want to measure before taking any action

@@ -531,7 +522,7 @@ mod tests {
assert_eq!(r.read_to_string().unwrap(), "hello!\n");
}

#[test]
//#[test]

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

accident?

This comment has been minimized.

@aturon

aturon Dec 9, 2014

Author Member

Nope: it's the reason this PR is titled [WIP].

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

aha! I see :)


pub mod task;
#[allow(missing_docs)]

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Could this be removed now?


static mut GLOBAL_ARGS_PTR: uint = 0;
static LOCK: StaticNativeMutex = NATIVE_MUTEX_INIT;
static LOCK: NativeMutex = MUTEX_INIT;

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

NativeMutex is gone, right?

This comment has been minimized.

@aturon

aturon Dec 9, 2014

Author Member

As you can see, I develop on macos.

}
}

pub fn push(f: proc():Send) {
INIT.doit(init);

This comment has been minimized.

@alexcrichton
unsafe {
// Note that the check against 0 for the queue pointer is not atomic at
// all with respect to `run`, meaning that this could theoretically be a
// use-after-free. There's not much we can do to protect against that,
// however. Let's just assume a well-behaved runtime and go from there!
rtassert!(!RUNNING.load(atomic::SeqCst));

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Isn't this an important assertion to keep? If you re-enqueue some cleanup while we're cleaning up we'll just leak it?

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Or... you'll hit the assertion below, nevermind!

args::init(argc, argv);
sys::thread::guard::init();
sys::stack_overflow::init();
unwind::register(failure::on_fail);

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Do you think this could be moved to a Once?

This comment has been minimized.

@aturon

aturon Dec 9, 2014

Author Member

The whole init function? It's an interesting idea...

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Maybe not the whole function, but at least unwind::register. The others may be more difficult to move around.

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

I'm also fine developing our story here over time and not hammering it out all immediately.

///
/// This function will only return once *all* native threads in the system have
/// exited.
// FIXME: this should be unsafe

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Feel free to make it so :)

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Also feel free to merge this upwards into lang_start

This comment has been minimized.

@aturon

aturon Dec 9, 2014

Author Member

I tried to mark it unsafe and ran into a weird issue so left it alone. I can try it again.

@@ -162,5 +195,7 @@ pub fn start(argc: int, argv: *const *const u8, main: proc()) -> int {
/// Invoking cleanup while portions of the runtime are still in use may cause
/// undefined behavior.
pub unsafe fn cleanup() {
rustrt::cleanup();
args::cleanup();
sys::stack_overflow::cleanup();

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Perhaps both args and stack_overflow could use at_exit to schedule cleanups? That way we only have to get one thing working with atexit to get it all to work.

if ret as uint == 0 {
// be sure to not leak the closure
let _p: Box<proc():Send> = mem::transmute(arg);
panic!("failed to spawn native thread: {}", ret);

This comment has been minimized.

@sfackler

sfackler Dec 9, 2014

Member

Should there be an option to handle this error rather than panic?

// Indicates whether we should perform expensive sanity checks, including rtassert!
//
// FIXME: Once the runtime matures remove the `true` below to turn off rtassert,
// etc.

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

This comment makes me chuckle

}
}

impl Drop for Handler {

This comment has been minimized.

@sfackler

sfackler Dec 9, 2014

Member

Does this need to impl Drop explicitly or can it just contain NoCopy?

thread: Thread,
}

thread_local!(static THREAD_INFO: RefCell<Option<ThreadInfo>> = RefCell::new(None))

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Instead of an Option, perhaps this could have the lazy initialization expression here?

This comment has been minimized.

@aturon

aturon Dec 14, 2014

Author Member

The lazy init expression is only used if you access the thread info from a thread not launched via Rust APIs; otherwise it is explicitly initialized with an already-created Thread.

This comment has been minimized.

@alexcrichton

alexcrichton Dec 14, 2014

Member

Ah yeah after looking below I see that this is necessary, carry on!


impl ThreadInfo {
fn with<R>(f: |&mut ThreadInfo| -> R) -> R {
THREAD_INFO.with(|c| {

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

This borrow worries me a little in that once the THREAD_INFO has its destructor run, all blocking services will abort the process because they won't be able to access the current thread.

I'm not really sure of a great way to get around that, but perhaps this could yield a nicer panic message?

This comment has been minimized.

@aturon

aturon Dec 14, 2014

Author Member

I can imagine a few ways to do better, but they involve bubbling up this possibility in APIs, e.g. by making Thread::current() yield an Option. I'm not sure how much to worry about this edge case.

I'm also not sure how to generate a better panic message here -- isn't with going to cause the panic? Looking at the docs, it doesn't seem like .destroyed covers all the cases that might lead to this panic.

This comment has been minimized.

@alexcrichton

alexcrichton Dec 14, 2014

Member

For generating a better panic message you'd want to check .destroyed() beforehand yeah. I think that may be good enough for now, we could possibly add more support later for something like repeated re-construction if necessary.

let hash = msg.bytes().fold(0, |accum, val| accum + (val as uint) );
let quote = match hash % 10 {
0 => "
It was from the artists and poets that the pertinent answers came, and I

This comment has been minimized.

@sfackler

sfackler Dec 9, 2014

Member

Glad to see this is surviving the transition :)

//! shared-memory data structures. In particular, types that are guaranteed to
//! be threadsafe are easily shared between threads using the
//! atomically-reference-counted container,
//! [`Arc`](../../std/sync/struct.Arc.html).

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

This is somewhat oddly worded to me where it starts out by saying memory is isolated, but then it explains that I can indeed share memory. Maybe the emphasis at the top could be toned down a bit?

This comment has been minimized.

@aturon

aturon Dec 9, 2014

Author Member

Sure. (This was brought over from the old std::task docs, so I was trying not to change it too much, but I agree with you.)

stdout: Option<Box<Writer + Send>>,
// Thread-local stderr
stderr: Option<Box<Writer + Send>>,
}

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

The stdout and stderr options here seem pretty sketchy to me, I'd be fine removing them. They're implemented with a trivial wrapper around the provide proc regardless.


/// Thread configuation. Provides detailed control over the properties
/// and behavior of new threads.
pub struct Cfg {

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Cfg seems like a somewhat odd name for this, convention-wise it seems like it may want to be ThreadBuilder, but I do agree that's pretty long. We may be out of a noun like Command is to Process, but perhaps this could just be called Builder to avoid stuttering?

// suffices. The second is that unwinding while unwinding is not
// defined. We take care of that by having an 'unwinding' flag in
// the thread itself. For these reasons, this unsafety should be ok.
unsafe {

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Can this unsafety be contained to just unwind::try?

}

/// Spawn a joinable thread, and return an RAII guard for it.
pub fn with_join<T: Send>(f: proc():Send -> T) -> JoinGuard<T> {

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

This is a bit of an interesting choice of words here! Perhaps we could move spawn to the constructor new (aka Thread::new) and free up the word spawn for this to give us some more room, but that's still not exactly the best state of affairs.

This comment has been minimized.

@aturon

This comment has been minimized.

@alexcrichton

alexcrichton Dec 9, 2014

Member

Could you also expand the documentation to explain where the type parameter T comes into play and how you can use a JoinGuard to get it out? It may also be nice to warn about ignoring the JoinGuard and how it may block.

aturon and others added some commits Dec 7, 2014

Fix the capture_stderr test
There's always a fun time having two sets of standard libraries when testing!
Tweak the startup routine to pass on linux
We need to be sure to init thread_info before we init args for example because
args is grabbing locks which may entail looking at the local thread eventually.
Revise std::thread API to join by default
This commit is part of a series that introduces a `std::thread` API to
replace `std::task`.

In the new API, `spawn` returns a `JoinGuard`, which by default will
join the spawned thread when dropped. It can also be used to join
explicitly at any time, returning the thread's result. Alternatively,
the spawned thread can be explicitly detached (so no join takes place).

As part of this change, Rust processes now terminate when the main
thread exits, even if other detached threads are still running, moving
Rust closer to standard threading models. This new behavior may break code
that was relying on the previously implicit join-all.

In addition to the above, the new thread API also offers some built-in
support for building blocking abstractions in user space; see the module
doc for details.

Closes #18000

[breaking-change]
std: Lower abstractions for thread_local/at_exit
The current implementations use `std::sync` primitives, but these primitives
currently end up relying on `thread_info` and a local `Thread` being available
(mainly for checking the panicking flag).

To get around this, this commit lowers the abstractions used by the windows
thread_local implementation as well as the at_exit_imp module. Both of these
modules now use a `sys::Mutex` and a `static mut` and manage the
allocation/locking manually.
std: Move the panic flag to its own thread local
This flag is somewhat tied to the `unwind` module rather than the `thread_info`
module, so this commit moves it into that module as well as allowing the same OS
thread to call `unwind::try` multiple times. Previously once a thread panicked
its panic flag was never reset, even after exiting the panic handler.
Delete rest of rustrt
... and address other rebasing fallout.
Disable at_exit handlers
The [final step](#19654) of
runtime removal changes the threading/process model so that the process
shuts down when the main thread exits. But several shared resources,
like the helper thread for timeouts, are shut down when the main thread
exits (but before the process ends), and they are not prepared to be
used after shut down, but other threads may try to access them during
the shutdown sequence of the main thread.

As an interim solution, the `at_exit` cleanup routine is simply skipped.

Ultimately, these resources should be made to safely handle asynchronous
shutdown, usually by panicking if called from a detached thread when the
main thread is ending.

See issue for details #20012

This is a [breaking-change] for anyone relying on `at_exit`.

@aturon aturon force-pushed the aturon:merge-rt branch from 5fca87f to 903c5a8 Dec 19, 2014

@alexcrichton

This comment has been minimized.

Copy link

alexcrichton commented on 903c5a8 Dec 19, 2014

r+

This comment has been minimized.

Copy link

alexcrichton replied Dec 19, 2014

p=10

@bors

This comment has been minimized.

Copy link
Contributor

bors commented on 903c5a8 Dec 19, 2014

saw approval from alexcrichton
at aturon@903c5a8

This comment has been minimized.

Copy link
Contributor

bors replied Dec 19, 2014

merging aturon/rust/merge-rt = 903c5a8 into auto

This comment has been minimized.

Copy link
Contributor

bors replied Dec 19, 2014

status: {"merge_sha": "0efafac398ff7f28c5f0fe756c15b9008b3e0534"}

This comment has been minimized.

Copy link
Contributor

bors replied Dec 19, 2014

aturon/rust/merge-rt = 903c5a8 merged ok, testing candidate = 0efafac

This comment has been minimized.

Copy link
Contributor

bors replied Dec 19, 2014

fast-forwarding master to auto = 0efafac

This comment has been minimized.

Copy link
Contributor

bors replied Dec 19, 2014

fast-forwarding master to auto = 0efafac

bors added a commit that referenced this pull request Dec 19, 2014

auto merge of #19654 : aturon/rust/merge-rt, r=alexcrichton
This PR substantially narrows the notion of a "runtime" in Rust, and allows calling into Rust code directly without any setup or teardown. 

After this PR, the basic "runtime support" in Rust will consist of:

* Unwinding and backtrace support
* Stack guards

Other support, such as helper threads for timers or the notion of a "current thread" are initialized automatically upon first use.

When using Rust in an embedded context, it should now be possible to call a Rust function directly as a C function with absolutely no setup, though in that case panics will cause the process to abort. In this regard, the C/Rust interface will look much like the C/C++ interface.

In more detail, this PR:

* Merges `librustrt` back into `std::rt`, undoing the facade. While doing so, it removes a substantial amount of redundant functionality (such as mutexes defined in the `rt` module). Code using `librustrt` can now call into `std::rt` to e.g. start executing Rust code with unwinding support.

* Allows all runtime data to be initialized lazily, including the "current thread", the "at_exit" infrastructure, and the "args" storage.

* Deprecates and largely removes `std::task` along with the widespread requirement that there be a "current task" for many APIs in `std`. The entire task infrastructure is replaced with `std::thread`, which provides a more standard API for manipulating and creating native OS threads. In particular, it's possible to join on a created thread, and to get a handle to the currently-running thread. In addition, threads are equipped with some basic blocking support in the form of `park`/`unpark` operations (following a tradition in some OSes as well as the JVM). See the `std::thread` documentation for more details.

* Channels are refactored to use a new internal blocking infrastructure that itself sits on top of `park`/`unpark`.

One important change here is that a Rust program ends when its main thread does, following most threading models. On the other hand, threads will often be created with an RAII-style join handle that will re-institute blocking semantics naturally (and with finer control).

This is very much a:

[breaking-change]

Closes #18000
r? @alexcrichton

@bors bors merged commit 903c5a8 into rust-lang:master Dec 19, 2014

2 checks passed

continuous-integration/travis-ci The Travis CI build passed
Details
default all tests passed
@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented Dec 19, 2014

spawn() and other adjacent features like channel() were in the prelude, saying something to the new Rust user about Rust the language in the process. I think it would be nice to preserve that, the feel of being setup up for concurrency by default.

@reem

This comment has been minimized.

Copy link
Contributor

reem commented Dec 19, 2014

It also sort of suggested that they were builtin constructs, which I think their removal from the prelude is supposed to prevent.

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Dec 19, 2014

@bluss You might want to leave a comment at rust-lang/rfcs#503, which talks about the future of the prelude.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.