Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add experimental support for futures #193

Merged
merged 25 commits into from
Jan 16, 2017
Merged

Add experimental support for futures #193

merged 25 commits into from
Jan 16, 2017

Conversation

nikomatsakis
Copy link
Member

@nikomatsakis nikomatsakis commented Dec 30, 2016

This branch adds a new method to a scope: spawn_future(F). The spawn_future API allows Rayon to play the role of an executor. That is, it takes some future F and gives it life, causing it to start executing. The result is another future, called a rayon future, which can be used to check if the result of F is ready.

The role of Rayon and futures

To understand the role Rayon plays here, recall that a future F is basically the plan for an async computation, much like an iterator is a plan for a loop. Thus a future by itself is inert. When you invoke spawn_future(), however, Rayon starts to put that plan into action: it pushes a job to a worker thread which will invoke poll() on the future F. This will trigger various bits of work to be done and may wind up blocked on I/O requests and the like. In the meantime, you get back another future F' that you can use to check on the status of this work or to compose new futures.

The simplest usage pattern, where you just want to push some work to another thread and then block on it and use the result, is like so:

scope(|s| {
    let x = s.spawn_future(future_x).rayon_wait();
    let y = s.spawn_future(future_y).rayon_wait();
    do_computation(x, y);
});

Note the use of the rayon_wait() method instead of wait() -- rayon_wait() will block intelligently, so that even if you are on a Rayon worker thread the system doesn't seize up. However, blocking is not the recommended way to use futures. Instead, it would be better to compose newer and bigger futures that use the result from spawn -- or, better yet, compose the futures before you spawn. If you must block, block at the very end:

scope(|s| {
    s.spawn_future(
        future_x
            .join(future_y)
            .map(|(x, y)| do_computation(x, y))
    ).rayon_wait()
})

cc @alexcrichton @aturon @carllerche -- please double-check my understanding here :)

Comparing spawn() and spawn_future()

So how does spawn_future() compare to the existing scope.spawn()? The rule of thumb is that spawn() is used to launch a computation for side-effects whereas spawn_future() is used to launch a computation for its result. You can observe the difference when it comes to the result type: spawn() takes a closure that returns (), so if you want to get any value out, it must write it somewhere external. In contrast, the future you give to spawn_future() has a result type.

Another place that this difference is important is cancellation. If you drop the future that is returned by spawn_future(), that is interpreted as a signal that you no longer care about that result. This will cause the spawned future to stop executing, possibly before its complete. This is a key mechanism used throughout the futures library to signal when results are no longer needed and hence avoid doing useless work. In contrast, once you spawn a task with s.spawn(), it will always execute. There is no way to cancel it.

API questions

API-wise, I've kept this to the bare minimum for the moment, simply adding spawn_future(). My general plan however is to do the following:

  • There is no support here for streams.
  • Add some form of spawn_future_fn() wrappers that, instead of taking a future, takes a closure and create a future for its result (using future::lazy). We probably want one for closures that return Result (in which case the future is fallible) and one for an "infallible" computation (this would wrap the result of the closure in Ok(), basically).
  • Add corresponding free functions spawn_async() and spawn_future_async(). These are analogous to the scope() methods but they execute outside of any scope, just injecting spawned jobs into the asynchronous thread-pool. The idea is that there is (conceptually) always an outermost scope that you don't have a handle to. This would solve the Servo use-case of wanting to inject work into a parallel thread-pool and query its result later (ideally, you would use spawn_future_async() for that, since it fits into the "inject job for result" use-case, not "inject job for side-effects").

I'd probably pursue all of those in follow-up PRs.

Status

Could use more tests but I think it's good to go.

Work items:

  • More tests (it's kind of hard to write stand-alone tests for this stuff!)
  • Better docs probably
  • Simplify cancellation implementation
    • In particular, my notes suggest there is a race condition, though I don't recall what it is :)
    • But basically we should just set some flag to true and unpark
  • Tests for future cancellation?
  • Account for possibility that unpark() could panic
  • Make a distinct feature for "futures" support?
  • "Drop of spawn could panic" -- not sure what I meant by this just now

use std::sync::mpsc::channel;
let (tx, rx) = channel();
let a = s.spawn_future(lazy(move || Ok::<usize, ()>(rx.recv().unwrap())));
// ^^^^ FIXME: why is this needed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rx.recv only requires &self, so the closure by default captures &rx, which doesn't live outside scope, which the future is required to outlast, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I thought rx.recv was fn(self)

@alexcrichton
Copy link

cc @alexcrichton @aturon @carllerche -- please double-check my understanding here :)

That all sounds great to me!

To double check my own understanding as well, a lot of this is very similar to basically:

fn spawn_rayon<F: Future>(f: F) -> RayonFuture {
    let (tx, rx) = oneshot::channel();
    rayon.spawn(|| tx.send(f.wait()));
    return tx
}

Except that this solves a few crucial problems when literally using a "oneshot"

  • When a future is blocked, rayon can continue to make progress with other work. That is, the blocking happens on a queue, not by taking a thread.
  • The rayon version is slightly more optimized, only requiring one allocation, not two.
  • Rayon handles everything like panics for you.

There is no support here for streams.

I'm curious, what's your thinking here? E.g. what would a stream look like? I could imagine that futures::sync would suffice for some bare bones business, but I suppose that like why oneshot isn't used under the hood everywhere it'd want to interact differently with rayon. Do you have some API sketches in mind?

Add some form of spawn_future_fn() wrappers that, instead of taking a future, takes a closure and create a future for its result (using future::lazy).

Yeah this is the general trend of "those things that can spawn futures" right now. We're likely to consider a unifying spawn trait (rust-lang/futures-rs#313) in which case this'd come for free. I haven't thought too much about the trait here, but we'd definitely want to consider rayon when designing it!

We probably want one for closures that return Result (in which case the future is fallible) and one for an "infallible" computation (this would wrap the result of the closure in Ok(), basically).

The general trend is to have spawn(Future) and spawn_fn(FnOnce() -> IntoFuture), so that'd work for fallibly computations as Result implements IntoFuture

Add corresponding free functions spawn_async() and spawn_future_async().

There's some prior art on CpuFuture for this with an inherent forget function on the normal spawn value. Would that suffice for this use case to avoid adding new spawning methods?

@nikomatsakis
Copy link
Member Author

@alexcrichton

Except that this solves a few crucial problems when literally using a "oneshot"

I think that is true.

I'm curious, what's your thinking here? E.g. what would a stream look like?

I don't know! I haven't looked at all at streams really. Maybe I'm wrong and there isn't a role for Rayon here? But that seems surprising to me.

We're likely to consider a unifying spawn trait (rust-lang/futures-rs#313) in which case this'd come for free.

Makes sense. @carllerche also pointed me at this repository also. One thing I noticed there is that some of the wrappers seem like they would result in >1 allocation per future which, yes, I was trying to avoid.

There's some prior art on CpuFuture for this with an inherent forget function on the normal spawn value. Would that suffice for this use case to avoid adding new spawning methods?

No, but that's interesting. I could add such a method. That said, it doesn't suffice really. The goal of the async methods is to be able to spawn a future without creating a scope. The futures you spawn would naturally have a 'static bound, since there is no scope to attach them to. It'd basically be equivalent to using future-cpupool except that you would share the same threadpool with other Rayon users, which is generally desirable.

@alexcrichton
Copy link

@nikomatsakis oh that all makes sense to me. Looking forward to see how this turns out!

@nikomatsakis
Copy link
Member Author

@alexcrichton fyi I simplified the cancellation semantics per our discussion. Now it just sets a flag and unparks, basically.

@alexcrichton
Copy link

👍

@nikomatsakis
Copy link
Member Author

@alexcrichton so I was looking at making the handling of unpark() more robust and I encountered a question that I didn't remember the answer to. What am I supposed to do if someone calls poll() but there is already a registered unpark handler? Should I just... accumulate them all? Replace the old one with the new one?

@nikomatsakis
Copy link
Member Author

For now I settled with "remember the most recent unpark value supplied". It seems like, if things are proceeding as expected, there should only be one task waiting, and hence any later calls to poll will either be the same unpark() or an updated one that I ought to use instead.

@aturon
Copy link

aturon commented Jan 11, 2017

For now I settled with "remember the most recent unpark value supplied". It seems like, if things are proceeding as expected, there should only be one task waiting, and hence any later calls to poll will either be the same unpark() or an updated one that I ought to use instead.

Yep, that's the same assumption we make throughout the library.

@nikomatsakis
Copy link
Member Author

So I think I've hardened the code against most sources of user panics. One thing we are not protected against, but I've decided it's a hopeless battle, is if the future's Drop method panics. I say a hopeless battle because once I started down that road I realized that there are so many places that code assumes that Drop will not panic it seems very unlikely that we could ever plug them all -- and panicking in a drop is already highly dubious and likely to yield a "double panic" abort.

Examples of what make this so hard: If the future's drop panics, it actually triggers during the poll method of CatchUnwind; we can catch it there easily enough. But...often you wind up forwarding panic values to a central location. If you have more than one, I've been dropped the others -- but what if the destructor for the panic return value should panic? Also, what about user closures in other threads? etc.

I think it makes sense at least for now to accept that if your Drop impl panics, your process may well abort, unless the context in which a drop occurs is fully under your control. If we were going to try and harden against this -- which might make sense, but I'm unpersuaded as of yet -- it would have to be a more comprehensive effort.

@nikomatsakis
Copy link
Member Author

OK, I think this branch is basically ready to go, though it would be good to add a few more tests.

@nikomatsakis
Copy link
Member Author

@cuviper -- would you like to review the logic here? I've gone over it once with @alexcrichton.

I decided to keep this under the unstable feature for now; when we want to move it to stable, I'd probably make a special (on by default maybe?) feature to request the extra dependency on the futures library.

if WorkerThread::current().is_null() {
executor::spawn(self).wait_future()
} else {
panic!("using `wait()` in a Rayon thread is unwise; try `rayon_wait()`")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just do the right thing? i.e. make rayon_wait the true RayonFuture::wait.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I don't want people to think that calling wait() in Rayon code is OK. It only works for a RayonFuture -- any other kind of future will do the wrong thing. Therefore, I want them to write rayon_wait() so that, if they happen to invoke it on some random future, it will error out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example how this could go wrong if I were encouraging people to call wait:

let v = scope.spawn_future(f).and_then(|x| x + 1).wait();

Here wait is being called on an AndThen<RayonFuture<F>>. But if they had written .rayon_wait(), then it would have failed to compile. What would work is:

let v = scope.spawn_future(scope.spawn_future(f).and_then(|x| x + 1)).rayon_wait();

Although you'd be better off not spawning twice:

let v = scope.spawn_future(f.and_then(|x| x + 1)).rayon_wait();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to make RayonFuture::wait a compile-time error, rather than a panic? Probably not, since we don't control the trait, but it's ugly that this will only show up at runtime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there is no way to do that; but I think the right thing is for the futures library to offer some sort of hook (perhaps via a thread-local?) to customize how wait behaves. That said, the truth is that if you are using wait() -- even rayon_wait() -- you are probably using futures wrong. The right thing would be to make a "follow-up" future and schedule that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the docs for wait() do warn that using it can will lead to deadlock, however.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that any references are still valid, or else the user shouldn't be
able to call `poll()`. (The same is true at the time of cancellation,
but that's not important, since `cancel()` doesn't do anything of
interest.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me nervous, as it probably should since you bothered to document it so carefully. But I don't have any concrete objection, and I trust your intuition on this more than my own, so... 🤷‍♂️

Copy link
Member Author

@nikomatsakis nikomatsakis Jan 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the most straight-forward bit of reasoning, so being nervous is reasonable. However, I don't really think there's a reason to be nervous about accessing the result (famous last words...). As I wrote, the fundamental premise of Rust's type system is that T and E must either be in scope (i.e., only contain live, valid references) or else no data of that type must actually be reachable (this can occur in corner cases). Since the type of the result contains only T and E, it had better be valid or else Rust is pretty fundamentally broken.

What could make you nervous is that I've transmuted the type to hide the other data in the struct, and hence THOSE fields (if they had references) might not be in scope. An example would be the spawn field, which contains a value of type F (the future type). However, that shouldn't be a problem, both because the code doesn't access those fields and because we set them to None (so there is in fact no data of type F reachable).

@nikomatsakis
Copy link
Member Author

@cuviper -- any objection to me landing this? Naturally we can iterate on the API (including the rayon_wait() business), but I'd like to get it in so we can start basing further changes on it.

(One thing I wouldn't mind talking over, and I may try to write-up an RFC issue or something, is the relationship of spawn() and spawn_future(). I have some thoughts here but I keep going back and forth.)

@cuviper
Copy link
Member

cuviper commented Jan 16, 2017 via email

@nikomatsakis nikomatsakis merged commit 0f5bd6a into master Jan 16, 2017
@nikomatsakis
Copy link
Member Author

Done! Gonna be time for a new release soon, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants