Add experimental support for futures #193

nikomatsakis · 2016-12-30T10:57:24Z

This branch adds a new method to a scope: spawn_future(F). The spawn_future API allows Rayon to play the role of an executor. That is, it takes some future F and gives it life, causing it to start executing. The result is another future, called a rayon future, which can be used to check if the result of F is ready.

The role of Rayon and futures

To understand the role Rayon plays here, recall that a future F is basically the plan for an async computation, much like an iterator is a plan for a loop. Thus a future by itself is inert. When you invoke spawn_future(), however, Rayon starts to put that plan into action: it pushes a job to a worker thread which will invoke poll() on the future F. This will trigger various bits of work to be done and may wind up blocked on I/O requests and the like. In the meantime, you get back another future F' that you can use to check on the status of this work or to compose new futures.

The simplest usage pattern, where you just want to push some work to another thread and then block on it and use the result, is like so:

scope(|s| {
    let x = s.spawn_future(future_x).rayon_wait();
    let y = s.spawn_future(future_y).rayon_wait();
    do_computation(x, y);
});

Note the use of the rayon_wait() method instead of wait() -- rayon_wait() will block intelligently, so that even if you are on a Rayon worker thread the system doesn't seize up. However, blocking is not the recommended way to use futures. Instead, it would be better to compose newer and bigger futures that use the result from spawn -- or, better yet, compose the futures before you spawn. If you must block, block at the very end:

scope(|s| {
    s.spawn_future(
        future_x
            .join(future_y)
            .map(|(x, y)| do_computation(x, y))
    ).rayon_wait()
})

cc @alexcrichton @aturon @carllerche -- please double-check my understanding here :)

Comparing `spawn()` and `spawn_future()`

So how does spawn_future() compare to the existing scope.spawn()? The rule of thumb is that spawn() is used to launch a computation for side-effects whereas spawn_future() is used to launch a computation for its result. You can observe the difference when it comes to the result type: spawn() takes a closure that returns (), so if you want to get any value out, it must write it somewhere external. In contrast, the future you give to spawn_future() has a result type.

Another place that this difference is important is cancellation. If you drop the future that is returned by spawn_future(), that is interpreted as a signal that you no longer care about that result. This will cause the spawned future to stop executing, possibly before its complete. This is a key mechanism used throughout the futures library to signal when results are no longer needed and hence avoid doing useless work. In contrast, once you spawn a task with s.spawn(), it will always execute. There is no way to cancel it.

API questions

API-wise, I've kept this to the bare minimum for the moment, simply adding spawn_future(). My general plan however is to do the following:

There is no support here for streams.
Add some form of spawn_future_fn() wrappers that, instead of taking a future, takes a closure and create a future for its result (using future::lazy). We probably want one for closures that return Result (in which case the future is fallible) and one for an "infallible" computation (this would wrap the result of the closure in Ok(), basically).
Add corresponding free functions spawn_async() and spawn_future_async(). These are analogous to the scope() methods but they execute outside of any scope, just injecting spawned jobs into the asynchronous thread-pool. The idea is that there is (conceptually) always an outermost scope that you don't have a handle to. This would solve the Servo use-case of wanting to inject work into a parallel thread-pool and query its result later (ideally, you would use spawn_future_async() for that, since it fits into the "inject job for result" use-case, not "inject job for side-effects").

I'd probably pursue all of those in follow-up PRs.

Status

Could use more tests but I think it's good to go.

Work items:

More tests (it's kind of hard to write stand-alone tests for this stuff!)
Better docs probably
Simplify cancellation implementation
- In particular, my notes suggest there is a race condition, though I don't recall what it is :)
- But basically we should just set some flag to true and unpark
Tests for future cancellation?
Account for possibility that unpark() could panic
- What to do with this error? Swallow it?
- c.f. RFC: Make panic from unpark() illegal? rust-lang/futures-rs#318
Make a distinct feature for "futures" support?
"Drop of spawn could panic" -- not sure what I meant by this just now

alexcrichton · 2016-12-30T18:40:11Z

src/scope/future/test.rs

+            use std::sync::mpsc::channel;
+            let (tx, rx) = channel();
+            let a = s.spawn_future(lazy(move || Ok::<usize, ()>(rx.recv().unwrap())));
+            //                          ^^^^ FIXME: why is this needed?


rx.recv only requires &self, so the closure by default captures &rx, which doesn't live outside scope, which the future is required to outlast, right?

Ah, I think I thought rx.recv was fn(self)

alexcrichton · 2016-12-30T18:48:54Z

cc @alexcrichton @aturon @carllerche -- please double-check my understanding here :)

That all sounds great to me!

To double check my own understanding as well, a lot of this is very similar to basically:

fn spawn_rayon<F: Future>(f: F) -> RayonFuture {
    let (tx, rx) = oneshot::channel();
    rayon.spawn(|| tx.send(f.wait()));
    return tx
}

Except that this solves a few crucial problems when literally using a "oneshot"

When a future is blocked, rayon can continue to make progress with other work. That is, the blocking happens on a queue, not by taking a thread.
The rayon version is slightly more optimized, only requiring one allocation, not two.
Rayon handles everything like panics for you.

There is no support here for streams.

I'm curious, what's your thinking here? E.g. what would a stream look like? I could imagine that futures::sync would suffice for some bare bones business, but I suppose that like why oneshot isn't used under the hood everywhere it'd want to interact differently with rayon. Do you have some API sketches in mind?

Add some form of spawn_future_fn() wrappers that, instead of taking a future, takes a closure and create a future for its result (using future::lazy).

Yeah this is the general trend of "those things that can spawn futures" right now. We're likely to consider a unifying spawn trait (rust-lang/futures-rs#313) in which case this'd come for free. I haven't thought too much about the trait here, but we'd definitely want to consider rayon when designing it!

We probably want one for closures that return Result (in which case the future is fallible) and one for an "infallible" computation (this would wrap the result of the closure in Ok(), basically).

The general trend is to have spawn(Future) and spawn_fn(FnOnce() -> IntoFuture), so that'd work for fallibly computations as Result implements IntoFuture

Add corresponding free functions spawn_async() and spawn_future_async().

There's some prior art on CpuFuture for this with an inherent forget function on the normal spawn value. Would that suffice for this use case to avoid adding new spawning methods?

nikomatsakis · 2016-12-30T20:19:35Z

@alexcrichton

Except that this solves a few crucial problems when literally using a "oneshot"

I think that is true.

I'm curious, what's your thinking here? E.g. what would a stream look like?

I don't know! I haven't looked at all at streams really. Maybe I'm wrong and there isn't a role for Rayon here? But that seems surprising to me.

We're likely to consider a unifying spawn trait (rust-lang/futures-rs#313) in which case this'd come for free.

Makes sense. @carllerche also pointed me at this repository also. One thing I noticed there is that some of the wrappers seem like they would result in >1 allocation per future which, yes, I was trying to avoid.

There's some prior art on CpuFuture for this with an inherent forget function on the normal spawn value. Would that suffice for this use case to avoid adding new spawning methods?

No, but that's interesting. I could add such a method. That said, it doesn't suffice really. The goal of the async methods is to be able to spawn a future without creating a scope. The futures you spawn would naturally have a 'static bound, since there is no scope to attach them to. It'd basically be equivalent to using future-cpupool except that you would share the same threadpool with other Rayon users, which is generally desirable.

alexcrichton · 2017-01-01T00:43:53Z

@nikomatsakis oh that all makes sense to me. Looking forward to see how this turns out!

nikomatsakis · 2017-01-10T20:05:18Z

@alexcrichton fyi I simplified the cancellation semantics per our discussion. Now it just sets a flag and unparks, basically.

alexcrichton · 2017-01-10T21:19:32Z

👍

nikomatsakis · 2017-01-11T10:33:34Z

@alexcrichton so I was looking at making the handling of unpark() more robust and I encountered a question that I didn't remember the answer to. What am I supposed to do if someone calls poll() but there is already a registered unpark handler? Should I just... accumulate them all? Replace the old one with the new one?

nikomatsakis · 2017-01-11T10:45:31Z

For now I settled with "remember the most recent unpark value supplied". It seems like, if things are proceeding as expected, there should only be one task waiting, and hence any later calls to poll will either be the same unpark() or an updated one that I ought to use instead.

aturon · 2017-01-11T16:28:28Z

For now I settled with "remember the most recent unpark value supplied". It seems like, if things are proceeding as expected, there should only be one task waiting, and hence any later calls to poll will either be the same unpark() or an updated one that I ought to use instead.

Yep, that's the same assumption we make throughout the library.

This permits us tighter ordering bounds.

This avoids an allocation and puts us in complete control.

We had some oversight with the old structure.

`LatchProbe` is a latch that can only be probed.

Also retool and add more tests.

Now that the `Park` trait in futures has no `'static` bound, we can do away with it.

Now, if we call `unpark()`, and that panics, we will propagate this to the enclosing `scope`.

nikomatsakis · 2017-01-13T09:28:49Z

So I think I've hardened the code against most sources of user panics. One thing we are not protected against, but I've decided it's a hopeless battle, is if the future's Drop method panics. I say a hopeless battle because once I started down that road I realized that there are so many places that code assumes that Drop will not panic it seems very unlikely that we could ever plug them all -- and panicking in a drop is already highly dubious and likely to yield a "double panic" abort.

Examples of what make this so hard: If the future's drop panics, it actually triggers during the poll method of CatchUnwind; we can catch it there easily enough. But...often you wind up forwarding panic values to a central location. If you have more than one, I've been dropped the others -- but what if the destructor for the panic return value should panic? Also, what about user closures in other threads? etc.

I think it makes sense at least for now to accept that if your Drop impl panics, your process may well abort, unless the context in which a drop occurs is fully under your control. If we were going to try and harden against this -- which might make sense, but I'm unpersuaded as of yet -- it would have to be a more comprehensive effort.

nikomatsakis · 2017-01-13T09:44:50Z

OK, I think this branch is basically ready to go, though it would be good to add a few more tests.

nikomatsakis · 2017-01-13T09:46:36Z

@cuviper -- would you like to review the logic here? I've gone over it once with @alexcrichton.

I decided to keep this under the unstable feature for now; when we want to move it to stable, I'd probably make a special (on by default maybe?) feature to request the extra dependency on the futures library.

cuviper · 2017-01-15T15:41:55Z

src/scope/future/mod.rs

+        if WorkerThread::current().is_null() {
+            executor::spawn(self).wait_future()
+        } else {
+            panic!("using  `wait()` in a Rayon thread is unwise; try `rayon_wait()`")


Why not just do the right thing? i.e. make rayon_wait the true RayonFuture::wait.

Because I don't want people to think that calling wait() in Rayon code is OK. It only works for a RayonFuture -- any other kind of future will do the wrong thing. Therefore, I want them to write rayon_wait() so that, if they happen to invoke it on some random future, it will error out.

Example how this could go wrong if I were encouraging people to call wait:

let v = scope.spawn_future(f).and_then(|x| x + 1).wait();

Here wait is being called on an AndThen<RayonFuture<F>>. But if they had written .rayon_wait(), then it would have failed to compile. What would work is:

let v = scope.spawn_future(scope.spawn_future(f).and_then(|x| x + 1)).rayon_wait();

Although you'd be better off not spawning twice:

let v = scope.spawn_future(f.and_then(|x| x + 1)).rayon_wait();

Is there a way to make RayonFuture::wait a compile-time error, rather than a panic? Probably not, since we don't control the trait, but it's ugly that this will only show up at runtime.

No, there is no way to do that; but I think the right thing is for the futures library to offer some sort of hook (perhaps via a thread-local?) to customize how wait behaves. That said, the truth is that if you are using wait() -- even rayon_wait() -- you are probably using futures wrong. The right thing would be to make a "follow-up" future and schedule that.

Note that the docs for wait() do warn that using it can will lead to deadlock, however.

rust-lang/futures-rs#360

cuviper · 2017-01-15T16:15:07Z

src/scope/future/README.md

+that any references are still valid, or else the user shouldn't be
+able to call `poll()`. (The same is true at the time of cancellation,
+but that's not important, since `cancel()` doesn't do anything of
+interest.)


This makes me nervous, as it probably should since you bothered to document it so carefully. But I don't have any concrete objection, and I trust your intuition on this more than my own, so... 🤷‍♂️

This is not the most straight-forward bit of reasoning, so being nervous is reasonable. However, I don't really think there's a reason to be nervous about accessing the result (famous last words...). As I wrote, the fundamental premise of Rust's type system is that T and E must either be in scope (i.e., only contain live, valid references) or else no data of that type must actually be reachable (this can occur in corner cases). Since the type of the result contains only T and E, it had better be valid or else Rust is pretty fundamentally broken.

What could make you nervous is that I've transmuted the type to hide the other data in the struct, and hence THOSE fields (if they had references) might not be in scope. An example would be the spawn field, which contains a value of type F (the future type). However, that shouldn't be a problem, both because the code doesn't access those fields and because we set them to None (so there is in fact no data of type F reachable).

nikomatsakis · 2017-01-16T12:18:49Z

@cuviper -- any objection to me landing this? Naturally we can iterate on the API (including the rayon_wait() business), but I'd like to get it in so we can start basing further changes on it.

(One thing I wouldn't mind talking over, and I may try to write-up an RFC issue or something, is the relationship of spawn() and spawn_future(). I have some thoughts here but I keep going back and forth.)

cuviper · 2017-01-16T16:20:16Z

No objection, especially since it's marked unstable. :-)

nikomatsakis · 2017-01-16T21:45:55Z

Done! Gonna be time for a new release soon, I think.

alexcrichton reviewed Dec 30, 2016

View reviewed changes

nikomatsakis mentioned this pull request Dec 31, 2016

RFC: Make panic from unpark() illegal? rust-lang/futures-rs#318

Closed

nikomatsakis mentioned this pull request Jan 2, 2017

Service trait and lifetimes tokio-rs/tokio-service#9

Open

nikomatsakis force-pushed the futures branch from 07cb64d to 490c65f Compare January 11, 2017 10:59

nikomatsakis added 18 commits January 12, 2017 11:15

add logging to scope()

5126be0

use compare_exchange in scope to propagate panic

73ac6ea

This permits us tighter ordering bounds.

add spawn_future API to scope()

988d2c5

weaken the load that is only used for debug assertions

668ff5b

revamp futures to not use channels

6ff8146

This avoids an allocation and puts us in complete control.

cleanup the compare-exchange loops

f4f4ee7

We had some oversight with the old structure.

promote future into its own module

18e903a

add a panic test (currently requires two threads)

16b9680

split the Latch trait into two traits

06c76ff

`LatchProbe` is a latch that can only be probed.

add rayon_wait

eeb0f3f

have wait panic from inside a Rayon thread as a helpful tip

1fbf5fc

Also retool and add more tests.

cleanup unsafe code blocks, leave some comments

dade8dc

fix use

0166e11

update to futures from crates.io

0a00c4e

remove the Ping trait

e26143c

Now that the `Park` trait in futures has no `'static` bound, we can do away with it.

avoid ::* for 1.12 compat

bc82162

simplify cancellation

81302c4

start with some documentation

a9fed5e

nikomatsakis added 5 commits January 12, 2017 11:15

discuss my reasoning around lifetime safety in the docs

ca2cd70

test what happens when poll gets invoked many times

de087f5

make some future-related stuff dependent on the unstable feature

fa4761b

change futures to hold onto the scope which started them

8f2cbc5

catch panics that occur in a call to unpark() and propagate them

0afebbe

Now, if we call `unpark()`, and that panics, we will propagate this to the enclosing `scope`.

nikomatsakis force-pushed the futures branch from 9e635f4 to 0afebbe Compare January 12, 2017 16:15

add a few more comments, run rustfmt

bb4eeb9

nikomatsakis force-pushed the futures branch from d4dba29 to bb4eeb9 Compare January 13, 2017 09:20

fix panic_unpark test not to be racy

5ac843e

cuviper reviewed Jan 15, 2017

View reviewed changes

cuviper mentioned this pull request Jan 16, 2017

FR: Parallel to sequential iterator #210

Open

nikomatsakis merged commit 0f5bd6a into master Jan 16, 2017

nikomatsakis deleted the futures branch February 28, 2017 17:41

yoshuawuyts mentioned this pull request Mar 14, 2022

Add blocking wrapper smol-rs/async-task#16

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add experimental support for futures #193

Add experimental support for futures #193

nikomatsakis commented Dec 30, 2016 •

edited

Loading

alexcrichton Dec 30, 2016

nikomatsakis Dec 30, 2016

alexcrichton commented Dec 30, 2016

nikomatsakis commented Dec 30, 2016

alexcrichton commented Jan 1, 2017

nikomatsakis commented Jan 10, 2017

alexcrichton commented Jan 10, 2017

nikomatsakis commented Jan 11, 2017

nikomatsakis commented Jan 11, 2017

aturon commented Jan 11, 2017

nikomatsakis commented Jan 13, 2017

nikomatsakis commented Jan 13, 2017

nikomatsakis commented Jan 13, 2017

cuviper Jan 15, 2017

nikomatsakis Jan 15, 2017

nikomatsakis Jan 15, 2017

cuviper Jan 15, 2017

nikomatsakis Jan 16, 2017

nikomatsakis Jan 16, 2017

nikomatsakis Jan 16, 2017

cuviper Jan 15, 2017

nikomatsakis Jan 15, 2017 •

edited

Loading

nikomatsakis commented Jan 16, 2017

cuviper commented Jan 16, 2017 via email

nikomatsakis commented Jan 16, 2017

Add experimental support for futures #193

Add experimental support for futures #193

Conversation

nikomatsakis commented Dec 30, 2016 • edited Loading

The role of Rayon and futures

Comparing spawn() and spawn_future()

API questions

Status

Work items:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexcrichton commented Dec 30, 2016

nikomatsakis commented Dec 30, 2016

alexcrichton commented Jan 1, 2017

nikomatsakis commented Jan 10, 2017

alexcrichton commented Jan 10, 2017

nikomatsakis commented Jan 11, 2017

nikomatsakis commented Jan 11, 2017

aturon commented Jan 11, 2017

nikomatsakis commented Jan 13, 2017

nikomatsakis commented Jan 13, 2017

nikomatsakis commented Jan 13, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis Jan 15, 2017 • edited Loading

Choose a reason for hiding this comment

nikomatsakis commented Jan 16, 2017

cuviper commented Jan 16, 2017 via email

nikomatsakis commented Jan 16, 2017

nikomatsakis commented Dec 30, 2016 •

edited

Loading

Comparing `spawn()` and `spawn_future()`

nikomatsakis Jan 15, 2017 •

edited

Loading