Async IO #1081

Open
steveklabnik opened this Issue Apr 21, 2015 · 168 comments

Projects

None yet
@steveklabnik
Contributor

Rust currently only includes synchronous IO in the standard library. Should we also have async IO as well, or leave that to an external crate? What would the design look like?

Moved from rust-lang/rust#6842

Related: #388

@Valloric

I don't see why this couldn't just remain in third-party external crates.

@steveklabnik
Contributor

It can. Hence the 'community library' and 'libs' tags both.

@Anachron

Depends on who is actually maintaining the crate.

In my opinion it should be at least some from the core members as once async and sync are not on the same page anymore, it can lead to confusion of even worse, broken projects.

What I mean is this:
Once the community wrote async versions of the std, the community will make decisions on itself, whether something should be one or another way.

It will be hard to keep up-to-date and collaborateurs may not have digged into the core of Rust, though it will either go into a different direction or need someone to sync it with the syncronous version of the core.

@josephglanville

The problem with "leaving it up to third party crates" is not having a blessed implementation that all libraries can interoperate with.

This problem has already happened a few times now, namely Ruby and Python both have many competing and incompatible asynchronous IO libraries (gevent/twisted, celluloid/eventmachine).

In and of itself that doesn't sound so bad until you realise the huge amount of stuff that gets built on top of said libraries. When you aren't then able to use powerful libraries with each other because they belong to different "async" camps then things get sad pretty quickly.

Contrast this to C#, a language with built in async primitives, it also ships most of the higher level async integrated code (HTTP client etc). There is a single blessed solution and every library builds on top of it, they all interoperate and everyone is happy.

I think it's super important to have async IO in core or blessed in some way to avoid fragmentation.

@m13253
m13253 commented May 9, 2015

I don't see why this couldn't just remain in third-party external crates.

But I think there should be language-level support for some key features that are important to async programming. That includes:

  • Coroutines (we need compiler support to ensure thread-safe)
  • Python-styled yield statement (though it originally makes an iterator, in async programming it is used to save current execution point and get back later)
  • Or C#-styled async/await statement instead
  • Green threads (it is possible for 3rd-party libraries, but providing an "official" green threading library avoids different network libraries conflicting with each other because they used different green threading libraries)

Yes we can already build a Node.JS-styled callback based async library -- that makes no sense. We need language-level support to build a modern one.

@flaub
flaub commented May 14, 2015

I think it's important to distinguish between traditional async I/O from use cases requiring the standard C select() API. I think that the existing synchronous I/O API is incomplete without the ability to cancel blocking calls. This doesn't seem to be robustly implementable without using select().

I'd like to see the current synchronous I/O API extended to support cancellation without necessarily exposing the underlying select. The goal here is not to provide a highly scalable or particularly efficient way of scheduling I/O requests, but merely a way to cancel blocking calls.

@gotgenes
  • Python-styled yield statement (though it originally makes an iterator, in async programming it is used to save current execution point and get back later)
  • Or C#-styled async/await statement instead

A minor correction to the "Python-styled" comment: it's technically yield from – syntax introduced in Python 3.3. (But yes, it is still based on generators).

Regarding async/await, it's worth noting that this syntax was proposed for Python 3, and has been provisionally accepted for Python 3.5. PEP 492 lists a fair number of languages that have adopted or have proposed the async/await keywords. PEP 492 also does a fair job of describing the weaknesses of the yield from approach, as well as the advantages of async/await.

Not that bandwagons are always the best reason to choose a direction, but async/await will become a very widespread idiom, and supporting it in Rust would provide great accessibility to those of us coming from other languages.

Yes we can already build a Node.JS-styled callback based async library -- that makes no sense. We need language-level support to build a modern one.

Hear, hear!

@ryanhiebert

Being somebody familiar with Python and its async story, but not as familiar with compiled static languages, it would be helpful to me (and perhaps others) if somebody could comment on something that I'm familiar with from Python.

In Python 3.5, async and await will be based on yield and yield from coroutines based on generators under the hood. This seems like a pretty elegant design to me, but I'd love to hear if there are any problems with that kind of approach.

@gotgenes

In Python 3.5, async and await will be based on yield and yield from coroutines based on generators under the hood. This seems like a pretty elegant design to me, but I'd love to hear if there are any problems with that kind of approach.

From @nathanaeljones's comment on rust-lang/rust#6842:

One of the earliest comments asked about .NET's async system. It's a callback system, like most, but there is a lot of syntactic sugar, and the 3 different systems that are exposed have different levels of overhead.

The most popular is C# async/await, which generates a closure and state machine to provide a continuation callback. Stack and thread context restoration is expensive, but you can opt out per-task.

Now, what is "expensive"? I'm not sure. (I'm in the same boat as @ryanhiebert. I program mostly in Python. I was made aware of Rust by @mitsuhiko's blog posts.)

@nathanaeljones

@gotgenes The expense depends on how much thread-local storage you're using. I know that it's low enough now that new APIs are async-only. This really depends on the language runtime (and the operating system); I don't think much can be learned about the performance implications by looking at other languages.

@sleeparrow

From @flaub:

I think it's important to distinguish between traditional async I/O from use cases requiring the standard C select() API. I think that the existing synchronous I/O API is incomplete without the ability to cancel blocking calls. This doesn't seem to be robustly implementable without using select().

I'd like to see the current synchronous I/O API extended to support cancellation without necessarily exposing the underlying select. The goal here is not to provide a highly scalable or particularly efficient way of scheduling I/O requests, but merely a way to cancel blocking calls.

I agree with this. Even trivial programs might suffer from hacks due to the inability to interrupt synchronous calls. (See rust-lang/rust#26446.) I think the language should support asynchronous IO, personally, but if it were really too complex to add to the API, synchronous IO should be made programmatic.

@phaux
phaux commented Jun 30, 2015

Would be cool to have something like GJ in the core.

@jimrandomh

Please first put in the straightforward wrapper around select/pselect. I understand you also want to build something better, but my first experience with Rust was hearing that it was 1.0, trying to do a project that involved using pseudoterminals, very close to something I'd already done in C, which involves waiting for input from the user and from a subprocess simultaneously. It immediately got bogged down in a rabbit hole of reverse-engineering macros from Linux system headers and corner cases of FFI, ending up much much more difficult than it had any right to be.

@retep998
Member

@jimrandomh Keep in mind that select on Windows only works on sockets, and you can't really wait on groups of files/terminals/pipes. If someone can create a crate with async that works well on both Windows and non-Windows, then I'm sure a lot more attention will be paid to getting async in Rust.

@gotgenes

Keep in mind that select on Windows only works on sockets, and you can't really wait on groups of files/terminals/pipes.

Lack of complete support in Windows didn't stop Python from using it.

@boazsegev

πŸ‘

I doubt we could standardize an Async IO API for all Rust applications without making it part of the Core library... and Async IO seems (to me) to be super important - even if it's just a fallback to a select call (i.e. returning a Result::Err("would block") instead of blocking).

I believe that a blessed / standard Async IO API is essential in order to promote Rust as a network programming alternative to Java, C, and other network languages (even, perhaps, Python or Ruby).

Also, considering Async IO would probably benefit the programming of a Browser, this would help us keep Mozilla invested.

...

Than again, I'm new to Rust, so I might have missed an existing solution (and no, mio isn't an existing solution, it a whole-sale IO framework).

@chpio
chpio commented Dec 27, 2015 edited

we could standardize the interface in the core and let lib developers do the implementations. that way every one would use that one "blessed" interface but we could have multiple competing implementations (it may be a good idea, i dont know :)).

Also there could be multiple async-api-abstraction-levels just like in JS:

async-await:

async function myFunc() {
  const data = await loadStuffAsync();
  return data.a + data.b;
}

promises:

function myFunc() {
  return loadStuffAsync().then(data => data.a + data.b);
}

callbacks (i hate them ;)):

function myFunc(cb) {
  loadStuffAsync((err, data) => {
    if (err) {
      return cb(err);
    }

    cb(null, data.a + data.b);
  });
}

streams:

tcpSocket
  .pipe(decodeRequest) // byte stream ->[decodeRequest]-> Request object stream
  .pipe(handleRequest) // Request object stream ->[handleRequest]-> Response object stream
  .pipe(encodeResponse) // Response object stream ->[encodeResponse]-> byte stream
  .pipe(tcpSocket)
  .on('error', handleErrors);

or is there already a nice stream implementation in rust? capable of...

  • highWaterMark with push back: pauses the previous handler when the queue of the current one is full
  • multitasking: each handler is executed in its own coroutine/thread
  • byte & object streams
  • error handling
@boazsegev

Looking over the code for the mio crate, I noticed that some unsafe code was required to implement the epoll/kqueue system calls that allow for evented IO (mio isn't purely Async IO, as it still uses blocking IO methods)...

It seems to me that unsafe code should be limited, as much as possible, to Rust's core and FFI implementations.

The "trust me" paradigm is different when developers are asked to trust Rust's core team vs. when they are asked to trust in third parties.

I doubt that competitive implementations, as suggested by @chpio , would do a better job at promoting a performant solution... although it could, possibly, be used to select a most performant solution for the underlying core library.

Ruby on Rails is a good example of how a less performant solution (although more comfortably designed) could win in a competitive environment.

@seanmonstar
Contributor

There's nothing wrong with unsafe code. A crate that uses it shouldn't be discouraged. Unsafe is required whenever memory safety is sufficiently complicated that the compiler cannot reason about it.

In this specific case though, unsafe is used because Rust demands all ffi code be marked unsafe. The compiler cannot reason about functions defined by other languages. You will never have code that uses epoll without unsafe (even if that unsafety were eventually tucked into a module in libstd).

@boazsegev

@seanmonstar - On the main part, I agree with your assessment.

However... Rust's main selling point is safety and I do believe that forcing third parties to write unsafe code does hurt the sales pitch. Also, unsafe code written by third parties isn't perceived as trust-worthy as unsafe code within the core library.

I'm aware that it's impossible to use the epoll and kqueue API without using unsafe code and this is part of the reason I believe that writing an Async IO library would help utilize low level system calls while promoting Rust's main feature (safety).

Having said that, I'm just one voice. Both opinions have their cons and pros and both opinions are legitimate.

@camlorn
camlorn commented Dec 30, 2015

I'm not sure this is the place and maybe I need to open a separate issue somewhere, but since we don't seem to have it, I'd rate some sort of abstraction over at least socket select as super important. I got to this issue by looking for that and finding other issues that linked here indirectly from 2014; since I see at least one other comment here saying the same thing, I figure I'd add my two cents. Before I go on, I should admit that I'm still from the outside looking in; I really, really want to use Rust and plan to do so in the immediate future, but haven't yet. My primary is C++ and my secondary is Python.
While an async I/O library is really a very good idea and definitely gets a +1 from me if only because Python proves that not putting it in the language/standard library will cause epic-level fragmentation, the lack of a standard library select means that I have to essentially opt into some sort of third party crate or write my own abstraction over the calls. I'm considering Rust for the development of a network protocol as a learning project and can spend whatever time I choose, so I have some flexibility in this regard. But the inability to easily find a platform-neutral select in the standard library is leaving a bad taste in my mouth right now.
I'd say that getting select in at least for sockets as soon as possible would close a rather big and critical hole, as the only other alternatives that don't involve third-party libraries that I'm seeing seem to involve either a thread for every connection or fiddling around with timeouts. In the latter case, the documentation says the error depends on the platform--I get to write yet another abstraction layer! I'll probably opt into Mio for my current project, but I still consider this a shortcoming because all I really need is select.
Speaking more generally, having used both Twisted and Gevent some (though admittedly not enough to be called an expert), I like the look of Asyncio and think copying/borrowing from it might be a good starting point. Twisted always degenerated to inlineCallbacks and gevent always became all the difficulties of threads but with the "advantage" that it lies about this, offering mostly false hope. Since other languages seem to be converging on Asyncio, any solution that looks like it would get my admittedly far-from-expert vote. I'd go so far as to say that we should do it by implementing Python-style generators/coroutines, but that's probably a separate issue and I'm certainly nowhere near thinking about starting any RFCs at the moment.

@tailhook

While an async I/O library is really a very good idea and definitely gets a +1 from me if only because Python proves that not putting it in the language/standard library will cause epic-level fragmentation

It turns out that python proves quite contrary. There was asyncore in python. But it was quickly obsoleted by twisted (and tornado/gevent... much later). And now there is an asyncio which may be obsoleted by curio, as the latter looks like much nicer by design (but still too early to reason about it's success).

At the end of the day, implementing yield-like construct looks promising. But it's too early to put any async IO library code into the stdlib. There are virtually no downsides of using external crates for the task.

@gotgenes

At the end of the day, implementing yield-like construct looks promising.

As mentioned earlier, Python has already moved on from yield from to async/await as the syntax of choice for asynchronous operations, for reasons outlined in PEP-492.

I'd like to second pointing out curio as an interesting new model for async in Python.

@camlorn
camlorn commented Jan 1, 2016

This is interesting. I've never heard of asyncore before now, but it looks like a very complicated way to use a select call. I'm not surprised that it didn't become popular, especially given the 1999 release date (I found one source placing it at Python 1.5.2, but can't find official confirmation). I'm not very convinced that it's good evidence that I'm wrong about Python proving the fragmentation point. I'm not saying that I'm right, just that I need more convincing before dropping my viewpoint as incorrect. In my opinion, something okay with many protocols is better than 5 or 6 options, each more amazing than the last, but each supporting different protocols.
I still think my point about select stands and that it should be put into the standard library as soon as possible. It or something like it is the first step to an async framework. The disadvantage of opting into an external crate for async I/O when all you need is select is that everyone needs to learn the external crate; by contrast, select is a very simple call and can be explained in a few paragraphs. Even if a crate containing only select exists, though, I fail to see any disadvantage to adding it to std::net in some form.

@grigio
grigio commented Jan 2, 2016

It seems that async-await style is popular as non-blocking pattern.

@szagi3891

I think that too much of combining of convention.
Suffice to answer the Async IO returns was channel.

@boazsegev

I think we're over thinking it... for now,how about having non-blocking sockets return with an EWOULDBLOCK and having core wrappers for kqueue, epoll and select (system dependent, of course) with a preprocessor able to inform us which system is available and which isn't... At least let Rust be independent of C knowhow as much as possible.

P.S.

I would probably wrap epoll and kqueue in a single wrapper and select in another.

@sconger
sconger commented Jan 31, 2016

I feel it's worth pointing out that some operating systems have started moving away from the poll/select model. There has been a trend toward the system managing IO and threads together. Windows has tied its newer asynchronous IO to windows thread pools, and Darwin has done similarly with grand central dispatch. Instead of waiting for an event, you ask the OS to do something, and it triggers a callback on a worker thread when the work is done.

I don't see Rust being able to support those APIs without something like async/await. They don't fit in with the current safety mechanisms.

@retep998
Member

Pretty much all async on Windows is completion based in that you ask it to do something, and it does it, and then gets back to you. The only options you really have are how it gets back to you. Whether you use the wait functions on overlapped events, or have an APC fire, or a callback in a thread pool, or a completion packet to an IOCP.

@tikue
tikue commented Feb 7, 2016

How would the borrow checker interact with async/await? If I borrow self mutably and then await some computation whose result will be combined with self, presumably other async states won't be able to borrow self? Would you need to drop any borrows before awaiting?

@l0calh05t

If I borrow self mutably and then await some computation whose result will be combined with self, presumably other async states won't be able to borrow self?

That's what I would expect.

@akcom
akcom commented Mar 4, 2016

@retep998 I have to disagree. While IO completion ports are certainly a high throughput asynchronous method, event polling is well-supported by both files and sockets.

@retep998
Member
retep998 commented Mar 4, 2016

@akcom Really? Windows provides event polling that isn't completely awful (select doesn't count)? Can you provide some examples of this?

@akcom
akcom commented Mar 4, 2016

@retep998 WaitForMultipleObjectsEx can be used for polling but also for overlapped operations.

@retep998
Member
retep998 commented Mar 4, 2016

@akcom Ah, for a moment I thought you meant Windows provided readiness based async on Windows. What you're actually saying is that Windows provides ways to receive notification of completed overlapped operations other than IOCPs which is actually what I said in my other message, so I'm not entirely sure what your point is.

The only options you really have are how it gets back to you. Whether you use the wait functions on overlapped events, or have an APC fire, or a callback in a thread pool, or a completion packet to an IOCP.

Note that you cannot use the wait functions to determine when you can read/write a file without blocking, you have to first start an operation asynchronously and then use some method to be notified of when the operation is done.

@akcom
akcom commented Mar 4, 2016

@retep998 I can understand how you may have missed my point since I mentioned the overlapped operations in passing. If you do a bit of reading about WaitForMultipleObjects (you may have better luck finding an example which utilizes WSAWaitForMultipleEvents which is a thin wrapper for WaitForMultipleObjects), you'll see that when bAlertable=FALSE, you can use it as a simple polling mechanism which will signal when an object is readable or writeable. This does not require you to start an operation asynchronously.

@retep998
Member
retep998 commented Mar 4, 2016

@akcom Can you link me to such an example which demonstrates it being used to signal when a file handle or socket is readable or writeable?

@dgrunwald
Contributor

Note that WaitForMultipleObjects is limited to MAXIMUM_WAIT_OBJECTS, which is 64. It's easy to run into that limit, and requires complex logic to deal with (e.g. a thread pool that uses one thread for each group of 64 sockets).

It's usually a good idea to avoid WaitForMultipleObjects and use completion based async IO instead.

@retep998
Member
retep998 commented Mar 4, 2016

@akcom Ah, you're referring to WSAEventSelect which provides readiness notifications in the form of an event that can be waited on. However that API is limited exclusively to sockets and is unusable for file handles, so if an async library wants to support files or pipes it would have to support completion based async regardless. And, as @dgrunwald mentioned, the wait API doesn't really scale up too well.

@NeoLegends

Please excuse my naivety, but what about libuv?

@retep998
Member

@NeoLegends Rust used to use libuv years ago when it still had libgreen. That wasn't really async so much as "green" threads, operations were still fundamentally synchronous, and was abandoned due to the complications that arose (spinning on a green thread could cause deadlock for example) and performance reasons. I have heard of some attempts since then to use libuv as a third party crate for Rust, although do note that libuv is just a cross platform abstraction and any Rust library can use the system primitives directly instead of wrapping libuv. One issue with libuv is that it uses callbacks for nearly everything and having to bend your Rust code to that model is rather complicated due to the restrictions it forces on lifetimes and borrows.

@NeoLegends

@retep998 What if the Rust implementation was a wrapper of libuv? Couldn't one abstract the callbacks away?

@chpio
chpio commented Mar 27, 2016

but how do you want to deal with "continuation passing" if not with callbacks?

@NeoLegends

The Rust wrapper library would obviosly have to do that. It doesn't need to expose the callbacks through the public interface, though.

@retep998
Member

What would the Rust interface look like then? The libuv model is to get a callback when the operation is done, the libgreen model is to just block the thread until the operation is done. What sort of system do you propose to notify the user when their operation is complete? The big questions that are blocking async IO are not implementation details, using libuv won't save us anything over using system primitives directly. Rather it is what the Rust interface looks like, how can it be designed such that it imposes few restrictions on Rust code, but at the same time maps efficiently to the underlying primitives.

@burdges
burdges commented Mar 29, 2016

Any C asynchronous IO solutions like libuv require rather thick wrappers to make them safe in Rust, @NeoLegends, so using a native Rust solution like mio makes more sense.

There are many asynchronous IO interface designs @retep998. In Rust, there is using mio directly, but indirectly one could use mio through coroutines as mioco, promises as gj, futures as eventual_io, or state machines as rotor.

All have advantages and disadvantages that can get quite subtle. I think one should focus on making the current mio ecosystem play nicely together with zero-ish additional cost. And hope it evolves into something that can be standardized. Just a few questions :

  • What do these different approaches cost? Are state machines like rotor the only "zero cost" solution? Are coroutines or green threads like mioco higher cost than fancy callbacks like promises/futures?
  • Does eventual_io pay any costs beyond gj for its greater abstraction via eventual? It'd be interesting if gj and eventual_io could be merged by making the Cap'n proto RPC use eventual_io or something.
  • Can eventual be run on top of a rotor state machine and/or mioco? Can this be streamlined and done efficiently?
  • Is there an argument that coroutines are dangerous and their use should be minimized? I'm thinking here about restrictions to parallelism primitives like x10 that avoided deadlocks.
@tailhook

I'm not sure the questions are for me, but let me share some thoughts:

What do these different approaches cost? Are state machines like rotor the only "zero cost" solution? Are coroutines or green threads like mioco higher cost than fancy callbacks like promises/futures?

mioco shouldn't be too more expensive in the theory, but the problem is stack management. Most rust programs expect large stack size: file copy allocates 64kb buffer on the stack, each format/log macro occupies quite a lot and so on. And only way to free stack memory after use is to deallocate the whole coroutine/microthread. (deallocating stack on each client request is too slow, if you ask, you need to have a pool of threads/stacks)

I have some experience in vagga which runs on musl libc so has small stack limit of 80kb. And it turns out that the stack size runs out pretty quickly. (And 80kb per coroutine/mictrothread is too much for many network applications).

Can eventual be run on top of a rotor state machine and/or mioco? Can this be streamlined and done efficiently?

Yes, I strongly believe so. This is how I think it should be done: raw protocol implementation with state machines, and a simple protocol wrapper with closures/futures on top of that. It hasn't been done yet, though.

@NeoLegends

@retep998 Ah, I see. Thanks for the clarification.

@burdges
burdges commented Mar 29, 2016

I've a partial answer from @dwrensha to :

Does eventual_io pay any costs beyond gj for its greater abstraction via eventual? It'd be interesting if gj and eventual_io could be merged by making the Cap'n proto RPC use eventual_io or something.

It appears gj wanted single-threaded performance with Rcs while eventual wants multi-thread safety with Arcs and Send bounds. At least this particular difference could be resolved with higher-kinded polymorphism.

@dwrensha

@burdges: it's not just performance. Sharing Rc data among tasks on a single thread is less error-prone than sharing Arc data among multiple threads, because the yield points are explicit.

@burdges
burdges commented Mar 29, 2016

Interesting, I briefly search for Rc vs Arc, but found little and didn't try that hard. I probably should've searched the RFCs specifically. Thanks!

@icorderi

I created the crate await as a placeholder for what we would like to have in the core.
I also included a partial survery of what's currently out there --in rust--, and what is built on top of what.

After looking at the abstractions the crates provided and the dependencies between them, it seems we have a crate for each kind of abstraction @chpio enumerated (and there seem to be no arguments against that list).

The first thing that strikes out is that mio, or rather, Event Loops, are definitely an important building block throughout the other crates.

There also seems to be a few "shinny" + "shinny-io" crate pairs going around. Ideally all those "*-io" would become unnecessary with the "proper" core abstractions in place.

contributions are welcome, encouraged and very much needed!

@vinipsmaker

@icorderi: I too believe that an executor (what you call event loop) is an important first step.

Recently my employer/client wanted to build a library that doesn't spawn hundreds of threads to maintain a few connections and asked me to design a new solution.

I started taking a look on current Rust abstractions. Pure mio is horrible because it doesn't compose. You cannot create different crates abstracting different protocols because they need to collaborate with each other (e.g. token clashing, message types to mio...). If you create crates low-level enough, you need to do too much boilerplate to integrate them.

Abstractions wrapping mio were too incomplete, too intrusive or mixed up too many concepts within a single library (which breaks single responsibility principle and makes unclear to reason about what happens as there are no clear boundaries).

Then I started researching executors design (an executor is to function execution as an allocator is to allocation) and I've liked the design of this particular proposal:

An executor that allows you to post functions is important to allow you to guarantee that your asynchronous operations initiating function always return immediately. This is a base guarantee to avoid problems like unfairness, jitter, and starvation. An example of such unfairness happening is given in the paper http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0114r0.pdf (4.5 When scheduling logic is embedded in the language).

I started designing/experimenting a mio wrapper that provide an executor, a timer queue and a socket reactor: https://github.com/vinipsmaker/asio-rs/tree/master/examples

My colleague @canndrew was working on a design based on http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4045.pdf to allow a model that can be adapted to any async paradigm (callbacks, futures, coroutines...), but he eventually said he'd need higher-kinded types.

@tikue
tikue commented May 13, 2016 edited

@vinipsmaker (shameless plug) I'm currently working on an async re-write of tarpc built on top of mio. It allows registering multiple different tarpc services on a single event loop, and multiple clients on a single event loop. Here's a contrived example of a client calling a server which subsequently makes 2 calls to another server and waits until both calls are available before sending a reply to the client.

I'd be interested to hear if that suits your needs. It only currently really works on nightly due to some feature flags being flipped on. Hopefully that won't be the case for long.

@vinipsmaker
vinipsmaker commented May 13, 2016 edited

@tikue: The model you propose isn't a large improvement over mio's one. I can already use pure mio (if I build a monolith crate and forget about mio's problem of composing interfaces). There are other mio-wrapper and I don't need to build my own. My comment was more in a contribution to a future standardized abstraction within the Rust core library. Within this discussion, I cannot see a hacky way around mio current API as a good solution.

First thing about mio is that it uses a reactive/passive model rather than proactor/active model.

An active model puts the user as the active party. This is an elegant solution to answer questions like "how can I defer acceptance of new connections during server high-load scenarios?"

I also must say from my experience that reactive APIs are harder to design tests for (you need work around to identify which reaction maps to which action from the test).

You can also look at the situation as mio being based on the readiness model and APIs that receive more technical praises based on the completion notification model.

There is also concerns about coroutines integration. Sure, coroutines are not the solution to everything, but many times blocking code is more readable just because you can keep using the language constructs (if, while...) to do control flow. Coroutines are the only solution to asynchronous programming keeping this property that I know of. There is more than one developer who reported to me that Boost.Asio's ability to start an operation within a certain async paradigm (e.g. coroutines) and then move to another model (e.g. callbacks) for the same I/O objects was quite useful.

EDIT:

There's a little inaccuracy around the example of deferring acceptance. mio is capable of that. However, I don't see the same property in your abstraction.

EDIT2:

A little documentation around your documentation would be good too. And I know it's unfair from my part to ask for documentation of your design when I myself have not done the same my design.

@tikue
tikue commented May 16, 2016

@vinipsmaker there's no fundamental reason tarpc can't support deferring acceptance. It's just not a high priority at the moment, and there's a lot of design space to consider. Regarding documentation, feel free to open issues where you believe it's lacking. As the async branch is still undergoing a lot of churn (it hasn't even merged to master yet), writing documentation isn't super-high priority at the moment.

@vinipsmaker
vinipsmaker commented May 16, 2016 edited

@tikue: you're right. There is no fundamental reason why tarpc can't support deferring acceptance. My previous argument wasn't good and didn't explain the problem.

The problem is how much detail and customization points you need to put in your abstractions. If it's a reactive interface then you start with a small abstraction and start building complexity atop over time. "I need a pool to recycle objects, let me add a PoolFactory and a set_factory... now I want to use objects from this array allocated in the stack and no more than X requests being handled at once, let me add... now I want deferred acceptance... and now...". You'll put an interface for everything you think about and the interface will never be enough. With an active style, the problem is solved and you don't need to do anything. You don't need to justify with "it's not a priority at the moment" because it's done from the moment the abstraction is usable.

The active style also maps very naturally to some other paradigms like futures and coroutines (accept which returns a future or suspend the current coroutine).

@burdges
burdges commented May 17, 2016

It's dubious that anyone understands this design space well enough right now. Are there any proofs of deadlock freedom for a subset of coroutine flexibility? Just because I need complex shutdown logic do I need to forgo the advantages of restricting to futures? etc.

It sounds like the first question should be : What language features do projects like eventual, gj, rotor, and mioco need to better integrate with one another? At minimum, the lack of higher-kinded polymorphism contributed to the fork between eventual and gj, and appears to be limiting others as well.

@vinipsmaker

It sounds like the first question should be : What language features do projects like eventual, gj, rotor, and mioco need to better integrate with one another?

This would be the executor. Exactly what async IO needs. One thread being shared by multiple tasks. The executor does that, control executions within this thread.

At minimum, the lack of higher-kinded polymorphism contributed to the fork between eventual and gj, and appears to be limiting others as well.

This is my understanding as well.

@icorderi

@burdges I'm trying to answer exactly that.

"What are the core features that we need to better integrate all this async crates?"

The gist of the current state is:

/// Abstracting the ability to be resumed when something completes
trait Await<T> {
    fn await(ctx: &mut Context) -> T;
}

The Context is an abstraction related to what @vinipsmaker referred to as the "executor".
The "executor" is essentially a Context factory that knows how to create Thread's or mioco coroutines or something else.

/// An execution context
///
/// This is a `Thread` on a threaded model, or a **mioco** `Coroutine`
trait Context { 
   // Currently playing with what should go in here...
   // The primitives to do signaling and cross-context communication should be here 
}

Ideally this would be backed by some sugar and allow us to write code like this:

async fn foo_async() -> impl Await<String> {
    let s = await TcpStream::bind_async("example.com:8080");
    let mut buff = [0u8;1024];
    let c = await s.read_async(&mut buff);
    String::from_utf8(buff[..c].to_vec()).unwrap()
}

The desugared form can be based on continuations or a state machine. Where the Context flows through.

/// Example of desugaring the **await** syntax using continuations
pub fn foo_async_desugared<C: Context>(ctx: &mut C) -> impl Await<(String, TcpStream)> {
    TcpStream::bind_async("localhost:8123", ctx)
        .and_then(move |mut s, ctx| {
            let mut buff = [0u8;1024];
            s.read_async(&mut buff, ctx)
                .map(move |c| (String::from_utf8(buff[..c].to_vec()).unwrap(), s), ctx) }, ctx)
}

You can find the current state of this here (docs).

@camlorn
camlorn commented May 18, 2016

@icorderi
Is there a good reason that the syntax sugar for making async convenient should be separate from the syntax for something like a Python generator? As I understand it, generators are something at least some people want (including me, kind of).

I know Python ultimately decided that the answer is yes, but I'm not sure what their justifications were or if they apply to Rust. I knew at one point--I think it was mainly that Python is dynamic and you could accidentally return stuff you shouldn't--but haven't looked at the PEP in forever.

Perhaps they need to desugar to different things for zero cost abstraction.

@icorderi

@camlorn, yes, the sugar should not be the same, the two are semantically different.

Think of it without the "asynchronous" bit. [exaggerating] You are essentially saying that because Vec's can hold a single value. We should make all functions just return Vec.

An Await<T> is parallel to returning a T, while a Iterator<T> is parallel to Vec<T>.

On the other hand, the desugaring of yield x is quite "similar" to let x = await foo(), in the sense that both would likely come down to using continuations or state machines.

The compiler's ability to break down a function into continuations / states can be used to support both features.

@vinipsmaker

@icorderi:

/// Abstracting the ability to be resumed when something completes

It's pretty obvious that this is VERY tightly coupled to the concept of coroutines. Therefore, you're not answering the question "What are the core features that we need to better integrate all this async crates?" because you assume coroutines concepts right from the beginning.

While you advocate "awaitable concepts" I advocate "sharing the thread among multiple jobs/tasks/closures/..." (i.e. the executor).

And if you want to propose a coroutine implementation, you do not even need to integrate it with a thread-sharing/executor/... mechanism. The integration should be done by a third piece.

"What are the core features that we need to better integrate all this async crates?"

This is something that Christopher Kohlhoff (Boost.Asio author) has done for C++. He not only provided the foundation, but also provided all these "crates" (callbacks, coroutines, futures and the integration among them). And his model is not too tightly coupled like most would do, but rather extensible and open to new paradigms (you could implement blocking-style if you want or something new you think of). His library is really flexible and you could adopt several models for a project (single-threaded, one executor per thread each processing multiple clients, one thread per client...).

He is now proposing papers to have these abstractions integrated in the core C++ itself and the one thing to notice is that his abstractions are separate abstractions and not a big blob of something that can only be used together. executors, network interfaces, foundation for extensible (i.e. it supports callbacks, futures, coroutines...) async abstractions and stackless coroutines.

If we gonna discuss coroutines, I wish we would discuss coroutines only as these features are building blocks for other abstractions and not a suportive tool you'll only use glued with network code. For instance, in Python you have generators. Some folk from the C++ land are implementing even channels inspired by Go channels on top of coroutine interfaces.

@icorderi

@vinipsmaker,

It's pretty obvious that this is VERY tightly coupled to the concept of coroutines.

I'll have to strongly disagree with that.

Being able to reason abstractly over a function call that will not return immediately/synchronously does not imply or require coroutines to exist.

We do not need a coroutine to accomplish the following:

async fn foo_async() -> impl Await<String> {
    let s = await TcpStream::bind_async("example.com:8080");
    let mut buff = [0u8;1024];
    let c = await s.read_async(&mut buff);
    String::from_utf8(buff[..c].to_vec()).unwrap()
}

bind_async and read_async operate on top of select/poll constructs. No coroutines involved.

Now, you are obviously not wrong either. If we had the ability to speak abstractly over asynchronous computations, we might as well give them the possibility to be scheduled on an execution context that is not just pure threads. Otherwise, we would be wasting a lot of potential. This is were coroutines come in play.

But I don't expect the std to have coroutines, just to have an abstraction for an execution context (thread, coroutine) so that external libraries can implement concrete models. The std would also impl Context for Thread.

For this to work, we need to make sure that pretty much everything under std::sync can be implemented on top of this Context abstraction (or have the *_async() versions of the blocking calls), so that RWLocks, Mutex, Barrier, channels don't break other implementations of Context by blocking an entire Thread.

With this in place, asyncio can be an independent crate on the nursery implemented on top of an epoll/kqueue crate (quite probably also in the nursery).

mioco can then be written on top of that to provide a CoroutineContext that allows us to run Await's on them.

futures/eventual/gf/etc... can all be written on top of Await and used on either mioco or Thread's, it all depends if you schedule the initial async call on a Thread or on a Coroutine.

@vinipsmaker

@icorderi:

I quoted "Abstracting the ability to be resumed when something completes" when I stated "it's very tightly coupled to the concept of coroutines". I restate the same phrase again, as you haven't show me how "resumable functions" aren't very related to the concept of coroutines.

And in your example:

async fn foo_async() -> impl Await<String> {
    let s = await TcpStream::bind_async("example.com:8080");
    let mut buff = [0u8;1024];
    let c = await s.read_async(&mut buff);
    String::from_utf8(buff[..c].to_vec()).unwrap()
}

You mentioned bind_async and read_async, but what is interesting is the await construct. What does it do? Suspend function to be resumed later? If this is what it does, then this is a coroutine. So yes, coroutines are involved.

Also, I'm not sure I understand your Context proposal. Could you explain it more, please?

@icorderi

@vinipsmaker

Take the following two snippets:

std::thread::Thread::spawn_async(foo_async)
mioco::Coroutine::spawn_async(foo_async)

In the Thread snippet, a new Thread gets created to execute foo_async, the await's inside foo_async will park the Thread, and it will stay there until the IO completes. In this case, the code should behave the same as the blocking version:

fn foo_async() -> String {
    let s = TcpStream::bind("example.com:8080").unwrap(); 
    let mut buff = [0u8;1024];
    let c = s.read(&mut buff).unwrap();
    String::from_utf8(buff[..c].to_vec()).unwrap()
}

In the Coroutine snippet, the await's will park the Coroutine, allowing other coroutines to use the thread that freed up. Whenever the await inside foo_async completes, it will unpark the coroutine. Now a few things can happen here, that are all design choices on how mioco wakes up a coroutine. If we assume a cooperative scheduler, then, whenever the current running code in the Thread that was backing the Coroutine yields/pauses/completes, foo_async will resume (assuming no other work was pending scheduling on that coroutine).


Maybe it's subtle, but abstracting a computation that will return at some later point in time (Await), is not the same as user threads (Coroutines).

Although in reality, a world with just Await and no Coroutine's is kinda boring.
On the other hand, a world with Coroutines an no Await is the mess we live in right now, where we have no common abstraction to describe something that will return at some point in time.

Yes, Await and Coroutine are related in the sense that, having one without the other is boring or an API mess. But they are not the same concept, nor one, requires the other.


We already have Coroutine's, I'm not proposing a coroutine library, nor that saying that the std should have a default coroutine implementation.

What I'm proposing is to abstract a computation that will return at some later point in time (Await) and to abstract an execution Context. The only execution Context in the std is a Thread, the same, very real 1:1 thread we have today.

We can have a zoo of Coroutine implementations with cooperative scheduling, preemptive scheduling, at w.e. other flavor people want publish on crates.io.


The important thing is that any code written to be async using Await, will run in all the async libraries, and any Await+Send created inside some Coroutine-X crate, can be tossed over to some Coroutine-Y crate and it will just work.

This also allows us to write some async library backed by coroutines that will continue to make forward progress even if someone decides to call a lock on a mutex. Because that lock will freeze the Context which might be a thread or a coroutine. Same thing applies for channels.

This is precisely the goal, to be able to interoperate between all this async libraries.


@vinipsmaker, did it make sense?

@vinipsmaker
vinipsmaker commented May 20, 2016 edited

@icorderi:

Thanks for the explanation. Your proposal is much more clear now.

abstracting a computation that will return at some later point in time (Await), is not the same as user threads (Coroutines).

This is what futures & promises are for. In this case, the await is just syntax sugar for that [possibly blocking code]. There is a crate in Rust just for that: https://en.wikipedia.org/wiki/Futures_and_promises

If your proposal is not about futures & promises, could you explain the differences (aside from naming and other irrelevant differences)?

Yes, Await and Coroutine are related in the sense that, having one without the other is boring or an API mess. But they are not the same concept, nor one, requires the other.

Your Await is your Await (i.e. a new concept). You're the authority to say what it is and what it is not. I'm trying to understand what is it and I'm trying to relate your explanation to things I know using more standardized terms than "new" concepts.

Your example:

async fn foo_async() -> impl Await<String> {
    let s = await TcpStream::bind_async("example.com:8080");
    let mut buff = [0u8;1024];
    let c = await s.read_async(&mut buff);
    String::from_utf8(buff[..c].to_vec()).unwrap()
}

There are two possibilities here:

  • await TcpStream::bind_async("example.com:8080") blocks. Then we're discussing non-async abstractions and I'm on the wrong thread.
  • await TcpStream::bind_async("example.com:8080") suspends the function (i.e. local variables are saved and this is a coroutine no matter if you're spawning it from another thread or the same thread). Coroutines are subroutines that have two more operations, suspend and resume (and this is what I see here).

Now, your code has await (the keyword) and Await (the "object"/class/...). await is very related to coroutines. Please spend a little more time explaining more if I'm wrong. And I'm aware that this is just syntax sugar to Await and Await should be usable without await.

If you want to pursue this route, you'll need to provide a solution for "futures islands" (futures with multiple implementations cooperating together). I'm aware of none and I'm aware of endless discussions in the Boost mailing list trying to come up with some. And it seems to me that you're trying to unify async code so threads and coroutines could have the same algorithms. Therefore I'm just more inclined to believe that you'll need to solve this problem (futures islands) if you want to follow this route.

All in all, this doesn't solve one single problem: sharing a single thread with multiple jobs/tasks/closures/... . This is the executor. We have mio, but mio alone is a mess to integrate with different crates (token clashing, only one possible implementation of Handler running...).

and to abstract an execution Context

I'm still unsure about what is this "execution Context".

The important thing is that any code written to be async using Await

Not all problems were solved. We still need to "spawn" a few things and now in which threads they'll run. I'm not convinced until these others pieces are described well enough too.

This is precisely the goal, to be able to interoperate between all this async libraries.

Agreed.

A little text of myself covering a little on the topic of asynchronous code (including the Kohlhoff's solution to interoperate multiple async paradigms): https://vinipsmaker.wordpress.com/2015/11/26/multitasking-styles-event-loops-and-asynchronous-programming/

@icorderi

@vinipsmaker thanks for taking the time to go back and forth on this. I appreciate the discussion and you raise very valid questions. I hope the following long post will provide you with answers.

"abstracting a computation that will return at some later point in time (Await), is not the same as user threads (Coroutines)." - @icorderi

This is what futures & promises are for. In this case, the await is just syntax sugar for that [possibly blocking code]. There is a crate in Rust just for that: https://en.wikipedia.org/wiki/Futures_and_promises

I'll have to disagree with you again. Just because it quacks, and with a nudge in the right direction it swims, it doesn't make it a duck.

A (Future, Promise) pair are different abstractions than an Await + Context.

You can hold a Future, check if it's completed, wait for it to complete. Once it is completed, you can access it all you want. A Promise is used to fulfill / set / send the value to the corresponding Future it was created alongside with.

You can think of an Await as an asynchronous FnOnce, once it's consumed, that's the value it produced. You can't really call it again, it doesn't cache the result for you, it doesn't notify you of completion, you can't ask it if it's done (it will just finish when it finishes).

A Future/Promise pair is closer to a channel() than to an Await.

Off course, if someone gives you a Future and you need the value now, you can let x = await the_future.


@vinipsmaker, when I say abstraction x is not y, I don't mean that you couldn't get to y from an x.
I simply mean that semantically, they are not the same concept.

Let's look at some async options, from the user's perspective.

Active

  • If you just want caching, use a something like Lazy::new<T, A: Await<T>>(a: A) -> Lazy<T>.
  • If you want tracking of completion (+ caching),
    • You have an Await? then use a something like Future::from_async(..) (this does not produce a Promise, the return of the Await is an implicit "set".)
    • Someone else will produce the value? then use Future::new() -> Promise.
  • If you just need the value of an async computation and you can't make forward progress without it then let x = await foo() will do the trick.
  • if x is not a one-of and you will be getting more x's, use let (tx, rx) = channel()

Passive

  • if you want to be invoked when something happens / is-ready-for-you, use an Event

@vinipsmaker, I'll try to explain the Context based on the following quote from you:

...it seems to me that you're trying to unify async code so threads and coroutines could have the same algorithms.

That sentence, particularly what I bolded out, is precisely why the Context abstraction exists, and what the Context abstraction is for.

Since we agree that a Thread and a Coroutine are not the same thing. Simply because of that, a Context and a Coroutine cannot be one and the same, because a Thread is also a Context.

Context unifies Threads, Coroutines, etc. under a single abstraction.

By the way, what you state as the "future islands" problem is exactly what gets solved with the Await, Context abstractions.


All in all, this doesn't solve one single problem: sharing a single thread with multiple jobs/tasks/closures/...

Correct, this is not intended to solve that. This is about what basic abstractions should go into the std to support interoperability between different async crates.

I'll maintain my position, I don't believe we need Coroutine's in the std. We just need an abstraction that will allow the community to create different jobs/tasks/closures/Await schedulers. Schedulers that create Contexts to execute this things on.


Not all problems were solved. We still need to "spawn" a few things and now in which threads they'll run. I'm not convinced until these others pieces are described well enough too.

Fair enough.

As far as scheduler choices go, the scheduler choice should be done by the main() crate.
That's the guy that is about to consume a big pile of Await stuff and can't really defer the decision any further.

fn main_async() -> impl Await<()> {
    // lots of awaits calling more awaits, spawning even more awaits
}
fn main() {
    ThreadPool::with_capacity(8).run_synchronously(main_async).unwrap();
    // or
    GreenPool::new(8).run_synchronously(main_async).unwrap();
}

In this example, the two, ThreadPool and GreenPool, are very different schedulers, and the Context's they produce are also different things.

A ThreadPool is created with 8 worker threads. A Context is still a very real Thread for that guy. So it could very much run out of Context's if all threads are waiting on something. If you queue 9 things, 1 will wait until another completes.

On the other hand, the GreenPool is created also with 8 backing threads, but their Context would be a GreenThread. You can have a million Await's running in parallel (yes, only 8 will be making progress at any given time). With cooperative scheduling, every time one pauses waiting for something, another one takes a turn.

Note: Libraries (as opposed to binaries) making a choice of a concrete scheduler should be something rare, and they should have a good reason for it (i.e. a library that will have thousands upon thousands of long running Await's would not be happy with the ThreadPool)


@vinipsmaker, I hope the proposed abstractions, how they interact with each other as well as how crates benefit from this to interoperate with each other start making more sense.

@jsgf
jsgf commented May 23, 2016

@vinipsmaker I think @icorderi's proposal could be implemented with a plain state machine, where each "function" maps to a state machine represented by a current state variable and a struct/tuple of state-specific variables. In this case, something that appears syntactically as a call is actually the composition of nested state machines.

I think this is a generalization of a coroutine: in a coroutine the state tuple is a current frame, the nesting is the call stack, and the current state is the program counter. But I wouldn't consider the general state machine concept a "coroutine".

@glaebhoerl
Contributor

@icorderi I'm curious how this might work on a type system level. E.g. (I see you already gave the definition for the Await trait above), what do you think the body of trait Context might look like, even if just as an example?

@icorderi

@glaebhoerl, I haven't been able to to play with Context enough to answer that with certainty.

In order for this to work, Context should have the building blocks of what's needed for coding the internals of the std::sync types (at least the awaitable versions).

The design of Context should allow a Coroutine implementation to never truly block, regardless of code being executed by the coroutine (or stuff the code calls) using Locks, Barriers, Receivers, SyncSenders, etc..

For now, I mirrored most of the free functions on std::thread that operate on the current Thread as well as the unpark() from Thread.

/// Execution `Context` for `Await`'ables
pub trait Context {
    /// Atomically makes the handle's token available if it is not already.
    fn unpark(&self);

    /// When _awaited_, blocks unless or until the current context's token is made available.
    /// This method returns immediatly.
    fn park(&mut self);

    /// When _awaited_, blocks unless or until the current context's token is made available
    /// or the specified duration has been reached (may wake spuriously).
    /// This method returns immediatly.
    fn park_timeout(&mut self, dur: Duration);

    /// When _awaited_, puts the current context to sleep for the specified amount of time.
    /// This method returns immediatly.
    fn sleep(&mut self, dur: Duration);

    /// When _awaited_, cooperatively gives up a timeslice to the scheduler.
    /// This method returns immediatly.
    fn yield_now(&mut self);
}

I haven't gotten around to read all the src code from std:sync to get a complete abstraction on my head. An initial dive on the std::sync code shows that it's built on top of sys::common which are thin wrappers on the OS synchronization structures.

This seem to be the base structures the std::sync module is built on:

Those, will block the thread. Which means, the Context abstraction needs to include them somehow.

It could be as trivial as:

pub trait Context {
    fn mutex(&mut self) -> Mutex;
    fn rwlock(&mut self) -> RwLock;
    fn condvar(&mut self) -> Condvar;
    // ...
}

Perhaps, Mutex/etc.. constructors should be able to get the internal implementation from the Context:

impl<T> Mutex<T> {
    /// Creates a new mutex in an unlocked state ready for use.
    #[stable(feature = "rust1", since = "1.0.0")]
    pub fn new(t: T) -> Mutex<T> {
        Mutex {
            inner: box Context::current()::mutex() // StaticMutex::new(),
            data: UnsafeCell::new(t),
        }
    }
}

A good exercise to get a tighter definition for Context is to start a sync2 crate with Mutex and Receiver working with Await. The Thread'ed version should have no extra overhead and end up doing the same thing the current implementations do. A Coroutine version should not ever block the underlaying Thread.


@glaebhoerl, does this help you understand how it would work at the type system level?

@glaebhoerl
Contributor

@icorderi That's helpful, thanks.

One thing I suspect would be an issue with that approach is that it would imply distinguishing a closed universe of supported synchronization primitives -- the ones currently in std. Whereas I suspect the attitude of the people involved would be "those are just the ones we chose to include in std, anyone should be able to roll their own and put it up on crates.io", which no longer works so well if they all have to be centrally enumerated up-front in a trait. (This feels like the expression problem.)

But I'll leave that to people who are more knowledgeable about this stuff.

@eternaleye
eternaleye commented May 24, 2016 edited

One thing I'd like to bring up is the C++ proposal N4453.

It defines an exceedingly low-level primitive, "resumable functions," which are sufficient to implement async/await, promises/futures, or cooperative green threading.

The proposal uses the "resumable" keyword in three places, which could be implemented independently if Rust were to take this approach:

  1. resumable T in function return types, like constexpr T. This declares that the function may suspend itself for later resumption.
  2. break resumable;, the actual suspension point.
  3. resumable T in variable declarations. This declares that the variable holds the result/state of a resumable expression.

However, all of the above are syntax sugar. What is actually required to implement the feature is an interface for resumable callables, much as Fn/FnMut/FnOnce are an interface for callables that do not suspend.

The interface would only require one method:

trait Resumable {
    type Output;
    fn resume(&mut self) -> Option<Output>;
}

This Resumable trait describes the execution state of a resumable function. The resume() method produces None until the resumable computation is complete, at which point it produces Some. Interestingly enough, this is a near-perfect dual for Iterator - even up to the lack of guarantees should resume() be called again after it returns Some.

A resumable function would then return some R: Resumable, await would be a trivial expansion using match, and suspension points would be simple syntax sugar in the same way closures are. The last could be left to later (pending compiler support), while the former parts require no special support whatsoever.

@eddyb
Member
eddyb commented May 24, 2016

We had some discussions in Orlando last year, where we agreed on state-machine generators on top of something like mio for high-efficiency network servers.

Every time I see mentions of coroutines (as in, "green threads"; Python sort of muddled the distinction by what I think is terminology misuse) or boxed callbacks in async IO discussions they make me sad because those actively impose limits on what you can actually do at any given time without running out of memory, not to mention slowdowns, although some virtual call costs wouldn't be noticeable.

cc @pcwalton who knows more about the state machines + mio approach.

@eddyb
Member
eddyb commented May 24, 2016

@eternaleye You're describing the other half of a generator, for which a minimal API would be:

pub enum Yield<T, R> {
    Value(T),
    Return(R)
}

pub trait Generator {
    type Value;
    type Return;
    fn next(&mut self) -> Yield<Self::Value, Self::Return>;
}

That is necessary to support both yield expr and return expr, although not enough to actually give input back to the generator, which introduces several complications.

@eternaleye
eternaleye commented May 24, 2016 edited

@eddyb That's a very odd trait for Generator, IMO. I mean, your definition is just a Result-ified Iterator.

Meanwhile, I see real uses for Iterators where Item: Resumable - in particular, if each value is expensive to compute, and latency is a concern.

@eddyb
Member
eddyb commented May 24, 2016

@eternaleye In a way, Yield itself is Result, but it makes more sense when your final return value is io::Result<R>, where Iterator<Item=Result<(), io::Result<R>>> seems very confusing and has three possible after-the-end behaviors (None, Err(Ok(())) or panic).

@eddyb
Member
eddyb commented May 24, 2016 edited

Before hearing @pcwalton explain how simple mio can be with generators, where you never have to pass buffers in and out, you just try to read repeatedly, and effectively yield on "would block", I wrote this crazy contraption as a model for generators with multiple entry points based on the types they expect to get back as input.

In a comment there is this playpen link for an example using the manually desugared implementation to show off "copying a file in chunks, with progress notification".

This model hits a wall when you have some construct that will execute multiple await requests in parallel, because it would need a way to know which one succeeded, whereas in the mio model, you just retry all of them, and at least one will succeed.

Based on what I now know, I would probably implement the mio model with helpers for non-network IO to be fair to multiple network clients, in which case, reading from a file could just somtimes result in an artificial "would block", causing different actions to be also attempted.

@eternaleye

@eddyb: My point is that your definition of Generator is quite different from what I see almost anywhere else; the common definition seems to be admirably captured by Iterator.

Rather, I see Iterator and Resumable as duals in a very real, meaningful way:

  • An Iterator expands itself into multiple values by repeatedly invoking the same computation
  • A Resumable collects itself into a single computation by sequentially invoking potentially-distinct computations

A perfect example of something that maps well to Resumable is Iterator::fold() - each step of that maps perfectly to a call to Resumable::resume(), with the final step producing Some

@tikue
tikue commented May 24, 2016 edited

Resumable as described wouldn't work with generators, whereas @eddyb's Yield would.

@eternaleye

@tikue: I'd suggest reading the linked C++ proposal more deeply - not only can resumables work with generators, the document contains an implementation of generators using resumables.

@eddyb
Member
eddyb commented May 25, 2016

Finally gave the paper a read, and it's pretty much what I expected: a dirty value stashing trick (which @eternaleye called "extending the iterator" - confusing out of context):

fn fib(mut n: usize, out: &mut Option<i32>) -> impl Iterator<Item=()> {
    let mut a = 0;
    let mut b = 1;
    while n > 0 {
        *out = Some(a); yield (); // yield(a)
        let next = a + b;
        a = b;
        b = next;
    }
}

fn main() {
    let mut val = None;
    for _ in fib(5, &mut val) {
        println!("{}", val.take().unwrap());
    }
}

While this is uglier than actually having generators in the language (not to mention somewhat less efficient), it also poses borrowing problems: the generator object would borrow val, making it impossible in the current-world borrow-checking to .take() from it in the loop.

Actually solving this problem involves knowledge of the generator object and/or the Iterator::next implementation for it, as in other cases mutating a mutably borrowed value can result in memory unsafety (iterator invalidation, really, if you replace the generator object with Vec iterator).

@glaebhoerl
Contributor

@eternaleye I don't understand how Iterator and Resumable are supposed to be duals when their definitions are alpha-equivalent?

The actual dual to iterator is -- allegedly -- the observer pattern. (Linking the google cache because the original seems to be broken.)

@eternaleye
eternaleye commented May 25, 2016 edited

So @eddyb and I talked further on IRC, and I'm going to try and summarize (if I get anything wrong, just let me know).

  • Once I got what @eddyb meant regarding the lifetime issues, I agreed that it left an N4453-style approach dead in the water
  • However, I felt that generators as posed above had a problem of conflation: All suspension points yield values. This poses problems for a number of cases, such as when producing a value may take a while, and the programmer wishes to insert preemption points - this is not particularly nice to express with the above.
  • Furthermore, there are interesting interactions with "proxy yield" and explicitly driving generators from within other generators.

IMO, the optimal form of asynchronous execution is one in which:

  1. The programmer writes code which, as much as is feasible, appears to be synchronous
  2. When the programmer wishes greater control, they can explicitly handle the asynchrony of functions they call
  3. Synchronous code can easily use asynchronous code without "viral" keywords propagating through

Implicit in this framing is that case 1 is more common than case 2 - most of the time, the programmer should not have to care whether something is synchronous.

Consider the following (strawman syntax):

async fn fib() -> Yield<i64> {
    let mut prev = 1i64;
    yield prev;
    let mut cur = 1i64;
    yield cur;
    while cur > 0 {
        let (old, next) = (cur, prev + cur);
        prev = old;
        cur = next;
        yield cur;
    }
}

async fn fib_prime() -> Yield<i64> {
    for i in fib().filter( is_prime ) {
        yield i;
    }
}

Note that fib() is called as a perfectly normal function in fib_prime() - this is due to point (1) - requiring a keyword here would clutter the majority of code.

fib_prime() may potentially run for quite some time before it yields something; this would cause problems of responsiveness. This can be resolved by having fib_prime() suspend even when it would not yield - specifically, if it suspends any time an async fn it calls would suspend. Here, that's fib() - this limits its worst-case execution between suspensions to one iteration of fib()and one call to is_prime(), rather than an unbounded number of the same.

Of course, there are times when one needs to explicitly drive a Generator. Here's such a case:

async fn foo() {
    let x = proc fib();
    for (i, v) in x.into_iter().enumerate() {
    println!( "{}", i );
    if i % 2 == 0 {
        suspend;
    }
}

This function has side effects, and needs to ensure it's not interrupted between any pair of operations. To do this, it uses the proc keyword to denote that, rather than being interested in the value of the fib() function (regardless of execution semantics), it wishes x to hold "theprocess by which fib() produces the values". It then converts the Generator into an Iterator, and processes it accordingly - suspending itself at a safe time to do so.

This satisfies (2) - the ability to control asynchrony when needed.

Finally, for (3), the tools have already been introduced:

fn baz() {
    let x = fib_prime();
    for i in x {
        println!( "{}", i );
    }
}

As a synchronous function has no suspension points, the behavior from (1) of suspending when fib_prime() suspends would be nonsensical - thus, calling an asynchronous function from a synchronous function returns the generator, in the same manner as using the proc keyword within an asynchronous function. This, together with the default behavior in (1), avoids the need for viral keywords.

In addition, as demonstrated in (2), generators convert cleanly to iterators. As a result, synchronous code that calls asynchronous code has essentially no syntactic overhead whatsoever.

However, in order to support this, the value returned from a Generator needs to support three possible cases, not just two (as @eddyb's suggestion had):

  • Suspend execution without a value
  • Suspend execution, producing a value
  • Finish execution
pub enum Yield<T> {
    Suspend,
    Value(T),
    Done,
}

pub trait Generator {
    type Value;
    fn next(&mut self) -> Yield<Self::Value>;
}

Note: @eddyb's design has Return carry a value R, but it may not be worth it: with suspending, an async function is zero-or-more Suspend followed by exactly one Value, while an async iterator is zero or more sequences of (zero-or-more Suspend followed by zero-or-more Yield) - the return value complicates conversion to an Iterator (if R is not ()) or a Fn{,Mut,Once} (if T is not ()))

In the case illustrated with (1), the "outer Generator" (fib_prime()) would drive the "inner Generator" (fib()) in lockstep - but when the inner Generator produces Value, the outer Generator will capture it, and emit Suspend instead. On next iteration, it will feed the captured value into its own update process.

In the case illustrated with (2), the outer Generator is wholly unaware of the inner Generator - the update process handles driving it entirely on its own.

In the case illustrated with (3), conversion to an Iterator simply wraps the Generator in a newtype, which is trivial and loops on receiving Suspend.

@bugaevc
bugaevc commented May 29, 2016 edited

May I mention -- coroutines/generators are closely related to the IO monad, and await syntax is just another form of do notation. See #243, in particular this comment by @Eilie.

@eddyb
Member
eddyb commented May 29, 2016

@bugaevc I really wish someone had made a nice blog post about this, because rehashing the argument every time is tiresome and pointless: monads a poor fit for Rust's low-level and imperative semantics.

Among other things:

  • You have to pick what kind of closure to take - Option wants FnOnce, anything which calls the closure more than once wants FnMut or Fn, depending
  • Monad::bind has to be generic over the closure type, but that means the implementer can't keep the closure around without boxing it (unlike, say, the map adapter on iterators, which gets static dispatch)
  • How do you integrate Rust's imperative control-flow primitives with closures and monads? This is an unsolved problem AFAIK, other than approaches like "TCP" which are usually implemented using exception handling machinery (hard to account for and optimize correctly).
  • A state machine transform, unlike monads or generalized CPS, can guarantee static dispatch and no additional indirection. This is something people building scalable systems want.

@dwrensha tried to explain how generators can be used in their blogpost but there's still lots of work to do to refine the details of the various approaches and then compare them.

Until someone can actually show how to implement useful monad-like abstractions in Rust that can compete with generators + ?/catch, I remain unconvinced about the value of that front.

@bugaevc
bugaevc commented May 29, 2016 edited

@eddyb Sure, I understand monads don't fit into Rust as easy as they do into Haskell.
I really like ?/catch and can't wait for them to reach stable. And yeah, they were relatively easy and straightforward to implement without having to dive into this lambda / state machine nightmare.

I have some experience with async stuff in Python, so I do understand both how generators work and how I/O coroutines can be implemented on top of them.

What I want to emphasize is that implementing await/yield syntax is essentially the same as implementing the do notation (with FnOnces).

? was a trivial special case that could possibly be implemented as do, but it was easy enough to be implemented the straightforward way. This time it is the general case. There is no simple way to do it without plunging into monads and do notation, and it has exactly the same set of problems.

Not really, since it's limited to FnOnces and that gets away with most of the problems -- but does Rust really need the totally general case? It seems to me that FnOnces would be enough.

Sure, the implementor can't keep the closure around without boxing it -- but neither can the I/O loop (or whatever machinery implements async I/O) keep around the implicit closure that implements trait Generator (coroutine/generator continuation). It's one-to-one mapping, that's what I'm trying to say.

@eddyb
Member
eddyb commented May 29, 2016 edited

@bugaevc But a generator has N states in a single type, where monads/CPS have one closure type per state.

but neither can the I/O loop (or whatever machinery implements async I/O)

But I can allocate an array of a million instances of a single generator ahead of time, with no indirection.
This is the future of massively scalable services: minimal computing overhead, maximal throughput.

@tikue
tikue commented May 29, 2016 edited

@eddyb can you elaborate? I can't really picture when you'd want to run a million copies of the same service. I actually think running many different services on a single event loop is going to be the more common use case. Any service that delegates to another service for part of its work will need to do this. This is actually something I'm struggling with in tarpc, where I box services on the event loop to enable heterogeneity.

@krstoff
krstoff commented Jun 1, 2016

@eddyb: I don't want to sidetrack the discussion on-hand, but someone did write an excellent blog-post about it (monads and Rust) some time ago.

https://m4rw3r.github.io/rust-and-monad-trait/

@eddyb
Member
eddyb commented Jun 1, 2016

@krstoff That addresses only my first point and pretends that it can eschew the second (which is not true, not while keeping the monad consistently typed).
And it's already pretty complex, sporting a huge feature requirement list.

@sagebind
sagebind commented Jul 2, 2016 edited

First, this thread has been very, very interesting to follow, and I really hope Rust can bring a standard solution to async (or at least something for the many crates out there to play nice together). I can't help but take a stab at this stuff myself (maybe yet another crate will show up from me?).

This thread seems to have wandered around for a while, I think because everyone already has in their own mind how they want the abstraction of async to look like. Whether its callbacks, coroutines, or what have you. None of those are strictly required for async -- they are good abstractions over something that can be confusing to work with at times, but are not required for low-level interoperability.

The issue is this: we want some operation to begin, and we don't want to "block" the code we're at. There are lots of ways to solve this. Here's (another) summary:

  • Coroutines: Requires compiler support, and/or really, really fancy macros to break the linear flow of execution. When we either yield or await on something, we want to suspend the current coroutine and jump to code elsewhere. When the value is ready, we ought to jump back to where we were and resume execution.

    Question for those already thinking on these: for an expression let x = await y;, what is a valid type for y such that it can be "awaited" on? We can do some trickery to optimize coroutines calling other coroutines (to "flatten" them), but otherwise we need some sort of representation of a future value... which leads to...

  • Futures/promises/awaitables (there are many names used for similar concepts): Basically like a Result<T, E>, but something that has three states:

    pub enum Awaitable<T, E> {
       Pending,
       Ok(T),
       Err(E),
    }

    Of course this implementation won't work because the awaitable must be mutated from a Pending to an Ok or Err once the value becomes available, but this definition is for clarity of the core idea. This concept isn't completely at odds with other paradigms, unless you need to bypass an actual Awaitable for the sake of performance (like in yield from).

  • Callbacks: Everyone seems to agree that callbacks are ugly; I do as well, and it quickly becomes a hot mess to work with and causes code to become really messy. However, callbacks have the advantage of being approach-agnostic (with a few exceptions). For example, Awaitables could use callbacks to handle the completion of the value (adapting a previous example:

    let mut buff = [0u8;1024];
    TcpStream::bind_async("example.com:8080").result(|s| {
       s.read_async(&mut buff)
    }).map(|c| {
       String::from_utf8(buff[..c].to_vec()).unwrap()
    })

    Not the cleanest, but not all that different from working with a bunch of Results. Not being able to use pattern matching might be annoying, but we could minimize the callbacks with a signature like fn result(then: FnOnce(Result<T, E>) -> Awaitable<T, E>) -> Awaitable<T, E>; and work with the Result directly.

As noted by those knowledgeable in the Python world, async/await was introduced because of the ever so slightly different semantics from generators and to prevent muddying coroutines and generators together. We may want to keep them separate right off the bat to prevent this problem before it starts.

I think await would be the best solution, but I'd rather take advantage of Rust's honest typing precedent and avoid the async keyword. Instead of

async fn do_something<R: Read>(r: R) -> Result<String, ?> {
    (await r.read_async(1024)).map(|buf| String::from(buf))
}

why not this?

fn do_something<R: Read>(r: R) -> Awaitable<String, ?> {
    await r.read_async(1024).map(|buf| String::from(buf))
}

Food for thought.


@icorderi's proposal seems almost useless to me, unless I am missing something?

trait Await<T> {
    fn await(ctx: &mut Context) -> T;
}

For this to be useful at all requires there to be some sort of hidden semantics here that is capable of suspending the execution context, in a rather preemptive manner. This signature promises no such suspension and is misleading. The only way for this to work is to have:

  • coroutines that can pick up on this and suspend the couroutine without it being in the type signature or without calling a scheduler-specific function. Impressive ability to me, unless coroutines are implemented by the compiler.
  • thread scheduler that synchronously waits for the value to arrive in a new thread. Again not revealed anywhere in the type signature or invocation. In addition, blocking, just somewhere else other than the current thread isn't really async.
  • block the current thread at the await() function call for the value to arrive. Doesn't this defeat the point of trying to introduce async? Just do a regular blocking call and save the overhead.

@eternaleye's trait definitions for generators seems the most realistic to me, but again, do we want to muddy generators and coroutines together? If so, that's an acceptable stance, just not one I particularly agree with.


Sources: My own experience in implementing async over the years and working hard bringing it to PHP.
Sorry for the wall of text.

@glaebhoerl
Contributor

do we want to muddy generators and coroutines together? If so, that's an acceptable stance, just not one I particularly agree with.

It's all just control flow... what's the rationale for segregating them? (Honest question for my own edification.)

@eddyb
Member
eddyb commented Jul 3, 2016 edited

I much prefer the terminology I've seen used during ES Harmony, before ES6 chose a generator design:

  • generator - resumable function, can be stackful or stackless (stackful being generally wasteful here)
    • expression yield (i.e. passing a value into the generator step function) is an optional feature
  • coroutine - yield-using function in a resumable call stack, which can also contain regular functions

This makes "stackless coroutine" either an oxymoron or an interprocedular function -> generator pass.

What Python did, IIUC, was call generators "coroutines" when they also had expression yield.
However, expression yield has no measurable implementation costs in the stackful cases, and neither in stackless state machine transforms performed on CFGs (which is what MIR is), only AST transforms.

We can waste our time either by using misguided terminology variants or being very explicit every single time (e.g. "stackless resumable functions with inputs/outputs"), agree on less worse ones or invent completely new terms. But, you know, naming things is a hard problem and too much of a bikeshed.

@Ericson2314
Contributor
Ericson2314 commented Jul 3, 2016 edited

I was talking to eddyb about this last night. A few things that occurred to me:

@eddyb
Member
eddyb commented Jul 3, 2016 edited

@Ericson2314 I'm not entirely sure how cycling works, I haven't yet looked into that. However:
I should point out that the complexity would be mostly at the type-level, not the state-machine transform.
We can provide both session types and an iterator interface at the same time, using the same MIR-based transformation, but different type-level handling.

@Ericson2314
Contributor

@eddyb you said you don't like https://gist.github.com/eddyb/822c658190ccf18058db so much anymore? What is you latest mir-transformation idea?

@Ericson2314
Contributor

We can provide both session types and an iterator interface at the same time, using the same MIR-based transformation, but different type-level handling.

Well, our existing iterator is equivalent to Cycle2<(), Item> where Cycle2 is trait Cycle2<T, U>: FnOnce(T) -> (U, impl Cycle1<T, U>) { }---i.e. you give it a T, it gives you back a U. More complex sessions do not map onto this. But yes I do agree that the type-level abstractions are fairly orthogonal to the underlying compilation strategy.

@eddyb
Member
eddyb commented Jul 3, 2016

@Ericson2314 AFAICT you're not suggesting multiple input types for the same generator type, but rather different generator types, which is more desirable type-wise.
However, pass-through to such a generator requires inlining all of the steps. Which is actually a good thing, if possible you want to flatten out the states to minimize nested matches.

In fact, I believe @alexcrichton is hitting a similar problem with his futures library, which uses combinators that produce nested enums.
OTOH, flattening the state machine (which I believe C# also does?) would additionally let you experiment with a "computed goto" implementation (which we get for indirect tail calls, for example),

But at the MIR level, the transformation is straight-forward: take live locals at each yield point and put them in an enum variant, moving the target of the yield into an arm of the switch over that enum.

@Ericson2314
Contributor

AFAICT you're not suggesting multiple input types for the same generator type, but rather different generator types, which is more desirable type-wise.

exactly.

However, pass-through to such a generator requires inlining all of the steps. Which is actually a good thing, if possible you want to flatten out the states to minimize nested matches.

Yeah as I understand it, combinators and/or my do notation would introduce "administrative lambdas" which we would then try to inline away.

But at the MIR level, the transformation is straight-forward: take live locals at each yield point and put them in an enum variant, moving the target of the yield into an arm of the switch over that enum.

Yeah this is important to make mutually recursive closures work (and thus loops with my do desugar). rather than creating two closure values, there is two closure types, and in each closure body, the other closure is constructed just for the tail-call.

@eddyb
Member
eddyb commented Jul 3, 2016

@Ericson2314 At the MIR level there would be just one function, which greatly simplifies the process.

@jbmikk
jbmikk commented Jul 7, 2016

This is a very interesting thread, I was wondering how all of these possible solutions could fit into Rust's design. Maybe due to my javascript background I'm biased into thinking promises are a good interface for handling asynchronous code.

As @coderstephen pointed out, callbacks are approach-agnostic, in the sense they could fit well with different approaches. I'm new to Rust and I'm still learning about it, however I'm aware that closures let you take ownership of the bindings in a given context, thus you get to decide what keeps on living in the closure and what is freed with the stack.

I think that for the purpose Rust was designed for, this is a very important issue. Using closures the context that has to be brought back when a promise is fulfilled can be precisely specified by the programmer to the minimum necessary to handle the event. Moreover, with closures you don't have a (possibly) large frozen call stack waiting for it to return.

Picture a very deep call stack where you create a closure that takes ownership of a single binding, all the functions eventually return back up to the main function, the call stack is freed and after a while the promise/await is fulfilled, invoking the closure which creates a very little call stack with the binding it previously took ownership of.

Regardless of the approach (await, promises, etc), is there any other way this could be implemented (without closures) while still having this level of control over what bindings keep on living and which ones are freed?

I think this is specially important not only for performance reasons, but also because of Rust's ownership model. Having a large stack with many bindings created being suspended in order to handle other computations sounds like a lot of potential struggling with the ownership system.

@kevincox
kevincox commented Jul 7, 2016 edited

The problem with callbacks for async IO is that you need a dedicated thread of execution to call those callbacks, then you have to worry about blocking that thread, or having multiple of them for high throughput systems. Callbacks work really well in javascript because there is only ever one thread and it never blocks. However Rust has a much more flexible threading model so it will require a more flexible solution.

Currently I am thinking that futures sounds like a good approach. They are similar to promises in a number of ways:

  • An object to represent a async result.
  • Allows easily running multiple tasks in parallel.

However rather then attaching a callback futures generally provide a way to block and wait for a result.

let a = do_async_thing_a();
// Execution continues before a is complete.
let b = do_async_thing_b();
// The next line will block.
return a.value() + b.value()

This allows the two operations to be done in parallel. The major advantages to promises over futures is the lack of chaining, meaning that you will have to run a separate thread of some sort if you want to do processing on the result and return another future. However at least this isn't part of the public API.

Another feature that I consider essential for async work is a poll/epoll style option. Poll takes a set of futures and returns when one "resolves" and epoll is an optimization on poll that maintains a set of futures separately so that every call doesn't have to iterate over all of them.

With the user-control Rust provides picking a flexible enough model is very difficult, however I hope that we can figure something out because this is something I wish I could use every day.

@jbmikk
jbmikk commented Jul 7, 2016

I know I skipped the threading part. I think that for a language like Rust this is one of the things the user should be able to decide on.

It think promises/futures are pretty much the same thing, what matters is how you wait for them to complete.

I the case of promises with closures, having a separate thread would not be necessary. Whenever you have promises, you always have an application loop somewhere, usually hidden in the environment, although it could be explicit.

When a thread queues up a promise it doesn't block, it continues to return to its caller, eventually reaching the end of the main function. At this point which would normally be the end of the program the thread would enter the "event queue loop" be it explicitly by calling a function or by some other built in language feature.

Promises could be queued up in a thread local queue, so at the end of your main function you could call say thread::processEvents(). So you would have a single threaded solution, just like browsers or nodejs.

Multi-threaded solutions could be implemented with each thread handling it's own event queue if necessary. So multi-threaded applications would still be possible.

As I see it the awaiting solution implies that you either:

  • Have multiple threads, because every time you reach an await instruction each thread is blocked which is almost the same we have right now with sync std::io functions.
  • You have a green threading mechanism so that each time you reach an await instruction the context is saved (thread goes back to the pool and waits for something else to do, but the call stack remains).

If you have to keep track of multiple call stacks then you have multi-threading of some sort (green or not). There is no non-blocking single-threaded single-stack version of the await form, it's just not possible.

This was precisely my point in my original post. Having multiple call stacks takes a higher toll memory and ownership-wise. You have multiple call stacks using up memory and owning bindings at the same time (even if there is a single real thread processing them) and let's not forget the context switch cost.

The other related question I was trying to address is that closures let you take ownership of the bindings you need, everything else is freed up, the whole call stack vanishes along with all of the bindings. The await version does not offer something like that.

let a = do_async_thing_a();
// Execution continues before a is complete.
let b = do_async_thing_b();
// The next line will block.
// What if I want to do something else meanwhile?
// I need another thread. What happens with all other bindings in the call stack?
return a.value() + b.value()
@kevincox
kevincox commented Jul 7, 2016

Yes, I see what you are saying. But the problem is that a language-wide library can't know what event loop, thread pool or whatever you are using. The library would need to be general purpose so it can work with any threading/callback abstraction the user desires.

What I was trying to get at is I don't think that can be done with callbacks. Because if you say callbacks are always called on the same thread you now need some sort of green threads/scheduler/reactor and it won't be compatible with blocking. Promises (as specified in ECMAScript) require callbacks so I don't see them as flexible enough for all Rust code.

let a = do_async_thing_a();
// Execution continues before a is complete.
let b = do_async_thing_b();
// The next line will block.
/// What if I want to do something else meanwhile?
/// I need another thread. What happens with all other bindings in the call stack?
//// No you don't, you just do it here, then only call `.value()` when you need this result.
return a.value() + b.value()
@kevincox
kevincox commented Jul 7, 2016

However Futures abstract the threading/scheduler so that each future implementer is free to use a thread, or an evented source. Also the caller is blissfully unaware of what is being used under the hood. This opens up options for running computationally expensive operations on a background thread while handling IO events using a single (or small number of) epoll thread(s).

I would look at the C# Task framework. While it requires compiler support I believe it could be made without if desired and it works well for single or multi-threaded concurrency paradigms.

@NeoLegends

C# is beautifully designed IMO, I'd recommend looking at it too. As a language-design newbie, what would prevent us from copying the C# implementation?

@kevincox
kevincox commented Jul 7, 2016

C# is a futures implementation. I agree that it is very nice and that's why I'm recommending something very similar.

@jbmikk
jbmikk commented Jul 7, 2016

I agree with abstracting away the threading model. I didn't say the closures should always run in the same thread nor that the abstractions should be aware of it, that would just be one option for the user setting up the environment.

The environment should always have a threading model setup and the applications should work with whatever the environment provides. The single threaded version would be just one possible configuration.

I think that both closures and awaits could work with any threading model, the only difference between them is still how the call stack and memory are handled in each case. Closures are cheap in terms of stack/ownership, while awaiting is more expensive yet more straightforward for the user programming it.

Having the user be unaware of what thread model is being used is a good thing, but being unaware of the costs of using await vs closures is not so much. Not in the case of Rust at least, which is a systems language. It's like knowing the difference between static and dynamic dispatch.

I think this should be one of the cases where the user has to decide how he is going to wait for something to happen (regardless of the environment threading model).

Should I keep the whole stack around while waiting for this or should I just create a closure with the bare minimum context?

C# is very powerful, but it was designed AFAIK for a different use case than Rust. I'm sure there is a lot that can be learned from it, but we can't forget that whatever idea is taken from it has to work for Rust. Unless we wanted to change Rust's philosophy and turn it into .net, which is just another possibility.

let a = do_async_thing_a();
// Execution continues before a is complete.
let b = do_async_thing_b();
// The next line will block.
/// What if I want to do something else meanwhile?
/// I need another thread. What happens with all other bindings in the call stack?
//// No you don't, you just do it here, then only call `.value()` when you need this result. return a.value() + b.value()
///// What happens if I want the thread to do something else besides waiting for a and b.
///// Should I put all of the things I want to wait for in an array and await for them once?
///// If so that sounds like the only place I'm going to use await is in an application loop
///// because I want to wait for multiple unrelated things. Unless I use multiple threads.
@kevincox
kevincox commented Jul 7, 2016

You don't have to keep the whole stack around to use a future. Blocking on a future is just one way to use it. You can also put a bunch of them in a select/poll style function and you are at an event-style model. Or you can put them into a class that will call your callback when it "resolves".

fn select(Iter<Future<T>>) -> Future<T>
fn then_call(Future<T>, Fn(T))

The reason I am recommending futures is they seem much more fundamental and it is easy to compose them into other models. I don't think they are less efficient but if that is the case then it would be a strong argument against them.

@jbmikk
jbmikk commented Jul 7, 2016 edited

I have been reading a bit more about how .net works. I think I had the wrong model on my head of what await does. At least .net's model is different from what I had in mind.

Most importantly when the await is reached the function is not blocked. The function returns at that point just like a return statement would. Afterwards at some point the thread processes the pending futures and resumes execution at the await point, but that function never returns to the original point where it was called from.

So basically what happens when you call await something very similar to creating a closure, but it's implicit.

When you call await the context is saved into this implicit closure/task, saved in a magic invisible queue, the function returns as if nothing had happened, in .net the function must be marked async and return a Task.

Later at some point, the thread magically takes the queue and process the closures/tasks, creates the context and executes them, but it doesn't create the whole stack. Only the context from that function is created. So when the await is resumed it doesn't return to the original point where it was invoked (just like a closure).

So yes, .net's implementation of futures would be computationally just as cheap as using closures, but without having to implicitly create the closures, nor call the select() method. I think it is a good solution.

The important part is that to keep this working on a single thread, it has to eventually return to the main function and finish, so the thread can take care of the queue.

All of this is extremely similar to the promises + thread local queue I proposed earlier, but with a much nicer syntax.

Just for reference here's a good SO question. Anri's answer is particularly good.

@NeoLegends

I've been working with .NET for a couple of years now and only recently "switched" to Rust for my personal projects, so I know quite a bit on how .NET does things.

You're right, when the execution hits an await statement, the thread does not block. Instead, it queues up the rest of the method kinda as a callback, which is executed when the asynchronous operation completes.

The callback is invoked through / given to a proxy class called SynchronizationContext (see this). It is an abstract class of which an instance is assigned to a static variable / global. It acts as a sink to queue up the callbacks on the scheduler which is used / needed for the current project. WPF and WinForms applications use a single threaded model, which is why the continuations need to go onto the event loop thread. Console applications on the other hand have no concept of an event loop (they are like raw Rust binaries) and thus schedule on the (global) thread pool.

The context is captured by the compiler in a desugaring operation. async/await-methods are desugared into state machine classes whose members are the method's local variables (remember, C# has a GC and thus can use class methods as closures). Invoking the method then works like an iterator. The class has a MoveNext method which is invoked after each await in the method. It transitions the state machine into the next state, until everything is done.

@eternaleye
eternaleye commented Jul 7, 2016 edited

@jbmikk

There is no non-blocking single-threaded single-stack version of the await form, it's just not possible.

This is false. Transforming the asynchronous code to a state machine means that the asynchronous code has no stack at all - its state is captured into an explicit structure. As a result, there is only a single stack: The code driving it. This is, in fact, exactly what is done with async/await in C#.

@jbmikk
jbmikk commented Jul 7, 2016

@eternaleye Yes, I clarified in a later post that I had misundestood how
await works.

I was confused because I hadn't previously worked with it and because many
people talked about blocking execution, which never really happens.

Nothing gets ever blocked with await and that is a good thing.
El jul 7, 2016 5:52 PM, "eternaleye" notifications@github.com escribiΓ³:

@jbmikk https://github.com/jbmikk

There is no non-blocking single-threaded single-stack version of the await
form, it's just not possible.

This is false. Transforming the asynchronous code to a state machine means
that the asynchronous code has no stack, resulting in a single stack: The
code driving it. This is, in fact, exactly what is done with async/await
in C#.

β€”
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#1081 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAWDnbfb6WXwIPVLAJfI5tGTdeTX-mwoks5qTWcSgaJpZM4EFOFt
.

@kevincox
kevincox commented Jul 7, 2016

On Jul 7, 2016 18:46, "Jens Mikkelsen" notifications@github.com wrote:

Nothing gets ever blocked with await and that is a good thing.

Well I agree that having a non-blocking option is essential. I would say
that a blocking option is equally important. I would like to see at least
three paradigms easily supported.

  • Non-blocking
  • Blocking
  • Selectable
@eternaleye

@kevincox, @jbmikk: I feel like you're both conflating several different things here, and it's obscuring the issue a bit.

  1. What's the execution model?
    • State machine transforms? (C#)
    • Cooperatively-switched stacks? (goroutines)
    • Preemptively-switched-stacks? (some green thread libraries implement this with SIGALARM)
    • Bring your own? (C++)
  2. What's the programming model?
    • Waitable futures? (C++-like)
    • Chaining promises? (JS-like)
    • Callbacks? (older JS-like)
    • Async/await? (C#-like)
    • Invisible at call time?
  3. What's the system runtime interface model?
    • Readiness polling?
    • Completion notification?
    • All blocking, all the time, bring your own threads?

These things interact (mostly in that some execution models make some programming models inefficient), but they're very distinct.

The state machine transform execution model is unique in both how little infrastructure it requires at runtime (without a moving GC cooperative stacks are very problematic, and even with it they incur overhead; preemptive stacks involve a scheduler) while also efficiently supporting any/all of the programming models.

Then there's the question of how to hook these things together - for example, in the waitable-futures programming model, under what circumstances does waiting block for real, and under what circumstances does it cause the blocking routine to itself undergo state machine transformation?

@kevincox: In addition, "Non-blocking" and "blocking" are ambigious. For non-blocking, do you expect:
- Javascript-style, the nonblocking operation injects more code into an asynchronous context?
- C++-style, the nonblocking operation merely checks for whether the value is available?
- Other?
For blocking, do you expect:
- C++-style, the blocking operation blocks the whole real thread?
- C#/goroutine-style, the operation is logically blocking, but actually just transfers execution elsewhere?

Selectable doesn't even belong in the same list as those two - fundamentally, selection is about handling collections of asynchronous operations, and is independent of the above. One can just as easily have something like Javascript's Promise.race() (which returns a new promise that encapsulates "the first to succeed") as one can hypothetically implement a "first" operation on C++-like futures, which given an array of futures would return the index of whichever one becomes ready first.

In addition, under the state machine transform, blocking/nonblocking itself becomes completely independent of the programming model: Whether it's "nonblocking" (returns "I would block") or "blocking" (does not return control flow until done) is simply a matter of whether the caller is also transformed into a state machine.

At that point blocking vs. nonblocking only exists at the system interface level, and is best handled not by the language, but by whatever binding is handling such things - which is going to be a reactor, likely one akin to mio, which drives state machines and performs I/O on their behalf.

@sagebind
sagebind commented Jul 8, 2016

I'm not quite understanding why a number of you continuously bring up threads. Certainly, threading can be integrated with async, and we should embrace that, in fact, since Rust has such a great threading system. But doing so is probably hard, and if we accomplished that Rust would be one of the rare innovative new languages to integrate the two.

When we're talking about asynchronous I/O, we're looking at some sort of reactor or proactor event loop running somewhere. It can be the same thread as the programmer's code (most common), or in a different thread (how C# implements it... I can't think of any other language off the top of my head). I/O is unique in the fact that it is interaction with entities external from the CPU; they are unpredictable and not in sync with the CPU whatsoever. Why am I bringing this up? Because I/O operations are non-blocking down through the kernel and into the hardware. It would then be a waste to use even more than one or two threads to "wait" on I/O events, because you can wait on thousands of them in one thread. Our async model must allow this by some means.

When I say non-blocking, I mean that "no thread of execution is suspended from CPU scheduling because of an ongoing operation". Not making a kernel thread block for you or a background thread, or even a userland/green thread. It isn't worth suspending anybody's thread of execution to wait for something that is already non-blocking and will happen whenever it feels like it.

mio does a fine job of making I/O events manageable in linear code and could be used as a backend for async I/O handling.

As an aside, I don't understand why blocking on a future/promise would even be a common use case. If you're just shoving everything into threads anyway (to block elsewhere) why not just use threads instead of futures? Perhaps I could be convinced of some practical uses of this.

The norm for true non-blocking, asynchronous operations is a single-threaded system, but I've been a strong supporter of trying to come up with good models of using threads and async in the same system.


@glaebhoerl

do we want to muddy generators and coroutines together? If so, that's an acceptable stance, just not one I particularly agree with.

It's all just control flow... what's the rationale for segregating them? (Honest question for my own edification.)

Because they serve different practical purposes, and writing an asynchronous generator becomes really convoluted if they are the same thing.


@kevincox

The problem with callbacks for async IO is that you need a dedicated thread of execution to call those callbacks, then you have to worry about blocking that thread, or having multiple of them for high throughput systems. Callbacks work really well in javascript because there is only ever one thread and it never blocks. However Rust has a much more flexible threading model so it will require a more flexible solution.

A separate thread is not required, but I see the benefit of allowing the programmer to decide where it is invoked.

@bugaevc
bugaevc commented Jul 8, 2016

Is the C#/.NET solution everyone's discussing any different from Python's one?

@sagebind
sagebind commented Jul 8, 2016

@bugaevc .NET's solution is more or less to throw lots of threads at things in a hidden thread pool so everything seems nice and async. In theory, there are dedicated I/O threads in the .NET pool that handle async I/O for everyone else, but I guess they aren't even always used in practice.

Their overall architecture is to abstract the different types of background threads ("synchronization contexts"), each handling different types of operations. awaiting a Task<T> asks the thread pool to find a context that knows how to handle the task and resume the function when it completes.

@lnicola
lnicola commented Jul 8, 2016 edited

To expand on @coderstephen's comment, .NET SynchronizationContexts are a very elegant solution. They are captured when starting an asynchronous operation and are propagated throught the continuation chain.

The thread pool is one kind of synchronization context. Another example would be a GUI thread: controls usually have thread affinity and in .NET an async operation started from the interface will complete on the same thread.

They also abstract GUI toolkit differences. Since e.g. Windows Forms and WPF have different APIs for running callbacks on the interface thread, it's easier to just use the SynchronizationContext without caring about these differences.

EDIT: It seems that @NeoLegends had already given pretty much wrote the same explanation.

@glaebhoerl
Contributor

@coderstephen

Because they serve different practical purposes, and writing an asynchronous generator becomes really convoluted if they are the same thing.

Could you elaborate, perhaps with an example? I've seen numerous claims that generators and async/await are "the same thing, on some level", or are both special cases of a slightly-more-powerful abstraction, which all more or less seem to make sense from my inexpert vantage point, and I would like to understand the counterclaim.

@kevincox

@glaebhoerl Async/await can be trivially implemented using generators.

Essentially generators are functions in appearance (in the code) but the compiler turns them into a function that returns an Iterator-like-class. Then each call of that class can receive an input value and return an output value.

Pseduo example.

// Function return a generator with input () output i64
generator fn count_by(from: i64, to: i64, by: i64) {
    for i in from..to {
        yield i * by
    }
}

// Example equivalent code.
struct count_state {
    from: i64, // Dead variable.
    to: i64, // Dead variable.
    by: i64,
    iter: Range<i64>,
}

impl count_by_state {
    fn next(&mut self, input: ()) -> Option<i64> {
        match self.iter.next() {
            Some(i) => i * by
            None => None
        }
    }
}
fn count_by(from: i64, to: i64, by: i64) -> count_by_state {
    let mut iter = from..to;

    return count_by_state{
        from: from,
        to: to,
        by: by,
        iter: iter
    };
}

This transformation isn't trivial but can be done by the compiler. In this example it had to break the loop and compiler generated code would probably be more of a state machine but the next() method would have the same interface (or similar). Notably this transformation is exactly what is done by the C# compiler for async/await (although it is less general but that is not really relevant in this context).

To implement async/await using generators you just need to return a promise/future from the generator next() function and the next invocation of the generator passes the result. You can see many examples of this implemented using JavaScript generators.

The one problem I can see with this in rust is because of the static typing. Because if you have two yield points in the function that want to receive different values that would be hard to express (you can no longer have a single next() function).

@CaseyLeask

A new example of a community async library is http://aturon.github.io/blog/2016/08/11/futures/ by @aturon and @alexcrichton

@amluto
amluto commented Sep 8, 2016

@icorderi, @vinipsmaker, etc:

I'm a bit late to this discussion, and I'm not (yet?) terribly familiar with Rust's internals or exactly how lifetimes work, but I think there is a difference between async/await and coroutines in terms of safety and lifetimes, and maybe there's an issue with both of them that I haven't seem brought up here. Consider a different example usage:

async fn foo(a: &Thing) -> [whatever] {
    let ret = await ...;
    a.do_something(|b: &OtherThing| { f1(b); f2(b); } );
    ret
}

In a pure async/await model, the closure passed to do_something cannot block, switch threads, or engage in any other shenanigans. This means that do_something can pass any reference to the closure without regard to lifetime constraints -- from do_something's point of view, the closure accepts any lifetime for its reference argument.

A different variant:

async fn foo(a: &Thing) -> [whatever] {
    let ret = await ...;
    a.do_something(|b: &OtherThing| { f1(b); await whatever(); f2(b); } );
    ret
}

doesn't directly cause problems, as the closure would either be ill-formed or would be a different type of object entirely and would constrain the lifetime of its parameter. In contrast, with the type of coroutine that I think is being discussed here, the async and await annotations aren't explicit and, with code like:

fn foo(a: &Thing) -> [whatever] {
    let ret = await ...;
    a.do_something(|b: &OtherThing| { f1(b); f2(b); } );
    ret
}

unless possible blocking / coroutine switching points are part of the function types, there's no way for do_something to know whether the closure will block (because even this code can't tell -- does f1 block?), so the safety considerations are more complicated.

I'm not sure that serious coroutine proposals / libraries are subject to this problem, but it seems to me that avoiding it would, at the very least, require considerable care in exactly how the interface for switching coroutines works.

It also occurs to me that, with a real coroutine library, there might be a need to distinguish between Send/Sync and the related concept of safeness to send between coroutines on the same thread or safeness to share between coroutines on the same thread. For example, Rc should be safe to share between coroutines on the same thread. I think that async/await gets this right for free.

@camlorn
camlorn commented Sep 12, 2016

@amluto
I think at least part of the lifetime problem with both generators and async/await is that they are both effectively streaming iterators.

I don't quite follow how higher-kinded types solve it personally, but there is an RFC floating around for it.

I do agree that it needs to be explicit. Things shouldn't be able to await unless their caller also awaits. This is effectively the problem with Python's Gevent, though there the yield points are "did you call something in any of these 5 magic modules?" which is also part of the problem.

The real issue with switching that doesn't bubble up is that you can't pretend coroutines are actually a single-threaded program. Since any API you might call might add an await in any function you call, you have to deal with locks. This is both less performant and harder to do right conceptually.

@amluto
amluto commented Sep 12, 2016 edited

I'm not sure why higher-kinded types are needed. I can imagine async/await being mostly (or entirely) sugar for futures, kind of like this:

fn do_a_thing<'a, 'b>(short_lived_arg: &'a i32, long_lived_arg: &'b i32) ->
                      -> Box<Future<...> + 'b> {
    let thing = get_a_future(short_lived_arg);
    let other_thing = thing.and_then(|x| x+long_lived_arg);

    // We cannot do this:
    // other_thing.and_then(|x| x+short_lived_arg)

    some_other_async_function(other_thing)
}

If this got nicely sugared up, then perhaps the lifetime distinction could be preserved

On an only vaguely related note, IMO it would be quite neat if we had nice syntax for futures (or async functions) that produce streams. This could serve the same purpose as generators.

[async? generator?] fn natural_numbers -> [what goes here] {
    for (i in 1..) { yield i; }
}

This could almost work in the syntax of the futures crate:

fn natural_numbers -> Box<Future<Item=(i32, [the same Future]), Err>> {
    [too lazy to actually implement this]
}

But I don't think that the self-referential associated type can actually be expressed.

EDIT: This can be expressed using the future crate's Stream trait.

@eddyb
Member
eddyb commented Sep 12, 2016

@amluto Future can be considered a special case of Iterator and as such, emulating Iterator with Future is going in the wrong direction.

@amluto
amluto commented Sep 12, 2016

@eddyb: What do you mean? Iterator returns values now. Futures return a single value later? Am I missing something here?

@eddyb
Member
eddyb commented Sep 12, 2016 edited

It's possible to do this (but the futures crate doesn't do it):

impl<T, E, I: Iterator<Item=Result<T, E>>> Future for I {
    type Item = T;
    type Error = E;
    fn poll(&mut self) -> Poll<T, E> {
        match self.next() {
            None => Ok(Async::NotReady),
            Some(Ok(x)) => Ok(Async::Ready(x)),
            Some(Err(e)) => Err(e)
        }
    }
}

IOW, the required API surface of Future is equivalent to that of an Iterator over Results.

EDIT: However, this is not a fused iterator and a tad weird.
It'd be cleaner (and enforce "one final value") with Generator<Value=NotReady, Return=Result<T, E>>.

@camlorn
camlorn commented Sep 13, 2016

The equivalence of generators to the streaming iterator problem is demonstrated with something like this:

gen broken()->&i32 {
    for i in 1..10 {
        yield &i;
    }
}

This is a streaming iterator. This should absolutely work. But it can't. Either the compiler has to emit an infinite number of temporaries or the references that are yielded get modified when the iterator advances. In the latter case, this breaks the assumption that &T means unchanging, and then there goes safety. It is possible that the futures crate has a solution to this problem, but I don't think it does.

The equivalence of a generator to an async/await is mostly straightforward: an async/await is an iterator over awaitable things.

The solution to the streaming iterator problem being higher-kinded types is something I've read in multiple places, and there is an RFC for a limited form of HKT somewhere citing that as a reason for doing what it proposes. Perhaps there is another way; I have not yet taken the time to understand how higher-kinded lifetimes as proposed in that RFC solve this problem. But unless coroutines are streaming iterators, some of the things that should just work don't.

@eddyb
Member
eddyb commented Sep 13, 2016

The "streaming" definition of Iterator would look something like this:

trait Iterator {
    type Item<'a>;
    fn next<'a>(&'a mut self) -> Option<Self::Item<'a>>;
}

It's possible to define this right now by moving <'a> to the trait itself but that can't work for Iterator in a backwards-compatible manner (whereas this definition might) and HRTB has some implementation limitations which restrict its usefulness in describing such iterators, for now.

@amluto
amluto commented Sep 13, 2016

@camlorn: Hmm. I wonder if the HKT trick is that broken() yields a reference like "there exists 'a for which the reference is &'a i32".

But treating an async/await as an iterator over awaitable things seems a bit wrong to me. If I have a generator, I don't think I should be able to call next() twice in a row, thus obtaining two awaitable things, of which the first gives the first element and the second gives the second. Sure, this can be implemented, but it seems overly flexible.

@eddyb
Member
eddyb commented Sep 13, 2016

@amluto Look at Future's API, you can call poll how many times you want (even if it might, e.g. panic).

@BurntSushi
Member

@eddyb That trait is kind of misleading. I'm pretty sure you can't define any useful adapter methods without some kind of HKT support. (Maybe all you need is abstraction over lifetime parameters.)

@eddyb
Member
eddyb commented Sep 13, 2016

@BurntSushi We already have HRTB, i.e. things like for<'a> T: Trait<'a>.

@BurntSushi
Member

I'm not talking about HRTB. Try to define some adapter methods.

@eddyb
Member
eddyb commented Sep 13, 2016

@BurntSushi Are you referring to how you need a type parameter R to use for Fn{,Mut,Once}() -> R?
Because type Item<'a> = <F as FnMut<(Self::Item<'a>,)>>::Output; would just work.
The -> R sugar also causes other problems, but I agree higher-kinded type params are one solution here.

@BurntSushi
Member

@eddyb Here's one example of me flailing with it: https://is.gd/HED3Eo --- This is just me trying to get something to work today.

As far as Item<'a> goes, consider something that can be done with Iterator today:

fn foo<S: AsRef<[u8]>, I: Iterator<Item=S>>(it: I) -> Foo { ... }

How can I write this when the associated type (S in this case) has kind * -> * (in your definition) instead of * (in the current definition of Iterator) without HKT? As I understand it, the very notion of writing S<T> (or S<'a>) where S is a type variable is precisely the definition of HKT.

@eddyb
Member
eddyb commented Sep 13, 2016 edited

The quick answer for the second question is almost always "don't use a type parameter":

fn foo<I: Iterator>(it: I) -> Foo
where for<'a> I::Item<'a>: AsRef<[u8]>
{ ... }

One could imagine that both your version and mine would compile if we changed the definition of Iterator in a backwards-compatible manner - yours would require that I::Item<'a> does not refer to 'a while mine would be more flexible for the caller (but you couldn't keep elements of it).

Now your other example with count in it is indeed a limitation of moving the lifetime parameter to the trait: the problem is that theoretically Self could contain 'a (unlike the type Item<'a> form).
You can work around that by creating another trait, for example: https://is.gd/rS4yzU.

@camlorn
camlorn commented Sep 13, 2016

@eddyb
I was going to say how I don't understand that trait, but then I got it: it works if we infer the lifetime based on the return value.

@amluto
The implementations of async/await I've seen in i.e. Python turn stuff like this:

async fn foo() {
await thing1();
await thing2();
await thing3();
}

Into an iterator of things to be executed, which then percolate up to an event loop that executes them and then pumps the iterator again. There may be another way of doing it.

The interesting thing with such an implementation is that you can do a lot if you can yield trait objects, because then you can use &Future or whatever instead of one concrete value per async function. But, as it stands, I don't think we can express the lifetimes. Python deals with this because it doesn't have a type system and C# deals with this because it's got a GC and can stick stuff on the heap. The latter could be done in Rust, but this would be forcing usage of Box; this is against a lot of what Rust stands for,.

You could avoid this by using green threads, but imho green threads are mostly a bad idea. If you already have a multithreaded app, conversion to green threads is great. Otherwise, you're just keeping all the downfall of threads without the benefits of a possibly better design. But I'm not sure how you design this and avoid the lifetime concern.

One other interesting thing here is that recursion won't compile if you use a stack frame to struct transformation, which may or may not be a benefit of the design. You can get it with Box, but you have to be explicit.

@eddyb
Member
eddyb commented Sep 13, 2016 edited

The interesting thing with such an implementation is that you can do a lot if you can yield trait objects, because then you can use &Future or whatever instead of one concrete value per async function.

But an async function is a future, so I don't see why an async function over futures even makes sense.

@camlorn
camlorn commented Sep 13, 2016

@eddyb
My understanding is that you probably have a trait that encapsulates the idea of being awaitable, plus a bunch of concrete objects that represent things you might want to await on. Making the awaitable trait Future is one possible choice.

If you can't pass out references and/or trait objects, then an async function is limited to awaiting only one type of thing.

Whenever the action completes, the thing doing the awaiting is going to advance, possibly killing the reference it passed out, and as far as I can tell there's no way to make the compiler understand that this is okay without unsafe. I imagine you can do something unsafe behind a safe API, though, so maybe it's not as big a problem as it looked to me when I started this subthread.

Also, I might be way off in my understanding of the issue. I haven't worked extensively with frameworks in other languages. Just some dabbling here and there. If you're not understanding what I'm getting at, maybe my understanding is at fault.

@amluto
amluto commented Sep 13, 2016

@camlorn: What I'm imagining for a very high-quality async/await implementation is Rust could be mostly just nice syntactic sugar for a whole bunch of and_then combinators and the like for an appropriate Future trait. The lifetimes should just work, with the caveat that the scopes aren't what they look like. For example, if I type:

async fn download(url: &str) -> ... {
  let http_session = await connect(url);
  // <-- here, see below
}

Then the desugared form should be a function that returns a future/promise/whatever and takes url as a parameter. If url is not referenced at here or farther down in download, then this function could have a signature such that url can have a shorter lifetime than the returned future.

I'd hope that a good async/await implementation can also integrate with explicit combinators (like select in the futures crate) without falling apart. I imagine this is straightforward:

let a = [a future];
let b = [a future];
let c = await select([&a, &b]);

I would also hope that the implementation could handle streams nicely too, which would presumably involve adding a yield keyword and a different type of async function that returns a stream. (And maybe this could come in a non-async form to generate a function that returns an Iterator.)

@eddyb
Member
eddyb commented Sep 13, 2016

@camlorn You might be thinking of a completion-based model, where a completely statically dispatched async fn would contain all the types it (or any other async fn it uses) can wait on, because those results would have to be passed into the state machine. I've prototyped this and it's a nightmare.

OTOH, futures-rs is readiness-based, which means that the actual values awaited on are produced from inside the state machine, instead of being passed into it, and if the values are not ready yet, you get told to wait some more. The external interface of such an async fn only needs to mention the final result, but not any of the other values it awaits on, because those values never need to exist outside of the async fn.

Any function that returns impl Future<Item=T, Error=E> is effectively an async fn (with futures-rs).
That is, it returns a value that you can poll to obtain a T value, an E error, or be told to wait some more.

@camlorn
camlorn commented Sep 13, 2016

@eddyb
Maybe. I'm thinking primarily of a Python model, where we abuse the lack of type systems to hell and back, so I'm not quite sure what you're getting at. It seems to me that if you can convert it to a struct like we do with closures and make it into an iterator that yields &Awaitable, then you don't have a type system problem beyond the aforementioned inability to easily express streaming iterators.

Then you adapt things like futures-rs into Awaitable with blanket impls. I imagine that Awaitable would look a lot like Future. In addition, this kind of model is something we'll probably end up with anyway. One very nice thing to have is something like Twisted's inlineCallbacks decorator, and we can do that as soon as we have generators. It seems to me that this will probably fall out as HKT for lifetimes/streaming iterator trait, generators, async fn in that order. But, as soon as you have the second, you can implement this kind of thing in a limited form already by hooking the iterator's next method off the then of a future and hacking something potentially unsafe together to send the result in. And once that's happened, it's harder to go back and redo it under a different model because people are already used to that way,.

@eddyb
Member
eddyb commented Sep 13, 2016

@camlorn All of that is much more complicated whereas Future is a very simple polling mechanism.
async fn is not semantically an iterator of awaitables, it's one single awaitable, hence -> impl Future.
No references are needed in the common case, only if you wanted to give a borrow of some internal state.

@lnicola
lnicola commented Sep 13, 2016

@camlorn Take a look at http://aturon.github.io/blog/2016/09/07/futures-design/ . Future doesn't use a completion/callback model where completion triggers off a computation. Instead, it's based on a polling model. Because of this, the idea advancing an iterator in a completion is kind of awkward as far as I see.

@camlorn
camlorn commented Sep 13, 2016

@eddyb
Supposing you build this up from futures-rs combinators, how do you translate an infinite loop? The only way I see to do it is to do what I'm suggesting, but hide the fact that you're doing what I'm suggesting inside the async fn.

I suppose maybe you can do something with streams, but streams don't seem like they can represent any loop, just some predefined ones.

@lnicola
I know how it works. But polling can be converted into a callback model just by polling and then calling the callback. What I'm saying is that you hook the iterator of futures onto the first future with then, then poll the entire mega-future, essentially. As long as someone polls it, things advance.

More broadly, lots of these models can be converted into each other if you have enough language features, and Rust almost does.

@eddyb
Member
eddyb commented Sep 13, 2016 edited

@camlorn You don't want to get anywhere near a callback model because it just doesn't scale.
(You've noticed already how things end up leaking into the signature, unlike the readiness model).

You also wouldn't build async fn from combinators, this is where state machine transforms come in.
But if you wanted to, you can always write your own infinite loop Future or whatever else you need.

@Ericson2314
Contributor

It seems the completion model isn't so bad if you accept you can't do join but can do fork. I might make a pasts-rs crate for this :D.

@camlorn
camlorn commented Sep 13, 2016

If the goal is "looks like imperative code", then futures aren't a good model. In my ideal world, it would be like threads, but with very explicit points at which the context can switch. I don't see this as too big of a deal to implement. I'm going to lay out all the moving bits I see, and then if people still see major problems I'll go away. But I think some of this is that I'm not being clear. Anyhow:

  • An async fn is a function written async fn foo(params) -> Res, converted roughly into fn foo(params) -> impl Async<Res>.
  • An Async<T> implements a hypothetical trait StreamingIterator. There is an event loop that knows how to deal with the things Async<T> can generate, probably references to something implementing Future or whatever your preferred term is and probably using a poll-based model. To make this work, Async<T> probably provides an extra method to reget the most recent reference; this deals with the inability to have a vector of references all with different lifetimes. I anticipate that the event loop will put these in boxes.
  • Every time through the loop, we go through the most recent references from each Async<T> in turn and see who is ready. If one of them is, then we tell that Async<T> to advance and give us the next one.
  • The await keyword works on any Async<T>. awaitais roughlyfor i in a { yield i }. Put another way, await flattens anotherAsyncinto this one. The result ofawaitis the final result of theAsyncwe awaited. Since we're using trait objects and it's a streaming iterator, this can work out even if they're providing results of different types. Once theAsyncwe awaited is exhausted, we extract the result, probably stored internally in theAsync`.
  • In order to deal with incoming connections and similar, the event loop provides a method whereby a new top-level Async<T> can be registered at any time. In order to support joining, this function provides something implementing Async<T> in turn that will simply block the current task until the newly-created one is finished. I leave making the event loop available inside tasks to the reader, as this isn't very hard.
  • Finally, to actually pass out the result, a task uses the return keyword. When a task that is being handled directly by the event loop gives a result, the event loop forgets the Async<T>. This lets one continue spawning top-level tasks forever without running out of ram.

If you have all of these pieces, something like the following pseudocode can work:

async fn server() {
    let listener = await tcp_listener("localhost:10000");
    loop {
        let sock = await listener.incoming_connection();
        spawn do_something_with_socket(sock);
    }
}

Nothing leaks into the signature of a async fn before transformation. This happens after transformation and is done by the compiler. My model should be nearly as fast as a hand-written FSM, supports joining, and allows one to write infinite loops directly. It probably also has a horrible downside. Feel free to point that out now. The most immediate one I see is that possibly T in an Async<T> needs to be copy.

Also, if I'm rehashing earlier discussion, tell me. Perhaps this has all been considered and laid out and dismissed as a bad idea already.

@eddyb
Member
eddyb commented Sep 13, 2016

@camlorn You describe all of that complexity, but a lot of it just goes away with the readiness model.
Each async fn results in a future, which can be seen as an iterator of Wait | Return(x) | Error(e).

That's already in the realm of existing generator implementations (i.e. ES6), which could be implemented in Rust (without needing streaming iterators) and then used as the basis for async fn.

@amluto
amluto commented Sep 13, 2016

@eddyb: I'm still a bit mystified by the idea of using an iterator for this. I agree that, in the readiness model, having a function that returns Wait | Return(x) | Error(e) works, but I think that calling that function next() and letting you iterate over it with for is confusing. Also, Iterator::next() returns Option, so you'd have to change your signature, and you may end up in an awkward situation where:

for &i in some_async { /* do nothing */ }

will panic.

If it could be made to work cleanly and efficiently, I think that:

enum<A> PollResult
{
    NotReady(A),
    Ready(A::Item),
    Err(A::Error),
}

impl<...> Async<...> {
    fn poll(self) -> PollResult<Self>;
}

would be nice because you get rid of the panic case entirely: you can't possibly call poll() too many times.

@eddyb
Member
eddyb commented Sep 14, 2016

@amluto I'm not saying it should be an iterator, just that it has a compatible API.
A generalized Generator API would fit even better:

enum Yield<T, R> {
    Value(T),
    Return(R)
}

trait Generator {
    type Value;
    type Return;
    fn advance(&mut self) -> Yield<Self::Value, Self::Return>;
}

impl<T, G: Generator<Value=T, Return=()>> Iterator for G {
    type Item = T;
    fn next(&mut self) -> Option<T> {
        match self.advance() {
            Yield::Value(x) => Some(x),
            Yield::Return(()) => None
        }
    }
}

impl<T, E, G: Generator<Value=NotReady, Return=Result<T, E>>> Future for G {
    type Item = T;
    type Error = E;
    fn poll(&mut self) -> Poll<T, E> {
        match self.advance() {
            Yield::Value(NotReady) => Ok(Async::NotReady),
            Yield::Return(Ok(x)) => Ok(Async::Ready(x)),
            Yield::Return(Err(e)) => Err(e)
        }
    }
}
@camlorn
camlorn commented Sep 14, 2016

So everyone knows, I now side with @eddyb. This was half a miscommunication and half futures-rs having changed their design near the beginning of this month, after I stopped looking closely at it.

I want generators too. Has anyone given thought to focusing effort there instead? It seems to me that it's much smaller in scope, but has the potential to provide the tools for people to easily prototype these ideas. Also, it would make my life so much easier next time I have to write an iterator, though that's not really the issue at hand.

I've considered doing a generator RFC after the limited HKT one is finished with, but doubt I can. I mostly see what the code transformation is, but I think that sometimes you need it to automatically become a streaming iterator, and I'm not sure how that should work.

@brettcannon

In case this issue starts up again or another one is created somewhere else on the topic of async I/O, as a Python core developer I wanted to offer to answer any Python questions people may have. You can also always email the async-sig mailing list with questions relating to asynchronous programming in Python. And if you want to know how async/await works in Python I wrote http://www.snarky.ca/how-the-heck-does-async-await-work-in-python-3-5 (summary: it's an API unlike most other languages that implicitly give you an event loop).

@steveklabnik
Contributor

as a Python core developer I wanted to offer to answer any Python questions people may have.

Thank you so much @brettcannon !

@njsmith
njsmith commented Nov 10, 2016

One thing that might be worth highlighting about Python's experience so far: we actually started out using async/await as a convenience layer on top of a traditional promise/futures-oriented API, like how async/await look in C# and JS. But in the mean time, people have started exploring other approaches and it's possible we'll actually move away from that approach. I wrote a long blog post making the case for this. I'm sure not all of it carries over the rust context, but it might be of interest in any case.

@brettcannon

I just want to second reading the blog post by @njsmith as it explains why Python might be shifting how we implement event loops while not having to change async/await thanks to how it's just an API to Python.

@eddyb
Member
eddyb commented Nov 10, 2016

@njsmith The futures-rs effort (with tokio on top) is readiness-based (i.e. poll), not using callbacks.
It seems to me that Python moved from callbacks to something more akin a state machine?
In which case, I don't believe Rust ever had anything serious built on top of callbacks.

@njsmith
njsmith commented Nov 11, 2016

@eddyb: futures-rs certainly seems to use callbacks, in the sense that I see lots of f: F arguments in the Future trait? But I'm definitely not enough of an expert on rust or futures-rs to say anything definitive about how the Python experience does or doesn't carry over.

@eddyb
Member
eddyb commented Nov 11, 2016 edited

@njsmith Those are adapters, e.g. imagine map(self, f) returning a f(await self) async fn.
The fundamental API is the poll method, everything else deals with building state machines for the readiness model without language-level async fn sugar.

This is akin to how Iterator is "internal" (i.e. you call .next() to get a value instead of being called back) and we have Iterator adapters like map and filter, but we still haven't added generators.
In fact, I expect async fn to be built on top of some more general formulation of generators.

@gotgenes

@njsmith I actually came here to provide a link to your async/await blog post to augment discussion, as it was incredibly thoughtful, but you beat me to it!

I'll add that @mitsuhiko also recently wrote a blog post on async in Python. Maybe particularly pertinent to this thread are his thoughts on the overloading iterators.

@brettcannon

The post by @mitsuhiko is specifically about asyncio and targeted at library authors (to put it in proper context).

@eddyb
Member
eddyb commented Jan 7, 2017 edited

You forgot loop { yield None; } at the end (or yield None; panic!();). Too much boilerplate IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment