Stack overflow when using streams #62

carllerche · 2016-08-15T16:05:03Z

The issue is that streams require recursion in order to loop. The current assumptions in the futures library are that, at some point, a future / stream will not be ready, forcing a callback to be registered (essentially a defer). However, this assumption is not always true. The following example demonstrates the problem w/ immediately ready futures but the issue exists in any situation where futures become ready faster than they are consumed.

Example:

extern crate futures;

use futures::Future;
use futures::stream::{channel, Stream, Sender};

fn count_down(i: u32, f: Sender<u32, ()>)
    -> Box<Future<Item = (), Error = ()>>
{
    let busy = f.send(Ok(i));
    if i > 0 {
        busy
            .map_err(|_| ())
            .and_then(move |sender| {
                count_down(i - 1, sender)
            })
            .boxed()
    } else {
        Box::new(futures::finished::<(), ()>(()))
    }
}

fn main() {
    let (tx, rx) = channel::<u32, ()>();

    rx.for_each(move |v| {
        Ok(())
    }).forget();

    count_down(100_000, tx).forget();
}

alexcrichton · 2016-08-19T05:02:25Z

Some discussion recently has led us to the conclusion that we're likely to remove tailcall and associated optimization as (as this clearly shows) it doesn't always quite work. That being said though the stack overflow here can be avoided through using a "loop" instead of tail recursion (shown below). @carllerche how do you feel about that? Do you think we should implement a general "yielding" mechanism to solve this as well?

extern crate futures;

use futures::{Future, BoxFuture};
use futures::stream::{iter, channel, Stream, Sender};

fn count_down(i: u32, f: Sender<u32, ()>) -> BoxFuture<(), ()> {
    iter((0..i).map(Ok)).fold(f, |f, i| {
        f.send(Ok(i)).map_err(|_| ())
    }).map(|_| ()).boxed()
}

fn main() {
    let (tx, rx) = channel::<u32, ()>();

    rx.for_each(move |v| {
        Ok(())
    }).forget();

    count_down(100_000, tx).forget();
}

carllerche · 2016-08-19T05:23:17Z

A loop would work fine in this case and I think these constructs should be provided. My guess is that something like clojure's trampoline (https://clojuredocs.org/clojure.core/trampoline) would enable any pattern that doesn't fit with an existing looping combinator.

alexcrichton · 2016-08-19T17:49:48Z

Ok, I'm gonna close this in favor of #78 where can catalog and track a number of loop combinators

ahicks92 · 2016-08-21T21:39:44Z

This might be worth documenting somewhere.

I'm considering using this library and stumbled on this while looking for other stuff. The problem is that I can't actually tell the difference between the examples here. Why does one work and not the other?

I imagine that this issue will eventually just disappear, but until then it might be worth a note in the tutorial that explains what's going on.

alexcrichton · 2016-08-22T17:24:25Z

@camlorn good point! I'll add a point to the FAQ at least for now.

The crucial difference here is how the "loop" of the computation is being expressed. The first example uses recursion, while the second example uses a stream. All streams work in constant stack space, whereas recursion runs the risk of blowing the stack.

ahicks92 · 2016-08-22T18:14:12Z

So, is it safe as long as the recursion goes through a future that isn't immediately ready, or is tail recursion like that always a problem?

both here and the FAQ are just saying "this can happen". As someone looking in from the outside and without a full understanding of the issues surrounding implementing futures, I'd like to understand the conditions under which my code might explode.

alexcrichton · 2016-08-23T06:04:15Z

Recursion through a future that isn't immediately ready runs less risk of blowing the stack, but it still runs the risk of a memory leak unfortunately. With the removal of tailcall the recursion is never reclaimed while the future is running, so you'd need something like a stream where futures are destroyed between iterations to do that.

And yeah it's true that this is definitely a tricky topic. I'd be willing to help out with any questions, and if you've got suggestions on docs I'd be more than willing to beef them up!

ahicks92 · 2016-08-23T15:53:28Z

So, this looks like an actual problem. Not sure how big of one. I can't write code yet; I'm still looking at this crate broadly, as it were.

Say I schedule a timeout with Mio. I want this timeout to repeat over and over for the duration of the application. The way I would do this is and_then on the timeout future returned by LoopHandle::timeout.

This sounds like it would leak.

I know this issue is closed, and I'm not sure if my points deserve a separate issue. But what I'm seeing here is a lack of zero-cost abstraction, which I kind of thought was the point. My understanding of this at the moment is that all tailcalling state machines made with this crate use O(transitions) memory, whereas doing one myself with a big enum and a bunch of annoying match statements is only going to be O(1). This wouldn't be a problem for an HTTP request, I imagine. But for something like I'm doing--network protocols over UDP--it becomes a potential concern because the chain of futures might effectively be infinite. Also, I can't quickly answer that question, which is concerning in itself but might be a failure on my part.

I suppose this could be fixed by providing streams for more things. I'm looking at the commit that removed tailcall, but I really think this needs to be replaced with something. It seems to me that futures which couldn't be optimized themselves could still call the optimization pass on their parents in the tree, which would at least gain something. Beyond that, I can't make a useful suggestion, not if the constraint is no implicit heap usage. If we are allowed to use the heap, though, Option<Box<Future>> might be a valid solution: as things like select and map and such finish, they can let go of their boxes.

If I'm treating this as a much bigger problem than it is, say so and I'll go away. I'll be the first to admit that my understanding is poor as of yet.

alexcrichton · 2016-08-24T00:22:46Z

This sounds like it would leak.

It depends on how you write the iteration. If done via recursion, yes. If done via something like fold on a stream, then no. You can see an example of this here.

My understanding of this at the moment is that all tailcalling state machines made with this crate use O(transitions) memory,

To clarify, state machines in general don't have this problem. For example if you write a state machine by hand, you're destroying all previous state when you transition states, so there's nothing that's held around (unless you hold it around).

The problem is that if you use Box to create an "infinite" chain of state machines, they never get a chance to optimize away. The fix is to express the loop as a state machine directly rather than a recursive list of state machines.

In that sense your enum/match would still only use O(1) space. If you were to express it via standard and_then combinators it depends precisely how you wrote it, but most of the time it wouldn't take O(transitions) memory. Only if you start creating a chain of state machines via recursion will the memory footprint start to come into effect.

I'm looking at the commit that removed tailcall, but I really think this needs to be replaced with something.

Our intention is to provide basic loop combinators, but do you think this may not be sufficient?

ahicks92 · 2016-08-24T00:32:08Z

I have an actual proposal, which I'm going to go put on #78.

I'm not sure how you're supposed to represent a state machine that transitions back into itself recursively. What is the intended approach, something with fold? One of the things drawing me to this crate is that I could use it to avoid giant match arms. But if that isn't the case, a lot of the appeal goes out of it.

Of course, it's possible that my specific use case can easily be rewritten into something friendly. But it seems to me that the most obvious way to do such a thing is to tailcall like in the first comment.

I hacked in `futures-rs` looping behaviour by this form: ```rust fn f(arg: u64) -> BoxFuture<…> { future::done(Ok(())) .and_then(do_important_thing(arg)) .and_then(|returned| { if returned.loop_is_complete() { future::done(Ok(())) } else { f(arg) } }) .boxed() } ``` A non-conditional recursion like the above rapidly stack overflowed, which is unsurprising without something like tail recursion. I'm not 100% on what the stack would look like here. Need to devise a better option using: * rust-lang/futures-rs#62 * rust-lang/futures-rs#78 * https://users.rust-lang.org/t/how-to-create-a-loop-with-futures-rs/7011

Make boxed generators compatible with futures-cpupool

alexcrichton mentioned this issue Aug 19, 2016

Add loop combinators #78

Closed

alexcrichton closed this as completed Aug 19, 2016

46bit mentioned this issue Jan 3, 2017

Rebuilt with Futures on top of tokio-core. sirpent-team/sirpent-rust#24

Merged

withoutboats pushed a commit that referenced this issue Mar 25, 2018

Merge pull request #62 from raviqqe/raviqqe

c02cb0a

Make boxed generators compatible with futures-cpupool

octave99 mentioned this issue Dec 12, 2018

Stack overflow with streams #1370

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stack overflow when using streams #62

Stack overflow when using streams #62

carllerche commented Aug 15, 2016 •

edited

alexcrichton commented Aug 19, 2016

carllerche commented Aug 19, 2016

alexcrichton commented Aug 19, 2016

ahicks92 commented Aug 21, 2016

alexcrichton commented Aug 22, 2016

ahicks92 commented Aug 22, 2016

alexcrichton commented Aug 23, 2016

ahicks92 commented Aug 23, 2016

alexcrichton commented Aug 24, 2016

ahicks92 commented Aug 24, 2016

Stack overflow when using streams #62

Stack overflow when using streams #62

Comments

carllerche commented Aug 15, 2016 • edited

alexcrichton commented Aug 19, 2016

carllerche commented Aug 19, 2016

alexcrichton commented Aug 19, 2016

ahicks92 commented Aug 21, 2016

alexcrichton commented Aug 22, 2016

ahicks92 commented Aug 22, 2016

alexcrichton commented Aug 23, 2016

ahicks92 commented Aug 23, 2016

alexcrichton commented Aug 24, 2016

ahicks92 commented Aug 24, 2016

carllerche commented Aug 15, 2016 •

edited