Skip to content
This repository has been archived by the owner on May 6, 2018. It is now read-only.

Consider adding io::read_repeating #82

Closed
cramertj opened this issue Nov 7, 2017 · 9 comments
Closed

Consider adding io::read_repeating #82

cramertj opened this issue Nov 7, 2017 · 9 comments

Comments

@cramertj
Copy link
Contributor

cramertj commented Nov 7, 2017

The current BytesCodec implementation is convenient, but as discussed it requires allocating a BytesMut for the data. There should be an easy way to get a stream of bytes out of an AsyncRead without allocating.

One option is to use io::read and stream::unfold:

let buf = [0; 1024];
struct LoopState<R, B> { is_first: bool, read: R, buf: B, num_read: usize }
let init = LoopState { is_first: true, read, buf, num_read: 0 };
let value_stream = stream::unfold(init, |mut state| {
    if num_read == 0 && !state.is_first { return None; }
    io::read(state.read, state.buf).map(|(read, buf, num_read)| {
        let output = process_buf(buf);
        (output, LoopState { is_first: false, read, num_read })
    })
});

This is pretty unergonomic, though. Consider adding a function like this:

pub fn read_repeating<R, B, F, Fut, It, E>(read: R, buf: B, f: F) -> RepeatingRead<R, B, F, Fut, It>
where
R: AsyncRead,
B: AsMut<u8>,
F: FnMut(&[u8]) -> Option<Fut>,
Fut: Future<Item=It, Error=E>,
E: From<io::Error>,

where RepeatingRead<R, B, F, Fut, It> implements Stream<Item = It, Error = E>.

Using it would look like this:

let value_stream = io::repeating_read(read, [0; 1024], |bytes| process_buf(bytes));

This could potentially help with tokio-rs/tokio-core#276.

However, it also feels like this overlaps quite a bit with the existing encoder/decoder functionality. It seems like the main difference is passing in a reference to a user-provided buffer instead of yielding a BytesMut. Comparatively, read_repeating is limiting to the end-user because they must either be able to yield an item from any incoming chunk of data, or they must save intermediate state in their closure upvars. In return, read_repeating reduces the number of allocations.

@alexcrichton
Copy link
Contributor

I'm having trouble just grokking the 6 (!) type parameters here... Perhaps this could be implemented externally first, see if it gains traction, and then maybe move in?

@cramertj
Copy link
Contributor Author

cramertj commented Nov 7, 2017

Yeah, it's definitely hard to read. I'm not sure how to make that easier. The basic idea is just that you're reading again and again into a user-provided buffer and passing a reference to the data read into a closure which can handle the data appropriately, either performing some side-effect or producing a Future of some data.

@cramertj
Copy link
Contributor Author

cramertj commented Nov 7, 2017

I'd be happy to spin it into a gist/PR/its own crate if you think that would make the intent clearer. I don't think many people would download a crate just for this function, both because of overhead and because of the difficulty of discovering the functionality.

The goal is just to provide an alternative of sorts to Decoder that handles the incoming bytes in a buffer directly, rather than spinning it out into a Bytes.

@alexcrichton
Copy link
Contributor

I agree that discoverability is a problem yeah but it's not really tenable for us to just add every combinator we can think of to tokio-io? I feel like we'd just need some threshold for inclusion beyond "a PR was sent" although I'm not quite sure what precisely such a threshold would be...

@cramertj
Copy link
Contributor Author

cramertj commented Nov 8, 2017

it's not really tenable for us to just add every combinator we can think of to tokio-io

Yeah, I totally get that. I see io::repeating_read(read, [0; 1024], process_buf) as equivalent to read.framed(Bytes).map(process_buf), but it allows the users to provide their own buffer and so avoids the unnecessary buffer heap allocation. If that's behavior we want to encourage, then I think repeating_read should be included. If the opinion is that these intermediate allocations don't matter enough to merit the additional method and ergonomics cost, then I'm happy to close this, as I agree that it's better to provide a single way to do things where possible. However, if we want to encourage zero-alloc parsing of the incoming buffer, then tokio-io needs something like repeating_read.

@cramertj
Copy link
Contributor Author

cramertj commented Nov 9, 2017

I've written a prototype implementation of this here (FWIW there are only 4 type parameters 😄).

@alexcrichton
Copy link
Contributor

Ah so this is sort of where the Decoder and Encoder style traits are intended for "easy mode" in terms of being a good entry point but not a one-size-fits-all solution. I think most of the time we try to make it go further in terms of "one size fits all" then it becomes very difficult to work with, so we opted to just have an "easy, but not always production ready" mode and then leave other abstractions up to users

@carllerche
Copy link
Member

Yes, tokio-io shouldn't include all possible permutations. It is totally fine to move helpers to external crates.

@carllerche
Copy link
Member

Closing this was leaning towards "won't merge" and there has been no followup discussion. Please open a new issue if you want to continue this discussion.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants