Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental input #30

Closed
Marwes opened this issue May 25, 2015 · 5 comments
Closed

Incremental input #30

Marwes opened this issue May 25, 2015 · 5 comments
Milestone

Comments

@Marwes
Copy link
Owner

Marwes commented May 25, 2015

It would be nice to support reading from streams where the input was produced incrementally to avoid needing to read the entire input into memory (such as from files). Since Stream types must be able to be cloned which makes it simple to support arbitrary look-ahead but it makes it impossible to use an iterator such as ::std::io::Chars.

Without arbitrary look ahead it would be trivial to just support LL(1) parsers by adding a peek function to Stream but I don't think its worth it given how useful the try parser can (even if it is inefficient).

Not quite sure how to implement this efficiently yet though.

@michaelsproul
Copy link

I'm interested in getting this working as I'd like to use parser combinators to parse XML from a serial stream...

One idea I had was to use an intermediate struct in place of the input - something that buffers input items as they're read.

struct BufferedStream<Input, Stream> {
    buffer: Vec<Input>,
    stream: Stream
}

Then parsers could read from the buffer directly if it were non-empty, or else pull a new item from the stream. It prevents the need to "reverse" changes to the stream if a pulled token ends up being unparseable, as it can simply be thrown onto the buffer (no stream cloning required). I will admit I haven't given this a whole lot of thought though... I'll play around with it a bit more.

@michaelsproul
Copy link

Or we could steal a bunch of ideas from nom, which seems to have streaming but isn't generic.

@Marwes
Copy link
Owner Author

Marwes commented Jun 15, 2015

I might be wrong but I don't think your BufferedStream handles parsers which can consume multiple tokens before failing. If you put the token in the buffer if it fails there might still have been tokens which has already been accepted which can't be recovered at the point that parsing is resumed.

I am not sure that the way nom handles streaming works well with recursive descent parsers (which is what parser combinators are, at least in this library). If a parser needs more input in nom I think it basically throws away everything it has parsed up to that point, requests more and then continues once it gets it (I might be wrong though) if you are deep within a parser when that point lots of work would be repeated.

@Marwes
Copy link
Owner Author

Marwes commented Jul 16, 2015

@michaelsproul I have a sketch for how this might be implemented in #37. I think it is rather close to what you had in mind, any comments?

@Marwes Marwes added this to the 1.1 milestone Aug 2, 2015
@Marwes Marwes removed this from the 1.1 milestone Dec 23, 2015
@Marwes Marwes added this to the 2.0 milestone Apr 16, 2016
@Marwes
Copy link
Owner Author

Marwes commented Oct 22, 2016

Closed by 2.0.0

@Marwes Marwes closed this as completed Oct 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants