New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clear documentation on "streaming" meaning #1582
Comments
2 tasks
epage
added a commit
to epage/winnow
that referenced
this issue
Feb 17, 2023
Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326
epage
added a commit
to epage/winnow
that referenced
this issue
Feb 17, 2023
Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326
epage
added a commit
to epage/winnow
that referenced
this issue
Feb 17, 2023
Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326
epage
added a commit
to epage/winnow
that referenced
this issue
Feb 17, 2023
Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The main bit of documentation on the streaming parsers I found is this one:
https://docs.rs/nom/latest/nom/#streaming--complete
While documenting the behavior of
nom
, it might be a good idea to stress thatnom
is not streaming in the sense that it will never be able to process arbitrary input in constant memory no matter what the parser does (e.g.count_many0
could in an ideal world run in constant memory). This is because the only way of retrying a partial parse is to reparse all from the beginning, as opposed to starting again where it actually stopped such as Haskell'sscanner
parser combinator:https://hackage.haskell.org/package/scanner-0.3.1/docs/Scanner.html#t:Result
This makes streaming mode in nom much less useful, and the inability to distinguish between EOF and "there might be more input coming" means that any parser finishing by an optional streaming sub-parser will always wait for more input, even if there is no more:
#271
One suggestion on that thread recommends using
complete()
to ensure any optional parser is complete but that breaks reuse of the parser where it's expected to be fully streaming. One problematic example is parsers finishing by aseparated_list0
. Either:separated_list0
will simply never parse successfully, as it will always be waiting for more input after the last item.If I'm right this basically means the only workable solution is to use a complete parser all the way and provide all the data in one chunk. There is no real disadvantage in doing so in that case since a streaming parser would eventually require the entirety of the input in memory anyway (possibly memory mapped) .
My suggestions are:
separated_list0
doc stating that streaming parsers should not be used forsep
ifseparated_list0
is the last sub-parser.(&[u8], bool)
input where the boolean indicates whether the input is complete or not. If it's complete, a streaming parser should never ask for more and simply fail.Another (challenging) route would be to make nom fully streaming:
Err::Incomplete
could carry a closure that when called will resume the parser with extra input. AFAIK the only way to achieve that in rust would be to either use a macro such as that mdo to define all parsers or use async/await style and use the suspended future's poll method as a way to resume parsing from a given point in the code.The text was updated successfully, but these errors were encountered: