Clear documentation on "streaming" meaning #1582

douglas-raillard-arm · 2022-12-08T14:14:12Z

The main bit of documentation on the streaming parsers I found is this one:
https://docs.rs/nom/latest/nom/#streaming--complete

While documenting the behavior of nom, it might be a good idea to stress that nom is not streaming in the sense that it will never be able to process arbitrary input in constant memory no matter what the parser does (e.g. count_many0 could in an ideal world run in constant memory). This is because the only way of retrying a partial parse is to reparse all from the beginning, as opposed to starting again where it actually stopped such as Haskell's scanner parser combinator:
https://hackage.haskell.org/package/scanner-0.3.1/docs/Scanner.html#t:Result

This makes streaming mode in nom much less useful, and the inability to distinguish between EOF and "there might be more input coming" means that any parser finishing by an optional streaming sub-parser will always wait for more input, even if there is no more:
#271

One suggestion on that thread recommends using complete() to ensure any optional parser is complete but that breaks reuse of the parser where it's expected to be fully streaming. One problematic example is parsers finishing by a separated_list0. Either:

the separator is a complete parser: it will detect correctly EOF, but if the input happens to be truncated at a sep boundary, it will wrongly successfully terminate (I guess, untested).
the separator is a streaming parser: separated_list0 will simply never parse successfully, as it will always be waiting for more input after the last item.

If I'm right this basically means the only workable solution is to use a complete parser all the way and provide all the data in one chunk. There is no real disadvantage in doing so in that case since a streaming parser would eventually require the entirety of the input in memory anyway (possibly memory mapped) .

My suggestions are:

Make it obvious in the documentation what can be achieved using streaming parsers, and critically what cannot be achieved.
Add a warning in the separated_list0 doc stating that streaming parsers should not be used for sep if separated_list0 is the last sub-parser.
If it is possible, detect and forbid when a streaming parser finishes by an optional streaming sub-parser (at runtime or using type tricks).
Maybe provide a way to send an EOF marker in the input e.g. by allowing (&[u8], bool) input where the boolean indicates whether the input is complete or not. If it's complete, a streaming parser should never ask for more and simply fail.

Another (challenging) route would be to make nom fully streaming: Err::Incomplete could carry a closure that when called will resume the parser with extra input. AFAIK the only way to achieve that in rust would be to either use a macro such as that mdo to define all parsers or use async/await style and use the suspended future's poll method as a way to resume parsing from a given point in the code.

The text was updated successfully, but these errors were encountered:

Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326

douglas-raillard-arm mentioned this issue Jan 4, 2023

Helper for parsing from a R: Read or R: BufRead #1145

Closed

epage mentioned this issue Feb 1, 2023

Post-fork documentation audit winnow-rs/winnow#100

Closed

2 tasks

epage added a commit to epage/winnow that referenced this issue Feb 17, 2023

doc(cookbook): Include more details on Partial parsing

8608032

Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326

epage added a commit to epage/winnow that referenced this issue Feb 17, 2023

doc(cookbook): Include more details on Partial parsing

c278418

Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326

epage added a commit to epage/winnow that referenced this issue Feb 17, 2023

doc(cookbook): Include more details on Partial parsing

e67774f

Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326

epage added a commit to epage/winnow that referenced this issue Feb 17, 2023

doc(cookbook): Include more details on Partial parsing

b1635b7

Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326

reinhrst mentioned this issue Jun 19, 2023

Writing a streaming parser #1160

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clear documentation on "streaming" meaning #1582

Clear documentation on "streaming" meaning #1582

douglas-raillard-arm commented Dec 8, 2022

Clear documentation on "streaming" meaning #1582

Clear documentation on "streaming" meaning #1582

Comments

douglas-raillard-arm commented Dec 8, 2022