-
Notifications
You must be signed in to change notification settings - Fork 791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helper for parsing from a R: Read
or R: BufRead
#1145
Comments
#1055 is also relevant (specifically wrt being able to read no more than necessary from |
It should even be possible to implement some operations on
|
reading from a |
Methods of I have tried to make a custom EDIT: I only managed to get pass it by RC |
FYI it's now possible to use https://crates.io/crates/nom-bufreader |
Could we consider re-opening this issue ? As described here #1582 streaming parsers are not a silver bullet. One cannot simply change from complete to streaming parsers and expect things to keep working. I had some pretty nasty subtle bugs coming from streaming parsers. What would really solve the issue is implementing the "input traits" of nom for As it stands, the only robust solution is to apply sub-parsers on known-length parts of input. This can be made to work for binary formats that encode the size of each blob prior to the blob, so it can be extracted into one |
In my experience trying to apply the parsers directly on Streaming parsers work well when one or more of those conditions are met:
I am currently testing a design where the input type indicates if we're in streaming mode or not, which should help in writing parsers. But if you want to work with streaming parsers, there's a basic level of complexity you need to manage, and decisions you have to take in your format, that nom cannot do for you because that's the wrong level for it. Can you tell me what kind of format you are working on? Maybe I can provide some pointers to make things easier |
Thanks for your quick response :) The format I'm writing a parser for is: Overall, it's organized as the following. "sized XXX" means "an XXX blob preceded by e.g. a u64 stating the size of the blob":
I don't need to backtrack in my specific case, but wouldn't
This is unfortunately not an option. The file I'm parsing may be larger than memory. The problem I'm facing is to parse the header of the format, which is not easily sized without decoding it (it's made of multiple reasonably-sized blobs, but I need basic le_u64/be_u64, take() and tag() to gather them). The data themselves will have the same issue.
Yes, that would be more to enable new use cases. It's possible to make an API for arbitrary large files without async, and I'm sure a Parser can still be exposed to async with a blocking read on channel in a separate thread fed by the async world or something like that. So to summarize:
EDIT: added mentions to tag() |
Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326
Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326
Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326
Inspired by - rust-bakery/nom#1160 - rust-bakery/nom#1582 - rust-bakery/nom#1145#issuecomment-678788326
I use
nom
andcookie-factory
for parsing in I/O. In every instance, I end up with code similar to the following:It would be useful if
nom
had helpers for this general pattern of "read just enough from someR: Read
to either satisfy the parser or reach an error case".The text was updated successfully, but these errors were encountered: