New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`string` can cause parsing to be slow with `verbose_error` #15

Closed
m4rw3r opened this Issue Nov 27, 2015 · 2 comments

Comments

Projects
None yet
1 participant
@m4rw3r
Owner

m4rw3r commented Nov 27, 2015

Problem

string allocates a copy of the string it was trying to match on error in verbose_error mode. This can cause really bad performance when eg. trying to match multiple different tags using nested or-combinators (used in the mp4-benchmark from nom_benchmarks).

@m4rw3r

This comment has been minimized.

Show comment
Hide comment
@m4rw3r

m4rw3r Feb 28, 2016

Owner

The string parser could be made to just return the Error::Expected on the differing character. This might even allow for removal of the verbose_error feature completely since Error will not contain any heap-allocated data.

This will result in slightly less useful error messages from the string parser, since it will not display the full string. In addition to this the error position returned from the string parser would be set to the unexpected character instead of the start of the match (the current behaviour). This could break some buffer-usage where it is expected to restart the parsing from the start of the attempted string match instead of restarting from the offending character.

Owner

m4rw3r commented Feb 28, 2016

The string parser could be made to just return the Error::Expected on the differing character. This might even allow for removal of the verbose_error feature completely since Error will not contain any heap-allocated data.

This will result in slightly less useful error messages from the string parser, since it will not display the full string. In addition to this the error position returned from the string parser would be set to the unexpected character instead of the start of the match (the current behaviour). This could break some buffer-usage where it is expected to restart the parsing from the start of the attempted string match instead of restarting from the offending character.

@m4rw3r

This comment has been minimized.

Show comment
Hide comment
@m4rw3r

m4rw3r Feb 28, 2016

Owner

Another possibility is to only allow 'static lifetime on the string to match against, but this will probably be too limiting.

Combining #23 with just reporting the offending character (ie. you get a full back-trace pointing to the string parser) will most likely solve the issue where you actually need to see which string it is trying to match.

Owner

m4rw3r commented Feb 28, 2016

Another possibility is to only allow 'static lifetime on the string to match against, but this will probably be too limiting.

Combining #23 with just reporting the offending character (ie. you get a full back-trace pointing to the string parser) will most likely solve the issue where you actually need to see which string it is trying to match.

@m4rw3r m4rw3r referenced this issue Feb 28, 2016

Merged

Simplified error #41

@m4rw3r m4rw3r closed this in #41 Mar 16, 2016

m4rw3r added a commit that referenced this issue Mar 16, 2016

Merge pull request #41 from m4rw3r/simplified_error
Simplified error

Fixes #15 and #23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment