-
-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
line_col method N times slow in large file. #560
Comments
I have rewritten a way to get line, col for improve a lot of speed. This new way just save the cursor (line, col) on iter the pairs. By a large file (5000 lines JSON case, duration from 12595ms down to 400ms) |
@huacnlee Are you planning to turn that into a PR for Pest? |
any update? |
It is currently a design choice of This does necessarily make The (I suspect a marginally better approach is to rfind to the newline and count columns in one normal loop (e.g. rather than separate memrchr and num_chars calls), since the column length is usually small, and then a bytecount for the line number.) I would, however, be happy to r+ a patch which incrementally tracked line:col in cursors (e.g. |
note that this may have extra overhead for small inputs and requires two extra dependencies, hence the faster `line_col` was put under the optional `fast-line-col` feature flag. closes pest-parser#560
note that this may have extra overhead for small inputs and requires two extra dependencies, hence the faster `line_col` was put under the optional `fast-line-col` feature flag. closes #560 Co-authored-by: Tomas Tauber <me@tomtau.be>
…terator. (#754) * Improve line, col calculate performance by use move cursor on Pairs Iterator. ref: #707, #560 * Add benchmark for pair.line_col vs position.line_cole * Fix flat_pairs and pairs.next_back to use position.line_col * Merge line_col method to use `position::line_col`. * Fix `pair.line_col` for supports skiped characters, and add test for rev iter.
When use pest in a large content file (about 700 KB JSON file), the
Pair<R>
to callline_col
looks like has performance issue O(N).Source code ref:
https://github.com/huacnlee/autocorrect/blob/v1.4.4/src/lib/src/code.rs#L56
The large file case:
https://github.com/microsoft/vscode-loc/blob/release/1.62.0/i18n/vscode-language-pack-zh-hans/translations/main.i18n.json
The benchmark result:
The text was updated successfully, but these errors were encountered: