Avoid allocating in lex_decimal #7252

MichaReiser · 2023-09-09T10:36:14Z

Summary

lex_decimal_number shows up as one of the most expensive operations during lexing. This is partially due to it allocating a string before calling into f64::parse or BigInt::parse.

This PR avoids allocating a string for numbers that don't use _ or E.

Test Plan

cargo test

This improves performance for files with many numbers (dataset.py +3%)

MichaReiser · 2023-09-09T10:36:25Z

Current dependencies on/for this PR:

main
- PR Avoid allocating in lex_decimal #7252 👈

This comment was auto-generated by Graphite.

github-actions · 2023-09-09T10:51:33Z

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

dhruvmanila

Thanks!

I like the LexedText abstraction. I think it could even be used in f-string normalization to normalize {{/}} to {/} using LexedText::skip_char.

crates/ruff_python_parser/src/lexer.rs

konstin · 2023-09-11T06:57:38Z

crates/ruff_python_parser/src/lexer.rs

@@ -1236,6 +1220,49 @@ const fn is_python_whitespace(c: char) -> bool {
    )
 }

+enum LexedText<'a> {


is that the same abstraction we use for normalizing strings?

The lexer doesn't normalize Strings. I haven't looked into what the whole string.rs is doing (and how much of it could be moved into the lexer)

i meant in the formatter, this was just a random thought because the logic looked familiar from the normalize functions of strings, ints and floats

It's similar but we don't use an abstraction in the formatter.

Avoid allocating in lex_decimal

b21518c

MichaReiser added the parser Related to the parser label Sep 9, 2023

MichaReiser requested a review from dhruvmanila September 9, 2023 10:46

dhruvmanila approved these changes Sep 9, 2023

View reviewed changes

crates/ruff_python_parser/src/lexer.rs Outdated Show resolved Hide resolved

crates/ruff_python_parser/src/lexer.rs Outdated Show resolved Hide resolved

Remove push_different and add debug assertion

1a3f51d

MichaReiser enabled auto-merge (squash) September 11, 2023 06:36

MichaReiser merged commit 7440e54 into main Sep 11, 2023
16 checks passed

MichaReiser deleted the lex-decimal-avoid-allocating branch September 11, 2023 06:37

konstin reviewed Sep 11, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid allocating in lex_decimal #7252

Avoid allocating in lex_decimal #7252

MichaReiser commented Sep 9, 2023 •

edited

MichaReiser commented Sep 9, 2023

github-actions bot commented Sep 9, 2023 •

edited

dhruvmanila left a comment

konstin Sep 11, 2023

MichaReiser Sep 11, 2023

konstin Sep 11, 2023

MichaReiser Sep 11, 2023

Avoid allocating in lex_decimal #7252

Avoid allocating in lex_decimal #7252

Conversation

MichaReiser commented Sep 9, 2023 • edited

Summary

Test Plan

MichaReiser commented Sep 9, 2023

github-actions bot commented Sep 9, 2023 • edited

PR Check Results

Ecosystem

dhruvmanila left a comment

Choose a reason for hiding this comment

konstin Sep 11, 2023

Choose a reason for hiding this comment

MichaReiser Sep 11, 2023

Choose a reason for hiding this comment

konstin Sep 11, 2023

Choose a reason for hiding this comment

MichaReiser Sep 11, 2023

Choose a reason for hiding this comment

MichaReiser commented Sep 9, 2023 •

edited

github-actions bot commented Sep 9, 2023 •

edited