Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary string cloning from the parser #9884

Merged
merged 5 commits into from
Feb 9, 2024

Conversation

charliermarsh
Copy link
Member

Closes #9869.

Copy link
Contributor

github-actions bot commented Feb 7, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@charliermarsh charliermarsh force-pushed the charlie/string-parser branch 3 times, most recently from c9054e1 to efc7d6e Compare February 8, 2024 00:47
@charliermarsh
Copy link
Member Author

Hate this change, didn't even improve benchmarks!!!

@MichaReiser
Copy link
Member

MichaReiser commented Feb 8, 2024

Hate this change, didn't even improve benchmarks!!!

That's because codspeed uses the fixed cost 0 for allocations....

@charliermarsh charliermarsh marked this pull request as ready for review February 8, 2024 18:07
}
content.push(ch);
fn parse_bytes(mut self) -> Result<StringType, LexicalError> {
let index = first_non_ascii_byte(self.source.as_bytes());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BurntSushi - Is this exposed in any crates that we already use? I had to vendor it from bstr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@dhruvmanila dhruvmanila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment on lines 219 to 220
// Add the characters before the escape sequence to the string.
let before_with_slash = self.skip_bytes(index + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the index could also include { or } as well, right? The variable name before_with_slash suggests otherwise unless I'm wrong.

@dhruvmanila dhruvmanila added the parser Related to the parser label Feb 8, 2024
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎸

Copy link

codspeed-hq bot commented Feb 8, 2024

CodSpeed Performance Report

Merging #9884 will improve performances by 6.53%

Comparing charlie/string-parser (58178b3) with main (7ca515c)

Summary

⚡ 1 improvements
✅ 29 untouched benchmarks

Benchmarks breakdown

Benchmark main charlie/string-parser Change
linter/default-rules[pydantic/types.py] 8.6 ms 8 ms +6.53%

@charliermarsh
Copy link
Member Author

I'm gonna get to the bottom of this if it's the last thing I do.

@charliermarsh
Copy link
Member Author

I see basically no improvement on local benchmarks either via criterion or hyperfine. I'm not sure why, to be honest. I guess I'll proceed with the change since it's both (1) a better API, and (2) intuitively better to avoid re-allocating here when we don't need to. But I'm a bit confused.

@charliermarsh
Copy link
Member Author

(Ignore current state of the PR, I duplicated the existing and updated code to do some benchmarking. I'll clean up later.)

@charliermarsh
Copy link
Member Author

By removing lexing from the parser benchmark, I'm now able to see a more consistent 1-2% improvement in parse time, so I'll move forward with this. (Benchmarking the individual string-parsing functions (rather than the parser as a whole) shows an enormous speed-up, like 2x.)

@charliermarsh charliermarsh marked this pull request as ready for review February 9, 2024 20:46
@charliermarsh charliermarsh merged commit 6f0e4ad into main Feb 9, 2024
17 checks passed
@charliermarsh charliermarsh deleted the charlie/string-parser branch February 9, 2024 21:03
nkxxll pushed a commit to nkxxll/ruff that referenced this pull request Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser Related to the parser
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove unnecessary string allocation from the parser
4 participants