Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize string handling in lit_token(). #50525

Merged
merged 1 commit into from May 9, 2018
Merged

Conversation

@nnethercote
Copy link
Contributor

@nnethercote nnethercote commented May 8, 2018

In the common case, the string value in a string literal Token is the
same as the string value in a string literal LitKind. (The exception is
when escapes or \r are involved.) This patch takes advantage of that to
avoid calling str_lit() and re-interning the string in that case. This
speeds up incremental builds for a few of the rustc-benchmarks, the best
by 3%.

Benchmarks that got a speedup of 1% or more:

coercions
        avg: -1.1%      min: -3.5%      max: 0.4%
regex-check
        avg: -1.2%      min: -1.5%      max: -0.6%
futures-check
        avg: -0.9%      min: -1.4%      max: -0.3%
futures
        avg: -0.8%      min: -1.3%      max: -0.3%
futures-opt
        avg: -0.7%      min: -1.2%      max: -0.1%
regex
        avg: -0.5%      min: -1.2%      max: -0.1%
regex-opt
        avg: -0.5%      min: -1.1%      max: -0.1%
hyper-check
        avg: -0.7%      min: -1.0%      max: -0.3%
@rust-highfive
Copy link
Collaborator

@rust-highfive rust-highfive commented May 8, 2018

r? @pnkfelix

(rust_highfive has picked a reviewer for you, use r? to override)

// new symbol because the string in the LitKind is different to the
// string in the Token.
let s = &sym.as_str();
if s.contains('\\') || s.contains('\r') {

This comment has been minimized.

@michaelwoerister

michaelwoerister May 8, 2018
Contributor

Since we are already micro-optimizing: This will iterate the string twice. Something like the following might be faster:

if s.as_bytes().iter().any(|&c| c == b'\\' || c == b'\r') {
@michaelwoerister
Copy link
Contributor

@michaelwoerister michaelwoerister commented May 8, 2018

Looks good to me.

@nnethercote nnethercote force-pushed the nnethercote:lit_token branch from d4d5d08 to b709fb0 May 8, 2018
In the common case, the string value in a string literal Token is the
same as the string value in a string literal LitKind. (The exception is
when escapes or \r are involved.) This patch takes advantage of that to
avoid calling str_lit() and re-interning the string in that case. This
speeds up incremental builds for a few of the rustc-benchmarks, the best
by 3%.
@nnethercote nnethercote force-pushed the nnethercote:lit_token branch from b709fb0 to 65ea0ff May 8, 2018
@nnethercote
Copy link
Contributor Author

@nnethercote nnethercote commented May 8, 2018

I updated to include @michaelwoerister's suggestion.

@michaelwoerister
Copy link
Contributor

@michaelwoerister michaelwoerister commented May 9, 2018

@bors r+

Thanks, @nnethercote!

@bors
Copy link
Contributor

@bors bors commented May 9, 2018

📌 Commit 65ea0ff has been approved by michaelwoerister

kennytm added a commit to kennytm/rust that referenced this pull request May 9, 2018
…rister

Optimize string handling in lit_token().

In the common case, the string value in a string literal Token is the
same as the string value in a string literal LitKind. (The exception is
when escapes or \r are involved.) This patch takes advantage of that to
avoid calling str_lit() and re-interning the string in that case. This
speeds up incremental builds for a few of the rustc-benchmarks, the best
by 3%.

Benchmarks that got a speedup of 1% or more:
```
coercions
        avg: -1.1%      min: -3.5%      max: 0.4%
regex-check
        avg: -1.2%      min: -1.5%      max: -0.6%
futures-check
        avg: -0.9%      min: -1.4%      max: -0.3%
futures
        avg: -0.8%      min: -1.3%      max: -0.3%
futures-opt
        avg: -0.7%      min: -1.2%      max: -0.1%
regex
        avg: -0.5%      min: -1.2%      max: -0.1%
regex-opt
        avg: -0.5%      min: -1.1%      max: -0.1%
hyper-check
        avg: -0.7%      min: -1.0%      max: -0.3%
```
bors added a commit that referenced this pull request May 9, 2018
Rollup of 12 pull requests

Successful merges:

 - #49988 (Mention Result<!, E> in never docs.)
 - #50148 (turn `ManuallyDrop::new` into a constant function)
 - #50432 (Fix rustdoc pathes search)
 - #50456 (Update the Cargo submodule)
 - #50460 (Make `String::new()` const)
 - #50464 (Remove some transmutes)
 - #50505 (Added regression function match value test)
 - #50511 (Add some explanations for #[must_use])
 - #50525 (Optimize string handling in lit_token().)
 - #50527 (Cleanup a `use` in a raw_vec test)
 - #50539 (Add more logarithm constants)
 - #49523 (Update RELEASES.md for 1.26.0)

Failed merges:
bors added a commit that referenced this pull request May 9, 2018
Rollup of 11 pull requests

Successful merges:

 - #49988 (Mention Result<!, E> in never docs.)
 - #50148 (turn `ManuallyDrop::new` into a constant function)
 - #50456 (Update the Cargo submodule)
 - #50460 (Make `String::new()` const)
 - #50464 (Remove some transmutes)
 - #50505 (Added regression function match value test)
 - #50511 (Add some explanations for #[must_use])
 - #50525 (Optimize string handling in lit_token().)
 - #50527 (Cleanup a `use` in a raw_vec test)
 - #50539 (Add more logarithm constants)
 - #49523 (Update RELEASES.md for 1.26.0)

Failed merges:
@bors bors merged commit 65ea0ff into rust-lang:master May 9, 2018
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@nnethercote nnethercote deleted the nnethercote:lit_token branch May 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.