New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize string handling in lit_token(). #50525

Merged
merged 1 commit into from May 9, 2018

Conversation

Projects
None yet
5 participants
@nnethercote
Contributor

nnethercote commented May 8, 2018

In the common case, the string value in a string literal Token is the
same as the string value in a string literal LitKind. (The exception is
when escapes or \r are involved.) This patch takes advantage of that to
avoid calling str_lit() and re-interning the string in that case. This
speeds up incremental builds for a few of the rustc-benchmarks, the best
by 3%.

Benchmarks that got a speedup of 1% or more:

coercions
        avg: -1.1%      min: -3.5%      max: 0.4%
regex-check
        avg: -1.2%      min: -1.5%      max: -0.6%
futures-check
        avg: -0.9%      min: -1.4%      max: -0.3%
futures
        avg: -0.8%      min: -1.3%      max: -0.3%
futures-opt
        avg: -0.7%      min: -1.2%      max: -0.1%
regex
        avg: -0.5%      min: -1.2%      max: -0.1%
regex-opt
        avg: -0.5%      min: -1.1%      max: -0.1%
hyper-check
        avg: -0.7%      min: -1.0%      max: -0.3%
@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented May 8, 2018

r? @pnkfelix

(rust_highfive has picked a reviewer for you, use r? to override)

// new symbol because the string in the LitKind is different to the
// string in the Token.
let s = &sym.as_str();
if s.contains('\\') || s.contains('\r') {

This comment has been minimized.

@michaelwoerister

michaelwoerister May 8, 2018

Contributor

Since we are already micro-optimizing: This will iterate the string twice. Something like the following might be faster:

if s.as_bytes().iter().any(|&c| c == b'\\' || c == b'\r') {
@michaelwoerister

This comment has been minimized.

Contributor

michaelwoerister commented May 8, 2018

Looks good to me.

Optimize string handling in lit_token().
In the common case, the string value in a string literal Token is the
same as the string value in a string literal LitKind. (The exception is
when escapes or \r are involved.) This patch takes advantage of that to
avoid calling str_lit() and re-interning the string in that case. This
speeds up incremental builds for a few of the rustc-benchmarks, the best
by 3%.
@nnethercote

This comment has been minimized.

Contributor

nnethercote commented May 8, 2018

I updated to include @michaelwoerister's suggestion.

@michaelwoerister

This comment has been minimized.

Contributor

michaelwoerister commented May 9, 2018

@bors r+

Thanks, @nnethercote!

@bors

This comment has been minimized.

Contributor

bors commented May 9, 2018

📌 Commit 65ea0ff has been approved by michaelwoerister

kennytm added a commit to kennytm/rust that referenced this pull request May 9, 2018

Rollup merge of rust-lang#50525 - nnethercote:lit_token, r=michaelwoe…
…rister

Optimize string handling in lit_token().

In the common case, the string value in a string literal Token is the
same as the string value in a string literal LitKind. (The exception is
when escapes or \r are involved.) This patch takes advantage of that to
avoid calling str_lit() and re-interning the string in that case. This
speeds up incremental builds for a few of the rustc-benchmarks, the best
by 3%.

Benchmarks that got a speedup of 1% or more:
```
coercions
        avg: -1.1%      min: -3.5%      max: 0.4%
regex-check
        avg: -1.2%      min: -1.5%      max: -0.6%
futures-check
        avg: -0.9%      min: -1.4%      max: -0.3%
futures
        avg: -0.8%      min: -1.3%      max: -0.3%
futures-opt
        avg: -0.7%      min: -1.2%      max: -0.1%
regex
        avg: -0.5%      min: -1.2%      max: -0.1%
regex-opt
        avg: -0.5%      min: -1.1%      max: -0.1%
hyper-check
        avg: -0.7%      min: -1.0%      max: -0.3%
```

bors added a commit that referenced this pull request May 9, 2018

Auto merge of #50546 - kennytm:rollup, r=kennytm
Rollup of 12 pull requests

Successful merges:

 - #49988 (Mention Result<!, E> in never docs.)
 - #50148 (turn `ManuallyDrop::new` into a constant function)
 - #50432 (Fix rustdoc pathes search)
 - #50456 (Update the Cargo submodule)
 - #50460 (Make `String::new()` const)
 - #50464 (Remove some transmutes)
 - #50505 (Added regression function match value test)
 - #50511 (Add some explanations for #[must_use])
 - #50525 (Optimize string handling in lit_token().)
 - #50527 (Cleanup a `use` in a raw_vec test)
 - #50539 (Add more logarithm constants)
 - #49523 (Update RELEASES.md for 1.26.0)

Failed merges:

bors added a commit that referenced this pull request May 9, 2018

Auto merge of #50546 - kennytm:rollup, r=kennytm
Rollup of 11 pull requests

Successful merges:

 - #49988 (Mention Result<!, E> in never docs.)
 - #50148 (turn `ManuallyDrop::new` into a constant function)
 - #50456 (Update the Cargo submodule)
 - #50460 (Make `String::new()` const)
 - #50464 (Remove some transmutes)
 - #50505 (Added regression function match value test)
 - #50511 (Add some explanations for #[must_use])
 - #50525 (Optimize string handling in lit_token().)
 - #50527 (Cleanup a `use` in a raw_vec test)
 - #50539 (Add more logarithm constants)
 - #49523 (Update RELEASES.md for 1.26.0)

Failed merges:

@bors bors merged commit 65ea0ff into rust-lang:master May 9, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@nnethercote nnethercote deleted the nnethercote:lit_token branch May 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment