Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/char: Speed up to_digit() for radix <= 10 #55932

Merged
merged 5 commits into from
Nov 15, 2018
Merged

Conversation

Turbo87
Copy link
Member

@Turbo87 Turbo87 commented Nov 13, 2018

I noticed that char::to_digit() seemed to do a bit of extra work for handling [a-zA-Z] characters. Since to_digit(10) seems to be the most common case (at least in the rust codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the radix < 10 case, which also seems to have a positive effect.

It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something!

Before

# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16      ... bench:      13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2       ... bench:      13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36      ... bench:      14,236 ns/iter (+/- 1,949)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16      ... bench:      13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2       ... bench:      13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36      ... bench:      14,206 ns/iter (+/- 1,312)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16      ... bench:      14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2       ... bench:      13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36      ... bench:      14,388 ns/iter (+/- 1,068)

After

# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16      ... bench:      12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2       ... bench:      11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36      ... bench:      14,213 ns/iter (+/- 614)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16      ... bench:      12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2       ... bench:      11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36      ... bench:      14,249 ns/iter (+/- 3,514)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16      ... bench:      12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2       ... bench:      11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36      ... bench:      14,188 ns/iter (+/- 1,501)

I ran the benchmark using:

python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit"

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 13, 2018
@scottmcm
Copy link
Member

Consider whether there should also be a benchmark for non-constant radix. I assume this will make that path slower, which is probably a good trade-off, but sounds worth quantifying.

@Turbo87
Copy link
Member Author

Turbo87 commented Nov 13, 2018

@scottmcm yeah, I was wondering about that too. I'm not sure how to write such a benchmark though. I assume if I use the following code it would just inline the variable and use the constant radix path too:

let radix = 8u32;
c.to_digit(radix);

any clues on how I can make sure that the compiler does not consider this a constant? would "32".parse::<u32>() be an option?

@Xanewok
Copy link
Member

Xanewok commented Nov 13, 2018

@Turbo87 since we can use libtest here, maybe black_box is worth a shot?

@Turbo87
Copy link
Member Author

Turbo87 commented Nov 13, 2018

@Xanewok good idea. I assume that was just for blackboxing the output, but it seems it might also work for input. I'll try it out and report back 🤔

### Before

```
# Run 1
test char::methods::bench_to_digit_radix_10                ... bench:      16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16                ... bench:      13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2                 ... bench:      13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36                ... bench:      14,236 ns/iter (+/- 1,949)

# Run 2
test char::methods::bench_to_digit_radix_10                ... bench:      16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16                ... bench:      13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2                 ... bench:      13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36                ... bench:      14,206 ns/iter (+/- 1,312)

# Run 3
test char::methods::bench_to_digit_radix_10                ... bench:      16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16                ... bench:      14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2                 ... bench:      13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36                ... bench:      14,388 ns/iter (+/- 1,068)
```

### After

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16      ... bench:      12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2       ... bench:      11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36      ... bench:      14,213 ns/iter (+/- 614)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16      ... bench:      12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2       ... bench:      11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36      ... bench:      14,249 ns/iter (+/- 3,514)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16      ... bench:      12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2       ... bench:      11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36      ... bench:      14,188 ns/iter (+/- 1,501)
```
@Turbo87
Copy link
Member Author

Turbo87 commented Nov 13, 2018

@scottmcm @Xanewok these are the results including the new bench_to_digit_radix_var benchmark:

Before

# Run 1
test char::methods::bench_to_digit_radix_10        ... bench:      16,087 ns/iter (+/- 846)
test char::methods::bench_to_digit_radix_16        ... bench:      14,161 ns/iter (+/- 721)
test char::methods::bench_to_digit_radix_2         ... bench:      14,269 ns/iter (+/- 4,107)
test char::methods::bench_to_digit_radix_36        ... bench:      14,195 ns/iter (+/- 1,169)
test char::methods::bench_to_digit_radix_var       ... bench:      23,104 ns/iter (+/- 2,296)

# Run 2
test char::methods::bench_to_digit_radix_10        ... bench:      16,122 ns/iter (+/- 2,736)
test char::methods::bench_to_digit_radix_16        ... bench:      14,165 ns/iter (+/- 4,147)
test char::methods::bench_to_digit_radix_2         ... bench:      14,048 ns/iter (+/- 4,400)
test char::methods::bench_to_digit_radix_36        ... bench:      14,136 ns/iter (+/- 608)
test char::methods::bench_to_digit_radix_var       ... bench:      23,045 ns/iter (+/- 1,621)

# Run 3
test char::methods::bench_to_digit_radix_10        ... bench:      16,018 ns/iter (+/- 536)
test char::methods::bench_to_digit_radix_16        ... bench:      14,157 ns/iter (+/- 886)
test char::methods::bench_to_digit_radix_2         ... bench:      14,178 ns/iter (+/- 2,260)
test char::methods::bench_to_digit_radix_36        ... bench:      14,177 ns/iter (+/- 865)
test char::methods::bench_to_digit_radix_var       ... bench:      23,043 ns/iter (+/- 1,983)

After

# Run 1
test char::methods::bench_to_digit_radix_10        ... bench:      11,518 ns/iter (+/- 497)
test char::methods::bench_to_digit_radix_16        ... bench:      14,623 ns/iter (+/- 1,260)
test char::methods::bench_to_digit_radix_2         ... bench:      11,240 ns/iter (+/- 1,430)
test char::methods::bench_to_digit_radix_36        ... bench:      14,189 ns/iter (+/- 854)
test char::methods::bench_to_digit_radix_var       ... bench:      24,513 ns/iter (+/- 3,337)

# Run 2
test char::methods::bench_to_digit_radix_10        ... bench:      11,536 ns/iter (+/- 1,549)
test char::methods::bench_to_digit_radix_16        ... bench:      14,602 ns/iter (+/- 1,033)
test char::methods::bench_to_digit_radix_2         ... bench:      13,940 ns/iter (+/- 9,252)
test char::methods::bench_to_digit_radix_36        ... bench:      14,303 ns/iter (+/- 749)
test char::methods::bench_to_digit_radix_var       ... bench:      24,298 ns/iter (+/- 1,440)

# Run 3
test char::methods::bench_to_digit_radix_10        ... bench:      11,491 ns/iter (+/- 840)
test char::methods::bench_to_digit_radix_16        ... bench:      14,540 ns/iter (+/- 991)
test char::methods::bench_to_digit_radix_2         ... bench:      11,275 ns/iter (+/- 576)
test char::methods::bench_to_digit_radix_36        ... bench:      14,201 ns/iter (+/- 444)
test char::methods::bench_to_digit_radix_var       ... bench:      24,617 ns/iter (+/- 5,132)

as predicted the variable radix case is slightly slower than before, but that seems like a good tradeoff to me as variable radix is probably much more rare than a constant radix.

This seems to perform equally well
@alexcrichton
Copy link
Member

@bors: r+

These look like great improvement, thanks @Turbo87!

@bors
Copy link
Contributor

bors commented Nov 14, 2018

📌 Commit 7843e27 has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 14, 2018
kennytm added a commit to kennytm/rust that referenced this pull request Nov 15, 2018
core/char: Speed up `to_digit()` for `radix <= 10`

I noticed that `char::to_digit()` seemed to do a bit of extra work for handling `[a-zA-Z]` characters. Since `to_digit(10)` seems to be the most common case (at least in the `rust` codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the `radix < 10` case, which also seems to have a positive effect.

It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something!

### Before

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16      ... bench:      13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2       ... bench:      13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36      ... bench:      14,236 ns/iter (+/- 1,949)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16      ... bench:      13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2       ... bench:      13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36      ... bench:      14,206 ns/iter (+/- 1,312)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16      ... bench:      14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2       ... bench:      13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36      ... bench:      14,388 ns/iter (+/- 1,068)
```

### After

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16      ... bench:      12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2       ... bench:      11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36      ... bench:      14,213 ns/iter (+/- 614)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16      ... bench:      12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2       ... bench:      11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36      ... bench:      14,249 ns/iter (+/- 3,514)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16      ... bench:      12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2       ... bench:      11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36      ... bench:      14,188 ns/iter (+/- 1,501)
```

I ran the benchmark using:

```sh
python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit"
```
bors added a commit that referenced this pull request Nov 15, 2018
Rollup of 16 pull requests

Successful merges:

 - #54906 (Reattach all grandchildren when constructing specialization graph.)
 - #55182 (Redox: Update to new changes)
 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55530 (Speed up String::from_utf16)
 - #55556 (Use `Mmap` to open the rmeta file.)
 - #55622 (NetBSD: link libstd with librt in addition to libpthread)
 - #55827 (A few tweaks to iterations/collecting)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55932 (core/char: Speed up `to_digit()` for `radix <= 10`)
 - #55935 (appveyor: Use VS2017 for all our images)
 - #55936 (save-analysis: be even more aggressive about ignorning macro-generated defs)
 - #55948 (submodules: update clippy from d8b4269 to 7e0ddef)
 - #55956 (add tests for some fixed ICEs)
pietroalbini added a commit to pietroalbini/rust that referenced this pull request Nov 15, 2018
core/char: Speed up `to_digit()` for `radix <= 10`

I noticed that `char::to_digit()` seemed to do a bit of extra work for handling `[a-zA-Z]` characters. Since `to_digit(10)` seems to be the most common case (at least in the `rust` codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the `radix < 10` case, which also seems to have a positive effect.

It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something!

### Before

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16      ... bench:      13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2       ... bench:      13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36      ... bench:      14,236 ns/iter (+/- 1,949)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16      ... bench:      13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2       ... bench:      13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36      ... bench:      14,206 ns/iter (+/- 1,312)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16      ... bench:      14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2       ... bench:      13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36      ... bench:      14,388 ns/iter (+/- 1,068)
```

### After

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16      ... bench:      12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2       ... bench:      11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36      ... bench:      14,213 ns/iter (+/- 614)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16      ... bench:      12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2       ... bench:      11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36      ... bench:      14,249 ns/iter (+/- 3,514)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16      ... bench:      12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2       ... bench:      11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36      ... bench:      14,188 ns/iter (+/- 1,501)
```

I ran the benchmark using:

```sh
python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit"
```
bors added a commit that referenced this pull request Nov 15, 2018
Rollup of 17 pull requests

Successful merges:

 - #55182 (Redox: Update to new changes)
 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55530 (Speed up String::from_utf16)
 - #55556 (Use `Mmap` to open the rmeta file.)
 - #55622 (NetBSD: link libstd with librt in addition to libpthread)
 - #55750 (Make `NodeId` and `HirLocalId` `newtype_index`)
 - #55778 (Wrap some query results in `Lrc`.)
 - #55781 (More precise spans for temps and their drops)
 - #55785 (Add mem::forget_unsized() for forgetting unsized values)
 - #55852 (Rewrite `...` as `..=` as a `MachineApplicable` 2018 idiom lint)
 - #55865 (Unix RwLock: avoid racy access to write_locked)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55932 (core/char: Speed up `to_digit()` for `radix <= 10`)
 - #55956 (add tests for some fixed ICEs)

Failed merges:

r? @ghost
@bors bors merged commit 7843e27 into rust-lang:master Nov 15, 2018
@Turbo87 Turbo87 deleted the to_digit branch November 15, 2018 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants