New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/char: Speed up `to_digit()` for `radix <= 10` #55932

Merged
merged 5 commits into from Nov 15, 2018

Conversation

Projects
None yet
8 participants
@Turbo87
Member

Turbo87 commented Nov 13, 2018

I noticed that char::to_digit() seemed to do a bit of extra work for handling [a-zA-Z] characters. Since to_digit(10) seems to be the most common case (at least in the rust codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the radix < 10 case, which also seems to have a positive effect.

It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something!

Before

# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16      ... bench:      13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2       ... bench:      13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36      ... bench:      14,236 ns/iter (+/- 1,949)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16      ... bench:      13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2       ... bench:      13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36      ... bench:      14,206 ns/iter (+/- 1,312)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16      ... bench:      14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2       ... bench:      13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36      ... bench:      14,388 ns/iter (+/- 1,068)

After

# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16      ... bench:      12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2       ... bench:      11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36      ... bench:      14,213 ns/iter (+/- 614)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16      ... bench:      12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2       ... bench:      11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36      ... bench:      14,249 ns/iter (+/- 3,514)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16      ... bench:      12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2       ... bench:      11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36      ... bench:      14,188 ns/iter (+/- 1,501)

I ran the benchmark using:

python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit"
@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Nov 13, 2018

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@Turbo87 Turbo87 force-pushed the Turbo87:to_digit branch from 50406df to c3ae8ed Nov 13, 2018

@scottmcm

This comment has been minimized.

Member

scottmcm commented Nov 13, 2018

Consider whether there should also be a benchmark for non-constant radix. I assume this will make that path slower, which is probably a good trade-off, but sounds worth quantifying.

@Turbo87

This comment has been minimized.

Member

Turbo87 commented Nov 13, 2018

@scottmcm yeah, I was wondering about that too. I'm not sure how to write such a benchmark though. I assume if I use the following code it would just inline the variable and use the constant radix path too:

let radix = 8u32;
c.to_digit(radix);

any clues on how I can make sure that the compiler does not consider this a constant? would "32".parse::<u32>() be an option?

@Xanewok

This comment has been minimized.

Member

Xanewok commented Nov 13, 2018

@Turbo87 since we can use libtest here, maybe black_box is worth a shot?

@Turbo87

This comment has been minimized.

Member

Turbo87 commented Nov 13, 2018

@Xanewok good idea. I assume that was just for blackboxing the output, but it seems it might also work for input. I'll try it out and report back 🤔

Turbo87 added some commits Nov 13, 2018

core/char: Speed up `to_digit()` for `radix <= 10`
### Before

```
# Run 1
test char::methods::bench_to_digit_radix_10                ... bench:      16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16                ... bench:      13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2                 ... bench:      13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36                ... bench:      14,236 ns/iter (+/- 1,949)

# Run 2
test char::methods::bench_to_digit_radix_10                ... bench:      16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16                ... bench:      13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2                 ... bench:      13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36                ... bench:      14,206 ns/iter (+/- 1,312)

# Run 3
test char::methods::bench_to_digit_radix_10                ... bench:      16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16                ... bench:      14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2                 ... bench:      13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36                ... bench:      14,388 ns/iter (+/- 1,068)
```

### After

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16      ... bench:      12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2       ... bench:      11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36      ... bench:      14,213 ns/iter (+/- 614)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16      ... bench:      12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2       ... bench:      11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36      ... bench:      14,249 ns/iter (+/- 3,514)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16      ... bench:      12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2       ... bench:      11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36      ... bench:      14,188 ns/iter (+/- 1,501)
```

@Turbo87 Turbo87 force-pushed the Turbo87:to_digit branch from c3ae8ed to 17f08fe Nov 13, 2018

@Turbo87

This comment has been minimized.

Member

Turbo87 commented Nov 13, 2018

@scottmcm @Xanewok these are the results including the new bench_to_digit_radix_var benchmark:

Before

# Run 1
test char::methods::bench_to_digit_radix_10        ... bench:      16,087 ns/iter (+/- 846)
test char::methods::bench_to_digit_radix_16        ... bench:      14,161 ns/iter (+/- 721)
test char::methods::bench_to_digit_radix_2         ... bench:      14,269 ns/iter (+/- 4,107)
test char::methods::bench_to_digit_radix_36        ... bench:      14,195 ns/iter (+/- 1,169)
test char::methods::bench_to_digit_radix_var       ... bench:      23,104 ns/iter (+/- 2,296)

# Run 2
test char::methods::bench_to_digit_radix_10        ... bench:      16,122 ns/iter (+/- 2,736)
test char::methods::bench_to_digit_radix_16        ... bench:      14,165 ns/iter (+/- 4,147)
test char::methods::bench_to_digit_radix_2         ... bench:      14,048 ns/iter (+/- 4,400)
test char::methods::bench_to_digit_radix_36        ... bench:      14,136 ns/iter (+/- 608)
test char::methods::bench_to_digit_radix_var       ... bench:      23,045 ns/iter (+/- 1,621)

# Run 3
test char::methods::bench_to_digit_radix_10        ... bench:      16,018 ns/iter (+/- 536)
test char::methods::bench_to_digit_radix_16        ... bench:      14,157 ns/iter (+/- 886)
test char::methods::bench_to_digit_radix_2         ... bench:      14,178 ns/iter (+/- 2,260)
test char::methods::bench_to_digit_radix_36        ... bench:      14,177 ns/iter (+/- 865)
test char::methods::bench_to_digit_radix_var       ... bench:      23,043 ns/iter (+/- 1,983)

After

# Run 1
test char::methods::bench_to_digit_radix_10        ... bench:      11,518 ns/iter (+/- 497)
test char::methods::bench_to_digit_radix_16        ... bench:      14,623 ns/iter (+/- 1,260)
test char::methods::bench_to_digit_radix_2         ... bench:      11,240 ns/iter (+/- 1,430)
test char::methods::bench_to_digit_radix_36        ... bench:      14,189 ns/iter (+/- 854)
test char::methods::bench_to_digit_radix_var       ... bench:      24,513 ns/iter (+/- 3,337)

# Run 2
test char::methods::bench_to_digit_radix_10        ... bench:      11,536 ns/iter (+/- 1,549)
test char::methods::bench_to_digit_radix_16        ... bench:      14,602 ns/iter (+/- 1,033)
test char::methods::bench_to_digit_radix_2         ... bench:      13,940 ns/iter (+/- 9,252)
test char::methods::bench_to_digit_radix_36        ... bench:      14,303 ns/iter (+/- 749)
test char::methods::bench_to_digit_radix_var       ... bench:      24,298 ns/iter (+/- 1,440)

# Run 3
test char::methods::bench_to_digit_radix_10        ... bench:      11,491 ns/iter (+/- 840)
test char::methods::bench_to_digit_radix_16        ... bench:      14,540 ns/iter (+/- 991)
test char::methods::bench_to_digit_radix_2         ... bench:      11,275 ns/iter (+/- 576)
test char::methods::bench_to_digit_radix_36        ... bench:      14,201 ns/iter (+/- 444)
test char::methods::bench_to_digit_radix_var       ... bench:      24,617 ns/iter (+/- 5,132)

as predicted the variable radix case is slightly slower than before, but that seems like a good tradeoff to me as variable radix is probably much more rare than a constant radix.

core/char: Drop `radix == 10` special case
This seems to perform equally well
@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 14, 2018

@bors: r+

These look like great improvement, thanks @Turbo87!

@bors

This comment has been minimized.

Contributor

bors commented Nov 14, 2018

📌 Commit 7843e27 has been approved by alexcrichton

kennytm added a commit to kennytm/rust that referenced this pull request Nov 15, 2018

Rollup merge of rust-lang#55932 - Turbo87:to_digit, r=alexcrichton
core/char: Speed up `to_digit()` for `radix <= 10`

I noticed that `char::to_digit()` seemed to do a bit of extra work for handling `[a-zA-Z]` characters. Since `to_digit(10)` seems to be the most common case (at least in the `rust` codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the `radix < 10` case, which also seems to have a positive effect.

It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something!

### Before

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16      ... bench:      13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2       ... bench:      13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36      ... bench:      14,236 ns/iter (+/- 1,949)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16      ... bench:      13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2       ... bench:      13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36      ... bench:      14,206 ns/iter (+/- 1,312)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16      ... bench:      14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2       ... bench:      13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36      ... bench:      14,388 ns/iter (+/- 1,068)
```

### After

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16      ... bench:      12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2       ... bench:      11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36      ... bench:      14,213 ns/iter (+/- 614)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16      ... bench:      12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2       ... bench:      11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36      ... bench:      14,249 ns/iter (+/- 3,514)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16      ... bench:      12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2       ... bench:      11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36      ... bench:      14,188 ns/iter (+/- 1,501)
```

I ran the benchmark using:

```sh
python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit"
```

bors added a commit that referenced this pull request Nov 15, 2018

Auto merge of #55943 - kennytm:rollup, r=kennytm
Rollup of 16 pull requests

Successful merges:

 - #54906 (Reattach all grandchildren when constructing specialization graph.)
 - #55182 (Redox: Update to new changes)
 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55530 (Speed up String::from_utf16)
 - #55556 (Use `Mmap` to open the rmeta file.)
 - #55622 (NetBSD: link libstd with librt in addition to libpthread)
 - #55827 (A few tweaks to iterations/collecting)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55932 (core/char: Speed up `to_digit()` for `radix <= 10`)
 - #55935 (appveyor: Use VS2017 for all our images)
 - #55936 (save-analysis: be even more aggressive about ignorning macro-generated defs)
 - #55948 (submodules: update clippy from d8b42690 to 7e0ddef4)
 - #55956 (add tests for some fixed ICEs)

pietroalbini added a commit to pietroalbini/rust that referenced this pull request Nov 15, 2018

Rollup merge of rust-lang#55932 - Turbo87:to_digit, r=alexcrichton
core/char: Speed up `to_digit()` for `radix <= 10`

I noticed that `char::to_digit()` seemed to do a bit of extra work for handling `[a-zA-Z]` characters. Since `to_digit(10)` seems to be the most common case (at least in the `rust` codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the `radix < 10` case, which also seems to have a positive effect.

It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something!

### Before

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16      ... bench:      13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2       ... bench:      13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36      ... bench:      14,236 ns/iter (+/- 1,949)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16      ... bench:      13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2       ... bench:      13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36      ... bench:      14,206 ns/iter (+/- 1,312)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16      ... bench:      14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2       ... bench:      13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36      ... bench:      14,388 ns/iter (+/- 1,068)
```

### After

```
# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16      ... bench:      12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2       ... bench:      11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36      ... bench:      14,213 ns/iter (+/- 614)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16      ... bench:      12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2       ... bench:      11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36      ... bench:      14,249 ns/iter (+/- 3,514)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16      ... bench:      12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2       ... bench:      11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36      ... bench:      14,188 ns/iter (+/- 1,501)
```

I ran the benchmark using:

```sh
python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit"
```

bors added a commit that referenced this pull request Nov 15, 2018

Auto merge of #55974 - pietroalbini:rollup, r=pietroalbini
Rollup of 17 pull requests

Successful merges:

 - #55182 (Redox: Update to new changes)
 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55530 (Speed up String::from_utf16)
 - #55556 (Use `Mmap` to open the rmeta file.)
 - #55622 (NetBSD: link libstd with librt in addition to libpthread)
 - #55750 (Make `NodeId` and `HirLocalId` `newtype_index`)
 - #55778 (Wrap some query results in `Lrc`.)
 - #55781 (More precise spans for temps and their drops)
 - #55785 (Add mem::forget_unsized() for forgetting unsized values)
 - #55852 (Rewrite `...` as `..=` as a `MachineApplicable` 2018 idiom lint)
 - #55865 (Unix RwLock: avoid racy access to write_locked)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55932 (core/char: Speed up `to_digit()` for `radix <= 10`)
 - #55956 (add tests for some fixed ICEs)

Failed merges:

r? @ghost

@bors bors merged commit 7843e27 into rust-lang:master Nov 15, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@Turbo87 Turbo87 deleted the Turbo87:to_digit branch Nov 15, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment