Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve `char::is_ascii_*` codegen #67585

Merged
merged 1 commit into from Feb 12, 2020
Merged

Conversation

@ranma42
Copy link
Contributor

ranma42 commented Dec 24, 2019

This PR is an attempt to fix #65127

A couple of warnings:

  1. the generated code might be further improved (in LLVM and/or MIR) by emitting better comparison sequences; in particular, this would improve the performance of "complex" checks such as those in is_ascii_punctuation
  2. the second commit is currently marked "DO NOT MERGE", because it regresses SIMD on u8 slices; this could likely be fixed by improving the computation/usage of demanded bits in LLVM

An alternative approach to remove the code duplication might be the use of macros, but currently most of the duplication is actually in the doc comments, so maybe just keeping the redundancy could be ok

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Dec 24, 2019

r? @joshtriplett

(rust_highfive has picked a reviewer for you, use r? to override)

@ranma42

This comment has been minimized.

Copy link
Contributor Author

ranma42 commented Dec 24, 2019

Some benchmark numbers:

tartaros:rust ranma42$ rustc -O is_ascii.rs --test
tartaros:rust ranma42$ ./is_ascii --bench

running 22 tests
test is_ascii::byte_bench              ... bench:          65 ns/iter (+/- 6)
test is_ascii::char_bench              ... bench:          63 ns/iter (+/- 5)
test is_ascii_alphabetic::byte_bench   ... bench:          81 ns/iter (+/- 9)
test is_ascii_alphabetic::char_bench   ... bench:         125 ns/iter (+/- 23)
test is_ascii_alphanumeric::byte_bench ... bench:         114 ns/iter (+/- 8)
test is_ascii_alphanumeric::char_bench ... bench:         110 ns/iter (+/- 11)
test is_ascii_control::byte_bench      ... bench:          94 ns/iter (+/- 8)
test is_ascii_control::char_bench      ... bench:         105 ns/iter (+/- 50)
test is_ascii_digit::byte_bench        ... bench:          70 ns/iter (+/- 2)
test is_ascii_digit::char_bench        ... bench:          95 ns/iter (+/- 10)
test is_ascii_graphic::byte_bench      ... bench:          69 ns/iter (+/- 6)
test is_ascii_graphic::char_bench      ... bench:         100 ns/iter (+/- 3)
test is_ascii_hexdigit::byte_bench     ... bench:         118 ns/iter (+/- 10)
test is_ascii_hexdigit::char_bench     ... bench:         137 ns/iter (+/- 5)
test is_ascii_lowercase::byte_bench    ... bench:          70 ns/iter (+/- 10)
test is_ascii_lowercase::char_bench    ... bench:          94 ns/iter (+/- 11)
test is_ascii_punctuation::byte_bench  ... bench:         282 ns/iter (+/- 33)
test is_ascii_punctuation::char_bench  ... bench:         366 ns/iter (+/- 17)
test is_ascii_uppercase::byte_bench    ... bench:          70 ns/iter (+/- 4)
test is_ascii_uppercase::char_bench    ... bench:         111 ns/iter (+/- 16)
test is_ascii_whitespace::byte_bench   ... bench:          85 ns/iter (+/- 6)
test is_ascii_whitespace::char_bench   ... bench:          85 ns/iter (+/- 12)

test result: ok. 0 passed; 0 failed; 0 ignored; 22 measured; 0 filtered out

tartaros:rust ranma42$ ./rust/build/x86_64-apple-darwin/stage2/bin/rustc -O is_ascii.rs --test
tartaros:rust ranma42$ ./is_ascii --bench

running 22 tests
test is_ascii::byte_bench              ... bench:          63 ns/iter (+/- 8)
test is_ascii::char_bench              ... bench:          63 ns/iter (+/- 7)
test is_ascii_alphabetic::byte_bench   ... bench:          81 ns/iter (+/- 8)
test is_ascii_alphabetic::char_bench   ... bench:          81 ns/iter (+/- 7)
test is_ascii_alphanumeric::byte_bench ... bench:         116 ns/iter (+/- 8)
test is_ascii_alphanumeric::char_bench ... bench:         116 ns/iter (+/- 9)
test is_ascii_control::byte_bench      ... bench:          94 ns/iter (+/- 2)
test is_ascii_control::char_bench      ... bench:          93 ns/iter (+/- 7)
test is_ascii_digit::byte_bench        ... bench:          70 ns/iter (+/- 2)
test is_ascii_digit::char_bench        ... bench:          69 ns/iter (+/- 3)
test is_ascii_graphic::byte_bench      ... bench:          69 ns/iter (+/- 2)
test is_ascii_graphic::char_bench      ... bench:          70 ns/iter (+/- 2)
test is_ascii_hexdigit::byte_bench     ... bench:         135 ns/iter (+/- 21)
test is_ascii_hexdigit::char_bench     ... bench:         124 ns/iter (+/- 18)
test is_ascii_lowercase::byte_bench    ... bench:          73 ns/iter (+/- 5)
test is_ascii_lowercase::char_bench    ... bench:          70 ns/iter (+/- 2)
test is_ascii_punctuation::byte_bench  ... bench:         298 ns/iter (+/- 14)
test is_ascii_punctuation::char_bench  ... bench:         274 ns/iter (+/- 44)
test is_ascii_uppercase::byte_bench    ... bench:          69 ns/iter (+/- 7)
test is_ascii_uppercase::char_bench    ... bench:          69 ns/iter (+/- 5)
test is_ascii_whitespace::byte_bench   ... bench:          86 ns/iter (+/- 10)
test is_ascii_whitespace::char_bench   ... bench:          72 ns/iter (+/- 3)

test result: ok. 0 passed; 0 failed; 0 ignored; 22 measured; 0 filtered out

Please do not take these measurement as absolute as they suffer from 2 limitations:

  • are on single elements (and SIMD codegen is affected by this change as shown in https://godbolt.org/z/sK6rKz )
  • are only on valid ASCII codepoints and uniformly distributed on each of them, while in the real world the frequencies are likely to be very skewed
@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Dec 25, 2019

Nice improvement.

Do you have a plan to improve the second commit?

@ranma42

This comment has been minimized.

Copy link
Contributor Author

ranma42 commented Dec 25, 2019

I would love some guidance regarding the second commit. What is the preferred direction?
Accept duplication in the code (just like in the docs)?
Re-use the ranges by means of a macro?

@JohnCSimon

This comment has been minimized.

Copy link
Member

JohnCSimon commented Jan 18, 2020

Ping from triage:
@ranma42 can you post your status on this PR or ping others for additional help? Thanks.

@ranma42

This comment has been minimized.

Copy link
Contributor Author

ranma42 commented Jan 19, 2020

@joshtriplett short term I guess the safest option is to drop the second commit and accept the duplication (the code is trivial, the duplication in the doc comments is actually more heavy-weight than the implementation code). Would that be ok?

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

Mark-Simulacrum commented Jan 20, 2020

Yes, I think the duplication is fine for now.

@ranma42 ranma42 force-pushed the ranma42:fix/char-is-ascii-codegen branch from 7e6e0be to 84435dd Jan 21, 2020
@ranma42 ranma42 marked this pull request as ready for review Jan 21, 2020
@ranma42

This comment has been minimized.

Copy link
Contributor Author

ranma42 commented Jan 24, 2020

This is still marked as S-waiting-on-author.
Should I do something or just ping @joshtriplett ?

@ranma42

This comment has been minimized.

Copy link
Contributor Author

ranma42 commented Feb 1, 2020

I am guessing it is being ignored because of the tags.
@JohnCSimon could you update them?

@mati865

This comment was marked as resolved.

Copy link
Contributor

mati865 commented Feb 1, 2020

@rustbot modify labels: -S-waiting-on-author +S-waiting-on-review

@Dylan-DPC

This comment has been minimized.

Copy link
Member

Dylan-DPC commented Feb 10, 2020

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 11, 2020

☔️ The latest upstream changes (presumably #69030) made this pull request unmergeable. Please resolve the merge conflicts.

These methods explicitly check if a char is in a specific ASCII range,
therefore the `is_ascii()` check is not needed, but LLVM seems to be
unable to remove it.

WARNING: this change improves the performance on ASCII `char`s, but
complex checks such as `is_ascii_punctuation` become slower on
non-ASCII `char`s.
@ranma42 ranma42 force-pushed the ranma42:fix/char-is-ascii-codegen branch from 84435dd to 4e7aeaf Feb 11, 2020
@ranma42

This comment has been minimized.

Copy link
Contributor Author

ranma42 commented Feb 11, 2020

rebased to resolve merge conflicts

@Dylan-DPC

This comment has been minimized.

Copy link
Member

Dylan-DPC commented Feb 12, 2020

@rust-highfive rust-highfive assigned Amanieu and unassigned KodrAus Feb 12, 2020
@Amanieu

This comment has been minimized.

Copy link
Contributor

Amanieu commented Feb 12, 2020

There is a way to improve performance even further, at the cost of uglier code. If you need to check both uppercase and lowercase, you can clear bit 6 (0x20) of the character code to convert to uppercase and then do a single range check on the uppercase value.

So for example, is_ascii_alphabetic could be implemented as 'A'..='Z'.contains(c & !0x20).

@ranma42

This comment has been minimized.

Copy link
Contributor Author

ranma42 commented Feb 12, 2020

Yup, with further tweaking it might be possible to improve the performance.
This PR mainly aims at making char and u8 be on par.

@Amanieu

This comment has been minimized.

Copy link
Contributor

Amanieu commented Feb 12, 2020

Well, let's leave that for a future PR then.

@bors r+

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 12, 2020

📌 Commit 4e7aeaf has been approved by Amanieu

Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Feb 12, 2020
…r=Amanieu

Improve `char::is_ascii_*` codegen

This PR is an attempt to fix rust-lang#65127

A couple of warnings:
 1. the generated code might be further improved (in LLVM and/or MIR) by emitting better comparison sequences; in particular, this would improve the performance of "complex" checks such as those in `is_ascii_punctuation`
 2. the second commit is currently marked "DO NOT MERGE", because it regresses SIMD on `u8` slices; this could likely be fixed by improving the computation/usage of demanded bits in LLVM

An alternative approach to remove the code duplication might be the use of macros, but currently most of the duplication is actually in the doc comments, so maybe just keeping the redundancy could be ok
bors added a commit that referenced this pull request Feb 12, 2020
Rollup of 8 pull requests

Successful merges:

 - #67585 (Improve `char::is_ascii_*` codegen)
 - #68914 (Speed up `SipHasher128`.)
 - #68994 (rustbuild: include channel in sanitizers installed name)
 - #69032 (ICE in nightly-2020-02-08: handle TerminatorKind::Yield in librustc_mir::transform::promote_consts::Validator method)
 - #69034 (parser: Remove `Parser::prev_token_kind`)
 - #69042 (Remove backtrace header text)
 - #69059 (Remove a few unused objects)
 - #69089 (Properly use the darwin archive format on Apple targets)

Failed merges:

r? @ghost
@bors bors merged commit 4e7aeaf into rust-lang:master Feb 12, 2020
4 checks passed
4 checks passed
pr Build #20200211.26 succeeded
Details
pr (Linux mingw-check) Linux mingw-check succeeded
Details
pr (Linux x86_64-gnu-llvm-7) Linux x86_64-gnu-llvm-7 succeeded
Details
pr (Linux x86_64-gnu-tools) Linux x86_64-gnu-tools succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

You can’t perform that action at this time.