New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8280124: Reduce branches decoding latin-1 chars from UTF-8 encoded bytes #7122
Conversation
|
Webrevs
|
On a microbenchmark that zooms in on the logical predicate the speed-up is closer to 2x. This seems like a transformation a JIT could do automatically. gcc and clang doesn't do it, but icc seem to pull it off (as tested via godbolt.org). It's unclear if this is common enough to motivate such enhancement work, but it might be of academic interest to attempt it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@cl4es This change now passes all automated pre-integration checks. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 28 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
|
Thanks for reviewing, everyone! /integrate |
Going to push as commit e314a4c.
Your commit was automatically rebased without conflicts. |
Mailing list message from John Rose on core-libs-dev: It?s kind of sad that you have to do this by hand. One item on my (very long) wish list for the JVM switch (x) { As you can see, this is again a test for membership If the JVM had such a set-test generator, it could (a) turn ? John On 18 Jan 2022, at 2:44, Claes Redestad wrote: |
This resolves minor inefficiency in the fast-path for decoding latin-1 chars from UTF-8. I also took the opportunity to refactor the StringDecode microbenchmark to align with recent changes to the StringEncode micro.
The inefficiency is that this test is quite branchy:
if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) && ...
Since the two constant bytes differ only on the lowest bit this can be transformed to this, saving us a branch:
if ((b1 & 0xfe) == 0xc2 && ...
This provides a small speed-up on microbenchmarks where the input can be internally encoded as latin1:
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7122/head:pull/7122
$ git checkout pull/7122
Update a local copy of the PR:
$ git checkout pull/7122
$ git pull https://git.openjdk.java.net/jdk pull/7122/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 7122
View PR using the GUI difftool:
$ git pr show -t 7122
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7122.diff