-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8347459: C2: missing transformation for chain of shifts/multiplications by constants #23728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back marc-chevalier! A progress list of the required criteria for merging this PR into |
|
@marc-chevalier This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 20 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@jaskarth, @dafedafe, @eme64) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
|
@marc-chevalier The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
jaskarth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice improvement! I'm glad to see that the limitation of Store/Shift folding requiring both shifts to have the same constant is being fixed. I've left some comments here.
I also noticed that in LShiftINode::Ideal, there is a transform that mentions not breaking i2s and i2b patterns. It might be worth looking separately to see if this condition can be relaxed because of the change to StoreNode::Ideal_sign_extended_input.
So |
|
@dean-long Yes! 1 << 33 was already, and is still, transformed into 1 << 1, while 1 << 30 << 3 is NOT transformed into 1 << 33 but directly into 0. The second part is exhibited by this test: @Test
@IR(failOn = {IRNode.LSHIFT})
// Checks (x << 31) << 1 => 0
public int testDoubleShift3(int x) {
return (x << 31) << 1;
}(and a few similar other). I didn't add a test for the simple |
I don't know if it's needed (we might have that covered in other tests), but adding now seems like a good idea for completeness. |
dafedafe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool improvement @marc-chevalier! Thanks a lot!
test/hotspot/jtreg/compiler/c2/irTests/LShiftINodeIdealizationTests.java
Show resolved
Hide resolved
dafedafe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just left one more small comment. Otherwise it looks good to me. Thanks @marc-chevalier!
test/hotspot/jtreg/compiler/c2/irTests/LShiftINodeIdealizationTests.java
Outdated
Show resolved
Hide resolved
…ifts-multiplications-by-constants
eme64
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-chevalier This is good work, thanks for working on this :)
I have a first batch of comments and suggestions.
src/hotspot/share/opto/memnode.cpp
Outdated
| set_req_X(MemNode::ValueIn, shl->in(1), phase); | ||
| return this; | ||
| // If (conIL > conIR) we are inventing 0 lower bits, and throwing | ||
| // away upper bits, but we are not introducing garbage bits by above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean with by above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean from the upper bits, from the left if written like decimal numbers. Rephrasing that.
src/hotspot/share/opto/memnode.cpp
Outdated
| // This case happens when the store source was itself a left shift, that gets merged | ||
| // into the inner left shift of the sign-extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, above you were talking about a left and a right shift. But now you seem to be talking about some "source" left shift and an "inner" left shift. It's a bit confusing. Are you talking about this case?
(StoreB ... (RShiftI _ (LShiftI _ (LShiftI _ valIn conIL1 ) conIL2 ) conIR) )
Where the two left shifts are already combined by Ideal earlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed. You seem confused about the term "source". I'm happy to change it, but I can't see a better word: a store (or move, or assignment) has a source and a destination, that are often resp. the left-hand side and the right-hand side of the expression/statement/instruction. I'm rephrasing that.
| // +------------------------+---------+ | ||
| // 31 8 7 0 | ||
| // v[0..7] is meaningful, but v[8..31] is not. In this case, num_rejected_bits == 24 | ||
| // If we do the shift left then right by 24 bits, we get: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be nice if you explicitly denoted the 3 cases, maybe even using some indentation for emphasis:
Case 1: conIL == conIR
Case 2: conIL > conIR
Case 3: conIL < conIR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could also be nice to say what you are trying to show / prove here.
| // +------------------+---------+-----+ | ||
| // | sign bit | v[0..5] | 0 | | ||
| // +------------------+---------+-----+ | ||
| // 31 8 7 2 1 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make a statement if this is ok or not?
| // | sign bit | v[0..5] | 0 | | ||
| // +------------------+---------+-----+ | ||
| // 31 10 9 4 3 0 | ||
| // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also have a case where conIL < conIR? What happens then?
test/hotspot/jtreg/compiler/c2/irTests/LShiftLNodeIdealizationTests.java
Show resolved
Hide resolved
|
Fixed all the comments. You can do the next iteration. |
eme64
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-chevalier Nice work. Thanks for all the updates, and extra time spent on the proof!
dafedafe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reworking on the explanation etc. @marc-chevalier!
Still looks ok to me.
…ifts-multiplications-by-constants
|
/integrate I've merged master in it, ran more tests. Seems all good. |
|
@marc-chevalier |
|
/sponsor |
1 similar comment
|
/sponsor |
|
Going to push as commit bdcac98.
Your commit was automatically rebased without conflicts. |
|
@TobiHartmann @marc-chevalier Pushed as commit bdcac98. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
|
@eme64 The command |
This collapses double shift lefts by constants in a single constant: (x << con1) << con2 => x << (con1 + con2). Care must be taken in the case con1 + con2 is bigger than the number of bits in the integer type. In this case, we must simplify to 0.
Moreover, the simplification logic of the sign extension trick had to be improved. For instance, we use
(x << 16) >> 16to convert a 32 bits into a 16 bits integer, with sign extension. When storing this into a 16-bit field, this can be simplified into simplex. But in the case wherexis itself a left-shift expression, sayy << 3, this PR makes the IR looks like(y << 19) >> 16instead of the old((y << 3) << 16) >> 16. The former logic didn't handle the case where the left and the right shift have different magnitude. In this PR, I generalize this simplification to cases where the left shift has a larger magnitude than the right shift. This improvement was needed not to miss vectorization opportunities: without the simplification, we have a left shift and a right shift instead of a single left shift, which confuses the type inference.This also works for multiplications by powers of 2 since they are already translated into shifts.
Thanks,
Marc
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23728/head:pull/23728$ git checkout pull/23728Update a local copy of the PR:
$ git checkout pull/23728$ git pull https://git.openjdk.org/jdk.git pull/23728/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 23728View PR using the GUI difftool:
$ git pr show -t 23728Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23728.diff
Using Webrev
Link to Webrev Comment