-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8273454: C2: Transform (-a)*(-b) into a*b #5403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back zgu! A progress list of the required criteria for merging this PR into |
|
@zhengyu123 The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
src/hotspot/share/opto/mulnode.cpp
Outdated
| Node *in1 = in(1); | ||
| Node *in2 = in(2); | ||
| if (in1->Opcode() == Op_SubI && in2->Opcode() == Op_SubI) { | ||
| Node* n11 = in1->in(1); | ||
| Node* n21 = in2->in(1); | ||
| if (phase->type(n11)->higher_equal(TypeInt::ZERO) && | ||
| phase->type(n21)->higher_equal(TypeInt::ZERO)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking if it's a good idea to move these code into MulNode, as they were actually much the same with MulLNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder that too, so is the rest of MulINode/MulLNode::Ideal() code (and many other places). I am not sure how to workaround the different types, any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a dogfood, but it works. https://gist.github.com/theRealELiu/328d62157975b1f20e3626b3ef747eb4
Too much abstraction makes the code hard to read. One needs to check the concrete class to identify what the code exactly is, E.g. In my patch, add_id() may be TypeInt::ZERO or TypeLong::Zero, even TypeD::ZERO. So I'm not sure if it's a good idea. Is there any guidelines to this issue, try to abstract them or make the readability in the first place? @TobiHartmann
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would also prefer to move the optimization into MulNode::Ideal. @theRealELiu's patch is good but can be further improved by modifying the node inputs instead of returning a new node (similar to the other optimizations in MulNode::Ideal).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, Type::is_zero_type can be used to detect 0 and instead of checking the opcodes, Node::is_Sub should be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks, I will make changes accordingly.
src/hotspot/share/opto/mulnode.cpp
Outdated
| Node *in1 = in(1); | ||
| Node *in2 = in(2); | ||
| if (in1->Opcode() == Op_SubI && in2->Opcode() == Op_SubI) { | ||
| Node* n11 = in1->in(1); | ||
| Node* n21 = in2->in(1); | ||
| if (phase->type(n11)->higher_equal(TypeInt::ZERO) && | ||
| phase->type(n21)->higher_equal(TypeInt::ZERO)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would also prefer to move the optimization into MulNode::Ideal. @theRealELiu's patch is good but can be further improved by modifying the node inputs instead of returning a new node (similar to the other optimizations in MulNode::Ideal).
src/hotspot/share/opto/mulnode.cpp
Outdated
| Node *in1 = in(1); | ||
| Node *in2 = in(2); | ||
| if (in1->Opcode() == Op_SubI && in2->Opcode() == Op_SubI) { | ||
| Node* n11 = in1->in(1); | ||
| Node* n21 = in2->in(1); | ||
| if (phase->type(n11)->higher_equal(TypeInt::ZERO) && | ||
| phase->type(n21)->higher_equal(TypeInt::ZERO)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, Type::is_zero_type can be used to detect 0 and instead of checking the opcodes, Node::is_Sub should be used.
| } | ||
| } | ||
|
|
||
| private static final long[][] longParams = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to https://git.openjdk.java.net/jdk/pull/5266, I would prefer random values for better coverage.
|
|
||
| /** | ||
| * @test | ||
| * @bug 8270366 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug number is incorrect.
| for (int index = 0; index < intParams.length; index ++) { | ||
| int result = intTest(intParams[index][0], intParams[index][1]); | ||
| for (int i = 0; i < 20_000; i++) { | ||
| if (result != intTest(intParams[index][0], intParams[index][1])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some warmup iterations, intTest will be C2 compiled and you are then comparing outputs of the same compiled method. I.e., if there's a bug in the C2 optimization, the test might not catch it. What you should do instead, is to compare the output of the C2 compiled method to the expected value (which is a * b in this case).
You should also prevent inlining of intTest.
The test you added with JDK-8270366 has the same problem.
src/hotspot/share/opto/mulnode.cpp
Outdated
| Node* n21 = in2->in(1); | ||
| if (phase->type(n11)->is_zero_type() && | ||
| phase->type(n21)->is_zero_type()) { | ||
| return make(in1->in(2), in2->in(2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to create a new node? Can't you simply update the inputs like the code below does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed your early comment. Fixed.
| } | ||
|
|
||
| private static void testInt(int a, int b) { | ||
| int expected = (-a) * (-b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about this is the expected value? As the method has been invoked 2000 times, I think it would be compiled by c2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default CompileThreshold is 10K when tiered compilation is disabled, which is the case here, so there is no risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why don't you compute expected as a * b?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to keep as it is to match testxxx() functions. I think it articulates that JIT-ed result matches interpreter's.
| private static void testLong(long a, long b) { | ||
| long expected = (-a) * (-b); | ||
| for (int i = 0; i < 20_000; i++) { | ||
| if (expected != test(a, b)) { | ||
| throw new RuntimeException("Incorrect result."); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about calculating the expected value outside the iteration to avoid it to be compiled?
private static void testLong() {
for (int i = 0; i < 20_000; i++) {
long a = random.nextLong();
long b = random.nextLong();
long expected = (-a) * (-b);
if (expected != test(a, b)) {
throw new RuntimeException("Incorrect result.");
}
}
}And just call this method once in main to prevent it from being too hot.
e1iu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks, @theRealELiu |
src/hotspot/share/opto/mulnode.cpp
Outdated
| Node *in1 = in(1); | ||
| Node *in2 = in(2); | ||
| if (in1->is_Sub() && in2->is_Sub()) { | ||
| Node* n11 = in1->in(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with below code, I would name the local in11 or simply use phase->type(in1->in(1)) because it's the only user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| import java.util.Random; | ||
|
|
||
| public class TestNegMultiply { | ||
| private static Random random = new Random(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use Utils.getRandomInstance() from jdk.test.lib.Utils to ensure that the seed is printed for reproducibility. You can check other tests for an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
|
||
| /** | ||
| * @test | ||
| * @bug 8273454 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test needs * @key randomness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| } | ||
|
|
||
| private static void testInt(int a, int b) { | ||
| int expected = (-a) * (-b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why don't you compute expected as a * b?
|
|
||
| private static void testInt(int a, int b) { | ||
| int expected = (-a) * (-b); | ||
| for (int i = 0; i < 20_000; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need a second loop in here? It's sufficient to set TEST_COUNT high enough to trigger compilation. I would suggest something like this:
private static int testInt(int a, int b) {
return (-a) * (-b);
}
private static void runIntTests() {
for (int i = 0; i < TEST_COUNT; i++) {
int a = random.nextInt();
int b = random.nextInt();
int res = testInt(a, b);
Asserts.assertEQ(a * b, res);
}
}
And then run with -XX:CompileCommand=dontinline,TestNegMultiply::test*. No need to disable OnStackReplacement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inner loop ensures that all tests hit JIT-ed version. If the transformation is broken, I would prefer the test fails for the very first iteration, instead of somewhere in the middle.
I refactored the code to remove inner loop.
Also, fixed command option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't control the iteration in which the test would fail if there's a bug in C2 (it could only fail for some random values). Therefore, you could as well use random values for the warmup and simply increase TEST_COUNT to ensure that C2 compilation is triggered and we run a reasonable amount of iterations with C2 compiled code.
Your newest version of the test now has the problem that OSR compilation might C2 compile the computation of the expected value and then you are comparing the output of a C2 compiled method to a C2 compiled method instead of the interpreter. You have the following options:
- Compute the expected value as
a * b. In that case it's fine if the computation is C2 compiled as well. - Prevent compilation of the
run*methods (either by disabling OSR compilation or by completely disabling compilation of these methods)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And sorry for being picky here but I would like to keep tests as simple as possible :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed according to you comments.
I really appreciate you suggestions, thanks!
TobiHartmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making these changes, looks good to me.
|
@zhengyu123 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 117 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
chhagedorn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
|
@TobiHartmann @chhagedorn Thanks! |
|
/integrate |
|
Going to push as commit 7c9868c.
Your commit was automatically rebased without conflicts. |
|
@zhengyu123 Pushed as commit 7c9868c. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
The transformation reduce instructions in generated code.
x86_64:
Before:
After:
AArch64:
Before:
After:
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5403/head:pull/5403$ git checkout pull/5403Update a local copy of the PR:
$ git checkout pull/5403$ git pull https://git.openjdk.java.net/jdk pull/5403/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 5403View PR using the GUI difftool:
$ git pr show -t 5403Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5403.diff