Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8273021: C2: Improve Add and Xor ideal optimizations #5266

Closed
wants to merge 7 commits into from

Conversation

kelthuzadx
Copy link
Member

@kelthuzadx kelthuzadx commented Aug 26, 2021

Greetings. This patch adds the following identical equations for Add and Xor node, respectively, which probably drives further optimizations.

~(x-1) => -x
~x + 1 => -x

Verified by generated opto assembly, maybe an IR verification test can be added later.

Compiled method (c2)      71    1             compiler.c2.TestAddXorIdeal::test1 (6 bytes)
  0x00007f9e11514800:   sub    $0x18,%rsp
  0x00007f9e11514807:   mov    %rbp,0x10(%rsp)              ;*synchronization entry
                                                            ; - compiler.c2.TestAddXorIdeal::test1@-1 (line 39)
  0x00007f9e1151480c:   mov    %esi,%eax
  0x00007f9e1151480e:   neg    %eax                         ;*iadd {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - compiler.c2.TestAddXorIdeal::test1@4 (line 39)
  0x00007f9e11514810:   add    $0x10,%rsp
  0x00007f9e11514814:   pop    %rbp
  0x00007f9e11514815:   cmp    0x338(%r15),%rsp             ;   {poll_return}
  0x00007f9e1151481c:   ja     0x00007f9e11514823
  0x00007f9e11514822:   retq   

Compiled method (c2)      73    2             compiler.c2.TestAddXorIdeal::test2 (6 bytes)
  0x00007f9e11512480:   sub    $0x18,%rsp
  0x00007f9e11512487:   mov    %rbp,0x10(%rsp)              ;*synchronization entry
                                                            ; - compiler.c2.TestAddXorIdeal::test2@-1 (line 43)
  0x00007f9e1151248c:   mov    %esi,%eax
  0x00007f9e1151248e:   neg    %eax                         ;*ixor {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - compiler.c2.TestAddXorIdeal::test2@4 (line 43)
  0x00007f9e11512490:   add    $0x10,%rsp
  0x00007f9e11512494:   pop    %rbp
  0x00007f9e11512495:   cmp    0x338(%r15),%rsp             ;   {poll_return}
  0x00007f9e1151249c:   ja     0x00007f9e115124a3
  0x00007f9e115124a2:   retq 

Compiled method (c2)      72    3             compiler.c2.TestAddXorIdeal::test3 (8 bytes)
  0x00007f9e11514b00:   sub    $0x18,%rsp
  0x00007f9e11514b07:   mov    %rbp,0x10(%rsp)              ;*synchronization entry
                                                            ; - compiler.c2.TestAddXorIdeal::test3@-1 (line 47)
  0x00007f9e11514b0c:   mov    %rsi,%rax
  0x00007f9e11514b0f:   neg    %rax                         ;*ladd {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - compiler.c2.TestAddXorIdeal::test3@6 (line 47)
  0x00007f9e11514b12:   add    $0x10,%rsp
  0x00007f9e11514b16:   pop    %rbp
  0x00007f9e11514b17:   cmp    0x338(%r15),%rsp             ;   {poll_return}
  0x00007f9e11514b1e:   ja     0x00007f9e11514b25
  0x00007f9e11514b24:   retq  

Compiled method (c2)      72    4             compiler.c2.TestAddXorIdeal::test4 (8 bytes)
  0x00007f9e11514500:   sub    $0x18,%rsp
  0x00007f9e11514507:   mov    %rbp,0x10(%rsp)              ;*synchronization entry
                                                            ; - compiler.c2.TestAddXorIdeal::test4@-1 (line 51)
  0x00007f9e1151450c:   mov    %rsi,%rax
  0x00007f9e1151450f:   neg    %rax                         ;*lxor {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - compiler.c2.TestAddXorIdeal::test4@6 (line 51)
  0x00007f9e11514512:   add    $0x10,%rsp
  0x00007f9e11514516:   pop    %rbp
  0x00007f9e11514517:   cmp    0x338(%r15),%rsp             ;   {poll_return}
  0x00007f9e1151451e:   ja     0x00007f9e11514525
  0x00007f9e11514524:   retq  

Compiled method (c2)      72    5             compiler.c2.TestAddXorIdeal::test5 (6 bytes)
  0x00007f9e11518500:   sub    $0x18,%rsp
  0x00007f9e11518507:   mov    %rbp,0x10(%rsp)              ;*synchronization entry
                                                            ; - compiler.c2.TestAddXorIdeal::test5@-1 (line 55)
  0x00007f9e1151850c:   mov    %esi,%eax
  0x00007f9e1151850e:   neg    %eax                         ;*iadd {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - compiler.c2.TestAddXorIdeal::test5@4 (line 55)
  0x00007f9e11518510:   add    $0x10,%rsp
  0x00007f9e11518514:   pop    %rbp
  0x00007f9e11518515:   cmp    0x338(%r15),%rsp             ;   {poll_return}
  0x00007f9e1151851c:   ja     0x00007f9e11518523
  0x00007f9e11518522:   retq  

Compiled method (c2)      74    6             compiler.c2.TestAddXorIdeal::test6 (6 bytes)
  0x00007f9e11512180:   sub    $0x18,%rsp
  0x00007f9e11512187:   mov    %rbp,0x10(%rsp)              ;*synchronization entry
                                                            ; - compiler.c2.TestAddXorIdeal::test6@-1 (line 59)
  0x00007f9e1151218c:   mov    %esi,%eax
  0x00007f9e1151218e:   neg    %eax                         ;*ixor {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - compiler.c2.TestAddXorIdeal::test6@4 (line 59)
  0x00007f9e11512190:   add    $0x10,%rsp
  0x00007f9e11512194:   pop    %rbp
  0x00007f9e11512195:   cmp    0x338(%r15),%rsp             ;   {poll_return}
  0x00007f9e1151219c:   ja     0x00007f9e115121a3
  0x00007f9e115121a2:   retq  

Compiled method (c2)      74    7             compiler.c2.TestAddXorIdeal::test7 (8 bytes)
  0x00007f9e11511e80:   sub    $0x18,%rsp
  0x00007f9e11511e87:   mov    %rbp,0x10(%rsp)              ;*synchronization entry
                                                            ; - compiler.c2.TestAddXorIdeal::test7@-1 (line 63)
  0x00007f9e11511e8c:   mov    %rsi,%rax
  0x00007f9e11511e8f:   neg    %rax                         ;*ladd {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - compiler.c2.TestAddXorIdeal::test7@6 (line 63)
  0x00007f9e11511e92:   add    $0x10,%rsp
  0x00007f9e11511e96:   pop    %rbp
  0x00007f9e11511e97:   cmp    0x338(%r15),%rsp             ;   {poll_return}
  0x00007f9e11511e9e:   ja     0x00007f9e11511ea5
  0x00007f9e11511ea4:   retq 


Compiled method (c2)      73    8             compiler.c2.TestAddXorIdeal::test8 (10 bytes)
  0x00007f9e11512780:   sub    $0x18,%rsp
  0x00007f9e11512787:   mov    %rbp,0x10(%rsp)              ;*synchronization entry
                                                            ; - compiler.c2.TestAddXorIdeal::test8@-1 (line 67)
  0x00007f9e1151278c:   mov    %rsi,%rax
  0x00007f9e1151278f:   neg    %rax                         ;*lxor {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - compiler.c2.TestAddXorIdeal::test8@8 (line 67)
  0x00007f9e11512792:   add    $0x10,%rsp
  0x00007f9e11512796:   pop    %rbp
  0x00007f9e11512797:   cmp    0x338(%r15),%rsp             ;   {poll_return}
  0x00007f9e1151279e:   ja     0x00007f9e115127a5
  0x00007f9e115127a4:   retq 

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8273021: C2: Improve Add and Xor ideal optimizations

Reviewers

Contributors

  • yulei <lei.yul@alibaba-inc.com>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5266/head:pull/5266
$ git checkout pull/5266

Update a local copy of the PR:
$ git checkout pull/5266
$ git pull https://git.openjdk.java.net/jdk pull/5266/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5266

View PR using the GUI difftool:
$ git pr show -t 5266

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5266.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Aug 26, 2021

👋 Welcome back yyang! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Aug 26, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Aug 26, 2021

@kelthuzadx The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler label Aug 26, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Aug 26, 2021

Webrevs

@kelthuzadx
Copy link
Member Author

@kelthuzadx kelthuzadx commented Aug 26, 2021

/contributor add yulei lei.yul@alibaba-inc.com

@openjdk
Copy link

@openjdk openjdk bot commented Aug 26, 2021

@kelthuzadx
Contributor yulei <lei.yul@alibaba-inc.com> successfully added.

if (phase->type(in1->in(2)) == TypeInt::MINUS_1) {
return new SubINode(phase->makecon(TypeInt::ZERO), in1->in(1));
}
} else if (op2 == Op_AddI && phase->type(in1) == TypeInt::MINUS_1) {
Copy link
Member

@TobiHartmann TobiHartmann Sep 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to check both inputs for constant -1? Shouldn't AddNode::Ideal canonicalize the inputs and ensure that constants are moved to the second input?

// We also canonicalize the Node, moving constants to the right input,
// and flatten expressions (so that 1+x+2 becomes x+3).
virtual Node *Ideal(PhaseGVN *phase, bool can_reshape);

Copy link
Member Author

@kelthuzadx kelthuzadx Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, commute already moves loads and constants into right .

Changed.

* @bug 8273021
* @summary C2: Improve Add and Xor ideal optimizations
* @library /test/lib
* @run main/othervm -XX:-Inline -XX:-TieredCompilation -XX:TieredStopAtLevel=4 -XX:CompileCommand=compileonly,compiler.c2.TestAddXorIdeal::* compiler.c2.TestAddXorIdeal
Copy link
Member

@TobiHartmann TobiHartmann Sep 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about -XX:CompileCommand=dontinline,compiler.c2.TestAddXorIdeal::test* Instead of disabling all inlining and limiting compilation?

Copy link
Member Author

@kelthuzadx kelthuzadx Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. Magic number have been substituted by random number.

Asserts.assertTrue(test1(i + 5) == -(i + 5));
Asserts.assertTrue(test2(i - 7) == -(i - 7));
Asserts.assertTrue(test3(i + 100) == -(i + 100));
Asserts.assertTrue(test4(i - 1024) == -(i - 1024));
Copy link
Member

@TobiHartmann TobiHartmann Sep 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using random numbers for better coverage?

// convert into "x+ -c0" in SubXNode::Ideal. So ~(x-1) will eventually
// be -1^(x+(-1)).
if (op1 == Op_AddI && phase->type(in2) == TypeInt::MINUS_1) {
if (phase->type(in1->in(2)) == TypeInt::MINUS_1) {
Copy link
Contributor

@theRealELiu theRealELiu Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two conditions could be combined.

Copy link
Member Author

@kelthuzadx kelthuzadx Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've combined these conditions.

Copy link
Contributor

@theRealELiu theRealELiu left a comment

LGTM

* @bug 8273021
* @summary C2: Improve Add and Xor ideal optimizations
* @library /test/lib
* @run main/othervm -XX:-TieredCompilation -XX:TieredStopAtLevel=4
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TieredStopAtLevel has no effect if Tiered Compilation is turned off. You can remove it.

/*
* @test
* @bug 8273021
* @summary C2: Improve Add and Xor ideal optimizations
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test needs * @key randomness

}

public static void main(String... args) {
Random random = new Random();
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use Utils.getRandomInstance() from import jdk.test.lib.Utils to ensure that the seed is printed for reproducibility. You can check other tests for an example.

Copy link
Member Author

@kelthuzadx kelthuzadx Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does make sense. Changed.

long n1 = 0;
for (int i = -5_000; i < 5_000; i++) {
n = random.nextInt();
Asserts.assertTrue(test1(i + n) == -(i + n));
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you are using random numbers, can't you simply check Asserts.assertTrue(test1(n) == -n)? And just loop for a fixed number of iterations.

public static void main(String... args) {
Random random = new Random();
int n = 0;
long n1 = 0;
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be declared in the loop.

Copy link
Member Author

@kelthuzadx kelthuzadx Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean declared within loop body? I've changed but it looks like a perference problem.

Copy link
Member

@TobiHartmann TobiHartmann Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in Java, local variables should be declared as close as possible to the point they are first used at (see, for example, Google's Java Style Guide). The declaration does not affect performance.

Here's how I would write the loop to improve readability:

for (int j = 0; j < 50_000; j++) {
  int i = random.nextInt();
  long l = random.nextLong();
  Asserts.assertTrue(test1(i) == -i);
  Asserts.assertTrue(test2(i) == -i);
  Asserts.assertTrue(test3(l) == -l);
  ...

Summary: No need to use negative initial value for loop induction variable (as it is not even used), increase number of iterations to ensure C2 compilation (you are running without -Xbatch), use same random numbers per loop iteration, use more intuitive variable names.

But these are just code style details, feel free to ignore.

Copy link
Member Author

@kelthuzadx kelthuzadx Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for patient. I agree with your comment that using 0 as initial value and increasing iterations. :)

I try to use a and b as variable names since long l looks like long 1

@openjdk
Copy link

@openjdk openjdk bot commented Sep 10, 2021

@kelthuzadx This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8273021: C2: Improve Add and Xor ideal optimizations

Co-authored-by: yulei <lei.yul@alibaba-inc.com>
Reviewed-by: thartmann, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 203 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Sep 10, 2021
Copy link
Member

@TobiHartmann TobiHartmann left a comment

Thanks for making these changes. Looks good.

@kelthuzadx
Copy link
Member Author

@kelthuzadx kelthuzadx commented Sep 10, 2021

Thanks @TobiHartmann and @theRealELiu for reviews.

@kelthuzadx
Copy link
Member Author

@kelthuzadx kelthuzadx commented Sep 13, 2021

/integrate

@openjdk
Copy link

@openjdk openjdk bot commented Sep 13, 2021

Going to push as commit a73c06d.
Since your change was applied there have been 213 commits pushed to the master branch:

  • 9f86082: 8273610: LogTestFixture::restore_config() should not restore options
  • 2ee1f96: 8273484: Cleanup unnecessary null comparison before instanceof check in java.naming
  • f189dff: 8273595: tools/jpackage tests do not work on apt-based Linux distros like Debian
  • 922e86f: 8273522: Rename test property vm.cds.archived.java.heap to vm.cds.write.archived.java.heap
  • f42b927: 8273609: Fix trivial doc typos in the compiler area
  • e4cd209: 8273611: Remove unused ProfilePrint_lock
  • f690a01: 8273278: Support XSLT on GraalVM Native Image--deterministic bytecode generation in XSLT
  • 5e1df2c: 8273162: AbstractSplittableWithBrineGenerator does not create a random salt
  • d4177a9: 8273351: bad tag in jdk.random module-info.java
  • ec9d1be: 8273194: Document the two possible cases when Lookup::ensureInitialized returns
  • ... and 203 more: https://git.openjdk.java.net/jdk/compare/9bc023220fbbb0b6ea1ed1a0ca2aa3848764e8cd...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Sep 13, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Sep 13, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Sep 13, 2021

@kelthuzadx Pushed as commit a73c06d.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@kelthuzadx kelthuzadx deleted the 2_complement branch Sep 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler integrated
4 participants