Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8273454: C2: Transform (-a)*(-b) into a*b #5403

Closed
wants to merge 11 commits into from

Conversation

zhengyu123
Copy link
Contributor

@zhengyu123 zhengyu123 commented Sep 7, 2021

The transformation reduce instructions in generated code.

x86_64:

Before:

  0x00007fb92c78b3ac:   neg    %esi
  0x00007fb92c78b3ae:   neg    %edx
  0x00007fb92c78b3b0:   mov    %esi,%eax
  0x00007fb92c78b3b2:   imul   %edx,%eax                    ;*imul {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - TestSub::runSub@4 (line 9)

After:

                                                           ; - TestSub::runSub@-1 (line 9)
  0x00007fc8c05b74ac:   mov    %esi,%eax
  0x00007fc8c05b74ae:   imul   %edx,%eax                    ;*imul {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - TestSub::runSub@4 (line 9)

AArch64:

Before:

 0x0000ffff814b4a70:   neg     w11, w1
 0x0000ffff814b4a74:   mneg    w0, w2, w11                 ;*imul {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - TestSub::runSub@4 (line 9)

After:

 0x0000ffff794a67f0:   mul     w0, w1, w2                  ;*imul {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - TestSub::runSub@4 (line 9)


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5403/head:pull/5403
$ git checkout pull/5403

Update a local copy of the PR:
$ git checkout pull/5403
$ git pull https://git.openjdk.java.net/jdk pull/5403/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5403

View PR using the GUI difftool:
$ git pr show -t 5403

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5403.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Sep 7, 2021

👋 Welcome back zgu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Sep 7, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Sep 7, 2021

@zhengyu123 The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler label Sep 7, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 7, 2021

Node *in1 = in(1);
Node *in2 = in(2);
if (in1->Opcode() == Op_SubI && in2->Opcode() == Op_SubI) {
Node* n11 = in1->in(1);
Node* n21 = in2->in(1);
if (phase->type(n11)->higher_equal(TypeInt::ZERO) &&
phase->type(n21)->higher_equal(TypeInt::ZERO)) {
Copy link
Contributor

@theRealELiu theRealELiu Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking if it's a good idea to move these code into MulNode, as they were actually much the same with MulLNode.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder that too, so is the rest of MulINode/MulLNode::Ideal() code (and many other places). I am not sure how to workaround the different types, any suggestions?

Copy link
Contributor

@theRealELiu theRealELiu Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a dogfood, but it works. https://gist.github.com/theRealELiu/328d62157975b1f20e3626b3ef747eb4

Too much abstraction makes the code hard to read. One needs to check the concrete class to identify what the code exactly is, E.g. In my patch, add_id() may be TypeInt::ZERO or TypeLong::Zero, even TypeD::ZERO. So I'm not sure if it's a good idea. Is there any guidelines to this issue, try to abstract them or make the readability in the first place? @TobiHartmann

Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would also prefer to move the optimization into MulNode::Ideal. @theRealELiu's patch is good but can be further improved by modifying the node inputs instead of returning a new node (similar to the other optimizations in MulNode::Ideal).

Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, Type::is_zero_type can be used to detect 0 and instead of checking the opcodes, Node::is_Sub should be used.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks, I will make changes accordingly.

Node *in1 = in(1);
Node *in2 = in(2);
if (in1->Opcode() == Op_SubI && in2->Opcode() == Op_SubI) {
Node* n11 = in1->in(1);
Node* n21 = in2->in(1);
if (phase->type(n11)->higher_equal(TypeInt::ZERO) &&
phase->type(n21)->higher_equal(TypeInt::ZERO)) {
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would also prefer to move the optimization into MulNode::Ideal. @theRealELiu's patch is good but can be further improved by modifying the node inputs instead of returning a new node (similar to the other optimizations in MulNode::Ideal).

Node *in1 = in(1);
Node *in2 = in(2);
if (in1->Opcode() == Op_SubI && in2->Opcode() == Op_SubI) {
Node* n11 = in1->in(1);
Node* n21 = in2->in(1);
if (phase->type(n11)->higher_equal(TypeInt::ZERO) &&
phase->type(n21)->higher_equal(TypeInt::ZERO)) {
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, Type::is_zero_type can be used to detect 0 and instead of checking the opcodes, Node::is_Sub should be used.

}
}

private static final long[][] longParams = {
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to https://git.openjdk.java.net/jdk/pull/5266, I would prefer random values for better coverage.


/**
* @test
* @bug 8270366
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug number is incorrect.

for (int index = 0; index < intParams.length; index ++) {
int result = intTest(intParams[index][0], intParams[index][1]);
for (int i = 0; i < 20_000; i++) {
if (result != intTest(intParams[index][0], intParams[index][1])) {
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some warmup iterations, intTest will be C2 compiled and you are then comparing outputs of the same compiled method. I.e., if there's a bug in the C2 optimization, the test might not catch it. What you should do instead, is to compare the output of the C2 compiled method to the expected value (which is a * b in this case).

You should also prevent inlining of intTest.

The test you added with JDK-8270366 has the same problem.

Node* n21 = in2->in(1);
if (phase->type(n11)->is_zero_type() &&
phase->type(n21)->is_zero_type()) {
return make(in1->in(2), in2->in(2));
Copy link
Member

@TobiHartmann TobiHartmann Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to create a new node? Can't you simply update the inputs like the code below does?

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed your early comment. Fixed.

}

private static void testInt(int a, int b) {
int expected = (-a) * (-b);
Copy link
Contributor

@theRealELiu theRealELiu Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure about this is the expected value? As the method has been invoked 2000 times, I think it would be compiled by c2.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default CompileThreshold is 10K when tiered compilation is disabled, which is the case here, so there is no risk.

Copy link
Member

@TobiHartmann TobiHartmann Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why don't you compute expected as a * b?

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep as it is to match testxxx() functions. I think it articulates that JIT-ed result matches interpreter's.

private static void testLong(long a, long b) {
long expected = (-a) * (-b);
for (int i = 0; i < 20_000; i++) {
if (expected != test(a, b)) {
throw new RuntimeException("Incorrect result.");
}
}
}
Copy link
Contributor

@theRealELiu theRealELiu Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about calculating the expected value outside the iteration to avoid it to be compiled?

 private static void testLong() {
        for (int i = 0; i < 20_000; i++) {
            long a = random.nextLong();
            long b = random.nextLong();
            long expected = (-a) * (-b);
            if (expected != test(a, b)) {
                throw new RuntimeException("Incorrect result.");
            }
        }
    }

And just call this method once in main to prevent it from being too hot.

Copy link
Contributor

@theRealELiu theRealELiu left a comment

LGTM

@zhengyu123
Copy link
Contributor Author

@zhengyu123 zhengyu123 commented Sep 13, 2021

LGTM

Thanks, @theRealELiu

Node *in1 = in(1);
Node *in2 = in(2);
if (in1->is_Sub() && in2->is_Sub()) {
Node* n11 = in1->in(1);
Copy link
Member

@TobiHartmann TobiHartmann Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with below code, I would name the local in11 or simply use phase->type(in1->in(1)) because it's the only user.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

import java.util.Random;

public class TestNegMultiply {
private static Random random = new Random();
Copy link
Member

@TobiHartmann TobiHartmann Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use Utils.getRandomInstance() from jdk.test.lib.Utils to ensure that the seed is printed for reproducibility. You can check other tests for an example.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


/**
* @test
* @bug 8273454
Copy link
Member

@TobiHartmann TobiHartmann Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test needs * @key randomness

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

private static void testInt(int a, int b) {
int expected = (-a) * (-b);
Copy link
Member

@TobiHartmann TobiHartmann Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why don't you compute expected as a * b?


private static void testInt(int a, int b) {
int expected = (-a) * (-b);
for (int i = 0; i < 20_000; i++) {
Copy link
Member

@TobiHartmann TobiHartmann Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a second loop in here? It's sufficient to set TEST_COUNT high enough to trigger compilation. I would suggest something like this:

private static int testInt(int a, int b) {
    return (-a) * (-b);
}

private static void runIntTests() {
    for (int i = 0; i < TEST_COUNT; i++) {
        int a = random.nextInt();
        int b = random.nextInt();
        int res = testInt(a, b);
        Asserts.assertEQ(a * b, res);
    }
}

And then run with -XX:CompileCommand=dontinline,TestNegMultiply::test*. No need to disable OnStackReplacement.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inner loop ensures that all tests hit JIT-ed version. If the transformation is broken, I would prefer the test fails for the very first iteration, instead of somewhere in the middle.

I refactored the code to remove inner loop.

Also, fixed command option.

Copy link
Member

@TobiHartmann TobiHartmann Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't control the iteration in which the test would fail if there's a bug in C2 (it could only fail for some random values). Therefore, you could as well use random values for the warmup and simply increase TEST_COUNT to ensure that C2 compilation is triggered and we run a reasonable amount of iterations with C2 compiled code.

Your newest version of the test now has the problem that OSR compilation might C2 compile the computation of the expected value and then you are comparing the output of a C2 compiled method to a C2 compiled method instead of the interpreter. You have the following options:

  • Compute the expected value as a * b. In that case it's fine if the computation is C2 compiled as well.
  • Prevent compilation of the run* methods (either by disabling OSR compilation or by completely disabling compilation of these methods)

Copy link
Member

@TobiHartmann TobiHartmann Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And sorry for being picky here but I would like to keep tests as simple as possible :)

Copy link
Contributor Author

@zhengyu123 zhengyu123 Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed according to you comments.

I really appreciate you suggestions, thanks!

@openjdk openjdk bot removed the rfr label Sep 15, 2021
@openjdk openjdk bot added the rfr label Sep 15, 2021
Copy link
Member

@TobiHartmann TobiHartmann left a comment

Thanks for making these changes, looks good to me.

@openjdk
Copy link

@openjdk openjdk bot commented Sep 17, 2021

@zhengyu123 This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8273454: C2: Transform (-a)*(-b) into a*b

Reviewed-by: thartmann, eliu, chagedorn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 117 new commits pushed to the master branch:

  • 1890d85: 8273872: ZGC: Explicitly use 2M large pages
  • 54b4567: 8273880: Zero: Print warnings when unsupported intrinsics are enabled
  • e07ab82: 8273408: java.lang.AssertionError: typeSig ERROR on generated class property of record
  • 8c022e2: 8270434: JDI+UT: Unexpected event in JDI tests
  • b982904: 8271073: Improve testing with VM option VerifyArchivedFields
  • bc48a0a: 8273902: Memory leak in OopStorage due to bug in OopHandle::release()
  • 9c5441c: 8271569: Clean up the use of CDS constants and field offsets
  • 12fa707: 8261941: Use ClassLoader for unregistered classes during -Xshare:dump
  • 7e92abe: 8273710: Remove redundant stream() call before forEach in jdk.jdeps
  • 59b2478: 8273659: Replay compilation crashes with SIGSEGV since 8271911
  • ... and 107 more: https://git.openjdk.java.net/jdk/compare/267c61a16a916e35762e8df5737ec74b06defae8...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Sep 17, 2021
Copy link
Member

@chhagedorn chhagedorn left a comment

Looks good!

@zhengyu123
Copy link
Contributor Author

@zhengyu123 zhengyu123 commented Sep 18, 2021

@TobiHartmann @chhagedorn Thanks!

@zhengyu123
Copy link
Contributor Author

@zhengyu123 zhengyu123 commented Sep 18, 2021

/integrate

@openjdk
Copy link

@openjdk openjdk bot commented Sep 18, 2021

Going to push as commit 7c9868c.
Since your change was applied there have been 124 commits pushed to the master branch:

  • bb9d142: 8273958: gtest/MetaspaceGtests executes unnecessary tests in debug builds
  • 2a2e919: 8273685: Remove jtreg tag manual=yesno for java/awt/Graphics/LCDTextAndGraphicsState.java & show test instruction
  • 8302061: 8273774: CDSPluginTest should only expect classes_nocoops.jsa exists on supported 64-bit platforms
  • 2f8c221: 8273681: Add Vector API vs Arrays.mismatch intrinsic benchmark
  • 17f7a45: 8273913: Problem list some headful client jtreg tests that fail on macOS 12
  • 27d747a: 8273877: os::unsetenv unused
  • 35f6f1d: 8273808: Cleanup AddFontsToX11FontPath
  • 1890d85: 8273872: ZGC: Explicitly use 2M large pages
  • 54b4567: 8273880: Zero: Print warnings when unsupported intrinsics are enabled
  • e07ab82: 8273408: java.lang.AssertionError: typeSig ERROR on generated class property of record
  • ... and 114 more: https://git.openjdk.java.net/jdk/compare/267c61a16a916e35762e8df5737ec74b06defae8...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Sep 18, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Sep 18, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Sep 18, 2021

@zhengyu123 Pushed as commit 7c9868c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler integrated
4 participants