Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8286125: C2: "bad AD file" with PopulateIndex on x86_64 #8587

Closed
wants to merge 3 commits into from

Conversation

pfustc
Copy link
Member

@pfustc pfustc commented May 7, 2022

A fuzzer test reports an assertion failure issue with PopulateIndexNode
on x86_64. It can be reproduced by the new jtreg case inside this patch.
Root cause is that C2 superword creates a PopulateIndexNode by mistake
while vectorizing below loop.

    for (int i = 304; i > 15; i -= 3) {
        int c = 16;
        do {
            for (int t = 1; t < 1; t++) {}
            arr[c + 1] >>= i;
        } while (--c > 0);
    }

This is a corner loop case with redundant code inside. After several C2
optimizations, the do-while loop inside is unrolled and then isomorphic
right shift statements can be combined in the superword optimization.
Since all shift counts are the same loop IV value i, superword should
generate a RShiftCntVNode to create a vector of scalar replications of
the loop IV. But after JDK-8280510, a PopulateIndexNode is generated by
mistake because of the opd == iv() condition.

To fix this, we add a have_same_inputs condition here checking if all
inputs at position opd_idx of nodes in the pack are the same. If true,
C2 code should NOT run into this block to generate a PopulateIndexNode.
Instead, it should run into the next block for scalar replications.

Additionally, only adding this condition here is still not good enough
because it breaks the experimental post loop vectorization. As in post
loops, all packs are singleton, i.e., have_same_inputs is always true.
Hence, we also add a pack size check here to make post loop logic run
into this block. It's safe to let it go because post loop never needs
scalar replications of the loop IV - it never combines nodes in packs.

We also add two more assertions in the code.

Jtreg hotspot::hotspot_all_no_apps, jdk::tier1~3 and langtools::tier1
are tested and no issue is found.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (1 review required, with at least 1 reviewer)

Issue

  • JDK-8286125: C2: "bad AD file" with PopulateIndex on x86_64

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/8587/head:pull/8587
$ git checkout pull/8587

Update a local copy of the PR:
$ git checkout pull/8587
$ git pull https://git.openjdk.java.net/jdk pull/8587/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 8587

View PR using the GUI difftool:
$ git pr show -t 8587

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/8587.diff

A fuzzer test reports an assertion failure issue with PopulateIndexNode
on x86_64. It can be reproduced by the new jtreg case inside this patch.
Root cause is that C2 superword creates a PopulateIndexNode by mistake
while vectorizing below loop.

    for (int i = 304; i > 15; i -= 3) {
        int c = 16;
        do {
            for (int t = 1; t < 1; t++) {}
            arr[c + 1] >>= i;
        } while (--c > 0);
    }

This is a corner loop case with redundant code inside. After several C2
optimizations, the do-while loop inside is unrolled and then isomorphic
right shift statements can be combined in the superword optimization.
Since all shift counts are the same loop IV value `i`, superword should
generate a RShiftCntVNode to create a vector of scalar replications of
the loop IV. But after JDK-8280510, a PopulateIndexNode is generated by
mistake because of the `opd == iv()` condition.

To fix this, we add a `have_same_inputs` condition here checking if all
inputs at position `opd_idx` of nodes in the pack are the same. If true,
C2 code should NOT run into this block to generate a PopulateIndexNode.
Instead, it should run into the next block for scalar replications.

Additionally, only adding this condition here is still not good enough
because it breaks the experimental post loop vectorization. As in post
loops, all packs are singleton, i.e., `have_same_inputs` is always true.
Hence, we also add a pack size check here to make post loop logic run
into this block. It's safe to let it go because post loop never needs
scalar replications of the loop IV - it never combines nodes in packs.

We also add two more assertions in the code.

Jtreg hotspot::hotspot_all_no_apps, jdk::tier1~3 and langtools::tier1
are tested and no issue is found.
@bridgekeeper
Copy link

bridgekeeper bot commented May 7, 2022

👋 Welcome back pli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label May 7, 2022
@openjdk
Copy link

openjdk bot commented May 7, 2022

@pfustc The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label May 7, 2022
@mlbridge
Copy link

mlbridge bot commented May 7, 2022

Webrevs

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable. Let me test it.

@vnkozlov
Copy link
Contributor

vnkozlov commented May 9, 2022

Meanwhile, please, update it to latest JDK.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My tier1-4 testing passed clean.

@openjdk
Copy link

openjdk bot commented May 9, 2022

@pfustc This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8286125: C2: "bad AD file" with PopulateIndex on x86_64

Reviewed-by: kvn, thartmann

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 11 new commits pushed to the master branch:

  • 3fa1c40: 8286298: Remove unused methods in sun.invoke.util.VerifyType
  • 3462190: 8286163: micro-optimize Instant.plusSeconds
  • 60a91d1: 8286285: G1: Rank issues with ParGCRareEvent_lock and Threads_lock
  • d478958: 8286179: Node::find(int) should not traverse from new to old nodes
  • de8f4d0: 8286191: misc tests fail due to JDK-8285987
  • bf0dc4f: 8286367: riscv: riscv port is broken after JDK-8284161
  • 4fd79a6: 8285730: unify _WIN32_WINNT settings
  • bd6026c: 7124282: [macosx] Can't see table cell highlighter when the highlight border is the same color as the cell.
  • 9a3cb93: 8030121: java/awt/dnd/MissingDragExitEventTest/MissingDragExitEventTest.java fails
  • ace4230: 8261650: Add a comment with details for MTLVC_MAX_INDEX
  • ... and 1 more: https://git.openjdk.java.net/jdk/compare/61450bb061ecda9700ddbd387a1f0659ebd1cced...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 9, 2022
@pfustc
Copy link
Member Author

pfustc commented May 10, 2022

Meanwhile, please, update it to latest JDK.

Thanks @vnkozlov for looking at this. Patch is rebased to JDK master.

@vnkozlov
Copy link
Contributor

You need second review. It is not trivial.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in post loops, all packs are singleton, i.e., have_same_inputs is always true.
Hence, we also add a pack size check here to make post loop logic run
into this block. It's safe to let it go because post loop never needs
scalar replications of the loop IV - it never combines nodes in packs.

I don't understand why we need to execute that code for post loops. Could you elaborate why the p->size() == 1 check is needed?

@@ -0,0 +1,56 @@

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@pfustc
Copy link
Member Author

pfustc commented May 10, 2022

Hi @TobiHartmann ,

Thanks for your review.

I don't understand why we need to execute that code for post loops. Could you elaborate why the p->size() == 1 check is needed?

In post loop vectorization, superword doesn't combine multiple scalar nodes to one vector node. Instead, each scalar node is transformed to a vector node with vector mask. If a loop body statement has the induction variable involved, such as

for (int i = 0; i < length; i++) {
    a[i] = i;
}

then superword should run into this if (opd == iv() && ...) {...} block to create an index vector for vectorizing post loop. But packs in post loops are special (singletons) - each has only one scalar node inside. So same_inputs(...) always returns true. If we just add the have_same_inputs condition, post loop vectorization will generate incorrect result because index vectors cannot be created. Hence we add another p->size() == 1 check here.

@TobiHartmann
Copy link
Member

Thanks for the clarification. Makes sense.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the newline from the beginning of the test before integrating.

@pfustc
Copy link
Member Author

pfustc commented May 10, 2022

Please remove the newline from the beginning of the test before integrating.

Thanks for reminding me. I've done that.

@pfustc
Copy link
Member Author

pfustc commented May 10, 2022

/integrate

@openjdk
Copy link

openjdk bot commented May 10, 2022

Going to push as commit 1ca5404.
Since your change was applied there have been 14 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 10, 2022
@openjdk openjdk bot closed this May 10, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 10, 2022
@openjdk
Copy link

openjdk bot commented May 10, 2022

@pfustc Pushed as commit 1ca5404.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@pfustc pfustc deleted the slpfix branch May 10, 2022 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
3 participants