Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8280696: C2 compilation hits assert(is_dominator(c, n_ctrl)) failed #8770

Closed
wants to merge 2 commits into from

Conversation

TobiHartmann
Copy link
Member

@TobiHartmann TobiHartmann commented May 18, 2022

We hit an assert when computing early control via PhaseIdealLoop::compute_early_ctrl for 1726 AddP because one of its control inputs 1478 Region does not dominate current control 1350 Loop of the AddP:
Screenshot from 2022-05-18 15-10-24

I.e., current control of the AddP is incorrect. The problem is that the code in PhaseIdealLoop::has_local_phi_input that special cases AddP's only checks control of the Address (and Offset) input, assuming that control of the Base input is consistent.

// We allow the special case of AddP's with no local inputs.
// This allows us to split-up address expressions.
if (m->is_AddP() &&
get_ctrl(m->in(2)) != n_ctrl &&
get_ctrl(m->in(3)) != n_ctrl) {

This is not guaranteed though, leading to the AddP ending up with control that is not dominated by control of its base input.

As described below, this only reproduces with a very specific sequence of optimizations triggered by replay compilation with -XX:+StressIGVN and a fixed seed. I was not able to extract a regression test.

The fix is to also check control of the Base input when moving the AddP up to a dominating point. For testing purposes, I added an assert(get_ctrl(m->in(1)) != n_ctrl, "sanity") without the fix to verify that this change does not affect common cases. It triggers in the failing case but not for any test in tier 1 - 5. In addition, I slightly refactored the code of PhaseIdealLoop::compute_early_ctrl and added comments.

Gory details below.

Relevant graph after parsing:

 197  CastPP  ===  1460  60  [[ ... 1724  1725 ]]  #java/io/BufferedReader:NotNull *
 1459  CastPP  ===  1460  60  [[ ...  1724  1725 ]]  #java/io/BufferedReader:NotNull *
 1725  Phi  ===  1478  1459  197  [[ 1726 ]]  #java/io/BufferedReader:NotNull *
 1724  Phi  ===  1478  1459  197  [[ 1726 ]]  #java/io/BufferedReader:NotNull *
 1726  AddP  === _  1724  1725  41  [[ ... ]]   Oop:java/io/BufferedReader:NotNull+24 *

1724 Phi is then processed by the following code in PhiNode::Ideal that replaces its inputs by a cast of the unique input 1730 CastPP:

if (uncasted) {
// Add cast nodes between the phi to be removed and its unique input.
// Wait until after parsing for the type information to propagate from the casts.
assert(can_reshape, "Invalid during parsing");

 197  CastPP  ===  1460  60  [[ ...  1725 ]]  #java/io/BufferedReader:NotNull *
1459  CastPP  ===  1460  60  [[ ...  1725 ]]  #java/io/BufferedReader:NotNull *
1730  CastPP  ===  1478  60  [[ ...  1724  1724 ]]  #java/io/BufferedReader:NotNull * strong dependency
1725  Phi  ===  1478  1459  197  [[ 1726 ]]  #java/io/BufferedReader:NotNull *
1724  Phi  ===  1478  1730  1730  [[ 1726 ]]  #java/io/BufferedReader:NotNull *
1726  AddP  === _  1724  1725  41  [[ ... ]]   Oop:java/io/BufferedReader:NotNull+24 *

Then 1724 Phi is replaced by the unique input 1730 CastPP:

  197  CastPP  ===  1460  60  [[ ...  1725 ]]  #java/io/BufferedReader:NotNull *
 1459  CastPP  ===  1460  60  [[ ...  1725 ]]  #java/io/BufferedReader:NotNull *
 1725  Phi  ===  1478  1459  197  [[ 1726 ]]  #java/io/BufferedReader:NotNull *
 1730  CastPP  ===  1478  60  [[ ...  1726 ]]  #java/io/BufferedReader:NotNull * strong dependency
 1726  AddP  === _  1730  1725  41  [[ ... ]]   Oop:java/io/BufferedReader:NotNull+24 *

Now 1459 CastPP is replaced by identical 197 CastPP:

 197  CastPP  ===  1460  60  [[ ...  1725 ]]  #java/io/BufferedReader:NotNull *
 1725  Phi  ===  1478  197  197  [[ 1726  1739 ]]  #java/io/BufferedReader:NotNull *
 1730  CastPP  ===  1478  60  [[ ...  1726 ]]  #java/io/BufferedReader:NotNull * strong dependency
 1726  AddP  === _  1730  1725  41  [[ ... ]]   Oop:java/io/BufferedReader:NotNull+24 *

Finally, 1725 Phi is replaced by unique input 197 CastPP and the AddP ends up with two casts with different control of the same oop for Base and Address:

 197  CastPP  ===  1460  60  [[ ...  1725 ]]  #java/io/BufferedReader:NotNull *
 1730  CastPP  ===  1478  60  [[ ...  1726 ]]  #java/io/BufferedReader:NotNull * strong dependency
 1726  AddP  === _  1730  197  41  [[ ... ]]   Oop:java/io/BufferedReader:NotNull+24 *

Looking at the above transformation, the root cause is really the 1730 CastPP added by PhiNode::Ideal which is not needed and prevents the two casts from being merged. Is it worth filing a follow-up enhancement to fix this?

Thanks,
Tobias


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (1 review required, with at least 1 reviewer)

Issue

  • JDK-8280696: C2 compilation hits assert(is_dominator(c, n_ctrl)) failed

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/8770/head:pull/8770
$ git checkout pull/8770

Update a local copy of the PR:
$ git checkout pull/8770
$ git pull https://git.openjdk.java.net/jdk pull/8770/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 8770

View PR using the GUI difftool:
$ git pr show -t 8770

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/8770.diff

@bridgekeeper
Copy link

bridgekeeper bot commented May 18, 2022

👋 Welcome back thartmann! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label May 18, 2022
@openjdk
Copy link

openjdk bot commented May 18, 2022

@TobiHartmann The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label May 18, 2022
@mlbridge
Copy link

mlbridge bot commented May 18, 2022

Webrevs

@vnkozlov
Copy link
Contributor

The question is why we have separate similar Phi for Base and Address?:

 1725  Phi  ===  1478  1459  197  [[ 1726 ]]  #java/io/BufferedReader:NotNull *
 1724  Phi  ===  1478  1459  197  [[ 1726 ]]  #java/io/BufferedReader:NotNull *

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with adding the check for Base`s control. But I would like to have additional investigation why we have similar Phi nodes for Base and Address. And also your suggestion about CastPP. Both in other RFEs.

@openjdk
Copy link

openjdk bot commented May 18, 2022

@TobiHartmann This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8280696: C2 compilation hits assert(is_dominator(c, n_ctrl)) failed

Reviewed-by: kvn, chagedorn, roland

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 23 new commits pushed to the master branch:

  • af7cda5: 8285733: [s390] Vector Instruction Emitters for element-wise access are broken
  • d24c84e: 8286873: Improve websocket test execution time
  • db19dc6: 8284688: Minor cleanup could be done in java.security.jgss
  • 6e38666: 8286944: Loom: Common ContinuationEntry cookie handling
  • 47500b2: 8286897: Loom: Cleanup x86_64 StubGenerator
  • fc3edf2: 8285687: Remove jtreg tag manual=yesno for java/awt/print/PrinterJob/PageRangesDlgTest.java
  • 022e717: 8286462: Incorrect copyright year for src/java.base/share/classes/jdk/internal/vm/FillerObject.java
  • dbda0e2: 8286969: Add a new test library API to execute kinit in SecurityTools.java
  • 26c7c92: 8286694: Incorrect argument processing in java launcher
  • 2a2d54e: 8286984: (ch) Problem list java/nio/channels/FileChannel/LargeMapTest.java on Windows
  • ... and 13 more: https://git.openjdk.java.net/jdk/compare/69ff86a32088d9664e5e0dae12edddc0643e3fd3...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 18, 2022
@TobiHartmann
Copy link
Member Author

Thanks for the review, Vladimir!

The question is why we have separate similar Phi for Base and Address?

That happens here:

if (base == NULL) {
base = new PhiNode(in(0), base_type, NULL);
for (uint i = 1; i < req(); i++) {
base->init_req(i, in(i)->in(AddPNode::Base));
}
phase->is_IterGVN()->register_new_node_with_optimizer(base);
}
if (address == NULL) {
address = new PhiNode(in(0), address_type, NULL);
for (uint i = 1; i < req(); i++) {
address->init_req(i, in(i)->in(AddPNode::Address));
}
phase->is_IterGVN()->register_new_node_with_optimizer(address);
}
if (offset == NULL) {
offset = new PhiNode(in(0), TypeX_X, NULL);
for (uint i = 1; i < req(); i++) {
offset->init_req(i, in(i)->in(AddPNode::Offset));
}
phase->is_IterGVN()->register_new_node_with_optimizer(offset);
}
return new AddPNode(base, address, offset);

@TobiHartmann
Copy link
Member Author

That code was introduced by JDK-8231291. Maybe @rwestrel can comment on why it's necessary to create individual Phis for base and address.

@vnkozlov
Copy link
Contributor

That code was introduced by JDK-8231291. Maybe @rwestrel can comment on why it's necessary to create individual Phis for base and address.

Bug!!!:
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/cfgnode.cpp#L2168
should use AddPNode::Base:

        if (in(i)->in(AddPNode::Base) != base) {
          base = NULL;
        }

@TobiHartmann
Copy link
Member Author

Oh, good catch! I fixed that as well (the original issue still reproduces but requires a different seed).

Copy link
Member

@chhagedorn chhagedorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice analysis and good catch by Vladimir! Looks good to me, too.

Copy link
Contributor

@rwestrel rwestrel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@TobiHartmann
Copy link
Member Author

Christian, Roland, thanks for the reviews!

As Vladimir requested, I filed a follow-up RFE (JDK-8287009) for the useless CastPPs.

I think the code creating two Phis for the Base and Address inputs is fine because they can be different but I leave it to @rwestrel to comment on that.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update looks good.

@TobiHartmann
Copy link
Member Author

Thanks, Vladimir!

@TobiHartmann
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented May 19, 2022

Going to push as commit fa1b56e.
Since your change was applied there have been 23 commits pushed to the master branch:

  • af7cda5: 8285733: [s390] Vector Instruction Emitters for element-wise access are broken
  • d24c84e: 8286873: Improve websocket test execution time
  • db19dc6: 8284688: Minor cleanup could be done in java.security.jgss
  • 6e38666: 8286944: Loom: Common ContinuationEntry cookie handling
  • 47500b2: 8286897: Loom: Cleanup x86_64 StubGenerator
  • fc3edf2: 8285687: Remove jtreg tag manual=yesno for java/awt/print/PrinterJob/PageRangesDlgTest.java
  • 022e717: 8286462: Incorrect copyright year for src/java.base/share/classes/jdk/internal/vm/FillerObject.java
  • dbda0e2: 8286969: Add a new test library API to execute kinit in SecurityTools.java
  • 26c7c92: 8286694: Incorrect argument processing in java launcher
  • 2a2d54e: 8286984: (ch) Problem list java/nio/channels/FileChannel/LargeMapTest.java on Windows
  • ... and 13 more: https://git.openjdk.java.net/jdk/compare/69ff86a32088d9664e5e0dae12edddc0643e3fd3...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 19, 2022
@openjdk openjdk bot closed this May 19, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 19, 2022
@openjdk
Copy link

openjdk bot commented May 19, 2022

@TobiHartmann Pushed as commit fa1b56e.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
4 participants