Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8253644: C2: assert(skeleton_predicate_has_opaque(iff)) failed: unexpected #1448

Closed
wants to merge 4 commits into from

Conversation

@chhagedorn
Copy link
Member

@chhagedorn chhagedorn commented Nov 26, 2020

test1() fails while creating pre/main/post loops when copying skeleton predicates to the main loop. The problem is that we find phi nodes when following a skeleton Opaque4 bool node, when trying to find the OpaqueLoopInit and OpaqueLoopStride nodes. This is unexpected and lets the assertion fail. This happens due to the following reason:

An inner loop of a nested loop is first unswitched and the skeleton predicates are copied to the slow and fast loop by just creating a new If node that share the same Opaque4 node:

ProjNode* fast_proj = create_new_if_for_predicate(iffast, NULL, reason, iff->Opcode(), predicate_proj->is_IfTrue());
ProjNode* slow_proj = create_new_if_for_predicate(ifslow, NULL, reason, iff->Opcode(), predicate_proj->is_IfTrue());
// Replace bool input by input from original predicate
_igvn.replace_input_of(fast_proj->in(0), 1, iff->in(1));
_igvn.replace_input_of(slow_proj->in(0), 1, iff->in(1));

The loop tree building algorithm recognizes both loops as children of the parent loop:

Loop: N0/N0  has_sfpt
  Loop: N338/N314  limit_check profile_predicated predicated counted [0,100),+1 (2 iters)  has_sfpt
     Loop: N459/N458  profile_predicated predicated counted [0,10000),+1 (5271 iters)  has_sfpt (slow loop)
     Loop: N343/N267  profile_predicated predicated counted [0,10000),+1 (5271 iters)  has_sfpt (fast loop)

Some additional optimizations are then applied such that the fast loop no longer has any backedge/path to the parent loop while the slow loop still has. As a result, the loop tree building algorithm only recognizes the slow loop as child while the fast loop is not. The fast loop is treated as a separate loop on the same level as the parent loop:

Loop: N0/N0  has_sfpt
  Loop: N338/N314  limit_check profile_predicated predicated counted [0,100),+1 (2 iters)  has_sfpt
    [N459, but the loop could actually be removed in the meantime but the skeleton predicates are still there]
  Loop: N343/N267  profile_predicated predicated counted [0,10000),+1 (5274 iters)  has_sfpt

Now the original parent loop (N338) gets peeled. The fast and the slow loop still both share skeleton Opaque4 bool nodes with all their inputs nodes up to and including the OpaqueLoopInit/Stride nodes. Let's look at one of the skeleton If nodes for the fast loop that uses such a Opaque4 node. The skeleton If is no longer part of the original parent loop and is therefore not peeled. But now we need some phi nodes to select the correct nodes either from the peeled iteration or from N338 for this skeleton If of the fast loop. This is done in PhaseIdealLoop::clone_iff() which creates a new Opaque4 node together with new Bool and Cmp nodes and then inserts some phi nodes to do the selection.

When afterwards creating pre/main/post loops for the fast loop (N343) that is no child anymore, we find these phi nodes on the path to the OpaqueLoopInit/Stride nodes which lets the assertion fail. A more detailed explanation why this happens can be find at test1() in TestUnswitchCloneSkeletonPredicates.

I propose to copy the skeleton predicates to the unswitched loops in the same way as we copy the skeleton predicates to the main loop by cloning all the nodes on the path to the OpaqueLoopInit/Stride nodes with a small adaptation: We should also copy the OpaqueLoopInit/Stride nodes and we should keep the uncommon traps because we only want to replace them by Halt nodes once we create pre/main/post loops.

Thanks,
Christian


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8253644: C2: assert(skeleton_predicate_has_opaque(iff)) failed: unexpected

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1448/head:pull/1448
$ git checkout pull/1448

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Nov 26, 2020

👋 Welcome back chagedorn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Nov 26, 2020

@chhagedorn The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr label Nov 26, 2020
@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 26, 2020

Webrevs

@rwestrel
Copy link
Contributor

@rwestrel rwestrel commented Nov 26, 2020

Some additional optimizations are then applied such that the fast loop no longer has any backedge/path to the parent loop while the slow loop still has.

Does one round of loop opts stop here...
and another starts below?

As a result, the loop tree building algorithm only recognizes the slow loop as child while the fast loop is not. The fast loop is treated as a separate loop on the same level as the parent loop:

[N459, but the loop could actually be removed in the meantime but the skeleton predicates are still there]

I don't understand that part. Is N459 no longer a loop?

@chhagedorn
Copy link
Member Author

@chhagedorn chhagedorn commented Nov 27, 2020

Some additional optimizations are then applied such that the fast loop no longer has any backedge/path to the parent loop while the slow loop still has.

Does one round of loop opts stop here...
and another starts below?

As a result, the loop tree building algorithm only recognizes the slow loop as child while the fast loop is not. The fast loop is treated as a separate loop on the same level as the parent loop:

[N459, but the loop could actually be removed in the meantime but the skeleton predicates are still there]

I don't understand that part. Is N459 no longer a loop?

Yes, exactly. So, we unswitch in one round A. Then we have the first loop tree shown with the slow and fast loop at the start of the next round B. In this round B, we apply Split If that is then able to remove the IfNode (2). We always take the true projection ( x == 100 is true) which means that we immediately exit the slow loop N459 on the first iteration. The CountedLoopNode is therefore removed. The second loop tree shown is at the start of the next round C where we then finally apply the peeling and pre/main/post steps which lets the assertion fail. This description of N459 is indeed a bit misleading.

@rwestrel
Copy link
Contributor

@rwestrel rwestrel commented Nov 27, 2020

Yes, exactly. So, we unswitch in one round A. Then we have the first loop tree shown with the slow and fast loop at the start of the next round B. In this round B, we apply Split If that is then able to remove the IfNode (2). We always take the true projection ( x == 100 is true) which means that we immediately exit the fast loop N459 on the first iteration. The CountedLoopNode is therefore removed. The second loop tree shown is at the start of the next round C where we then finally apply the peeling and pre/main/post steps which lets the assertion fail. This description of N459 is indeed a bit misleading. When afterwards talking about the fast loop I just mean where the fast loop was before to keep things simpler.

Thanks for the clarification.
On round C, N459 is not part of the loop tree anymore then?
If that's case shouldn't we remove useless skeleton predicates because then don't guard a loop anymore (PhaseIdealLoop::eliminate_useless_predicates())?

@chhagedorn
Copy link
Member Author

@chhagedorn chhagedorn commented Nov 27, 2020

Yes, exactly. So, we unswitch in one round A. Then we have the first loop tree shown with the slow and fast loop at the start of the next round B. In this round B, we apply Split If that is then able to remove the IfNode (2). We always take the true projection ( x == 100 is true) which means that we immediately exit the fast loop N459 on the first iteration. The CountedLoopNode is therefore removed. The second loop tree shown is at the start of the next round C where we then finally apply the peeling and pre/main/post steps which lets the assertion fail. This description of N459 is indeed a bit misleading. When afterwards talking about the fast loop I just mean where the fast loop was before to keep things simpler.

Thanks for the clarification.
On round C, N459 is not part of the loop tree anymore then?
If that's case shouldn't we remove useless skeleton predicates because then don't guard a loop anymore (PhaseIdealLoop::eliminate_useless_predicates())?

I had a look at that method. Apparently, we are only removing Opaque1 predicates but no Opaque4 skeleton predicates:

for (int i = C->predicate_count(); i > 0; i--) {
Node * n = C->predicate_opaque1_node(i-1);
assert(n->Opcode() == Op_Opaque1, "must be");
if (!useful_predicates.member(n)) { // not in the useful list
_igvn.replace_node(n, n->in(1));
}
}

Could that be improved? But as of the current implementation, the Opaque4 node with its inputs is shared between a fast and a slow loop. So we could not remove the Opaque4 node when either the fast or slow loop does not need it anymore while the other one does.

@openjdk
Copy link

@openjdk openjdk bot commented Nov 30, 2020

@chhagedorn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8253644: C2: assert(skeleton_predicate_has_opaque(iff)) failed: unexpected

Reviewed-by: roland, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 128 new commits pushed to the master branch:

  • 6937d9f: 8257799: Update JLS cross-references in java.compiler
  • a5297bd: 8254939: macOS: unused function 'replicate4_imm'
  • 36c0600: 8257805: Add compiler/blackhole tests to tier1
  • 395b6bd: 8257817: Shenandoah: Don't race with conc-weak-in-progress flag in weak-LRB
  • a265c20: 8255619: Localized WinResources.properties have MsiInstallerStrings_en.wxl resource
  • e3793e5: 8257514: Fix the issues in jdk.jpackage identified by SpotBugs
  • bbc44f5: 8257186: Size of heap segments is not computed correctlyFix overflow in size computation for heap segments
  • b4b9828: 8254784: javac should reject records with @SafeVarargs applied to varargs record component
  • dcf63f8: 8257788: Class fields could be local in the SunJSSE provider
  • d29c78d: 8257679: Improved unix compatibility layer in Windows build (winenv)
  • ... and 118 more: https://git.openjdk.java.net/jdk/compare/3d460bd29566b7b6cdfcc463d5dc4b1354b07258...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Nov 30, 2020
@rwestrel
Copy link
Contributor

@rwestrel rwestrel commented Nov 30, 2020

Could that be improved? But as of the current implementation, the Opaque4 node with its inputs is shared between a fast and a slow loop. So we could not remove the Opaque4 node when either the fast or slow loop does not need it anymore while the other one does.

Right. Yes, I think it would make sense to remove the skeleton predicates that become useless. Anyway, your fix looks good to me.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Good.

I suggest to merge latest source and re-test again. I am concern that some Git testing failed.

@chhagedorn
Copy link
Member Author

@chhagedorn chhagedorn commented Dec 1, 2020

Could that be improved? But as of the current implementation, the Opaque4 node with its inputs is shared between a fast and a slow loop. So we could not remove the Opaque4 node when either the fast or slow loop does not need it anymore while the other one does.

Right. Yes, I think it would make sense to remove the skeleton predicates that become useless. Anyway, your fix looks good to me.

Thanks for your careful review! I filed JDK-8257498 to also eliminate useless skeleton predicates.

@chhagedorn
Copy link
Member Author

@chhagedorn chhagedorn commented Dec 1, 2020

Good.

I suggest to merge latest source and re-test again. I am concern that some Git testing failed.

Thanks for your review! Good idea, I will do that and start some more testing.

@chhagedorn
Copy link
Member Author

@chhagedorn chhagedorn commented Dec 7, 2020

Extended testing with tier1-7 and hs-precheckin-comp looked good.

@chhagedorn
Copy link
Member Author

@chhagedorn chhagedorn commented Dec 8, 2020

@vnkozlov @rwestrel Thanks for your reviews!

@chhagedorn
Copy link
Member Author

@chhagedorn chhagedorn commented Dec 8, 2020

/integrate

@openjdk openjdk bot closed this Dec 8, 2020
@openjdk openjdk bot added integrated and removed ready rfr labels Dec 8, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Dec 8, 2020

@chhagedorn Since your change was applied there have been 138 commits pushed to the master branch:

  • 51ac376: 8256411: Based anonymous classes have a weird end position
  • 0b6b6eb: 8257813: [redo] C2: Filter type in PhiNode::Value() for induction variables of trip-counted integer loops
  • 500ab45: 8257769: Cipher.getParameters() throws NPE for ChaCha20-Poly1305
  • 6ff18e3: 8257855: Example SafeVarargsNotApplicableToRecordAccessors breaks test tools/javac/diags/CheckExamples.java
  • cef606f: 8253762: JFR: getField(String) should be able to access subfields
  • 39b8a2e: 8257670: sun/security/ssl/SSLSocketImpl/SSLSocketLeak.java reports leaks
  • c43c224: 8257796: [TESTBUG] TestUseSHA512IntrinsicsOptionOnSupportedCPU.java fails on x86_32
  • 62c7788: 8257211: C2: Enable call devirtualization during post-parse phase
  • 149a02f: 8257572: Deprecate the archaic signal-chaining interfaces: sigset and signal
  • f92745d: 8257718: LogCompilation: late_inline doesnt work right for JDK 8 logs
  • ... and 128 more: https://git.openjdk.java.net/jdk/compare/3d460bd29566b7b6cdfcc463d5dc4b1354b07258...master

Your commit was automatically rebased without conflicts.

Pushed as commit 1d0adbb.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@chhagedorn chhagedorn deleted the JDK-8253644 branch Dec 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants