Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8335747: C2: fix overflow case for LoopLimit with constant inputs #23024

Closed

Conversation

eme64
Copy link
Contributor

@eme64 eme64 commented Jan 10, 2025

LoopLimitNode::Value tries to constant-fold when it has constant inputs. However, there can be an overflow in the int-computation, but we check for it with if (final_con == (jlong)final_int) { and do not constant fold in that case.

However, there was an assert that checked that such an overflow would never be encountered. We already had to make an exception for this assert during PhaseCCP with JDK-8309266.

Why did we not hit this assert before?
LoopLimitNode needs to have constant inputs. We used to assume that if the constants would lead to an overflow, then the loop-limit-check would also get similar constants, and detect that limit <= max_int-stride does not hold, and it would constant-fold away the loop, together with the LoopLimitNode.

But now we found a second case:

// The final value should be in integer range in almost all cases,
// since the loop is counted and the limit was checked for overflow.
// There some exceptions, for example:
// - During CCP, there might be a temporary overflow from PhiNodes, see JDK-8309266.
// - During PhaseIdealLoop::split_thru_phi, the LoopLimitNode floats possibly far above
// the loop and its predicates, and we might get constants on one side of the phi that
// would lead to overflows. Such a code path would never lead us to enter the loop
// because of the loop limit overflow check that happens after the LoopLimitNode
// computation with overflow, but before we enter the loop, see JDK-8335747.

In PhaseIdealLoop::split_thru_phi, we temporarily split the LoopLimitNode through the phi, generating a new LoopLimitNode for each branch of the phi. We then call Value on it to see if that leads us to constant fold one of the branches, which would be considered a "win".

for (uint i = 1; i < region->req(); i++) {
Node* x;
Node* the_clone = nullptr;
if (region->in(i) == C->top()) {
x = C->top(); // Dead path? Use a dead data op
} else {
x = n->clone(); // Else clone up the data op
the_clone = x; // Remember for possible deletion.
// Alter data node to use pre-phi inputs
if (n->in(0) == region)
x->set_req( 0, region->in(i) );
for (uint j = 1; j < n->req(); j++) {
Node* in = n->in(j);
if (in->is_Phi() && in->in(0) == region)
x->set_req(j, in->in(i)); // Use pre-Phi input for the clone
}
}
// Check for a 'win' on some paths
const Type* t = x->Value(&_igvn);

In the regression test, we have this example:

int x = flag ? 1000 : 2147483647;
// Creates a Phi(1000, 2147483647)
// We do loop-predication, and add a
// LoopLimitNode(init=0, limit=x, stride=4)
//
// Later, we find try to PhaseIdealLoop::split_thru_phi
// the LoopLimitNode, through the Phi(1000, 2147483647).
//
// This creates a temporary
// LoopLimitNode(init=0, limit=2147483647, stride=4)
//
// And then we get this:
// init_con 0
// limit_con 2147483647
// stride_con 4
// trip_count 536870912
// final_int -2147483648
// final_con 2147483648
//
for (int i = 0; i < x; i+=4 /* works for at least 2..64 */) {
// Break before going out of bounds
// but with quadratic check to not affect limit.
if (i * i > 1000_000) { return; }
a[i] = 34;
}

We generate a temporary clone of LoopLimitNode(init=0, limit=x, stride=4) (would not constant fold because of variable x = Phi(1000, 2147483647)), which happens to be LoopLimitNode(init=0, limit=2147483647, stride=4). We evaluate Value on this temporary clone, and hit the overflow case.

Why is it ok to just remove the assert and allow LoopLimitNode to overflow?
We still have the loop limit check, which checks that limit <= max_int-stride, and this means we would never enter the loop if we took the Phi branch that led to the overflow.

I could not just remove the assert, because in LoopLimitNode::Ideal we have this (strange?) check that does not optimize the LoopLimitNode if the inputs are constants:

  if (in(Init)->is_Con() && in(Limit)->is_Con())
    return nullptr;  // Value

The assumption seems to be that we want Value to do the constant folding here - but of course we did not constant-fold because we had detected the overflow in Value. Not optimizing further here has a unfortunate consequence: on platforms that do not have LoopLimit implemented in the backend directly, we would have "lowered" the LoopLimit further down into ConvI2L, SubL, AddL, DivL, ConvL2I ... nodes. With this check, we do never lower it, and end up with a "bad AD", i.e. compilation bailout in product.

I think this check can reasonably be removed, because Value should be called before Ideal anyway, and so if we can constand fold because of constant inputs, we would have already done so.

Note, that the lowering is delayed until post_loop_opts_phase, but we never did record_for_post_loop_opts_igvn, and so it was not guaranteed that we actually ever processed the LoopLimitNode again, which would mean we got "bad AD" again, i.e. compilation bailout in product.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8335747: C2: fix overflow case for LoopLimit with constant inputs (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23024/head:pull/23024
$ git checkout pull/23024

Update a local copy of the PR:
$ git checkout pull/23024
$ git pull https://git.openjdk.org/jdk.git pull/23024/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 23024

View PR using the GUI difftool:
$ git pr show -t 23024

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23024.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 10, 2025

👋 Welcome back epeter! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 10, 2025

@eme64 This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8335747: C2: fix overflow case for LoopLimit with constant inputs

Reviewed-by: kvn, qamai

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 38 new commits pushed to the master branch:

  • 062f2dc: 8347554: [BACKOUT] C2: implement optimization for series of Add of unique value
  • a289bcf: 8306579: Consider building with /Zc:throwingNew
  • cede304: 8347482: Remove unused field in ParkEvent
  • fa5ff82: 8342062: Reformat keytool and jarsigner output for keys with a named parameter set
  • cc19897: 8293123: Fix various include file ordering
  • 6e43f48: 8346929: runtime/ClassUnload/DictionaryDependsTest.java fails with "Test failed: should be unloaded"
  • c885e59: 8346377: Properly support static builds for Windows
  • 0612636: 8347373: HTTP/2 flow control checks may count unprocessed data twice
  • 450636a: 8347274: Gatherers.mapConcurrent exhibits undesired behavior under variable delays, interruption, and finishing
  • 82e2a79: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic
  • ... and 28 more: https://git.openjdk.org/jdk/compare/f6492aa63486393593ea8761cef5362ef46abf13...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot changed the title JDK-8335747 8335747: C2: assert((in(Init)->is_ConI() && in(Limit)->is_ConI() && in(Stride)->is_ConI()) ? final_con == (jlong)final_int : true) failed: final value should be integer Jan 10, 2025
@openjdk
Copy link

openjdk bot commented Jan 10, 2025

@eme64 The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jan 10, 2025
@eme64 eme64 changed the title 8335747: C2: assert((in(Init)->is_ConI() && in(Limit)->is_ConI() && in(Stride)->is_ConI()) ? final_con == (jlong)final_int : true) failed: final value should be integer 8335747: C2: fix overflow case for LoopLimit with constant inputs Jan 10, 2025
@eme64 eme64 marked this pull request as ready for review January 10, 2025 07:11
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 10, 2025
@mlbridge
Copy link

mlbridge bot commented Jan 10, 2025

Webrevs

@vnkozlov
Copy link
Contributor

I could not just remove the assert, because in LoopLimitNode::Ideal we have this (strange?) check that does not optimize the LoopLimitNode if the inputs are constants:

May be we check it to not touch this loop until we fully unroll it (for small number of iterations)

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 10, 2025
@merykitty
Copy link
Member

I think this check can reasonably be removed, because Value should be called before Ideal anyway, and so if we can constant fold because of constant inputs, we would have already done so.

I think you are mistaken here, Ideal is called before Value.

Why is it ok to just remove the assert and allow LoopLimitNode to overflow?
We still have the loop limit check, which checks that limit <= max_int-stride, and this means we would never enter the loop if we took the Phi branch that led to the overflow.

In this case, can we return Type::TOP, so that if this assumption is false we will get an error?

@eme64
Copy link
Contributor Author

eme64 commented Jan 13, 2025

@merykitty

I think you are mistaken here, Ideal is called before Value.

Yikes, I think you are right. Still, the lowering happens only after post-loop-opts phase, so the Value optimization could have constant folded it in most cases by then. I hope that is good enough. The alternative is to test-run Value during Ideal, and check if it would constant fold... but that is a little hacky too.

In this case, can we return Type::TOP, so that if this assumption is false we will get an error?

Hmm, I'm not sure. I'm a little worried about the if here:
int x = flag ? 1000 : 2147483647;
The LoopLimitNode gets split through the Phi of the Region of this If. If the LoopLimitNode constant folds to TOP on the right branch, then the phi collapses. But the If here does not need to collapse. Indeed: we can take the right branch of the if, but we just cannot enter the loop after having taken it.
Also: LoopLimitNode::Ideal generates nodes in the lowering. Those could in principle also reach an overflow case, and in some strange case this could later constant fold. Then we would not get TOP either... I think we should give back a valid int value or range for the Value case as well, and not TOP. I would rather do that then possibly mess up the graph.
What do you think?

@merykitty
Copy link
Member

You are right, but I believe the resulting expansion will constant fold regardless. So, why do we need to reject constant folding of the LoopLimitNode in the presence of overflow?

@eme64
Copy link
Contributor Author

eme64 commented Jan 13, 2025

@merykitty You are probably right, we could probably just constant-fold in Value. I mean is it not a little strange anyway: why do we have the optimization twice: once in Value and then the lowering in Ideal. Feels a little like duplication.

We could investigate this in a follow-up RFE, what do you think?

Copy link
Member

@merykitty merykitty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, please go ahead.

@eme64
Copy link
Contributor Author

eme64 commented Jan 14, 2025

@eme64
Copy link
Contributor Author

eme64 commented Jan 14, 2025

Thanks @merykitty @vnkozlov for the reviews!
/integrate

@openjdk
Copy link

openjdk bot commented Jan 14, 2025

Going to push as commit f0af830.
Since your change was applied there have been 58 commits pushed to the master branch:

  • fec769b: 8346778: Enable native access should work with the source launcher
  • cbb2b84: 8344130: C2: Avoid excessive hoisting in scheduler due to minuscule differences in block frequency
  • bb93f67: 8347646: module-info classfile missing the preview flag
  • 3e989fd: 8346986: Remove ASM from java.base
  • 3967696: 8347496: Test jdk/jfr/jvm/TestModularImage.java fails after JDK-8347124: No javac
  • 6eb83ef: 8347500: hsdis cannot be built with Capstone.next
  • c1d322f: 8347627: Compiler replay tests are failing after JDK-8346990
  • 0ae5c6b: 8342996: Enhance Attach API to support arbitrary length arguments - OSX
  • 91b63ca: 8345016: [ASAN] java.c reported ‘%s’ directive argument is null [-Werror=format-truncation=]
  • 379d05b: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros
  • ... and 48 more: https://git.openjdk.org/jdk/compare/f6492aa63486393593ea8761cef5362ef46abf13...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 14, 2025
@openjdk openjdk bot closed this Jan 14, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 14, 2025
@openjdk
Copy link

openjdk bot commented Jan 14, 2025

@eme64 Pushed as commit f0af830.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants