Skip to content

8274074: SIGFPE with C2 compiled code with -XX:+StressGCM #5651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

chhagedorn
Copy link
Member

@chhagedorn chhagedorn commented Sep 23, 2021

In the testcase, the divisor input node of a DivI node is sunk out of a loop to a div by zero UCT and is pinned with a CastII node to ensure it's not floating back into the loop. The divisor is optimized by taking into account that it's only executed on the uncommon path. The CastII, however, is removed later and the division floats back into the loop which results in a SIGFPE crash.

The relevent lines in the testcase are the following two divisions:

static int iFld = 1;
static int q = 0;
...
y = iFld - q; // divisor
y = (iArrFld[2] / y); // division 1
y = (5 / iFld); // division 2

After sinking the divisor of division 1 in the testcase to the div by zero UCT of division 2, the graph looks like this:

Screenshot from 2021-09-21 14-40-37

  • 201 If is the zero check of division 2 (will always succeed because iFld = 1, i.e. UCT is never taken).
  • 193 DivI (division 1) is not sunk because its get_ctrl() is 203 IfFalse (outside the loop already because there is no use inside the loop since the local y is directly overwritten again).
  • 275 SubI (divisor) was sunk out of the loop and is pinned by 276 CastII (unconditional dependency).

In IGVN, CastII::Value() is called for 276 CastII. It sees an If/Cmp (zero check of division 2) with the same 137 LoadI input as for the 276 CastII. Therefore, we set its type to [0,0] here:

t = TypeInt::make(lo_int, hi_int, Type::WidenMax);
res = res->filter_speculative(t);
return res;

As a result, we replace 276 CastII with a constant zero in IGVN. But now we lost the pin to the uncommon path of the zero check of division 2 for 275 SubI and 193 DivI. 193 DivI is only used on the uncommon path but can now float around again, also inside the loop itself, which happens in the testcase. Inside the loop, we execute the division with the now optimized divisor 0 - q = 0 which is a division by zero and we crash.

In summary, it's not a problem that a Div node floats above its zero check here but rather that we optimize an input node used as divisor by assuming that we only execute the division on the uncommon path when the zero check of division 2 failed (which never happens). This divisor optimization would be wrong when the division is executed inside the loop. But due to losing the pin, we end up doing exactly that which results in a SIGFPE crash.

The suggested fix is to extend the sinking algorithm to rewire data nodes with a control input inside a loop whose get_ctrl() is actually completely outside loops on uncommon paths. The control input is set to get_ctrl() to force the nodes out of loops. In the example above, the control input of 193 DivI is set to 203 IfFalse, ensuring that it is still pinned to the uncommon path after 276 CastII is removed. This fix is also beneficial if we do not sink any nodes at all later.

Thanks,
Christian


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8274074: SIGFPE with C2 compiled code with -XX:+StressGCM

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5651/head:pull/5651
$ git checkout pull/5651

Update a local copy of the PR:
$ git checkout pull/5651
$ git pull https://git.openjdk.java.net/jdk pull/5651/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5651

View PR using the GUI difftool:
$ git pr show -t 5651

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5651.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 23, 2021

👋 Welcome back chagedorn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 23, 2021
@openjdk
Copy link

openjdk bot commented Sep 23, 2021

@chhagedorn The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Sep 23, 2021
@mlbridge
Copy link

mlbridge bot commented Sep 23, 2021

Webrevs

@@ -1558,6 +1558,11 @@ void PhaseIdealLoop::try_sink_out_of_loop(Node* n) {
_igvn.remove_dead_node(n);
}
_dom_lca_tags_round = 0;
} else if (n_loop == _ltree_root && n->in(0) != NULL && get_loop(n->in(0)) != _ltree_root) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't the node be out of this loop but not necessarily out of all loops?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, that's an unnecessary limitation. I've tried to find an example where this happens but could not come up with one. But I'm sure that situation will occur at some point. I pushed an update.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

// Zero check Z2 with UCT
// DivI node D is only used on IfFalse path of zero check Z2 into UCT (on IfTrue path, the result is not used anywhere
// because we directly overwrite it again with "y = (5 / iFld)). The IfFalse path of the zero check, however, is never
// taken because iFld = 1. But before applying the sinking algorithm, the DivI node D could be be executed during the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"could be be" -> "could be"

// and optimizing it accordingly (iFld is found to be zero because the zero check Z2 failed, i.e. iFld is zero which is
// propagated into the CastII node whose type is improved to [0,0] and the node is replaced by constant zero), the
// DivI node must NOT be executed inside the loop anymore. But the DivI node is executed in the loop because of losing
// the CastII pin. The fix is to updated the control input of the DivI node to the get_ctrl() input outside the loop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"to updated" -> "to update"

@chhagedorn
Copy link
Member Author

Thanks Tobias for your review!

Copy link
Contributor

@rwestrel rwestrel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@openjdk
Copy link

openjdk bot commented Sep 27, 2021

@chhagedorn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8274074: SIGFPE with C2 compiled code with -XX:+StressGCM

Reviewed-by: roland, thartmann

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 138 new commits pushed to the master branch:

  • e3aff8f: 8274289: jdk/jfr/api/consumer/TestRecordedFrameType.java failed with "RuntimeException: assertNotEquals: expected Interpreted to not equal Interpreted"
  • 252aaa9: 8274293: Build failure on macOS with Xcode 13.0 as vfork is deprecated
  • 7700b25: 8273401: Disable JarIndex support in URLClassPath
  • 5ec1cdc: 8274321: Standardize values of @SInCE tags in javax.lang.model
  • 4838a2c: 8274143: Disable "invalid entry for security.provider.X" error message in log file when security.provider.X is empty
  • ab28db1: 8274312: ProblemList 2 serviceability/dcmd/gc tests with ZGC on macos-all
  • 8c122af: 8274314: Typo in WatchService#poll(long timeout, TimeUnit unit) javadoc
  • 9bc865d: 8273960: Redundant condition in Metadata.TypeComparator.compare
  • 5756385: 8274273: Update testing docs for MacOS with Non-US locale
  • 61ac53f: 8210927: JDB tests do not update source path after doing a redefine class
  • ... and 128 more: https://git.openjdk.java.net/jdk/compare/74ffe12267cb3ae63072a06f50083fd0352d8049...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 27, 2021
@chhagedorn
Copy link
Member Author

Thanks Roland for your review!

@chhagedorn
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Sep 27, 2021

Going to push as commit b0983df.
Since your change was applied there have been 140 commits pushed to the master branch:

  • 7436a77: 8274317: Unnecessary reentrant synchronized block in java.awt.Cursor
  • 7426fd4: 8274325: C4819 warning at vm_version_x86.cpp on Windows after JDK-8234160
  • e3aff8f: 8274289: jdk/jfr/api/consumer/TestRecordedFrameType.java failed with "RuntimeException: assertNotEquals: expected Interpreted to not equal Interpreted"
  • 252aaa9: 8274293: Build failure on macOS with Xcode 13.0 as vfork is deprecated
  • 7700b25: 8273401: Disable JarIndex support in URLClassPath
  • 5ec1cdc: 8274321: Standardize values of @SInCE tags in javax.lang.model
  • 4838a2c: 8274143: Disable "invalid entry for security.provider.X" error message in log file when security.provider.X is empty
  • ab28db1: 8274312: ProblemList 2 serviceability/dcmd/gc tests with ZGC on macos-all
  • 8c122af: 8274314: Typo in WatchService#poll(long timeout, TimeUnit unit) javadoc
  • 9bc865d: 8273960: Redundant condition in Metadata.TypeComparator.compare
  • ... and 130 more: https://git.openjdk.java.net/jdk/compare/74ffe12267cb3ae63072a06f50083fd0352d8049...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Sep 27, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Sep 27, 2021
@openjdk
Copy link

openjdk bot commented Sep 27, 2021

@chhagedorn Pushed as commit b0983df.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@chhagedorn chhagedorn deleted the JDK-8274074 branch October 1, 2021 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants