Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8292301: [REDO v2] C2 crash when allocating array of size too large #10038

Closed
wants to merge 16 commits into from

Conversation

rwestrel
Copy link
Contributor

@rwestrel rwestrel commented Aug 26, 2022

On top of the redo, this fixed 2 bugs:

8288184: the problem here is that the ValidLengthTest input of an
AllocateArrayNode becomes a constant. The CatchNode would then change
types if it was reprocessed but it's not. Custom logic is needed to
enqueue the CatchNode when the ValidLengthTest input of an
AllocateArrayNode changes. The CastII out of the AllocateArrayNode
becomes top but the fallthrough path doesn't die. This happens with
igvn in the case of the bug but could also happen with ccp. I fixed
both in this patch.

8291665: the code pattern for this is 2 AllocateArrayNodes out of loop
with a shared ValidLengthTest input in a loop. When the loop is cloned
that causes Phis to be added between the AllocateArrayNodes and the
BoolNode of the ValidLengthTest inputs. Split if runs next and it
doesn't expect the Phi at the ValidLengthTest inputs. The fix here is
to clone the Bool/Cmp subgraph down on loop cloning. There's logic for
that when the use of the bool is an If for instance so I simply added
a special case to run that logic for an AllocateArrayNode use as
well. Note that the test case I added fails reliably on 11 but not
with the current jdk developement branch. AFAICT, the bug is there but
something unrelated changed and a slightly different graph is built
for the test case that prevents split if.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8292301: [REDO v2] C2 crash when allocating array of size too large

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10038/head:pull/10038
$ git checkout pull/10038

Update a local copy of the PR:
$ git checkout pull/10038
$ git pull https://git.openjdk.org/jdk pull/10038/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10038

View PR using the GUI difftool:
$ git pr show -t 10038

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10038.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 26, 2022

👋 Welcome back roland! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Aug 26, 2022

@rwestrel The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Aug 26, 2022
@openjdk openjdk bot added the rfr Pull request is ready for review label Aug 26, 2022
@mlbridge
Copy link

mlbridge bot commented Aug 26, 2022

Webrevs

@navyxliu
Copy link
Member

This PR compounds 3 patches. The base is JDK-9279219, which has reviewed. On top of that, two others patches are bugfixes for the corner cases.

void PhaseCCP::push_catch(Unique_Node_List& worklist, const Node* use) {
if (use->is_Call()) {
if (use->is_Call() || use->is_AllocateArray()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, Roland,
I understand your intention here, but isn't AllocateArrayNode also a CallNode?
My understanding is that use->is_Call() is true if use is an AllocationNode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. There's no issue with CCP then. I'll update the change.

Copy link
Member

@navyxliu navyxliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the two bugfixes work for me. LGTM.
I am not a reviewer. need other reviewers to approve it.

@rwestrel
Copy link
Contributor Author

rwestrel commented Sep 1, 2022

the two bugfixes work for me. LGTM. I am not a reviewer. need other reviewers to approve it.

Thanks for reviewing this.

Copy link
Member

@navyxliu navyxliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good otherwise.

src/hotspot/share/opto/phaseX.cpp Outdated Show resolved Hide resolved
@@ -1640,6 +1640,13 @@ void PhaseIterGVN::add_users_to_worklist( Node *n ) {
if (imem != NULL) add_users_to_worklist0(imem);
}
}
if (use_op == Op_AllocateArray && n == use->in(AllocateNode::ValidLengthTest)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in an updated change.

if (iff->in(1)->is_Phi()) {
Node *b = clone_iff(iff->in(1)->as_Phi());
_igvn.replace_input_of(iff, 1, b);
uint input = iff->Opcode() == Op_AllocateArray ? AllocateNode::ValidLengthTest : 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uint input = iff->Opcode() == Op_AllocateArray ? AllocateNode::ValidLengthTest : 1;
uint input = (iff->Opcode() == Op_AllocateArray) ? AllocateNode::ValidLengthTest : 1;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left that one out as I think this pattern is common enough that it shouldn't be ambiguous.

@@ -2037,7 +2037,8 @@ void PhaseIdealLoop::clone_loop_handle_data_uses(Node* old, Node_List &old_new,
// in the loop to break the loop, then test is again outside of the
// loop to determine which way the loop exited.
// Loop predicate If node connects to Bool node through Opaque1 node.
if (use->is_If() || use->is_CMove() || C->is_predicate_opaq(use) || use->Opcode() == Op_Opaque4) {
if (use->is_If() || use->is_CMove() || C->is_predicate_opaq(use) || use->Opcode() == Op_Opaque4 ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment describing the new case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in updated change.

@openjdk
Copy link

openjdk bot commented Sep 8, 2022

@rwestrel This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8292301: [REDO v2] C2 crash when allocating array of size too large

Reviewed-by: xliu, thartmann, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 8 new commits pushed to the master branch:

  • 6ecd081: 8294270: make test passes awkward -status:-status:error,fail to jtreg
  • eca9749: 8288325: [windows] Actual and Preferred Size of AWT Non-resizable frame are different
  • 2e20e7e: 8294271: Remove use of ThreadDeath from make utilities
  • e45f3d5: 8294281: Allow warnings to be disabled on a per-file basis
  • 664e5b1: 8294187: RISC-V: Unify all relocations for the backend into AbstractAssembler::relocate()
  • acd75e0: 8294053: Unneeded local variable in handle_safefetch()
  • 0b56b82: 8293991: java/lang/Float/Binary16ConversionNaN.java fails on silent NaN conversions
  • acd5bcf: 8289610: Degrade Thread.stop

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 8, 2022
@rwestrel
Copy link
Contributor Author

rwestrel commented Sep 9, 2022

Looks good otherwise.

Thanks for reviewing.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these comments. Looks good!

Copy link
Member

@navyxliu navyxliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still LGTM.

Copy link
Member

@navyxliu navyxliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@rwestrel
Copy link
Contributor Author

LGTM.

Thanks for re-reviewing. @TobiHartmann recommended one more review as this change caused issues and had to be backed out.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move check_no_dead_use() call after final_graph_reshaping() call because we may have dead uses?

call->entry_point() == OptoRuntime::new_array_nozero_Java()) {
assert(call->is_CallStaticJava(), "static call expected");
assert(call->req() == call->jvms()->endoff() + 1, "missing extra input");
call->del_req(call->req()-1); // valid length test useless now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How you remove graph (test) attached to ValidLengthTest input? How you know that call node still has this input and it is call->req() -1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that the subgraph at ValidLengthTest goes dead and is not properly removed. I pushed a new change to fix that.
The ValidLengthTest input from the AllocateArrayNode is always moved to the call so there's no reason it shouldn't still be there.

arg1->as_Type()->type()->join(TypeInt::POS)->empty()) {
assert(call->is_CallStaticJava(), "static call expected");
assert(call->req() == call->jvms()->endoff() + 1, "missing extra input");
Node* valid_length_test = call->in(call->req()-1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How you know that call node still has this input and it is call->req() -1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: the ValidLengthTest input from the AllocateArrayNode is always moved to the call so there's no reason it shouldn't still be there.

// For array allocations, copy the valid length check to the call node so Compile::final_graph_reshaping() can verify
// that the call has the expected number of CatchProj nodes (in case the allocation always fails and the fallthrough
// path dies).
if (valid_length_test != NULL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check for TOP too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valid_length_test is NULL for AllocateNode and not NULL for AllocateArrayNode. PhaseMacroExpand::expand_allocate_common() is shared by the 2 node types and that test is used to differentiate the two. I don't think top is possible here. I also think for simplicity we want to always move the ValidLength input from the AllocateArrayNode to the call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@vnkozlov
Copy link
Contributor

I don't see link to mach5 pre-integration testing in JBS. Did we run it or we waiting final version?

@openjdk
Copy link

openjdk bot commented Sep 23, 2022

@rwestrel this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8292301
git fetch https://git.openjdk.org/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed ready Pull request is ready to be integrated labels Sep 23, 2022
@rwestrel
Copy link
Contributor Author

Thanks for reviewing this.

Should we move check_no_dead_use() call after final_graph_reshaping() call because we may have dead uses?

Yes, but we would then hit: https://bugs.openjdk.org/browse/JDK-8211759
That is, there are other transformations in final graph reshape that don't properly collect dead nodes so I think this should be done as a standalone change.

@openjdk openjdk bot added ready Pull request is ready to be integrated and removed merge-conflict Pull request has merge conflict with target branch labels Sep 23, 2022
@TobiHartmann
Copy link
Member

I don't see link to mach5 pre-integration testing in JBS. Did we run it or we waiting final version?

I re-submitted testing and will report back once it passed.

* @requires vm.compiler2.enabled
* @library /test/lib /
* @build sun.hotspot.WhiteBox
* @run driver jdk.test.lib.helpers.ClassFileInstaller sun.hotspot.WhiteBox
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test fails with

test result: Error. can't find sun.hotspot.WhiteBox in test directory or libraries

It should use jdk.test.whitebox.WhiteBox.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed now.

@vnkozlov
Copy link
Contributor

Thanks for reviewing this.

Should we move check_no_dead_use() call after final_graph_reshaping() call because we may have dead uses?

Yes, but we would then hit: https://bugs.openjdk.org/browse/JDK-8211759 That is, there are other transformations in final graph reshape that don't properly collect dead nodes so I think this should be done as a standalone change.

Okay. I agree on separate changes.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

// For array allocations, copy the valid length check to the call node so Compile::final_graph_reshaping() can verify
// that the call has the expected number of CatchProj nodes (in case the allocation always fails and the fallthrough
// path dies).
if (valid_length_test != NULL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@TobiHartmann
Copy link
Member

I'll re-run testing and report back once it passed.

@TobiHartmann
Copy link
Member

All tests passed.

@rwestrel
Copy link
Contributor Author

thanks for the review @vnkozlov and testing @TobiHartmann

@rwestrel
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Sep 28, 2022

Going to push as commit 1ea0d6b.
Since your change was applied there have been 48 commits pushed to the master branch:

  • c13e0ef: 8292848: AWT_Mixing and TrayIcon tests fail on el8 with hard-coded isOel7
  • 79ccc79: 8293613: need to properly handle and hide tmp VTMS transitions
  • 5e1e449: 8290920: sspi_bridge.dll not built if BUILD_CRYPTO is false
  • d827fd8: 8294430: RISC-V: Small refactoring for movptr_with_offset
  • 9d76ac8: 8292158: AES-CTR cipher state corruption with AVX-512
  • e5b65c4: 8290482: Update JNI Specification of DestroyJavaVM for better alignment with JLS, JVMS, and Java SE API Specifications
  • f8d9fa8: 8294483: Remove vmTestbase/nsk/jvmti/GetThreadState tests.
  • 6ad151d: 8293143: Workaround for JDK-8292217 when doing "step over" of bytecode with unresolved cp reference
  • 22b59b6: 8294471: SpecTaglet is inconsistent with SpecTree for inline property
  • 763d4bf: 8293592: Remove JVM_StopThread, stillborn, and related cleanup
  • ... and 38 more: https://git.openjdk.org/jdk/compare/05c8cabdad7b5c573046b1c5d235c33ac5cb266c...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Sep 28, 2022
@openjdk openjdk bot closed this Sep 28, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Sep 28, 2022
@openjdk
Copy link

openjdk bot commented Sep 28, 2022

@rwestrel Pushed as commit 1ea0d6b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

caizixian added a commit to caizixian/mmtk-openjdk that referenced this pull request Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants