-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8292301: [REDO v2] C2 crash when allocating array of size too large #10038
Conversation
…ng array of size too large" This reverts commit 967a28c.
👋 Welcome back roland! A progress list of the required criteria for merging this PR into |
This PR compounds 3 patches. The base is JDK-9279219, which has reviewed. On top of that, two others patches are bugfixes for the corner cases. |
src/hotspot/share/opto/phaseX.cpp
Outdated
void PhaseCCP::push_catch(Unique_Node_List& worklist, const Node* use) { | ||
if (use->is_Call()) { | ||
if (use->is_Call() || use->is_AllocateArray()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi, Roland,
I understand your intention here, but isn't AllocateArrayNode also a CallNode?
My understanding is that use->is_Call() is true if use is an AllocationNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. There's no issue with CCP then. I'll update the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the two bugfixes work for me. LGTM.
I am not a reviewer. need other reviewers to approve it.
Thanks for reviewing this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good otherwise.
@@ -1640,6 +1640,13 @@ void PhaseIterGVN::add_users_to_worklist( Node *n ) { | |||
if (imem != NULL) add_users_to_worklist0(imem); | |||
} | |||
} | |||
if (use_op == Op_AllocateArray && n == use->in(AllocateNode::ValidLengthTest)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in an updated change.
if (iff->in(1)->is_Phi()) { | ||
Node *b = clone_iff(iff->in(1)->as_Phi()); | ||
_igvn.replace_input_of(iff, 1, b); | ||
uint input = iff->Opcode() == Op_AllocateArray ? AllocateNode::ValidLengthTest : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint input = iff->Opcode() == Op_AllocateArray ? AllocateNode::ValidLengthTest : 1; | |
uint input = (iff->Opcode() == Op_AllocateArray) ? AllocateNode::ValidLengthTest : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left that one out as I think this pattern is common enough that it shouldn't be ambiguous.
@@ -2037,7 +2037,8 @@ void PhaseIdealLoop::clone_loop_handle_data_uses(Node* old, Node_List &old_new, | |||
// in the loop to break the loop, then test is again outside of the | |||
// loop to determine which way the loop exited. | |||
// Loop predicate If node connects to Bool node through Opaque1 node. | |||
if (use->is_If() || use->is_CMove() || C->is_predicate_opaq(use) || use->Opcode() == Op_Opaque4) { | |||
if (use->is_If() || use->is_CMove() || C->is_predicate_opaq(use) || use->Opcode() == Op_Opaque4 || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment describing the new case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in updated change.
@rwestrel This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 8 new commits pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the ➡️ To integrate this PR with the above commit message to the |
Co-authored-by: Tobias Hartmann <tobias.hartmann@oracle.com>
Thanks for reviewing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding these comments. Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks for re-reviewing. @TobiHartmann recommended one more review as this change caused issues and had to be backed out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move check_no_dead_use()
call after final_graph_reshaping()
call because we may have dead uses?
src/hotspot/share/opto/compile.cpp
Outdated
call->entry_point() == OptoRuntime::new_array_nozero_Java()) { | ||
assert(call->is_CallStaticJava(), "static call expected"); | ||
assert(call->req() == call->jvms()->endoff() + 1, "missing extra input"); | ||
call->del_req(call->req()-1); // valid length test useless now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How you remove graph (test) attached to ValidLengthTest input? How you know that call
node still has this input and it is call->req() -1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that the subgraph at ValidLengthTest goes dead and is not properly removed. I pushed a new change to fix that.
The ValidLengthTest input from the AllocateArrayNode is always moved to the call so there's no reason it shouldn't still be there.
src/hotspot/share/opto/compile.cpp
Outdated
arg1->as_Type()->type()->join(TypeInt::POS)->empty()) { | ||
assert(call->is_CallStaticJava(), "static call expected"); | ||
assert(call->req() == call->jvms()->endoff() + 1, "missing extra input"); | ||
Node* valid_length_test = call->in(call->req()-1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How you know that call node still has this input and it is call->req() -1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: the ValidLengthTest input from the AllocateArrayNode is always moved to the call so there's no reason it shouldn't still be there.
// For array allocations, copy the valid length check to the call node so Compile::final_graph_reshaping() can verify | ||
// that the call has the expected number of CatchProj nodes (in case the allocation always fails and the fallthrough | ||
// path dies). | ||
if (valid_length_test != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check for TOP too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valid_length_test is NULL for AllocateNode and not NULL for AllocateArrayNode. PhaseMacroExpand::expand_allocate_common() is shared by the 2 node types and that test is used to differentiate the two. I don't think top is possible here. I also think for simplicity we want to always move the ValidLength input from the AllocateArrayNode to the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
I don't see link to mach5 pre-integration testing in JBS. Did we run it or we waiting final version? |
@rwestrel this pull request can not be integrated into git checkout JDK-8292301
git fetch https://git.openjdk.org/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
Thanks for reviewing this.
Yes, but we would then hit: https://bugs.openjdk.org/browse/JDK-8211759 |
I re-submitted testing and will report back once it passed. |
* @requires vm.compiler2.enabled | ||
* @library /test/lib / | ||
* @build sun.hotspot.WhiteBox | ||
* @run driver jdk.test.lib.helpers.ClassFileInstaller sun.hotspot.WhiteBox |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test fails with
test result: Error. can't find sun.hotspot.WhiteBox in test directory or libraries
It should use jdk.test.whitebox.WhiteBox
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Fixed now.
Okay. I agree on separate changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.
// For array allocations, copy the valid length check to the call node so Compile::final_graph_reshaping() can verify | ||
// that the call has the expected number of CatchProj nodes (in case the allocation always fails and the fallthrough | ||
// path dies). | ||
if (valid_length_test != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
I'll re-run testing and report back once it passed. |
All tests passed. |
thanks for the review @vnkozlov and testing @TobiHartmann |
/integrate |
Going to push as commit 1ea0d6b.
Your commit was automatically rebased without conflicts. |
On top of the redo, this fixed 2 bugs:
8288184: the problem here is that the ValidLengthTest input of an
AllocateArrayNode becomes a constant. The CatchNode would then change
types if it was reprocessed but it's not. Custom logic is needed to
enqueue the CatchNode when the ValidLengthTest input of an
AllocateArrayNode changes. The CastII out of the AllocateArrayNode
becomes top but the fallthrough path doesn't die. This happens with
igvn in the case of the bug but could also happen with ccp. I fixed
both in this patch.
8291665: the code pattern for this is 2 AllocateArrayNodes out of loop
with a shared ValidLengthTest input in a loop. When the loop is cloned
that causes Phis to be added between the AllocateArrayNodes and the
BoolNode of the ValidLengthTest inputs. Split if runs next and it
doesn't expect the Phi at the ValidLengthTest inputs. The fix here is
to clone the Bool/Cmp subgraph down on loop cloning. There's logic for
that when the use of the bool is an If for instance so I simply added
a special case to run that logic for an AllocateArrayNode use as
well. Note that the test case I added fails reliably on 11 but not
with the current jdk developement branch. AFAICT, the bug is there but
something unrelated changed and a slightly different graph is built
for the test case that prevents split if.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10038/head:pull/10038
$ git checkout pull/10038
Update a local copy of the PR:
$ git checkout pull/10038
$ git pull https://git.openjdk.org/jdk pull/10038/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 10038
View PR using the GUI difftool:
$ git pr show -t 10038
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10038.diff