Navigation Menu

Skip to content
This repository has been archived by the owner on Sep 2, 2022. It is now read-only.

8269285: Crash/miscompile in CallGenerator::for_method_handle_inline after JDK-8191998 #169

Closed
wants to merge 5 commits into from

Conversation

shipilev
Copy link
Member

@shipilev shipilev commented Jun 29, 2021

See the bug report for more details.

I believe the JDK-8191998 change introduced a slight regression, where the speculative type join may empty the type. It would then crash on assert in fastdebug builds, or miscompile the null-check to true in release bits. New test captures both failure modes.

This is not a recent regression, but a regression nevertheless, so I would like to have that fix in JDK 17. Please review carefully, or speak up if you want to move it to JDK 18+ and then backport later.

Additional testing:

  • New test fails without the patch, passes with it
  • Linux x86_64 fastdebug tier1

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8269285: Crash/miscompile in CallGenerator::for_method_handle_inline after JDK-8191998

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk17 pull/169/head:pull/169
$ git checkout pull/169

Update a local copy of the PR:
$ git checkout pull/169
$ git pull https://git.openjdk.java.net/jdk17 pull/169/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 169

View PR using the GUI difftool:
$ git pr show -t 169

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk17/pull/169.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 29, 2021

👋 Welcome back shade! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 29, 2021
@openjdk
Copy link

openjdk bot commented Jun 29, 2021

@shipilev The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.java.net label Jun 29, 2021
@mlbridge
Copy link

mlbridge bot commented Jun 29, 2021

Webrevs

@vnkozlov
Copy link
Contributor

@shipilev can it be related to 8268884?

@shipilev
Copy link
Member Author

@shipilev can it be related to 8268884?

It does not look like it to me, but I am not the expert in this code. I thought this one is similar to the code in other place. Anyhow, applying the patch from JDK-8268884 PR does not fix the new regression test from this PR.

@shipilev
Copy link
Member Author

@vnkozlov More precisely, here are the types that join_speculative encounters on the failing path:

STDOUT:
arg_type: byte[int:>=0]:exact * (speculative=byte[int:>=0]:NotNull:exact *)
sig_type: java/io/Serializable *
narrowed_arg_type: java/lang/Object:AnyNull *,iid=top

@veresov
Copy link
Contributor

veresov commented Jun 29, 2021

I'm not an expert on this but perhaps we shouldn't be doing a join and instead transfer the speculative part of the argument to the speculative part of type of the signature?

@shipilev
Copy link
Member Author

I'm not an expect on this but perhaps we shouldn't be doing a join and instead transfer the speculative part of the argument to the speculative part of type of the signature?

I thought join_speculative was the attempt at that; but that fails when argument and signature are seriously different (interface and boxed class (?) here). It seems some other uses of join_speculative appreciate that corner case and handle it. I would be happy to try other approaches, but cannot find an easy (as in clean and obvious) way to do this.

@veresov
Copy link
Contributor

veresov commented Jun 29, 2021

Well, join_speculative() also does a join between the normal types. And evidently the type system has no idea how byte[] relates to Serializable, so the answer is the lowest type possible, which is Object. I hope Roland cleans this up in his type system rework down the road.

To do the surgery on the speculative part, it seems like the only way to do that is to add a Type* cast_with_speculative(Type*) to the Type interface. Otherwise I don't see how we can clone all the little protected fields in TypeOopPtr. Or may be some idiom exists that already does that... But I don't know. @vnkozlov, what do you think?

@shipilev
Copy link
Member Author

Right. Is it odd that join_speculative produces AnyNull here? Anyhow, given the miscompilation is already observed in the wild, maybe the current patch is a good protective measure for the cases where Type join fails?

@veresov
Copy link
Contributor

veresov commented Jun 29, 2021

AnyNull seems odd, but that's what makes you fix work, making empty() return true.

@veresov
Copy link
Contributor

veresov commented Jun 29, 2021

Come to think of it, this pathology arises because one of these guys is an interface. The type system doesn't quite know how to deal with that. Perhaps we should guard for it and not attempt the join if the argument type is an interface?

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for JDK 17 it is good fix. It revert result type to one before 8191998 changes in case join failed. In this sense it is not worser then before 8191998. We can look more on this later. May be when Vladimir Ivanov back.

@openjdk
Copy link

openjdk bot commented Jun 30, 2021

@shipilev This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8269285: Crash/miscompile in CallGenerator::for_method_handle_inline after JDK-8191998

Reviewed-by: kvn, iveresov, vlivanov

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 18 new commits pushed to the master branch:

  • 2b17e95: 8269691: ProblemList sun/management/jdp/JdpDefaultsTest.java on Linux-aarch64
  • 1da5d4b: 8269486: CallerAccessTest fails for non server variant
  • be0ac92: 8269614: [s390] Interpreter checks wrong bit for slow path instance allocation
  • 4b4bef4: 8269594: assert(_handle_mark_nesting > 1) failed: memory leak: allocating handle outside HandleMark
  • 4ee400a: 8268320: Better error recovery for broken patterns in switch
  • ca283c3: 8265907: JVM crashes when matching VectorMaskCmp Node
  • c3c9189: 8269141: Switch statement containing pattern case label element gets in the loop during execution
  • 6b64a79: 8268350: Remove assert that ensures thread identifier remains the same
  • 90eb118: 8269528: VectorAPI Long512VectorTest fails on X86 KNL target
  • a661686: 8269065: [REDO] vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java failed with OutOfMemoryError
  • ... and 8 more: https://git.openjdk.java.net/jdk17/compare/56240690f62f9048a45a53525efccffdec235a8d...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 30, 2021
@vnkozlov
Copy link
Contributor

About joining with Interface type. Returning Object:AnyNull is normal answer from C2 for Interfaces. We can't handle them currently and it is not the goal of this patch.

@veresov
Copy link
Contributor

veresov commented Jun 30, 2021

Ok, if AnyNull is what is reliably returned for interfaces, then empty() is essentially equivalent to is_interface() in this case. Thanks for the explanation.

@iwanowww
Copy link

I'm not fond of the proposed fix. Though it works in this particular case, Type::empty() may give a wrong signal in some pathological/corner cases.

As an example,join/join_speculative may produce TOP when call site happens to be in effectively dead (or dying) part of the graph and paradoxical IR shapes can arise. In such case, there's no reason to continue inlining through the call site, but give up (return NULL) immediately.

In this particular case (sig_type represents an interface), it goes off the rails when arg_type->higher_equal(sig_type) returns false though arrays always implement Serializable. The cast node is redundant here and can be simply omitted. (Recently, JDK-8268371 addressed a similar problem uncovered by CHA-related changes.)

I'm in favor of explicitly filtering out interface types coming from signatures and simply omitting casts when interface is encountered.

@shipilev
Copy link
Member Author

I'm in favor of explicitly filtering out interface types coming from signatures and simply omitting casts when interface is encountered.

All right, would this be acceptable then?

--- a/src/hotspot/share/opto/callGenerator.cpp
+++ b/src/hotspot/share/opto/callGenerator.cpp
@@ -1148,9 +1148,10 @@ CallGenerator* CallGenerator::for_method_handle_inline(JVMState* jvms, ciMethod*
         // Cast receiver to its type.
         if (!target->is_static()) {
           Node* arg = kit.argument(0);
+          ciKlass* sig_klass = signature->accessing_klass();
           const TypeOopPtr* arg_type = arg->bottom_type()->isa_oopptr();
-          const Type*       sig_type = TypeOopPtr::make_from_klass(signature->accessing_klass());
-          if (arg_type != NULL && !arg_type->higher_equal(sig_type)) {
+          const Type*       sig_type = TypeOopPtr::make_from_klass(sig_klass);
+          if (arg_type != NULL && !arg_type->higher_equal(sig_type) && !sig_klass->is_interface()) {
             const Type* recv_type = arg_type->join_speculative(sig_type); // keep speculative part
             Node* cast_obj = gvn.transform(new CheckCastPPNode(kit.control(), arg, recv_type));
             kit.set_argument(0, cast_obj);
@@ -1161,9 +1162,10 @@ CallGenerator* CallGenerator::for_method_handle_inline(JVMState* jvms, ciMethod*
           ciType* t = signature->type_at(i);
           if (t->is_klass()) {
             Node* arg = kit.argument(receiver_skip + j);
+            ciKlass* sig_klass = t->as_klass();
             const TypeOopPtr* arg_type = arg->bottom_type()->isa_oopptr();
-            const Type*       sig_type = TypeOopPtr::make_from_klass(t->as_klass());
-            if (arg_type != NULL && !arg_type->higher_equal(sig_type)) {
+            const Type*       sig_type = TypeOopPtr::make_from_klass(sig_klass);
+            if (arg_type != NULL && !arg_type->higher_equal(sig_type) && !sig_klass->is_interface()) {
               const Type* narrowed_arg_type = arg_type->join_speculative(sig_type); // keep speculative part
               Node* cast_obj = gvn.transform(new CheckCastPPNode(kit.control(), arg, narrowed_arg_type));
               kit.set_argument(receiver_skip + j, cast_obj);

@shipilev
Copy link
Member Author

What I am concerned about with filtering interfaces in signatures, is that we are dealing with MethodHandle handling block. Would such filtering break when arg_type and sig_type are both interfaces?

@shipilev
Copy link
Member Author

shipilev commented Jun 30, 2021

Wait, actually I should just filter before calling join_speculative, which makes my concern moot. See new commit. It passes the regression test, I'll run tier1, tier2 now...

@iwanowww
Copy link

Would such filtering break when arg_type and sig_type are both interfaces?

What particular scenario do you have in mind? The check is performend against sig_klass only, so it covers the case when both arg_type and sig_type represent interfaces. Interface types aren't trusted in bytecode, so the worst case scenario is there'll be a repeated subtype check at runtime needed later in callee.

MethodHandle invocation performs an adhoc type check between call site signature and MethodHandle type (MH.type() == MT_call_site) and the relevant logic in CallGenerator::for_method_handle_inline() reifies its effects on argument types in IR.

@@ -1161,10 +1164,13 @@ CallGenerator* CallGenerator::for_method_handle_inline(JVMState* jvms, ciMethod*
ciType* t = signature->type_at(i);
if (t->is_klass()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I suggested is to extend the guard with !t->as_klass()->is_interface() check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced it is safe. At least it does not look safe for JDK 17. The "must" in the comment block above makes me wary about skipping the casting for interfaces...

        // In lambda forms we erase signature types to avoid resolving issues
        // involving class loaders.  When we optimize a method handle invoke
        // to a direct call we must cast the receiver and arguments to its
        // actual types.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what safety concerns you have in mind w.r.t interfaces. On bytecode level interface treatment is lax (e.g., verifier ignores interface type info; see JVMS-4.10.1.2 for details).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am mostly concerned about pushing a "less than obvious and wider than definitely required" fix to JDK 17. Filtering out by "empty join type" is quite probably too wide. Filtering out the interface arguments is much less obvious. Yes, I get that interface arguments in signatures might be filtered wholesale, but I would rather wait for more testing to see if that actually works.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly speaking, I wouldn't expect too much from testing considering the problem went unnoticed for 2,5 years.

But, as an additional data point, I'd like to reiterate that the fix for JDK-8268371 applied a similar strategy in a much more common scenario - inlining of interface methods - and so far there were no problems spotted.

@shipilev
Copy link
Member Author

OK, so there are basically three ways to fix it so far. In the draft form, they are:

  1. Handle empty() join result:
--- a/src/hotspot/share/opto/callGenerator.cpp
+++ b/src/hotspot/share/opto/callGenerator.cpp
@@ -1164,8 +1166,10 @@ CallGenerator* CallGenerator::for_method_handle_inline(JVMState* jvms, ciMethod*
             const TypeOopPtr* arg_type = arg->bottom_type()->isa_oopptr();
             const Type*       sig_type = TypeOopPtr::make_from_klass(t->as_klass());
             if (arg_type != NULL && !arg_type->higher_equal(sig_type)) {
-              const Type* narrowed_arg_type = arg_type->join_speculative(sig_type); // keep speculative part
-              Node* cast_obj = gvn.transform(new CheckCastPPNode(kit.control(), arg, narrowed_arg_type));
+              // Keep the speculative part, but disallow it to empty the type
+              const Type* narrowed_arg_type = arg_type->join_speculative(sig_type);
+              const Type* cast_type = narrowed_arg_type->empty() ? sig_type : narrowed_arg_type;
+              Node* cast_obj = gvn.transform(new CheckCastPPNode(kit.control(), arg, cast_type));
               kit.set_argument(receiver_skip + j, cast_obj);
             }
           }
  1. Filter is_interface before performing the join, but still attempt to cast.
diff --git a/src/hotspot/share/opto/callGenerator.cpp b/src/hotspot/share/opto/callGenerator.cpp
index ab7f7897797..eaf19d37b72 100644
--- a/src/hotspot/share/opto/callGenerator.cpp
+++ b/src/hotspot/share/opto/callGenerator.cpp
@@ -1161,10 +1164,13 @@ CallGenerator* CallGenerator::for_method_handle_inline(JVMState* jvms, ciMethod*
           ciType* t = signature->type_at(i);
           if (t->is_klass()) {
             Node* arg = kit.argument(receiver_skip + j);
             const TypeOopPtr* arg_type = arg->bottom_type()->isa_oopptr();
            const Type*       sig_type = TypeOopPtr::make_from_klass(t->as_klass()));
             if (arg_type != NULL && !arg_type->higher_equal(sig_type)) {
-              const Type* narrowed_arg_type = arg_type->join_speculative(sig_type); // keep speculative part
+              // Keep the speculative part, but do not join with interfaces
+              const Type* narrowed_arg_type = t->as_klass()->is_interface() ?
+                      sig_type : arg_type->join_speculative(sig_type);
               Node* cast_obj = gvn.transform(new CheckCastPPNode(kit.control(), arg, narrowed_arg_type));
               kit.set_argument(receiver_skip + j, cast_obj);
             }
  1. Filter is_interface arguments completely, i.e. do not even attempt the cast.
--- a/src/hotspot/share/opto/callGenerator.cpp
+++ b/src/hotspot/share/opto/callGenerator.cpp
@@ -1159,7 +1159,7 @@ CallGenerator* CallGenerator::for_method_handle_inline(JVMState* jvms, ciMethod*
         // Cast reference arguments to its type.
         for (int i = 0, j = 0; i < signature->count(); i++) {
           ciType* t = signature->type_at(i);
-          if (t->is_klass()) {
+          if (t->is_klass() && !t->as_klass()->is_interface()) {
             Node* arg = kit.argument(receiver_skip + j);
             const TypeOopPtr* arg_type = arg->bottom_type()->isa_oopptr();
             const Type*       sig_type = TypeOopPtr::make_from_klass(t->as_klass());

All three fix the regression test. All three pass tier1, tier2.

I think (2) is the least risky for JDK 17. Thoughts?

@iwanowww
Copy link

Also, another way to fix the immediate bug is to switch from join_speculative to filter_speculative which, among other things, tries to work around limitations imposed by interface types:

> p arg_type->dump()
byte[int:>=0]:exact * (speculative=byte[int:>=0]:NotNull:exact *)

> p sig_type->dump()
java/io/Serializable *

> p arg_type->join_speculative(sig_type)->dump()
java/lang/Object:AnyNull *,iid=top

> p arg_type->filter_speculative(sig_type)->dump()
java/io/Serializable *

@veresov
Copy link
Contributor

veresov commented Jun 30, 2021

I like having is_interface() there because it makes it easier to find this place later, when the type system is fixed.

@iwanowww
Copy link

After thinking more about it (and studying TypeOopPtr::filter_helper), switching to filter_speculative looks like the right way to fix the bug.

Also, it won't require any additional changes when the type system is fixed.

@shipilev
Copy link
Member Author

Cool, filter_speculative works for regression test. See new commit. I'll run tier{1,2} for it.

@iwanowww
Copy link

Also, it won't require any additional changes when the type system is fixed.

I take it back: the sole need for joining arg_type and sig_type is to work around limitations imposed by CheckCastPP. And the sole purpose of CheckCastPP (compared to CastPP) is to work around limitations imposed by interface types. Once the type system is enhanced with proper support of interface types, I believe it will become possible to get rid of CheckCastPP in favor of CastPP and there won't be any need in explicit joining of types in CallGenerator::for_method_handle_inline() since CastPPNode::Value() will do "the right thing" and keep the speculative part along the way.

BTW I don't see any reason why CheckCastPPNode::Value() can't propagate speculative part of the input type right away, but that's something to consider as a followup enhancement.

Copy link

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

src/hotspot/share/opto/callGenerator.cpp Outdated Show resolved Hide resolved
@shipilev
Copy link
Member Author

tier1 and tier2 are fine with filter_speculative. @vnkozlov, @veresov -- are you good with new version?

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks good.

@veresov
Copy link
Contributor

veresov commented Jun 30, 2021

Looks good to me.

@shipilev
Copy link
Member Author

shipilev commented Jul 1, 2021

Thank you, all.

/integrate

@openjdk
Copy link

openjdk bot commented Jul 1, 2021

Going to push as commit c16d1fc.
Since your change was applied there have been 28 commits pushed to the master branch:

  • ad27d9b: 8269088: C2 fails with assert(!n->is_Store() && !n->is_LoadStore()) failed: no node with a side effect
  • c67a7b0: 8269230: C2: main loop in micro benchmark never executed
  • 962f1c1: 8262886: javadoc generates broken links with {@inheritdoc}
  • f7ffd58: 8267602: [macos] [lanai] java/awt/PrintJob/Text/stringwidth.sh doesn't exit on cancelling print dialog
  • 4930ae9: 8268592: JDK-8262891 causes an NPE in Lint.augment
  • 9ac63a6: 8262841: Clarify the behavior of PhantomReference::refersTo
  • aba6c55: 8269703: ProblemList vmTestbase/nsk/jvmti/scenarios/sampling/SP07/sp07t002/TestDescription.java on Windows-X64 with -Xcomp
  • 3e02224: 8269513: Clarify the spec wrt useOldISOCodes system property
  • 0dc65d3: 8268897: [TESTBUG] compiler/compilercontrol/mixed/RandomCommandsTest.java must not fail on Command.quiet
  • 3826012: 8268557: Module page uses unstyled table class
  • ... and 18 more: https://git.openjdk.java.net/jdk17/compare/56240690f62f9048a45a53525efccffdec235a8d...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Jul 1, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jul 1, 2021
@openjdk
Copy link

openjdk bot commented Jul 1, 2021

@shipilev Pushed as commit c16d1fc.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@shipilev shipilev deleted the JDK-8269285-c2-speculative branch July 1, 2021 10:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.java.net integrated Pull request has been integrated
4 participants