Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8274983: C1 optimizes the invocation of private interface methods #6445

Closed
wants to merge 6 commits into from

Conversation

navyxliu
Copy link
Member

@navyxliu navyxliu commented Nov 18, 2021

The root cause of the C1 regression is that some regex generate multiple classes which all implement
an interface. In SlowStartupTest.java, the following invokeinterface happens frequently with different receivers. the target is a private interface method.

9: invokeinterface #25,  3           // InterfaceMethod java/util/regex/Pattern$BmpCharPredicate.lambda$union$2:(Ljava/util/regex/Pattern$CharPredicate;I)Z

This patch allows c1 to generate the optimized virtual call for invokeinterface
whose targets are the private interface methods.

Before JDK-823835, LambdaMetaFactory generates invokespecial in this case. Because the private
interface methods can not be overrided, c1 generates the optimized virtual call. After JDK-823835,
LambdaMetaFactory generates invokeinterface instead. C1 generates the regular virtual call because
it can not recognize the new pattern. If a multiple of subclasses all implement a same interface,
it is possible that they trash the IC stub using their own concrete klass in runtime.

Optimized virtual call uses relocInfo::opt_virtual_call_type(3), It will call VM
'resolve_opt_virtual_call_C' once and resolve the target to the VEP of the nmethod.
Therefore, this patch can prevent the callsite from trashing.

Before this patch, SlowStartupTest had 38770 times _resolve_invoke_virtual_cnt and 38695 _handle_wrong_method_cnt per 10k iterations. To dump C1Statistics, we use fastdebug build for comparison.

$java -XX:TieredStopAtLevel=1 -XX:+PrintC1Statistics SlowStartupTest 1
Executed 10000 iterations in 736ms
C1 Runtime statistics:
 _resolve_invoke_virtual_cnt:     38770
 _resolve_invoke_opt_virtual_cnt: 186
 _resolve_invoke_static_cnt:      44
 _handle_wrong_method_cnt:        38695
 _ic_miss_cnt:                    35

With this patch, only 1 _handle_wrong_method_cnt is triggered but we have 3 more _resolve_invoke_opt_virtual_cnt events instead. The total runtime reduces from 736ms to 9ms.

$java -XX:TieredStopAtLevel=1 -XX:+PrintC1Statistics SlowStartupTest 1
Executed 10000 iterations in 9ms
C1 Runtime statistics:
 _resolve_invoke_virtual_cnt:     77
 _resolve_invoke_opt_virtual_cnt: 189
 _resolve_invoke_static_cnt:      45
 _handle_wrong_method_cnt:        1
 _ic_miss_cnt:                    39

Codegen wise, before the patch, C1 generates LIR for the invokeinterface whose target is a private interface methoda as follows.

__bci__use__tid____instr____________________________________
. 1    0    v2     a1.invokeinterface()
                   InvokePrivateInterfaceMethod$I.bar()V
. 6    0    v3     return

With this patch, C1 generates LIR as follows. it first check a1 is a subtype of InvokePrivateInterfaceMethod$I. if so, an optimized virtual call is generated. The callsite will be fixed up once and only one time in runtime.

__bci__use__tid____instr____________________________________
. 1    1    a2     checkcast(a1) InvokePrivateInterfaceMethod$I
                   stack [0:a1]
. 1    0    v3     a2.invokeinterface()
                   InvokePrivateInterfaceMethod$I.bar()V
. 6    0    v4     return

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8274983: C1 optimizes the invocation of private interface methods

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6445/head:pull/6445
$ git checkout pull/6445

Update a local copy of the PR:
$ git checkout pull/6445
$ git pull https://git.openjdk.java.net/jdk pull/6445/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 6445

View PR using the GUI difftool:
$ git pr show -t 6445

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6445.diff

This patch allows c1 to generate the optimized virtual call for invokeinterface
whose targets are the private interface methods.

Before JDK-823835, LambdaMetaFactory generates invokespecial in this case. Because the private
interface methods can not be overrided, c1 generates the optimized virtual call. After JDK-823835,
LambdaMetaFactory generates invokeinterface instead. C1 generates the regular virtual call because
it can not recognize the new pattern. If a multiple of subclasses all implement a same interface,
it is possible that they trash the IC stub using their own concrete klass in runtime.

Optimized virtual call uses relocInfo::opt_virtual_call_type(3), It will call VM
'resolve_opt_virtual_call_C' once and resolve the target to the VEP of the nmethod.
Therefore, this patch can prevent the callsite from trashing.
@bridgekeeper
Copy link

bridgekeeper bot commented Nov 18, 2021

👋 Welcome back xliu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 18, 2021
@openjdk
Copy link

openjdk bot commented Nov 18, 2021

@navyxliu The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Nov 18, 2021
@mlbridge
Copy link

mlbridge bot commented Nov 18, 2021

Webrevs

@dean-long
Copy link
Member

dean-long commented Nov 20, 2021

It looks like this does what we want for private interface methods, but I'm wondering if we handle all combinations of private/final and invokevirtual/invokeinterface, or are we missing some cases where can_be_statically_bound() would return true?

@navyxliu
Copy link
Member Author

navyxliu commented Nov 20, 2021

hi, @dean-long,

I think C1 covers all cases as long as the target method is loaded. I have seen cases which target methods haven't been loaded in startup time, but they are rare.

ciMethod::can_be_statically_bound() return true if the method is private or final. The matrix shows the modifiers of target methods.

final private
invokevirtual 1 2
invokespecial N/A1 3
invokeinterface N/A2 4
  1. generates the optimized virtual call because x->target_is_final() is true.
  2. transforms to invokespecial here , then it will be case 3.
  3. generates the optimize virtual call because x->code() == Bytecodes::_invokespecial is true.
  4. is what this patch covers.

NA-1. I think it's impossible for javac. it would be an optimized virtual call like case 3 even it existed.
NA-2: it's an illegal modifier for an interface method.
https://docs.oracle.com/javase/specs/jls/se17/html/jls-9.html#jls-InterfaceMethodModifier

@dean-long
Copy link
Member

dean-long commented Nov 20, 2021

Thanks @navyxliu. I wonder if we can do 2) and 3) for invokeinterface, simplying the patch. Something like:

  // Some methods are obviously bindable without any type checks so
  // convert them directly to an invokespecial or invokestatic.
  if (target->is_loaded() && !target->is_abstract() && target->can_be_statically_bound()) {
    switch (bc_raw) {
    case Bytecodes::_invokevirtual:
    case Bytecodes::_invokeinterface:: // XXX add invokeinterface here
      code = Bytecodes::_invokespecial;
      break;

[...]

  // invoke-special-super
  if (code == Bytecodes::_invokespecial && !target->is_object_initializer()) { // XXX use "code" here
    ciInstanceKlass* sender_klass = calling_klass;
    if (sender_klass->is_interface()) {

[...]

What do you think?

@navyxliu navyxliu changed the title 8274983: Pattern.matcher performance regression after JDK-8238358 8274983: C1 optimizes the invocation of private interface methods Nov 24, 2021
@navyxliu
Copy link
Member Author

navyxliu commented Nov 24, 2021

hi, @dean-long ,
I update the PR. The new revision generate Invoke HIR with code == Invokespecial. I don't need to touch the LIR part, so I revert them.

I call c->set_incompatible_class_change_check() instead of set_invokespecial_receiver_check(). this will call a runtime to throw ICCError directly. Before that, it went to uncommon_trap and reinterpreted the bytecode.

src/hotspot/share/c1/c1_GraphBuilder.cpp Outdated Show resolved Hide resolved
Copy link
Member

@dean-long dean-long left a comment

As far as I can tell, this now matches what C2 does.

@dean-long
Copy link
Member

dean-long commented Nov 25, 2021

/reviewers 2

@openjdk
Copy link

openjdk bot commented Nov 25, 2021

@navyxliu This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8274983: C1 optimizes the invocation of private interface methods

Reviewed-by: dlong, iveresov

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 50 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 25, 2021
@openjdk
Copy link

openjdk bot commented Nov 25, 2021

@dean-long
The number of required reviews for this PR is now set to 2 (with at least 1 of role reviewers).

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Nov 25, 2021
@openjdk openjdk bot removed the rfr Pull request is ready for review label Nov 29, 2021
@navyxliu
Copy link
Member Author

navyxliu commented Nov 29, 2021

I add a JMH benchmark for this one.

$make test TEST='micro:org.openjdk.bench.vm.compiler.InterfacePrivateCalls' CONF=linux-x86_64-server-release

//master(7b2d823e)
Benchmark                                             Mode  Cnt   Score   Error  Units
InterfacePrivateCalls.invokePrivateInterfaceMethodC1  avgt    5  11399.058 ± 1086.745  ns/op
InterfacePrivateCalls.invokePrivateInterfaceMethodC2  avgt    5  23.741 ± 0.189  ns/op

// with this patch
Benchmark                                             Mode  Cnt   Score   Error  Units
InterfacePrivateCalls.invokePrivateInterfaceMethodC1  avgt    5  24.534 ± 0.130  ns/op
InterfacePrivateCalls.invokePrivateInterfaceMethodC2  avgt    5  23.800 ± 0.384  ns/op

The average cost of C1 has improved from 11399 ns/op to 24.534 ns/op, or 464x faster. now the cost of c1 is comparable to C2.

@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 29, 2021
Copy link
Contributor

@veresov veresov left a comment

Seems reasonable.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 29, 2021
@navyxliu
Copy link
Member Author

navyxliu commented Nov 30, 2021

/integrate

@openjdk
Copy link

openjdk bot commented Nov 30, 2021

Going to push as commit 21d9ca6.
Since your change was applied there have been 63 commits pushed to the master branch:

  • 98a9f03: 8277602: Deopt code does not extend the stack enough if the caller is an optimize entry blob
  • 9150840: 8277899: Parallel: Simplify PSVirtualSpace::initialize logic
  • 01cefc9: 8277977: Incorrect references to --enable-reproducible-builds in docs
  • 69f56a0: 8264485: build.tools.depend.Depend.toString(byte[]) creates malformed hex strings
  • fecf906: 8267928: Loop predicate gets inexact loop limit before PhaseIdealLoop::rc_predicate
  • a5f2a58: 8277846: Implement fast-path for ASCII-compatible CharsetEncoders on ppc64
  • ceae380: 8277843: [Vector API] scalar2vector generates incorrect type info for mask operations if Op_MaskAll is unavailable
  • 3ee26c6: 8267767: Redundant condition check in SafepointSynchronize::thread_not_running
  • d230fee: 8277931: Parallel: Remove unused PSVirtualSpace::expand_into
  • fde6fe7: 8277824: Remove empty RefProcSubPhasesWorkerTimeTracker destructor
  • ... and 53 more: https://git.openjdk.java.net/jdk/compare/7b2d823e842e6a66dbe46b048da44ca9e5485c75...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Nov 30, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 30, 2021
@openjdk
Copy link

openjdk bot commented Nov 30, 2021

@navyxliu Pushed as commit 21d9ca6.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
3 participants