Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8308614: Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 #16686

Closed
wants to merge 6 commits into from

Conversation

sspitsyn
Copy link
Contributor

@sspitsyn sspitsyn commented Nov 16, 2023

This is a fix of a performance/scalability related issue. The JvmtiThreadState objects for virtual thread filtered events enabled globally are created eagerly because it is needed when the interp_only_mode is enabled. Otherwise, some events which are generated in interp_only_mode from the debugging version of interpreter chunks can be missed.
However, it has to be okay to avoid eager creation of these object if no interp_only_mode has ever been requested.
It seems to be an extremely important optimization to create JvmtiThreadState objects lazily in such cases.
It is done by introducing the flag JvmtiThreadState::_seen_interp_only_mode which indicates when the JvmtiThreadState objects have to be created eagerly.

Additionally, the fix includes the following related changes:

  • Use condition double checking idiom for MutexLocker mu(JvmtiThreadState_lock) in the function JvmtiVTMSTransitionDisabler::VTMS_mount_end which is on a performance-critical path and looks like this:
  JvmtiThreadState* state = thread->jvmti_thread_state();
  if (state != nullptr && state->is_pending_interp_only_mode()) {
    MutexLocker mu(JvmtiThreadState_lock);
    state = thread->jvmti_thread_state();
    if (state != nullptr && state->is_pending_interp_only_mode()) {
      JvmtiEventController::enter_interp_only_mode();
    }
  }
  • Add extra check of JvmtiExport::can_support_virtual_threads() when virtual thread mount and unmount are posted.
  • Minor: Added a ThreadsListHandle to the JvmtiEventControllerPrivate::enter_interp_only_mode. It is needed because of the dynamic creation of compensating carrier threads which is racy for JVMTI SetEventNotificationMode implementation.

Performance mesurements:

  • Without this fix the test provided by the bug submitter gives execution numbers:

    • no ClassLoad events enabled: 3251 ms
    • ClassLoad events are enabled: 40534 ms
  • With the fix:

    • no ClassLoad events enabled: 3270 ms
    • ClassLoad events are enabled: 3385 ms

Testing:

  • Ran mach5 tiers 1-6, no regressions are noticed

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8308614: Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16686/head:pull/16686
$ git checkout pull/16686

Update a local copy of the PR:
$ git checkout pull/16686
$ git pull https://git.openjdk.org/jdk.git pull/16686/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16686

View PR using the GUI difftool:
$ git pr show -t 16686

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16686.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 16, 2023

👋 Welcome back sspitsyn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 16, 2023
@openjdk
Copy link

openjdk bot commented Nov 16, 2023

@sspitsyn The following labels will be automatically applied to this pull request:

  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Nov 16, 2023
@mlbridge
Copy link

mlbridge bot commented Nov 16, 2023

Webrevs

src/hotspot/share/prims/jvmtiThreadState.hpp Outdated Show resolved Hide resolved
src/hotspot/share/prims/jvmtiThreadState.cpp Outdated Show resolved Hide resolved
src/hotspot/share/prims/jvmtiThreadState.cpp Outdated Show resolved Hide resolved
src/hotspot/share/prims/jvmtiThreadState.hpp Outdated Show resolved Hide resolved
src/hotspot/share/prims/jvmtiEventController.cpp Outdated Show resolved Hide resolved
@dcubed-ojdk
Copy link
Member

dcubed-ojdk commented Nov 29, 2023

I'm going to resurrect the failing guarantee() code and part of the stack trace that was removed
and yack a bit about this code path.

Here's the location of the failing guarantee():

void Handshake::execute(HandshakeClosure* hs_cl, ThreadsListHandle* tlh, JavaThread* target) {
  . . .
  guarantee(target != nullptr, "must be");
  if (tlh == nullptr) {
    guarantee(Thread::is_JavaThread_protected_by_TLH(target),
              "missing ThreadsListHandle in calling context.");

and here's part of the stack trace that got us here:

V  [libjvm.so+0x117937d]  JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState*)+0x45d  (jvmtiEventController.cpp:402)
V  [libjvm.so+0x1179520]  JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState*) [clone .part.0]+0x190  (jvmtiEventController.cpp:632)
V  [libjvm.so+0x117a1e1]  JvmtiEventControllerPrivate::thread_started(JavaThread*)+0x351  (jvmtiEventController.cpp:1174)
V  [libjvm.so+0x117e608]  JvmtiExport::get_jvmti_thread_state(JavaThread*)+0x98  (jvmtiExport.cpp:424)
V  [libjvm.so+0x118a86c]  JvmtiExport::post_field_access(JavaThread*, Method*, unsigned char*, Klass*, Handle, _jfieldID*)+0x6c  (jvmtiExport.cpp:2214)

This must have been a stack trace from a build with some optimizations enabled because
when I look at last night's code base, I see 8 frames from the JvmtiExport::get_jvmti_thread_state()
call to Handshake::execute() with three params:

src/hotspot/share/prims/jvmtiExport.cpp:
    JvmtiExport::get_jvmti_thread_state(JavaThread *thread) {
      assert(thread == JavaThread::current(), "must be current thread");
      if (thread->is_vthread_mounted() && thread->jvmti_thread_state() == nullptr) {
        JvmtiEventController::thread_started(thread);
      }

The above code asserts that the thread parameter is the current thread so
we know that the calling thread is operating on itself.

src/hotspot/share/prims/jvmtiEventController.cpp
    JvmtiEventControllerPrivate::thread_started(JavaThread *thread) {
      assert(thread == Thread::current(), "must be current thread");
<snip>
      // if we have any thread filtered events globally enabled, create/update the thread state
      if (is_any_thread_filtered_event_enabled_globally()) { // intentionally racy
        JvmtiThreadState::state_for(thread);

The above code asserts that the thread parameter is the current thread so
we know that the calling thread is operating on itself. Note that we're calling
the single parameter version of JvmtiThreadState::state_for() here and in
that case the thread_handle parameter is Handle thread_handle = Handle().

src/hotspot/share/prims/jvmtiThreadState.inline.hpp
    inline JvmtiThreadState* JvmtiThreadState::state_for(JavaThread *thread, Handle thread_handle) {
      // In a case of unmounted virtual thread the thread can be null.
      JvmtiThreadState* state = thread_handle == nullptr ? thread->jvmti_thread_state() :
                                                    java_lang_Thread::jvmti_thread_state(thread_handle());
      if (state == nullptr) {
        MutexLocker mu(JvmtiThreadState_lock);
        // check again with the lock held
        state = state_for_while_locked(thread, thread_handle());
        JvmtiEventController::recompute_thread_filtered(state);

The above code grabs the JvmtiThreadState from the thread parameter and
passes that to the JvmtiEventController::recompute_thread_filtered() call.
We know that thread parameter is the current thread.

src/hotspot/share/prims/jvmtiEventController.cpp
    void
    JvmtiEventControllerPrivate::recompute_thread_filtered(JvmtiThreadState *state) {
<snip>
      if (is_any_thread_filtered_event_enabled_globally()) {
        JvmtiEventControllerPrivate::recompute_thread_enabled(state);

The above code is just a filter wrapper for calling JvmtiEventControllerPrivate::recompute_thread_enabled(.

src/hotspot/share/prims/jvmtiEventController.cpp
    jlong
    JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *state) {
<snip>
      // compute interp_only mode
      bool should_be_interp = (any_env_enabled & INTERP_EVENT_BITS) != 0 || has_frame_pops;
      bool is_now_interp = state->is_interp_only_mode();

      if (should_be_interp != is_now_interp) {
        if (should_be_interp) {
          enter_interp_only_mode(state);

The above code determines that the current thread needs to be in
interpreted mode so it calls enter_interp_only_mode(state).

src/hotspot/share/prims/jvmtiEventController.cpp
    void JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *state) {
<snip>
      JavaThread *target = state->get_thread();
      Thread *current = Thread::current();
<snip>
      if (target->is_handshake_safe_for(current)) {
        hs.do_thread(target);
      } else {
        assert(state->get_thread() != nullptr, "sanity check");
        Handshake::execute(&hs, target);

We know from our previous code analysis that the JvmtiThreadState *state
we were passed was fetched from the current thread. See JvmtiThreadState::state_for
above. So that target thread and the current should be the same thread.

Why does this check return false:

    if (target->is_handshake_safe_for(current)) {

which allows us to travel down this call: Handshake::execute(&hs, target)

src/hotspot/share/runtime/handshake.cpp
    void Handshake::execute(HandshakeClosure* hs_cl, JavaThread* target) {
      // tlh == nullptr means we rely on a ThreadsListHandle somewhere
      // in the caller's context (and we sanity check for that).
      Handshake::execute(hs_cl, nullptr, target);
    }

The two parameter version of Handshake::execute() is just a wrapper
that passed a nullptr for the ThreadsListHandle to the three parameter
version of Handshake::execute(). And that's how we get to the failing
guarantee()...

src/hotspot/share/runtime/handshake.cpp
    void Handshake::execute(HandshakeClosure* hs_cl, ThreadsListHandle* tlh, JavaThread* target) {
      JavaThread* self = JavaThread::current();
      HandshakeOperation op(hs_cl, target, Thread::current());

      jlong start_time_ns = os::javaTimeNanos();

      guarantee(target != nullptr, "must be");
      if (tlh == nullptr) {
        guarantee(Thread::is_JavaThread_protected_by_TLH(target),
                  "missing ThreadsListHandle in calling context.");
        target->handshake_state()->add_operation(&op);

One possible fix for the guarantee is this version:

        guarantee(self == target || Thread::is_JavaThread_protected_by_TLH(target),
                  "missing ThreadsListHandle in calling context.");

However, that ignores why this code in JvmtiEventControllerPrivate::enter_interp_only_mode
returned false:

    if (target->is_handshake_safe_for(current)) {

when we have these local variable values:

      JavaThread *target = state->get_thread();
      Thread *current = Thread::current();
src/hotspot/share/runtime/javaThread.hpp
      // A JavaThread can always safely operate on it self and other threads
      // can do it safely if they are the active handshaker.
      bool is_handshake_safe_for(Thread* th) const {
        return _handshake.active_handshaker() == th || this == th;
      }

It seems to me that this portion of the logic: this == th should have
returned true and not false. What am I missing here?

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Nov 30, 2023

@dcubed-ojdk
Thank you for the analysis. I agree with it. It is why I've removed this stack trace and posted my statement:

I've removed a part of this comment with stack traces as my traps were not fully correct, need to double check everything. > This issue is not well reproducible but I'm still trying to reproduce it again.

Initially, I added a trap to catch the issue with the JVMTI SetEventNotificationMode related code path and mistakenly thought that I've caught it. But it was a different code path that you just described. So. I've removed this part of my comment as it was wrong. Then I tried to reproduce the code path I wanted to catch but had no luck to reproduce it.
Now, I'm going to remove this line of code from my PR we are discussing.
I'm suggesting to investigate the issue with guarantee separately from this PR.
Is it okay with you? (I'm sorry, if my previous comments were not clear.)

Just to comment on this code:

      if (target->is_handshake_safe_for(current)) {
        hs.do_thread(target);
      } else {
        assert(state->get_thread() != nullptr, "sanity check");
        Handshake::execute(&hs, target);

The condition target->is_handshake_safe_for(current) should return true for current thread, so this statement has to be executed: hs.do_thread(target);

Also, I had a general note (kind of unrelated to the line we discussed) about this guarantee:

void Handshake::execute(HandshakeClosure* hs_cl, ThreadsListHandle* tlh, JavaThread* target) {
  . . .
  guarantee(target != nullptr, "must be");
  if (tlh == nullptr) {
    guarantee(Thread::is_JavaThread_protected_by_TLH(target),
              "missing ThreadsListHandle in calling context.");

I think, this fragment should account that the target passed as argument can be current.
Should it be extended to check for current thread and avoid guarantee firing in such a case?

@dcubed-ojdk
Copy link
Member

Removing the TLH is the right thing to do. If we get that failure mode again, then we
can file a new bug and hopefully have an hs_err_pid file with it.

I don't think we should change the guarantee() in Handshake::execute(). When the
three parameter version of execute() is called with tlh == nullptr, the caller is
saying that there is supposed to be a ThreadsListHandle in the calling context. Yes,
if the target thread is the calling thread, then a ThreadsListHandle is not needed,
but that's why we have this code to prevent the call to Handshake::execute():

      if (target->is_handshake_safe_for(current)) {
        hs.do_thread(target);

In other words, I think Handshake::execute() is working the way it is supposed to
when tlh == nullptr is passed.

if (JvmtiThreadState::seen_interp_only_mode()) {
JvmtiEventController::thread_started(thread);
}
if (JvmtiExport::should_post_vthread_start()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this check: if (JvmtiExport::can_support_virtual_threads()) { removed?

Copy link
Contributor Author

@sspitsyn sspitsyn Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because the JvmtiExport::should_post_vthread_start() returns true only if the JvmtiExport::can_support_virtual_threads() returns true.

The JVMTI SetEventNotificationMode() function checks for a required event capability by using the JvmtiUtil::has_event_capability() function that is defined in the jvmtiEnter.cpp generated with the jvmtiEnter.xsl script.

It looks like this:

// Check Event Capabilities
bool JvmtiUtil::has_event_capability(jvmtiEvent event_type, const jvmtiCapabilities* capabilities_ptr) {
  switch (event_type) {
    . . .
    case JVMTI_EVENT_VIRTUAL_THREAD_START:
      return capabilities_ptr->can_support_virtual_threads != 0;
    case JVMTI_EVENT_VIRTUAL_THREAD_END:
      return capabilities_ptr->can_support_virtual_threads != 0;
    . . .

src/hotspot/share/prims/jvmtiThreadState.cpp Show resolved Hide resolved
@@ -231,6 +231,8 @@ class JvmtiThreadState : public CHeapObj<mtInternal> {
inline JvmtiEnvThreadState* head_env_thread_state();
inline void set_head_env_thread_state(JvmtiEnvThreadState* ets);

static bool _seen_interp_only_mode; // interp_only_mode was requested once
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps: s/requested once/requested at least once/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you - fixed now.

src/hotspot/share/prims/jvmtiThreadState.cpp Outdated Show resolved Hide resolved
@sspitsyn
Copy link
Contributor Author

Removing the TLH is the right thing to do. If we get that failure mode again, then we
can file a new bug and hopefully have an hs_err_pid file with it.

Agreed.

@sspitsyn
Copy link
Contributor Author

I don't think we should change the guarantee() in Handshake::execute(). When the
three parameter version of execute() is called with tlh == nullptr, the caller is
saying that there is supposed to be a ThreadsListHandle in the calling context. Yes,
if the target thread is the calling thread, then a ThreadsListHandle is not needed,
but that's why we have this code to prevent the call to Handshake::execute():
if (target->is_handshake_safe_for(current)) {
hs.do_thread(target);

In other words, I think Handshake::execute() is working the way it is supposed to
when tlh == nullptr is passed.

Just to share my view...
It is a little bit ugly to do it for each call site.
The Handshake::execute() can do it instead, so its call sites could be simplified.
BTW, it is done in the JvmtiHandshake::execute() and one can find it to be convenient.

@dcubed-ojdk
Copy link
Member

It is a little bit ugly to do it for each call site.
The Handshake::execute() can do it instead, so its call sites could be simplified.
BTW, it is done in the JvmtiHandshake::execute() and one can find it to be convenient.

Agreed, but it is intentional that direct uses of the three parameter
version of Handshake::execute() with tlh == nullptr require that
the caller understands the calling context. That means knowing where
your ThreadsListHandle is and not getting away with not having one
when the caller is current thread.

Higher layer callers of Handshake::execute() like JvmtiHandshake::execute()
or JvmtiEventControllerPrivate::enter_interp_only_mode() can make that
less strict choice if they choose to.

Copy link
Member

@dcubed-ojdk dcubed-ojdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the comments.

@openjdk
Copy link

openjdk bot commented Nov 30, 2023

@sspitsyn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8308614: Enabling JVMTI ClassLoad event slows down vthread creation by factor 10

Reviewed-by: dcubed, cjplummer, amenkov

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 222 new commits pushed to the master branch:

  • 92f7e51: 8312098: Update man page for javadoc
  • f6be7fd: 8321059: Unneeded array assignments in MergeCollation and CompactByteArray
  • d568562: 8320798: Console read line with zero out should zero out underlying buffer
  • 3087e14: 8320807: [PPC64][ZGC] C1 generates wrong code for atomics
  • 54957ac: 8312560: Annotation on Decomposed Record Component in Enhanced For Loop Fails Compilation
  • 3b30095: 8321130: Microbenchmarks do not build any more after 8254693 on 32 bit platforms
  • 8f1d40b: 8321114: Rename "Unnamed Classes" to "Implicitly Declared Classes" better
  • ecd335d: 8321107: Add more test cases for JDK-8319372
  • 2476be4: 8320716: ResolvedModule::reads includes self when configuration contains two or more automatic modules
  • 4ba94ef: 8320145: Compiler should accept final variable in Record Pattern
  • ... and 212 more: https://git.openjdk.org/jdk/compare/faeea07fe5d27e0c18c26f99705cc552e5ab9bdc...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 30, 2023
@sspitsyn
Copy link
Contributor Author

Thank you for review, Dan!

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Dec 1, 2023

Chris and Alex, thank you for review!

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Dec 1, 2023

/integrate

@openjdk
Copy link

openjdk bot commented Dec 1, 2023

Going to push as commit 42af8ce.
Since your change was applied there have been 223 commits pushed to the master branch:

  • 1839433: 8320941: Discuss receiver type handling
  • 92f7e51: 8312098: Update man page for javadoc
  • f6be7fd: 8321059: Unneeded array assignments in MergeCollation and CompactByteArray
  • d568562: 8320798: Console read line with zero out should zero out underlying buffer
  • 3087e14: 8320807: [PPC64][ZGC] C1 generates wrong code for atomics
  • 54957ac: 8312560: Annotation on Decomposed Record Component in Enhanced For Loop Fails Compilation
  • 3b30095: 8321130: Microbenchmarks do not build any more after 8254693 on 32 bit platforms
  • 8f1d40b: 8321114: Rename "Unnamed Classes" to "Implicitly Declared Classes" better
  • ecd335d: 8321107: Add more test cases for JDK-8319372
  • 2476be4: 8320716: ResolvedModule::reads includes self when configuration contains two or more automatic modules
  • ... and 213 more: https://git.openjdk.org/jdk/compare/faeea07fe5d27e0c18c26f99705cc552e5ab9bdc...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 1, 2023
@openjdk openjdk bot closed this Dec 1, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 1, 2023
@openjdk
Copy link

openjdk bot commented Dec 1, 2023

@sspitsyn Pushed as commit 42af8ce.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@sspitsyn sspitsyn deleted the b10 branch January 23, 2024 01:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
5 participants