Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8281615: Deadlock caused by jdwp agent #7461

Closed
wants to merge 6 commits into from

Conversation

zhengyu123
Copy link
Contributor

@zhengyu123 zhengyu123 commented Feb 14, 2022

There are scenarios that JDWP agent can deadlock on classTrackLock monitor. Following is the scenario in bug report.

Java Thread

  • loads a class and post JVMTI_EVENT_CLASS_PREPARE event
  • JDWP event callback handler calls classTrack_processUnloads() to handle the event.
  • classTrack_processUnloads() takes classTrackLock lock, then tries to allocate a new bag under the lock.
  • bag allocation code calls jvmtiAllocate(), which may be blocked by ongoing safepoint due to state transition.

If the safepoint is GC safepoint (prior to JDK16) or VM_JvmtiPostObjectFree safepoint (JDK16 or later)

VM Thread

  • post JVMTI_EVENT_OBJECT_FREE
  • JDWP event callback handler calls cbTrackingObjectFree() to handle the event
  • cbTrackingObjectFree() tries to acquire classTrackLock lock, leads to deadlock

From my research, there are three events that may be posted at safepoints, JVMTI_EVENT_GARBAGE_COLLECTION_START, JVMTI_EVENT_GARBAGE_COLLECTION_FINISH and JVMTI_EVENT_OBJECT_FREE, but only JVMTI_EVENT_OBJECT_FREE is relevant to JDWP agent.

The solution I purpose here, is simply move allocation/deallocation code outside of classTrackLock lock.

Test:

  • tier1
  • vmTestbase_nsk_jdi
  • vmTestbase_nsk_jdwp
  • vmTestbase_nsk_jvmti

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7461/head:pull/7461
$ git checkout pull/7461

Update a local copy of the PR:
$ git checkout pull/7461
$ git pull https://git.openjdk.java.net/jdk pull/7461/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 7461

View PR using the GUI difftool:
$ git pr show -t 7461

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7461.diff

Zhengyu Gu added 2 commits Feb 12, 2022
@bridgekeeper
Copy link

bridgekeeper bot commented Feb 14, 2022

👋 Welcome back zgu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Feb 14, 2022
@openjdk
Copy link

openjdk bot commented Feb 14, 2022

@zhengyu123 The following label will be automatically applied to this pull request:

  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the serviceability label Feb 14, 2022
@mlbridge
Copy link

mlbridge bot commented Feb 14, 2022

Webrevs

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Hi Zhengyu,

I don't think the fix is correct - see below.

Thanks,
David

@@ -88,8 +96,18 @@ classTrack_processUnloads(JNIEnv *env)
return NULL;
}
struct bag* deleted = deletedSignatures;
deletedSignatures = bagCreateBag(sizeof(char*), 10);
deletedSignatures = NULL;
debugMonitorExit(classTrackLock);
Copy link
Member

@dholmes-ora dholmes-ora Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks risky as the critical section is broken and the NULL deleted signatures exposed. If cbTrackingObjectFree occurs while this is true then you will lose the record of the deleted signature.

Alternatively you could allow for lock-free reading of deletedSignatures, preemptively allocate a new bad if needed then take the lock. Or even use the lock to read deletedSignatures to determine if a new bag is needed, then drop the lock, create the bag, take the lock and re-check everything.

Copy link
Contributor

@plummercj plummercj Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -194,8 +212,9 @@ classTrack_initialize(JNIEnv *env)
void
classTrack_activate(JNIEnv *env)
{
struct bag* new_bag = bagCreateBag(sizeof(char*), 1000);
Copy link
Member

@dholmes-ora dholmes-ora Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there can be any race during activation but this change is harmless.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure classTrack_activate() has the problem, but it is called outside of initialization, inside debug loop ... so I assume that event callback might happen.

Copy link
Contributor

@plummercj plummercj Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This code is called when the debug agent receives a CLASS_UNLOAD EventRequest (which the debug agent handles by requesting JVMTI GC_FINISH events). So there is nothing that prevents classTrack_processUnloads() from being called at the same time.

Copy link
Member

@dholmes-ora dholmes-ora Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case the classTrack_processUnloads is racing with the classTrack_activate call. I would not expect it to be possible to generate an event in relation to this before things have been "activated". But moving on ...

Copy link
Member

@dholmes-ora dholmes-ora Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I see from later comments that in fact no such race exists, but we have to synchronize with cbTrackingObjectFree

Copy link
Member

@dholmes-ora dholmes-ora Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bah! But then a comment after that says it does exist again.

There seem to be issues understanding exactly what the concurrent call sequences can be with this code.

Copy link
Contributor

@plummercj plummercj Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might be confusing classTrack_activate() with classTrack_reset(). I believe the former has a race but the latter does not.

Copy link
Contributor

@plummercj plummercj Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and even with the race, I still I think it is safe for a new bagSize(deletedSignatures) == 0 check to be added outside the lock. The only thing classTrack_activate() can do to deletedSignatures is set it when it is null. There is no risk of deletedSignatures being deleted and reallocated by another thread once you are in classTrack_processUnloads(). This is because it is called while holding the handlerLock, and as I explained elsewhere, by the time classTrack_reset() is called, it's no longer possible to be in classTrack_processUnloads().

bagEnumerateOver(deletedSignatures, cleanDeleted, NULL);
bagDestroyBag(deletedSignatures);
to_delete = deletedSignatures;
Copy link
Member

@dholmes-ora dholmes-ora Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanDeleted also calls jvmtiDeallocate, so either both need to be outside the lock or neither.

It is really unclear to me how many threads can be involved here and which functions can be called when.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Moved bagEnumerateOver() outside of lock as well.

@plummercj
Copy link
Contributor

plummercj commented Feb 15, 2022

Slight correction to the following in your description:

JDWP event callback handler calls classTrack_processUnloads() to handle the event.

classTrack_processUnloads() isn't called to handle the event. It's called as a side affect of handling the event.

/*
* Note: jvmtiAllocate/jvmtiDeallocate() may be blocked by ongoing safepoints.
* It is dangerous to call them (via bagCreateBag/bagDestroyBag())while holding monitor(s),
* because jvmti may post events, e.g. JVMTI_EVENT_OBJECT_FREE at safepoints and its event
* handler may acquire the same monitor(s), e.g. classTrackLock in cbTrackingObjectFree(),
* which can lead to deadlock.
*/
Copy link
Contributor

@plummercj plummercj Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug agent does a lot alloc/dealloc calls while processing JVMTI events and holding locks. So the question is why this is not problematic other than this classTrackLock issue. I believe the answer is that cbTrackingObjectFree() is unique in that it is handled outside of the debug agent's normal event handling code, which is serialized. cbTrackingObjectFree() is called so the debug agent can maintain a list of loaded classes, and is not an event that gets passed on to the debugger like most JVMTI events are. So we have a case here where classTrackLock can be grabbed by both "typical" JVMTI event handling code via the classTrack_processUnloads() call, and then this special cbTrackingObjectFree() event handling.

I think having this comment here doesn't help the reader of the code below unless they somehow read the comment first and then recognized it's application in the code below. At the very least the code below should tersely state while the lock is being released, and then refer to this comment for details.

Copy link
Contributor Author

@zhengyu123 zhengyu123 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved comments inside classTrack_processUnloads(), let me know if you prefer other places.

@@ -88,8 +96,18 @@ classTrack_processUnloads(JNIEnv *env)
return NULL;
}
struct bag* deleted = deletedSignatures;
deletedSignatures = bagCreateBag(sizeof(char*), 10);
deletedSignatures = NULL;
debugMonitorExit(classTrackLock);
Copy link
Contributor

@plummercj plummercj Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

struct bag* deleted = NULL;
jboolean retry = JNI_FALSE;
do {
// Avoid unnecessary allocations when class track has yet been activated.
Copy link
Contributor

@plummercj plummercj Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment is accurate. It's not that we want to avoid unnecessary allocations. We want to avoid unnecessary tracking of loaded classes.

}
}
debugMonitorExit(classTrackLock);
} while (retry == JNI_TRUE);
Copy link
Contributor

@plummercj plummercj Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the retry is necessary. Nor is the check for deletedSignatures != NULL. I assume you are trying to cover the case where initially deletedSignatures was NULL (so no bag was allocated), but is not NULL by the time you grab the lock. If you return when NULL like I suggested above, you won't have this issue. Note that if not NULL, there is no way another thread can get into this same code in clear it. This is because all callers first grab the handlerLock. Grabbing of the classTrackLock here is only done to synchronize with cbTrackingObjectFree(), not with other callers of classTrack_processUnloads().

Copy link
Contributor Author

@zhengyu123 zhengyu123 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried that classTrack_activate() races against classTrack_processUnloads() to set deletedSignatures, and you are right, it is also protected under handlerLock lock. It simplifies a code a lot.

jboolean retry = JNI_FALSE;
do {
// Avoid unnecessary allocations when class track has yet been activated.
if (deletedSignatures != NULL) {
Copy link
Contributor

@plummercj plummercj Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now kind of hard to read what used to be the deletedSignatures == NULL case. I think you can first just check for NULL outside the lock and then return. It will make it clear that it is very low overhead in this case.

}
debugMonitorExit(classTrackLock);
} while (retry == JNI_TRUE);
bagDestroyBag(new_bag);
Copy link
Contributor

@plummercj plummercj Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other comment about classTrack_processUnloads() in general that also applies to the original version. It seems pretty inefficient. Every time the debug agent gets an event it is called. On almost ever call it is returning and empty back and allocating a new one. It seems a check for bagSize(deletedSignatures) == 0 and returning NULL if true would help performance. I also believe this can be done outside of the lock (would like David's opinion on this).

Copy link
Contributor Author

@zhengyu123 zhengyu123 Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it is possible, but checking bagSize(deletedSignatures) == 0 seems to race against classTrack_reset() where it does not take handlerLock lock.

Copy link
Contributor

@plummercj plummercj Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it is possible, but checking bagSize(deletedSignatures) == 0 seems to race against classTrack_reset() where it does not take handlerLock lock.

I had thought of that too, but I think the way classTrack_reset() is called, it is likely not possible for there to be a classTrack_processUnloads() also coming in because everything is shut down:

    threadControl_onDisconnect();
    standardHandlers_onDisconnect();

    /*
     * Cut off the transport immediately. This has the effect of
     * cutting off any events that the eventHelper thread might
     * be trying to send.
     */
    transport_close();
    debugMonitorDestroy(cmdQueueLock);

    /* Reset for a new connection to this VM if it's still alive */
    if ( ! gdata->vmDead ) {
        debugInit_reset(getEnv());    <--- calls classTrack_reset()
    }

Copy link
Contributor Author

@zhengyu123 zhengyu123 Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shutting down debug loop does not seem to have effect on ongoing jvmti callback, e.g. thread 4 in the bug report.

Copy link
Contributor

@plummercj plummercj Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thread 4 callback is triggered due to the debugger having requested CLASS_UNLOAD events. This is shutdown by eventHandler_reset(), which is called by debugInit_reset() before it calls classTrack_reset(). So that means by the time we get to classTrack_reset(), the thread 4 callback is no longer possible.

@@ -194,8 +212,9 @@ classTrack_initialize(JNIEnv *env)
void
classTrack_activate(JNIEnv *env)
{
struct bag* new_bag = bagCreateBag(sizeof(char*), 1000);
Copy link
Contributor

@plummercj plummercj Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This code is called when the debug agent receives a CLASS_UNLOAD EventRequest (which the debug agent handles by requesting JVMTI GC_FINISH events). So there is nothing that prevents classTrack_processUnloads() from being called at the same time.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

The correctness here depends on exactly how these different functions can be called. I would have expected the allowed sequence to be:
activate -> 1or more processUnloads -> reset

but given processUnloads can somehow occur concurrently with activate, there is obviously more at play here. The main question for me is whether reset can be called more than once and whether it too can be called when processUnloads is still executing? If yes then there are still problems.

@plummercj
Copy link
Contributor

plummercj commented Feb 17, 2022

processUnloads is continuously called on every event. It's purpose is to process pending CLASS_UNLOAD events that have accumulated since the last JVMTI event. This is a rather hacky solution. JVMTI does not actually send a CLASS_UNLOAD event. It doesn't support them. What it does support is OBJECT_FREE events. So when the debugger has enabled CLASS_UNLOAD events, the debug agent enables OBJECT_FEE events. It tags each j.l.Class instance so it will get the OBJECT_FREE event for them, and when the event comes in, the Class gets added to deletedSignatures so the agent can later notify the debugger about it.

As a side note, this OBJECT_FREE solution is somewhat new. The old solution was to always do an "allClasses" during initialization, and then diff that with "allClasses" every time there was a full GC to see which classes unloaded. I wonder now if there isn't a reason we just don't send the CLASS_UNLOAD when the OBJECT_FREE is received. It would probably greatly simplify classTrack.c, and will fix an outstanding bug we have where sometimes there is a lengthy delay before CLASS_UNLOAD events are sent. This is because once there is a GC, they events are not sent until the next JVMTI event. It's actually possible there would never be one if the only thing the debugger is requesting is CLASS_UNLOAD events.

Anyway, back to processUnloads synchronization. The reason for synchronization between processUnload and activate is because one thread could be handling some random JVMTI event (maybe a breakpoint) and is calling processUnloads, and at the same time another thread is handling a CLASS_UNLOAD event request from the debugger, and calling activate. But now that I think of it, this should be harmless even without synchronization. The processUnloads thread will see the allocated deletedSignatures if activate has gotten that far, otherwise it won't. It should behave correctly in either case. I'm not convince that activate even needs synchronization. The purpose of the classTrack synchronization is to prevent one thread in processUnloads from grabbing and using deletedSignatures, only to have another thread in processUnloads do the same, and delete deletedSignatures in the process, leaving the 1st thread holding onto a deallocated pointer. However, even this should not be possible because all callers to processUnloads have already grabbed the handlerLock.

What I'm tempted to say here is just have this PR keep all the synchronization and I'll file a CR to possibly do a lot of cleanup here. Either remove (maybe all of) the synchronization, or maybe even implement sending the CLASS_UNLOAD when the OBJECT_FREE comes in, which will make the synchronization question moot, and also fix that other bug I mentioned.

BTW, one other bit of nonsense. classTrack_activate() does:

deletedSignatures = bagCreateBag(sizeof(char*), 1000);

This will be deleted the very first time an event comes in, even if no classes were unloaded, and processUnloads replaces it with:

deletedSignatures = bagCreateBag(sizeof(char*), 10);

@plummercj
Copy link
Contributor

plummercj commented Feb 17, 2022

Actually a bit of a correction to my last statement above. processUnloads is only called if there has been a GC since the last event came in. This also reduces the need to optimize processUnloads for the case where deletedSignatures is NULL or is empty.

@dholmes-ora
Copy link
Member

dholmes-ora commented Feb 17, 2022

Thanks for the additional explanation @plummercj. I agree with your suggestion to move this forward and revisit.

I suspect the issue with directly issuing the CLASS_UNLOAD event is due to when, and on what thread, the OBJECT_FREE event is issued.

@zhengyu123
Copy link
Contributor Author

zhengyu123 commented Feb 17, 2022

Thanks for the additional explanation @plummercj. I agree with your suggestion to move this forward and revisit.

I suspect the issue with directly issuing the CLASS_UNLOAD event is due to when, and on what thread, the OBJECT_FREE event is issued.

I believe OBJECT_FREE event is always issued from VMThread, at a GC safepoint prior to JDK16 or a dedicated safepoint post.

@zhengyu123
Copy link
Contributor Author

zhengyu123 commented Feb 17, 2022

The correctness here depends on exactly how these different functions can be called. I would have expected the allowed sequence to be: activate -> 1or more processUnloads -> reset

but given processUnloads can somehow occur concurrently with activate, there is obviously more at play here. The main question for me is whether reset can be called more than once and whether it too can be called when processUnloads is still executing? If yes then there are still problems.

classTrack_activate() is called inside debugLoop_run(), so it looks like that jvmti callback is enabled at the time.

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libjdwp.so+0xf0ea]  classTrack_activate+0x3a
C  [libjdwp.so+0x166dd]  installHandler+0xdd
C  [libjdwp.so+0x618c]  setCommand+0x17c
C  [libjdwp.so+0x13cd9]  debugLoop_run+0x299
C  [libjdwp.so+0x275e4]  attachThread+0x54
V  [libjvm.so+0x12be611]  JvmtiAgentThread::call_start_function()+0x181
V  [libjvm.so+0x1a9e496]  JavaThread::thread_main_inner()+0x5c6
V  [libjvm.so+0x1aa7100]  Thread::call_run()+0x100
V  [libjvm.so+0x1666814]  thread_native_entry(Thread*)+0x104

@zhengyu123
Copy link
Contributor Author

zhengyu123 commented Feb 17, 2022

processUnloads is continuously called on every event. It's purpose is to process pending CLASS_UNLOAD events that have accumulated since the last JVMTI event. This is a rather hacky solution. JVMTI does not actually send a CLASS_UNLOAD event. It doesn't support them. What it does support is OBJECT_FREE events. So when the debugger has enabled CLASS_UNLOAD events, the debug agent enables OBJECT_FEE events. It tags each j.l.Class instance so it will get the OBJECT_FREE event for them, and when the event comes in, the Class gets added to deletedSignatures so the agent can later notify the debugger about it.

As a side note, this OBJECT_FREE solution is somewhat new. The old solution was to always do an "allClasses" during initialization, and then diff that with "allClasses" every time there was a full GC to see which classes unloaded. I wonder now if there isn't a reason we just don't send the CLASS_UNLOAD when the OBJECT_FREE is received. It would probably greatly simplify classTrack.c, and will fix an outstanding bug we have where sometimes there is a lengthy delay before CLASS_UNLOAD events are sent. This is because once there is a GC, they events are not sent until the next JVMTI event. It's actually possible there would never be one if the only thing the debugger is requesting is CLASS_UNLOAD events.

Anyway, back to processUnloads synchronization. The reason for synchronization between processUnload and activate is because one thread could be handling some random JVMTI event (maybe a breakpoint) and is calling processUnloads, and at the same time another thread is handling a CLASS_UNLOAD event request from the debugger, and calling activate. But now that I think of it, this should be harmless even without synchronization. The processUnloads thread will see the allocated deletedSignatures if activate has gotten that far, otherwise it won't. It should behave correctly in either case. I'm not convince that activate even needs synchronization. The purpose of the classTrack synchronization is to prevent one thread in processUnloads from grabbing and using deletedSignatures, only to have another thread in processUnloads do the same, and delete deletedSignatures in the process, leaving the 1st thread holding onto a deallocated pointer. However, even this should not be possible because all callers to processUnloads have already grabbed the handlerLock.

What I'm tempted to say here is just have this PR keep all the synchronization and I'll file a CR to possibly do a lot of cleanup here. Either remove (maybe all of) the synchronization, or maybe even implement sending the CLASS_UNLOAD when the OBJECT_FREE comes in, which will make the synchronization question moot, and also fix that other bug I mentioned.

BTW, one other bit of nonsense. classTrack_activate() does:

deletedSignatures = bagCreateBag(sizeof(char*), 1000);

This will be deleted the very first time an event comes in, even if no classes were unloaded, and processUnloads replaces it with:

deletedSignatures = bagCreateBag(sizeof(char*), 10);

I believe checking bagSize() == 0 is an optimization. I think just taking handlerLock lock before calling classTrack_reset() ensures no race can happen and seems harmless.

@plummercj
Copy link
Contributor

plummercj commented Feb 17, 2022

I suspect the issue with directly issuing the CLASS_UNLOAD event is due to when, and on what thread, the OBJECT_FREE event is issued.

Yes, I think there is an issue trying to send all the accumulated CLASS_UNLOAD events when the GC_FINISH arrives because it arrives on a VM thread, not a java thread. The debug agent may suspend whichever thread sent the JDWP event to the debugger, and it will stay suspended until the debugger tells the debug agent to resume the thread. I say "may suspend" because it depends on the suspend policy of the EventRequest. It will suspend if the policy is SUSPEND_THREAD or SUSPEND_ALL, but not for SUSPEND_NONE. So we have two problems then. The VM thread is suspended, and also the debugger sees the VM thread as the event thread (only java threads should be event threads). I assume OBJECT_FREE also arrives on a VM thread and would have the same issue if it triggered sending the CLASS_UNLOAD event right away.

To fix the issue of having to wait for the next event before sending the CLASS_UNLOAD events (and it sometimes being a long time in coming), the debug agent could enable a JVMTI event it knows will be triggered right away, such as MethodEntry or MethodExit (these might be the only two candidates). This at least would ensure triggering the calling of processUnloads right way.

@plummercj
Copy link
Contributor

plummercj commented Feb 17, 2022

I believe checking bagSize() == 0 is an optimization. I think just taking handlerLock lock before calling classTrack_reset() ensures no race can happen and seems harmless.

Yes, it would be an optimization, and one I'm now seeing is not really needed since processUnloads() is not called as frequently as I thought (only called after a GC completes).

The grabbing of the handlerLock is questionable. Kim also called it out in JDK-8256811:

I think this can be fixed by removing the conditionalization of the call to classTrack_processUnloads, and just do it unconditionally. There is a (somewhat confusing) comment saying that the conditionalization is there to avoid taking the handler lock when that isn't needed. But it's not clear why that lock is being taken on that code path at all, as classTrack_processUnloads contains it's own internal locking.

@openjdk
Copy link

openjdk bot commented Feb 17, 2022

@zhengyu123 This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8281615: Deadlock caused by jdwp agent

Reviewed-by: dholmes, cjplummer

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 142 new commits pushed to the master branch:

  • 2557ef8: 8282276: Problem list failing two Robot Screen Capture tests
  • 6445ee4: 5041655: (ch) FileLock: negative param and overflow issues
  • 7feabee: 8261407: ReflectionFactory.checkInitted() is not thread-safe
  • 58e1882: 8282042: [testbug] FileEncodingTest.java depends on default encoding
  • 3cb3867: 8281315: Unicode, (?i) flag and backreference throwing IndexOutOfBounds Exception
  • 957dae0: 8280958: G1/Parallel: Unify marking code structure
  • e44d067: 8244593: Clean up GNM/NM after JEP 381
  • 41355e2: 8276686: Malformed Javadoc inline tags in JDK source in /java/util/regex/Pattern.java
  • 022d807: 8271008: appcds/*/MethodHandlesAsCollectorTest.java tests time out because of excessive GC (CodeCache GC Threshold) in loom
  • ab6d8e6: 8260328: Drop redundant CSS properties from java.desktop HTML files
  • ... and 132 more: https://git.openjdk.java.net/jdk/compare/c5c8c0644d9442846de15422285fffeb91c3e0a1...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Feb 17, 2022
@plummercj
Copy link
Contributor

plummercj commented Feb 17, 2022

classTrack_activate() is called inside debugLoop_run(), so it looks like that jvmti callback is enabled at the time.

Yes. This is due to the debugger sending a JDWP EventRequest command for CLASS_UNLOAD events. At that point class tracking is activated and the debug agent enables OBJECT_FREE events.

@zhengyu123
Copy link
Contributor Author

zhengyu123 commented Feb 22, 2022

@plummercj Thanks for the review.

@dholmes-ora Are you okay with current version? Thanks.

@dholmes-ora
Copy link
Member

dholmes-ora commented Feb 22, 2022

I thought from:
#7461 (comment)
there was srtill an outstanding issue with the handlerLock ? Though @plummercj seems to have approved anyway.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Changes seem reasonable given the discussion.

Thanks,
David

/* Allocate new bag outside classTrackLock lock to avoid deadlock.
*
* Note: jvmtiAllocate/jvmtiDeallocate() may be blocked by ongoing safepoints.
* It is dangerous to call them (via bagCreateBag/bagDestroyBag())while holding monitor(s),
Copy link
Member

@dholmes-ora dholmes-ora Feb 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: need space after ))

Copy link
Contributor Author

@zhengyu123 zhengyu123 Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks, @dholmes-ora

@plummercj
Copy link
Contributor

plummercj commented Feb 22, 2022

I thought from:
#7461 (comment)
there was srtill an outstanding issue with the handlerLock ? Though @plummercj seems to have approved anyway.

That's was just part of the discussion as to how much locking is really needed, and we agreed not to address those issues in this PR. So the code is (and already was) probably over synchronized, but that's ok and maybe we can address it at a later point.

@zhengyu123
Copy link
Contributor Author

zhengyu123 commented Feb 23, 2022

/integrate

@openjdk
Copy link

openjdk bot commented Feb 23, 2022

Going to push as commit e1060be.
Since your change was applied there have been 143 commits pushed to the master branch:

  • 6f882de: 8280964: [Linux aarch64] : drawImage dithers TYPE_BYTE_INDEXED images incorrectly
  • 2557ef8: 8282276: Problem list failing two Robot Screen Capture tests
  • 6445ee4: 5041655: (ch) FileLock: negative param and overflow issues
  • 7feabee: 8261407: ReflectionFactory.checkInitted() is not thread-safe
  • 58e1882: 8282042: [testbug] FileEncodingTest.java depends on default encoding
  • 3cb3867: 8281315: Unicode, (?i) flag and backreference throwing IndexOutOfBounds Exception
  • 957dae0: 8280958: G1/Parallel: Unify marking code structure
  • e44d067: 8244593: Clean up GNM/NM after JEP 381
  • 41355e2: 8276686: Malformed Javadoc inline tags in JDK source in /java/util/regex/Pattern.java
  • 022d807: 8271008: appcds/*/MethodHandlesAsCollectorTest.java tests time out because of excessive GC (CodeCache GC Threshold) in loom
  • ... and 133 more: https://git.openjdk.java.net/jdk/compare/c5c8c0644d9442846de15422285fffeb91c3e0a1...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated label Feb 23, 2022
@openjdk openjdk bot closed this Feb 23, 2022
@openjdk openjdk bot removed ready rfr labels Feb 23, 2022
@openjdk
Copy link

openjdk bot commented Feb 23, 2022

@zhengyu123 Pushed as commit e1060be.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@zhengyu123 zhengyu123 deleted the JDK-8281615-jdwp branch Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated serviceability
3 participants