Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8252871: fatal error: must own lock JvmtiThreadState_lock #60

Closed
wants to merge 1 commit into from

Conversation

@robehn
Copy link
Contributor

@robehn robehn commented Sep 7, 2020

When these two methods (set_frame_pop/clear_frame_pop) are called in a handshake the requesting thread will have lock the JvmtiThreadState_lock.
But the thread executing one of these in the handshake may not be the owner.
So we only check that JvmtiThreadState_lock is locked.

When verifying the callers to these methods I notice "clear_to_frame_pop" was unused, so instead of fixing it I remove it.

Passes testing locally, still running T3 and T7.

Now passed!


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issues

  • JDK-8252871: fatal error: must own lock JvmtiThreadState_lock
  • JDK-8252816: JvmtiEnvThreadState::clear_to_frame_pop() is not used

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/60/head:pull/60
$ git checkout pull/60

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Sep 7, 2020

👋 Welcome back rehn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Sep 7, 2020

@robehn The following labels will be automatically applied to this pull request: hotspot serviceability.

When this pull request is ready to be reviewed, an RFR email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label (add|remove) "label" command.

Loading

@robehn robehn marked this pull request as ready for review Sep 7, 2020
@openjdk openjdk bot added the rfr label Sep 7, 2020
@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 7, 2020

Webrevs

Loading

Copy link
Member

@YaSuenag YaSuenag left a comment

Thanks for catching up this. Looks good.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Sep 7, 2020

@robehn This change now passes all automated pre-integration checks. In addition to the automated checks, the change must also fulfill all project specific requirements

After integration, the commit message will be:

8252871: fatal error: must own lock JvmtiThreadState_lock
8252816: JvmtiEnvThreadState::clear_to_frame_pop() is not used

Reviewed-by: ysuenaga, dholmes
  • If you would like to add a summary, use the /summary command.
  • To credit additional contributors, use the /contributor command.
  • To add additional solved issues, use the /issue command.

Since the source branch of this PR was last updated there have been 13 commits pushed to the master branch:

  • bf5da0c: 8252897: Minor .jcheck/conf update
  • 7600274: 8252859: Inconsistent use of alpha in class AbsSeq
  • 4fb1980: 8252853: AArch64: gc/shenandoah/TestVerifyJCStress.java fails intermittently with C1
  • 73ba3ae: 8252500: ZGC on aarch64: Unable to allocate heap for certain Linux kernel configurations
  • 5dd1ead: 8252767: URLConnection.setRequestProperty throws IllegalAccessError
  • 2cceeed: 8166554: Avoid compilation blocking in OverloadCompileQueueTest.java
  • 188b0bc: 8252868: Clean up unused function from G1MMUTracker
  • 891886b: 8252887: Zero VM is broken after JDK-8252661
  • 7686e87: 8250968: Symlinks attributes not preserved when using jarsigner on zip files
  • 8d6d43c: 8251193: bin/idea.sh is generating wrong folder definitions for JVMCI modules
  • ... and 3 more: https://git.openjdk.java.net/jdk/compare/e0c8d4420c8e1a84581927cf77314498b8e5aa52...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid automatic rebasing, please merge master into your branch, and then specify the current head hash when integrating, like this: /integrate bf5da0c778f2cbc8c1b8568ca16c4bf51f7972b5.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Loading

@openjdk openjdk bot added the ready label Sep 7, 2020
@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 8, 2020

Mailing list message from David Holmes on hotspot-dev:

Hi Robbin,

On 8/09/2020 4:56 am, Robbin Ehn wrote:

When these two methods (set_frame_pop/clear_frame_pop) are called in a handshake the requesting thread will have lock
the JvmtiThreadState_lock. But the thread executing one of these in the handshake may not be the owner.

Ouch! We continue to get bitten by the fact the executor of a handshake
operation is not known a-priori. :(

So we only check that JvmtiThreadState_lock is locked.

That avoids the problem but it is not really a sufficient check - we
need to know that it is the handshaker thread that owns the lock and
that it acquired it purely for the purpose of this handshake operation!

More importantly this issue shows that the locking code is in fact
incorrect. We are on very dangerous ground if we are going to allow
locks to be "proxied" in this fashion! This is akin to reintroducing
"sneaky locks"!

If a handshake operation can be executed by either the target thread or
the handshaker thread, and locks are required, then they will have to be
acquired in the context of the handshake operation. If we can't do that
because of potential deadlocks in the handshake mechanism then I feel
that handshakes have a serious flaw that needs to be rectified. Perhaps
we need a way to force the operation to be executed by the handshaker so
that they can setup the right execution environment prior to the handshake?

When verifying the callers to these methods I notice "clear_to_frame_pop" was unused, so instead of fixing it I remove
it.

There is already a bug for that:

https://bugs.openjdk.java.net/browse/JDK-8252816

so if this proceeds you should add that issue to the PR.

Thanks,
David
-----

Loading

@robehn
Copy link
Contributor Author

@robehn robehn commented Sep 8, 2020

Hi David,

When these two methods (set_frame_pop/clear_frame_pop) are called in a handshake the requesting thread will have lock
the JvmtiThreadState_lock. But the thread executing one of these in the handshake may not be the owner.

Ouch! We continue to get bitten by the fact the executor of a handshake
operation is not known a-priori. :(

We could add this. But I would prefer not doing it in this bug fix.

So we only check that JvmtiThreadState_lock is locked.

That avoids the problem but it is not really a sufficient check - we
need to know that it is the handshaker thread that owns the lock and
that it acquired it purely for the purpose of this handshake operation!

Right now we know that because I traced the code :)

As I said I can make the requesting thread known to execution thread,
but I would really prefer not doing that in this bug fix.

More importantly this issue shows that the locking code is in fact
incorrect. We are on very dangerous ground if we are going to allow
locks to be "proxied" in this fashion! This is akin to reintroducing
"sneaky locks"!

A problem is that we use a safepoiting global lock to protect per thread resource.
The way we get around this with safepoints is:
"assert_locked_or_safepoint" (assert may be passed by a non-JavaThread or JavaThread in blocked/native without lock during a safepoint (safepoint + VM thread would fix that))
Which have this in it:
// see if invoker of VM operation owns it
VM_Operation* op = VMThread::vm_operation();
if (op != NULL && op->calling_thread() == lock->owner()) return;

So we already have proxied locks and in other cases just avoiding them (by just checking if at safepoint).

And as I said a few times now, I said I can make the requesting thread known to execution thread,
but I would really prefer not doing that in this bug fix.

If a handshake operation can be executed by either the target thread or
the handshaker thread, and locks are required, then they will have to be
acquired in the context of the handshake operation. If we can't do that
because of potential deadlocks in the handshake mechanism then I feel
that handshakes have a serious flaw that needs to be rectified. Perhaps
we need a way to force the operation to be executed by the handshaker so
that they can setup the right execution environment prior to the handshake?

The problem is that it's a safepointing lock.
Only executing the handshake by requester is an interesting idea, I'll investigate that.

When verifying the callers to these methods I notice "clear_to_frame_pop" was unused, so instead of fixing it I remove
it.

There is already a bug for that:

https://bugs.openjdk.java.net/browse/JDK-8252816

so if this proceeds you should add that issue to the PR.

Ok, I'll do that.

Thanks for having a look, let me know howto proceed.

Thanks,
David

Loading

@robehn
Copy link
Contributor Author

@robehn robehn commented Sep 8, 2020

/issue add JDK-8252816

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Sep 8, 2020

@robehn
Adding additional issue to issue list: 8252816: JvmtiEnvThreadState::clear_to_frame_pop() is not used.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 8, 2020

Mailing list message from Robbin Ehn on hotspot-dev:

Hi David, our mail service did the wrong thing here.

This was a reply to your comment, so you should have been in To-field,
this quote above "Hi David" should not have been here, and this was a
reply your mail (via github comment).

Thanks, Robbin

On 2020-09-08 09:39, Robbin Ehn wrote:

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 8, 2020

Mailing list message from David Holmes on hotspot-dev:

Hi Robbin,

On 8/09/2020 5:39 pm, Robbin Ehn wrote:

Hi David,

When these two methods (set_frame_pop/clear_frame_pop) are called in a handshake the requesting thread will have lock
the JvmtiThreadState_lock. But the thread executing one of these in the handshake may not be the owner.

Ouch! We continue to get bitten by the fact the executor of a handshake
operation is not known a-priori. :(

We could add this. But I would prefer not doing it in this bug fix.

Okay we can put in the current proposed fix to get the test working
again. Please file a new bug to fix the underlying locking problem.

So we only check that JvmtiThreadState_lock is locked.

That avoids the problem but it is not really a sufficient check - we
need to know that it is the handshaker thread that owns the lock and
that it acquired it purely for the purpose of this handshake operation!

Right now we know that because I traced the code :)

I'm sure you get my point though. :)

As I said I can make the requesting thread known to execution thread,
but I would really prefer not doing that in this bug fix.

More importantly this issue shows that the locking code is in fact
incorrect. We are on very dangerous ground if we are going to allow
locks to be "proxied" in this fashion! This is akin to reintroducing
"sneaky locks"!

A problem is that we use a safepoiting global lock to protect per thread resource.
The way we get around this with safepoints is:
"assert_locked_or_safepoint" (assert may be passed by a non-JavaThread or JavaThread in blocked/native without lock
during a safepoint (safepoint + VM thread would fix that)) Which have this in it:
// see if invoker of VM operation owns it
VM_Operation* op = VMThread::vm_operation();
if (op != NULL && op->calling_thread() == lock->owner()) return;

So we already have proxied locks and in other cases just avoiding them (by just checking if at safepoint).

Okay ... we know that the VMThread and safepoint VMops are special, and
always have been special, and people absolutely loathe the things we did
in supporting that special-ness and we (well you, Erik, Patricio, Dan)
have been working hard to get rid of a lot of that special code. So I
can't accept an argument that it is okay to take VMThread/VMop special
behaviour (e.g. lock proxying) and now spread it around to make other
special cases.

And as I said a few times now, I said I can make the requesting thread known to execution thread,
but I would really prefer not doing that in this bug fix.

Understood.

I have a general concern with locking in relation to handshakes, that we
cannot actually get rid of all the special handling that pertained to
safepoints (safepoint-check-always/never, lock ranking) just because we
now use handshakes. We have the same kinds of deadlock concerns if a
handshake operation tries to take a lock and might block the thread. So
same problem as safepoints and locks but no supporting code to try and
help us with that.

So if we cannot simply grab the necessary locks as part of the handshake
operation, then we need a way to ensure locking correctness prior to the
op - and the simplest seems to be that the handshaker does the necessary
locking and the handshake mechanism allows us to ensure the handshaker
executes the operation not the target.

Thanks,
David
-----

Loading

@robehn
Copy link
Contributor Author

@robehn robehn commented Sep 8, 2020

Hi David,

Okay we can put in the current proposed fix to get the test working
again. Please file a new bug to fix the underlying locking problem.

I created https://bugs.openjdk.java.net/browse/JDK-8252902
Please feel free to add sub-task, edit or change if I missed anything.

Thanks, Robbin

So we only check that JvmtiThreadState_lock is locked.

That avoids the problem but it is not really a sufficient check - we
need to know that it is the handshaker thread that owns the lock and
that it acquired it purely for the purpose of this handshake operation!

Right now we know that because I traced the code :)

I'm sure you get my point though. :)

As I said I can make the requesting thread known to execution thread,
but I would really prefer not doing that in this bug fix.

More importantly this issue shows that the locking code is in fact
incorrect. We are on very dangerous ground if we are going to allow
locks to be "proxied" in this fashion! This is akin to reintroducing
"sneaky locks"!

A problem is that we use a safepoiting global lock to protect per thread resource.
The way we get around this with safepoints is:
"assert_locked_or_safepoint" (assert may be passed by a non-JavaThread or JavaThread in blocked/native without lock
during a safepoint (safepoint + VM thread would fix that)) Which have this in it:
// see if invoker of VM operation owns it
VM_Operation* op = VMThread::vm_operation();
if (op != NULL && op->calling_thread() == lock->owner()) return;
So we already have proxied locks and in other cases just avoiding them (by just checking if at safepoint).

Okay ... we know that the VMThread and safepoint VMops are special, and
always have been special, and people absolutely loathe the things we did
in supporting that special-ness and we (well you, Erik, Patricio, Dan)
have been working hard to get rid of a lot of that special code. So I
can't accept an argument that it is okay to take VMThread/VMop special
behaviour (e.g. lock proxying) and now spread it around to make other
special cases.

And as I said a few times now, I said I can make the requesting thread known to execution thread,
but I would really prefer not doing that in this bug fix.

Understood.

I have a general concern with locking in relation to handshakes, that we
cannot actually get rid of all the special handling that pertained to
safepoints (safepoint-check-always/never, lock ranking) just because we
now use handshakes. We have the same kinds of deadlock concerns if a
handshake operation tries to take a lock and might block the thread. So
same problem as safepoints and locks but no supporting code to try and
help us with that.

So if we cannot simply grab the necessary locks as part of the handshake
operation, then we need a way to ensure locking correctness prior to the
op - and the simplest seems to be that the handshaker does the necessary
locking and the handshake mechanism allows us to ensure the handshaker
executes the operation not the target.

Thanks,
David

Loading

@robehn
Copy link
Contributor Author

@robehn robehn commented Sep 8, 2020

/integrate

Loading

@openjdk openjdk bot closed this Sep 8, 2020
@openjdk openjdk bot added integrated and removed ready rfr labels Sep 8, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Sep 8, 2020

@robehn Since your change was applied there have been 13 commits pushed to the master branch:

  • bf5da0c: 8252897: Minor .jcheck/conf update
  • 7600274: 8252859: Inconsistent use of alpha in class AbsSeq
  • 4fb1980: 8252853: AArch64: gc/shenandoah/TestVerifyJCStress.java fails intermittently with C1
  • 73ba3ae: 8252500: ZGC on aarch64: Unable to allocate heap for certain Linux kernel configurations
  • 5dd1ead: 8252767: URLConnection.setRequestProperty throws IllegalAccessError
  • 2cceeed: 8166554: Avoid compilation blocking in OverloadCompileQueueTest.java
  • 188b0bc: 8252868: Clean up unused function from G1MMUTracker
  • 891886b: 8252887: Zero VM is broken after JDK-8252661
  • 7686e87: 8250968: Symlinks attributes not preserved when using jarsigner on zip files
  • 8d6d43c: 8251193: bin/idea.sh is generating wrong folder definitions for JVMCI modules
  • ... and 3 more: https://git.openjdk.java.net/jdk/compare/e0c8d4420c8e1a84581927cf77314498b8e5aa52...master

Your commit was automatically rebased without conflicts.

Pushed as commit 704f784.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Loading

@robehn robehn deleted the 8252871-wrong-assert branch Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants