Skip to content

8336384: AbstractQueuedSynchronizer.acquire should cancel acquire when failing due to a LinkageError or other errors #20548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

viktorklang-ora
Copy link
Contributor

@viktorklang-ora viktorklang-ora commented Aug 12, 2024


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8336384: AbstractQueuedSynchronizer.acquire should cancel acquire when failing due to a LinkageError or other errors (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20548/head:pull/20548
$ git checkout pull/20548

Update a local copy of the PR:
$ git checkout pull/20548
$ git pull https://git.openjdk.org/jdk.git pull/20548/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20548

View PR using the GUI difftool:
$ git pr show -t 20548

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20548.diff

Webrev

Link to Webrev Comment

@viktorklang-ora
Copy link
Contributor Author

@DougLea @AlanBateman FYI

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 12, 2024

👋 Welcome back vklang! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Aug 12, 2024

@viktorklang-ora This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8336384: AbstractQueuedSynchronizer.acquire should cancel acquire when failing due to a LinkageError or other errors

Reviewed-by: alanb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 146 new commits pushed to the master branch:

  • 73f7a5f: 8338155: Fix -Wzero-as-null-pointer-constant warnings involving PTHREAD_MUTEX_INITIALIZER
  • c27a8c8: 8338124: C2 SuperWord: MulAddS2I input permutation still partially broken after JDK-8333840
  • 73ddb7d: 8335628: C2 SuperWord: cleanup: remove SuperWord::longer_type_for_conversion
  • d77e6fe: 8338154: Fix -Wzero-as-null-pointer-constant warnings in gtest framework
  • e70c9bc: 8338248: PartialArrayStateAllocator::Impl leaks Arena array
  • 5079c38: 8338160: Fix -Wzero-as-null-pointer-constant warnings in management.cpp
  • 4417c27: 8330535: Update nsk/jdb tests to use driver instead of othervm
  • b93b74e: 8338060: jdk/internal/util/ReferencedKeyTest should be more robust
  • 41e31d6: 8337622: IllegalArgumentException in java.lang.reflect.Field.get
  • 2ca136a: 8337815: Relax G1EvacStats atomic operations
  • ... and 136 more: https://git.openjdk.org/jdk/compare/156f0b4332bf076165898417cf6678d2fc32df5c...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Aug 12, 2024

@viktorklang-ora The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added core-libs core-libs-dev@openjdk.org rfr Pull request is ready for review labels Aug 12, 2024
@mlbridge
Copy link

mlbridge bot commented Aug 12, 2024

Webrevs

LockSupport.parkNanos(this, nanos);
else
break;
} catch (Error ex) { // rethrow VM errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of me things this should be Throwable to allow for possible runtime exceptions when timed-parking virtual threads. There is a separate work needed to ensure a runtime exception is never throw but it would at least cancel here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or use (Error | RuntimeException ex)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be okay too as there are no checked exceptions here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlanBateman If we really want it airtight I think we need to catch Throwable and sneaky-rethrow it? 🤔

LockSupport.parkNanos(this, nanos);
else
break;
} catch (Error | RuntimeException ex) {
Copy link
Contributor Author

@viktorklang-ora viktorklang-ora Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DougLea @AlanBateman Changed to Error | RuntimeException here and for AQS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catch (Throwable ex) would be consistent with the similar block at line 331. Though I'm unclear how that compiles without the method declaring throws Throwable ??

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I'm unclear how that compiles without the method declaring throws Throwable ??

It wouldn't need that because of precise rethrow. In any case, having Error and RuntimeException are okay.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been a while since I knew this code reasonably well so perhaps I have just forgotten this difference between AQS and built-in monitors, but it seems that a Condition.await can return by throwing an exception without re-acquiring the associated synchronizer. Or is that handled at a higher-level?

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 13, 2024
@AlanBateman
Copy link
Contributor

It has been a while since I knew this code reasonably well so perhaps I have just forgotten this difference between AQS and built-in monitors, but it seems that a Condition.await can return by throwing an exception without re-acquiring the associated synchronizer. Or is that handled at a higher-level?

The semantics are the same as monitor wait/notify so Condition.await must guarantee to hold the lock when it returns. If ConditionNode.block were to throw something like StackOverflowError then there would be an issue (it's a different park to the one changed in this PR but I think you do have a good point).

@viktorklang-ora
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Aug 13, 2024

Going to push as commit fbe4cc9.
Since your change was applied there have been 148 commits pushed to the master branch:

  • ba69ed7: 8338202: Shenandoah: Improve handshake closure labels
  • 5bf2709: 8334475: UnsafeIntrinsicsTest.java#ZGenerationalDebug assert(!assert_on_failure) failed: Has low-order bits set
  • 73f7a5f: 8338155: Fix -Wzero-as-null-pointer-constant warnings involving PTHREAD_MUTEX_INITIALIZER
  • c27a8c8: 8338124: C2 SuperWord: MulAddS2I input permutation still partially broken after JDK-8333840
  • 73ddb7d: 8335628: C2 SuperWord: cleanup: remove SuperWord::longer_type_for_conversion
  • d77e6fe: 8338154: Fix -Wzero-as-null-pointer-constant warnings in gtest framework
  • e70c9bc: 8338248: PartialArrayStateAllocator::Impl leaks Arena array
  • 5079c38: 8338160: Fix -Wzero-as-null-pointer-constant warnings in management.cpp
  • 4417c27: 8330535: Update nsk/jdb tests to use driver instead of othervm
  • b93b74e: 8338060: jdk/internal/util/ReferencedKeyTest should be more robust
  • ... and 138 more: https://git.openjdk.org/jdk/compare/156f0b4332bf076165898417cf6678d2fc32df5c...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Aug 13, 2024
@openjdk openjdk bot closed this Aug 13, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 13, 2024
@openjdk
Copy link

openjdk bot commented Aug 13, 2024

@viktorklang-ora Pushed as commit fbe4cc9.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@dholmes-ora
Copy link
Member

It has been a while since I knew this code reasonably well so perhaps I have just forgotten this difference between AQS and built-in monitors, but it seems that a Condition.await can return by throwing an exception without re-acquiring the associated synchronizer. Or is that handled at a higher-level?

The semantics are the same as monitor wait/notify so Condition.await must guarantee to hold the lock when it returns. If ConditionNode.block were to throw something like StackOverflowError then there would be an issue (it's a different park to the one changed in this PR but I think you do have a good point).

AFAICS await would call the acquire method that was changed here. I know we have issues with OOME and SOE, but these changes admit more general exception possibilities that would seem to undermine the required guarantee. But perhaps a different acquire is involved in the await case?

@AlanBateman
Copy link
Contributor

AFAICS await would call the acquire method that was changed here. I know we have issues with OOME and SOE, but these changes admit more general exception possibilities that would seem to undermine the required guarantee. But perhaps a different acquire is involved in the await case?

The changes here just cancel the attempt to acquire before throwing. There is more to Condition.awaitXXX in that they park until they can acquire, then acquire. So SOE or some other resource error would mean awaitXXX throws without holding the lock. We have done some work to on the virtual thread implementation of LockSupport.park/parkNanos/unpark so they never fail with OOME. There is a bit more to come there but otherwise not clear how to deal with other resource errors.

@dholmes-ora
Copy link
Member

When we have Catch (Error | RuntimeException ex) exactly what RuntimeExceptions are we trying to account for - because they should not happen and may break the synchronizer if they do.

@AlanBateman
Copy link
Contributor

When we have Catch (Error | RuntimeException ex) exactly what RuntimeExceptions are we trying to account for - because they should not happen and may break the synchronizer if they do.

Right now, the catch is just defending against LockSupport.park or parkNanos throwing. A timed-park on a virtual thread requires queueing a timer task and first use can involve class loading, initialisation and running more code that is obvious. We are trying to harden this as much as this as possible to never throw OOME. SOE is a lost cause. REE is possible right now and we have to do more to prevent that. Instrumentation brings a truck load of possible issues as some wild agent could instrument core classes and cause all manner of exceptions and issues. All we are doing here is just cancelling the acquire when throwing. It's just too surprising, and hard to diagnose, to throw and leaving a node queued.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants