Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8323782: Race: Thread::interrupt vs. AbstractInterruptibleChannel.begin #17444

Conversation

reinrich
Copy link
Member

@reinrich reinrich commented Jan 16, 2024

Set interrupted in Thread::interrupt before reading nioBlocker for correct (Dekker scheme) synchronization with concurrent execution of AbstractInterruptibleChannel::begin.

The change passed our CI functional testing: JTReg tests: tier1-4 of hotspot and jdk. All of Langtools and jaxp. SPECjvm2008, SPECjbb2015, Renaissance Suite, and SAP specific tests.
Testing was done with fastdebug and release builds on the main platforms and also on Linux/PPC64le and AIX.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8323782: Race: Thread::interrupt vs. AbstractInterruptibleChannel.begin (Bug - P4)

Reviewers

Contributors

  • Alan Bateman <alanb@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17444/head:pull/17444
$ git checkout pull/17444

Update a local copy of the PR:
$ git checkout pull/17444
$ git pull https://git.openjdk.org/jdk.git pull/17444/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17444

View PR using the GUI difftool:
$ git pr show -t 17444

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17444.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 16, 2024

👋 Welcome back rrich! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 16, 2024

@reinrich The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Jan 16, 2024
@reinrich
Copy link
Member Author

Thanks for the test @AlanBateman
/contributor add @AlanBateman

@openjdk
Copy link

openjdk bot commented Jan 16, 2024

@reinrich
Contributor Alan Bateman <alanb@openjdk.org> successfully added.

@reinrich
Copy link
Member Author

The new test LotsOfInterrupts.java hangs after a few repetitions (using jtreg's REPEAT_COUNT). With the fix it always terminates successfully.

@reinrich reinrich marked this pull request as ready for review January 17, 2024 09:01
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 17, 2024
@mlbridge
Copy link

mlbridge bot commented Jan 17, 2024

Webrevs

Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding this issue. It duplicates on JDK 8, and looking at the ancient history, it looks like it's been there since JDK 1.4 but just not noticed or diagnosed.

Reading the nioBlocker after setting the interrupt status is good.

There are about 20 tests for async interrupt of I/O ops in tier2. I see you've run tier1-4 so you've run them.

// Write interrupted before reading nioBlocker for correct synchronization.
interrupted = true;
interrupt0(); // inform VM of interrupt
if (this != Thread.currentThread()) {
// thread may be blocked in an I/O operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the existing comment "thread may be blocked in an I/O operation" can move to before the if statement so that it's a bit clearer than this is code for I/O operations.

@@ -1702,24 +1702,23 @@ public final void stop() {
public void interrupt() {
if (this != Thread.currentThread()) {
checkAccess();

}
// Write interrupted before reading nioBlocker for correct synchronization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer if this said that it sets the interrupt status, and must be done before reading the nioBlocker.

@openjdk
Copy link

openjdk bot commented Jan 17, 2024

@reinrich This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8323782: Race: Thread::interrupt vs. AbstractInterruptibleChannel.begin

Co-authored-by: Alan Bateman <alanb@openjdk.org>
Reviewed-by: alanb, dholmes

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 3 new commits pushed to the master branch:

  • a0e5e16: 8325162: Remove duplicate GCMParameters class
  • 0d51b76: 8325877: Split up NativeCompilation.gmk
  • 2b1a840: 8325860: Serial: Move Generation.java to serial folder

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 17, 2024
@reinrich
Copy link
Member Author

Thanks for your help Alan.

@@ -0,0 +1,91 @@
/*
* Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you should credit SAP here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alan wrote the test. I added him as contributor.

Comment on lines +1708 to +1709
interrupted = true;
interrupt0(); // inform VM of interrupt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is really safe/correct to move this outside the synchronized block? I know things have changed a bit with loom but we've "always" held a lock when doing the actual interrupt. I'd have to check the VM logic to be sure it can be called concurrently from multiple threads for the same target thread.

Copy link
Contributor

@AlanBateman AlanBateman Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is really safe/correct to move this outside the synchronized block? I know things have changed a bit with loom but we've "always" held a lock when doing the actual interrupt. I'd have to check the VM logic to be sure it can be called concurrently from multiple threads for the same target thread.

This hasn't changed. The interruptLock is used to coordinate the add/remove of the nioBlocker. When there is no nioBlocker set then the interrupt status and unparking (as in JavaThread::interrupt) has always executed without the interruptLock (named "blockerLock" in the past).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that interrupting is just asynchronous to some extent.
E.g. a thread polls its interrupt status clearing it thereby (without lock) before calling nio. A concurrent interrupt can be lost then even if the lock is acquired.
(Maybe clearing should not be done by a public method)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep my bad on the VM side of things - no change there. But in the nioBlocker case doesn't this inherently make things more racy? Now maybe those races are allowed, but this might lead to a change in behaviour.

Copy link
Contributor

@AlanBateman AlanBateman Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep my bad on the VM side of things - no change there. But in the nioBlocker case doesn't this inherently make things more racy? Now maybe those races are allowed, but this might lead to a change in behaviour.

I/O threads always check their interrupt status after installing the nioBlocker. The interrupter and the I/O thread (can be several) may race calling postInterrupt but that is okay.

@reinrich
Copy link
Member Author

I noticed that VirtualThread overrides isInterrupted

@Override
public boolean isInterrupted() {
return interrupted;
}

with just the same implementation as Thread has:

/**
* Tests whether this thread has been interrupted. The <i>interrupted
* status</i> of the thread is unaffected by this method.
*
* @return {@code true} if this thread has been interrupted;
* {@code false} otherwise.
* @see #interrupted()
*/
public boolean isInterrupted() {
return interrupted;
}

This potentially hinders performance. Is there a reason to have this override?

@AlanBateman
Copy link
Contributor

I noticed that VirtualThread overrides isInterrupted
Is there a reason to have this override?

It was necessary at one point but no reason to now except to keep it close at the source level with the other methods that access the interrupt status.

@reinrich
Copy link
Member Author

I noticed that VirtualThread overrides isInterrupted
Is there a reason to have this override?

It was necessary at one point but no reason to now except to keep it close at the source level with the other methods that access the interrupt status.

Do you want to keep it then or should I open an RFE to remove it?

@AlanBateman
Copy link
Contributor

Do you want to keep it then or should I open an RFE to remove it?

Maybe leave it for now as there is significant churn in this code right now, with more accumulating in the loom repo in advance of the monitors work.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for leaving this hanging. Change looks good.

Thanks

@reinrich
Copy link
Member Author

Thanks for helping and reviewing.
Richard.
/integrate

@openjdk
Copy link

openjdk bot commented Feb 16, 2024

Going to push as commit 4018b2b.
Since your change was applied there have been 15 commits pushed to the master branch:

  • 2705ed0: 8325074: ZGC fails assert(index == 0 || is_power_of_2(index)) failed: Incorrect load shift: 11
  • 3d85103: 8316813: NMT: Using WhiteBox API, virtual memory tracking should also be stressed in JMH tests
  • ba8db1f: 8325876: crashes in docker container tests on Linuxppc64le Power8 machines
  • 18cea82: 8319801: Recursive lightweight locking: aarch64 implementation
  • 9029bf6: 8316451: 6 java/lang/instrument/PremainClass tests ignore VM flags
  • 99c9ae1: 8323664: java/awt/font/JNICheck/FreeTypeScalerJNICheck.java still fails with JNI warning on some Windows configurations
  • 0fdfdf7: 8325983: Build failure after JDK-8324580
  • 3b1062d: 8322239: [macos] a11y : java.lang.NullPointerException is thrown when focus is moved on the JTabbedPane
  • 5a988a5: 8322750: Test "api/java_awt/interactive/SystemTrayTests.html" failed because A blue ball icon is added outside of the system tray
  • a231706: 8324580: SIGFPE on THP initialization on kernels < 4.10
  • ... and 5 more: https://git.openjdk.org/jdk/compare/b718ae35a87e5696cd6d26952ab1f7d3fda27691...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 16, 2024
@openjdk openjdk bot closed this Feb 16, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 16, 2024
@openjdk
Copy link

openjdk bot commented Feb 16, 2024

@reinrich Pushed as commit 4018b2b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@reinrich reinrich deleted the 8323782__Race__Thread__interrupt_vs__AbstractInterruptibleChannel_begin branch February 19, 2024 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants