Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8309688: Data race on java.io.ClassCache$CacheRef.strongReferent #14386

Closed
wants to merge 6 commits into from

Conversation

caoman
Copy link
Contributor

@caoman caoman commented Jun 8, 2023

Hi all,

Could anyone review this small fix for a data race in java.io.ClassCache$CacheRef? This fix makes the code safer by making the code data-race-free.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8309688: Data race on java.io.ClassCache$CacheRef.strongReferent (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/14386/head:pull/14386
$ git checkout pull/14386

Update a local copy of the PR:
$ git checkout pull/14386
$ git pull https://git.openjdk.org/jdk.git pull/14386/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 14386

View PR using the GUI difftool:
$ git pr show -t 14386

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/14386.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 8, 2023

👋 Welcome back manc! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 8, 2023
@openjdk
Copy link

openjdk bot commented Jun 8, 2023

@caoman The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Jun 8, 2023
@caoman
Copy link
Contributor Author

caoman commented Jun 8, 2023

The pre-submit failures for windows are due to https://bugs.openjdk.org/browse/CODETOOLS-7903482.

@mlbridge
Copy link

mlbridge bot commented Jun 8, 2023

Webrevs

@caoman
Copy link
Contributor Author

caoman commented Jun 9, 2023

The presubmit failure on linux-x86 looks unrelated to this PR. It is https://bugs.openjdk.org/browse/JDK-8309592.

@AlanBateman
Copy link
Contributor

Could anyone review this small fix for a data race in java.io.ClassCache$CacheRef? This fix makes the code safer by making the code data-race-free.

If I read the existing code correctly, ClassCache::get can't return null. If there are two threads racing then it's okay for both to read a non-null strong ref, one read non-null and the other null, or both read null (if someone else cleared it). It just means there may be fallback to the soft ref or if that gets cleared then the remove/re-compute the ClassValue. Maybe more eyes are needed but it could be that this TSAN is reporting a false positive here.

@RogerRiggs
Copy link
Contributor

I think the race is benign also but the change maintains the essential semantics.
The use of AtomicRef will have a very slight performance impact.

The semantic is that at-least-one caller gets the reference before it can become "only" soft-referenced and might be reclaimed.

The use of AtomicReference changes the semantic to be exactly-one caller gets the initial hard reference; other callers will be able to get the value from the soft-reference as long as the first caller holds the reference and perhaps beyond depending on memory pressure.

ClassCaches are used by serialization to provide ObjectStreamClass descriptors.
It is quite possible that multiple threads may be serializing the same classes.

@caoman
Copy link
Contributor Author

caoman commented Jun 9, 2023

Agree with the analysis that this race looks benign, and the slight change in semantics is OK. It is still a true data race and TSAN is NOT reporting a false positive though.

In our experience, the best practice is to avoid all data races unless there's a strong reason that fixing the race has undesirable effect (e.g. noticeable performance penalty). A benign race could become problematic in the future with unsuspecting changes to this file, and has higher maintenance cost in general.

I hope we can actually fix this data race. If we decide not to, at least add a comment about this intentionally benign data race.

@RogerRiggs
Copy link
Contributor

/reviewers 2 reviewer

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.

@openjdk
Copy link

openjdk bot commented Jun 12, 2023

@caoman This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8309688: Data race on java.io.ClassCache$CacheRef.strongReferent

Reviewed-by: rriggs, shade

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 11 new commits pushed to the master branch:

  • 3e0bbd2: 8285368: Overhaul doc-comment inheritance
  • 3eeb681: 8167252: Some of Charset.availableCharsets() does not contain itself
  • 653a8d0: 8310129: SetupNativeCompilation LIBS should match the order of the other parameters
  • 947f149: 8308444: LoadStoreNode::result_not_used() is too conservative
  • 8b4af46: 8309974: some JVMCI tests fail when VM options include -XX:+EnableJVMCI
  • 0038491: 8309978: [x64] Fix useless padding
  • 5f3613e: 8309960: ParallelGC young collections very slow in DelayInducer
  • 83d9267: 8303513: C2: LoadKlassNode::make fails with 'expecting TypeKlassPtr'
  • de8aca2: 8307907: [ppc] Remove RTM locking implementation
  • 4c0e164: 8309717: C2: Remove Arena::move_contents usage
  • ... and 1 more: https://git.openjdk.org/jdk/compare/181845ae46157a9bb3bf8e2a328fa59eddc0273a...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 12, 2023
@openjdk
Copy link

openjdk bot commented Jun 12, 2023

@RogerRiggs
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jun 12, 2023
Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this data race is benign, because ClassValue gives us the required memory semantics. Introducing additional performance hogs seems weird in this case, see below.

Since TSAN complains about the unsynchronized conflicting accesses in getStrong() and clearStrong(), maybe the better solution would be marking strong as volatile? It would still be awkward, but we would "only" pay a price of volatile load on a hot path.

src/java.base/share/classes/java/io/ClassCache.java Outdated Show resolved Hide resolved
src/java.base/share/classes/java/io/ClassCache.java Outdated Show resolved Hide resolved
@caoman
Copy link
Contributor Author

caoman commented Jun 13, 2023

Thank you @shipilev! Good point about the cost of extra AtomicReference instance, and slowing down the fast path with a CAS.
Agree that making strongReferent volatile is better and simpler.

For AtomicReferenceFieldUpdater or VarHandle, they are more complicated and I don't see much benefit over a simple volatile in this case.

@shipilev
Copy link
Member

Unfortunately, even the single volatile on that fast path seems to cost quite a bit :(

On M1:

# CONF=macosx-aarch64-server-release make images test TEST=micro:java.io.ObjectStreamClasses

Benchmark                       Mode  Cnt    Score   Error  Units

# Before
ObjectStreamClasses.testLookup  avgt    8  106,946 ± 0,691  ns/op

# After
ObjectStreamClasses.testLookup  avgt    8  119,984 ± 0,588  ns/op

So again, this looks like a benign data race, and we are penalizing the serialization a bit even with the lightweight fix. This is only to make TSAN happy, right? If so, I would rather keep the code as is.

There are plenty of benign data races in JDK, we should be able to somehow tell TSAN which are actually fine.

@caoman
Copy link
Contributor Author

caoman commented Jun 15, 2023

Thanks for the benchmarking. I was unaware of the microbenchmark java.io.ObjectStreamClasses.
The performance penalty is likely related to aarch64 being slower on volatile load. I see no observable difference on x86-64:

Benchmark                       Mode  Cnt    Score    Error  Units
# Before
ObjectStreamClasses.testLookup  avgt    8  230.255 ± 10.540  ns/op

# After
ObjectStreamClasses.testLookup  avgt    8  230.291 ± 19.369  ns/op

Anyway, this race does look benign and I'm OK with just documenting this benign race, and suppress it in TSAN. Updated this PR to a comment-only change.

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I have suggestions about the comment.

src/java.base/share/classes/java/io/ClassCache.java Outdated Show resolved Hide resolved
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 15, 2023
Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am okay with it. Since this is my comment, I'd like someone else to re-review :)

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine

@caoman
Copy link
Contributor Author

caoman commented Jun 15, 2023

Thank you for the reviews and suggestions!

@caoman
Copy link
Contributor Author

caoman commented Jun 15, 2023

/integrate

@openjdk
Copy link

openjdk bot commented Jun 15, 2023

Going to push as commit 5c70516.
Since your change was applied there have been 15 commits pushed to the master branch:

  • 81bfd78: 8309632: JDK 21 RDP1 L10n resource files update
  • 9f64a64: 8301379: Verify TLS_ECDH_* cipher suites cannot be negotiated
  • 4a5475c: 8309953: Strengthen and optimize oopDesc age methods
  • 79ff72a: 8308499: Test vmTestbase/nsk/jdi/MethodExitRequest/addClassExclusionFilter/filter001/TestDescription.java failed: VMDisconnectedException
  • 3e0bbd2: 8285368: Overhaul doc-comment inheritance
  • 3eeb681: 8167252: Some of Charset.availableCharsets() does not contain itself
  • 653a8d0: 8310129: SetupNativeCompilation LIBS should match the order of the other parameters
  • 947f149: 8308444: LoadStoreNode::result_not_used() is too conservative
  • 8b4af46: 8309974: some JVMCI tests fail when VM options include -XX:+EnableJVMCI
  • 0038491: 8309978: [x64] Fix useless padding
  • ... and 5 more: https://git.openjdk.org/jdk/compare/181845ae46157a9bb3bf8e2a328fa59eddc0273a...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 15, 2023
@openjdk openjdk bot closed this Jun 15, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jun 15, 2023
@openjdk
Copy link

openjdk bot commented Jun 15, 2023

@caoman Pushed as commit 5c70516.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@caoman caoman deleted the JDK8309688 branch June 16, 2023 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
4 participants