Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8273790: Potential cyclic dependencies between Gregorian and CalendarSystem #5683

Closed
wants to merge 3 commits into from

Conversation

@jaikiran
Copy link
Member

@jaikiran jaikiran commented Sep 24, 2021

Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?

As noted in that issue, trying to class load sun.util.calendar.CalendarSystem and sun.util.calendar.Gregorian concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:

  • Thread T1 initiates a classload on the sun.util.calendar.CalendarSystem.
  • This gives T1 the implicit class init lock on CalendarSystem.
  • Consider thread T2 which at the same time initiates a classload on sun.util.calendar.Gregorian class.
  • This gives T2 a implicit class init lock on Gregorian.
  • T1, still holding a lock on CalendarSystem attempts to load Gregorian since it wants to create a (singleton) instance of Gregorian and assign it to the static final GREGORIAN_INSTANCE member. Since T2 is holding a class init lock on Gregorian, T1 ends up waiting
  • T2 on the other hand is still loading the Gregorian class. Gregorian itself "is a" CalendarSystem, so during this loading of Gregorian class, T2 starts travelling up the class hierarchy and asks for a lock on CalendarSystem. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.

The linked JBS issue has a thread dump which shows this in action.

The commit here delays the instance creation of Gregorian by moving that instance creation logic from the static initialization of the CalendarSystem class, to the first call to CalendarSystem#getGregorianCalendar(). This prevents the CalendarSystem from needing a lock on Gregorian during its static init (of course, unless some code in this static init flow calls CalendarSystem#getGregorianCalendar(), in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)

A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to CalendarSystem#getGregorianCalendar() or CalendarSystem#forName(). Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8273790: Potential cyclic dependencies between Gregorian and CalendarSystem

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5683/head:pull/5683
$ git checkout pull/5683

Update a local copy of the PR:
$ git checkout pull/5683
$ git pull https://git.openjdk.java.net/jdk pull/5683/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5683

View PR using the GUI difftool:
$ git pr show -t 5683

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5683.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Sep 24, 2021

👋 Welcome back jpai! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@openjdk openjdk bot added the rfr label Sep 24, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Sep 24, 2021

@jaikiran The following labels will be automatically applied to this pull request:

  • core-libs
  • i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 24, 2021

Webrevs

Loading

Copy link
Member

@naotoj naotoj left a comment

Thanks, Jaikiran. Looks good. Some minor comments.

Loading

@@ -120,7 +120,17 @@ private static void initNames() {
* @return the <code>Gregorian</code> instance
*/
public static Gregorian getGregorianCalendar() {
return GREGORIAN_INSTANCE;
var gCal = GREGORIAN_INSTANCE;
Copy link
Member

@naotoj naotoj Sep 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the local variable gCal?

Loading

Copy link
Member Author

@jaikiran jaikiran Sep 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was there to avoid additional volatile reads in that method. A performance optimization. However, with the change Roger suggested, this is no longer relevant.

Loading

* @run main/othervm CalendarSystemDeadLockTest
* @run main/othervm CalendarSystemDeadLockTest
* @run main/othervm CalendarSystemDeadLockTest
* @run main/othervm CalendarSystemDeadLockTest
Copy link
Member

@naotoj naotoj Sep 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, before the fix, how many test instances caused the deadlock? I'd think these 5 runs are arbitrary numbers, Just wanted to have those 5 runs are appropriate.

Loading

Copy link
Member Author

@jaikiran jaikiran Sep 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @naotoj,
On my setup, without the fix, the test deadlocks almost always right on the first run. There have been cases where it did pass the first time, but running it a second time has always reproduced the failure. The 5 runs that I have in this test is indeed an arbitrary number. Given how quickly this test completes, I decided to use a slightly higher number of 5 instead of maybe 2 or 3. Do you think, we should change the run count to something else?

Loading

// add a couple of tasks which directly invoke sun.util.calendar.CalendarSystem#getGregorianCalendar()
tasks.add(new GetGregorianCalTask(taskTriggerLatch));
tasks.add(new GetGregorianCalTask(taskTriggerLatch));
final ExecutorService executor = Executors.newFixedThreadPool(tasks.size());
Copy link
Member

@naotoj naotoj Sep 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asserting tasks.size() == numTasks may help here.

Loading

Copy link
Member Author

@jaikiran jaikiran Sep 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. I've updated the test to add this check.

Loading

}
}
}
}
Copy link
Member

@naotoj naotoj Sep 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a new line at the EOF.

Loading

Copy link
Member Author

@jaikiran jaikiran Sep 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've updated this in the latest version of the PR.

Loading

@RogerRiggs
Copy link
Contributor

@RogerRiggs RogerRiggs commented Sep 24, 2021

As an alternative, can the Gregorian Instance be moved to a nested (static) class.
That will delay initialization until it is needed. This "holder" pattern is used elsewhere to defer initialization and break cycles.

Loading

@jaikiran
Copy link
Member Author

@jaikiran jaikiran commented Sep 25, 2021

Hello Roger,

As an alternative, can the Gregorian Instance be moved to a nested (static) class.
That will delay initialization until it is needed. This "holder" pattern is used elsewhere to defer initialization and break cycles.

I did indeed have that in mind when I started work on this. That was something Chris Hegarty had suggested and we have used in a different (but similar) issue a while back[1]. I was however unsure if that's a common enough technique, so had started off with the volatile approach. I've now updated the PR to use the holder technique instead.

[1] #2893 (comment)

Loading

Copy link
Member

@kelthuzadx kelthuzadx left a comment

@jaikiran Thanks for fixing this. Delaying instance creation via a static holder class seems reasonable to me.

Loading

naotoj
naotoj approved these changes Sep 27, 2021
Copy link
Member

@naotoj naotoj left a comment

Looks good. Thank you for the fix!

Loading

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

👍

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Sep 27, 2021

@jaikiran This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8273790: Potential cyclic dependencies between Gregorian and CalendarSystem

Reviewed-by: naoto, yyang, rriggs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 22 new commits pushed to the master branch:

  • 2cffe4c: 8274326: [macos] Ensure initialisation of sun/lwawt/macosx/CAccessibility in JavaComponentAccessibility.m
  • 172900d: 8274255: Update javac messages to use "enum class" rather than "enum type"
  • b0983df: 8274074: SIGFPE with C2 compiled code with -XX:+StressGCM
  • 7436a77: 8274317: Unnecessary reentrant synchronized block in java.awt.Cursor
  • 7426fd4: 8274325: C4819 warning at vm_version_x86.cpp on Windows after JDK-8234160
  • e3aff8f: 8274289: jdk/jfr/api/consumer/TestRecordedFrameType.java failed with "RuntimeException: assertNotEquals: expected Interpreted to not equal Interpreted"
  • 252aaa9: 8274293: Build failure on macOS with Xcode 13.0 as vfork is deprecated
  • 7700b25: 8273401: Disable JarIndex support in URLClassPath
  • 5ec1cdc: 8274321: Standardize values of @SInCE tags in javax.lang.model
  • 4838a2c: 8274143: Disable "invalid entry for security.provider.X" error message in log file when security.provider.X is empty
  • ... and 12 more: https://git.openjdk.java.net/jdk/compare/f214d6e8736a620c8e1b87c30587aa0977cccc4c...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Loading

@openjdk openjdk bot added the ready label Sep 27, 2021
@jaikiran
Copy link
Member Author

@jaikiran jaikiran commented Sep 28, 2021

Thank you Roger, Naoto and Yi Yang for the reviews.

Loading

@jaikiran
Copy link
Member Author

@jaikiran jaikiran commented Sep 28, 2021

/integrate

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Sep 28, 2021

Going to push as commit ddc2627.
Since your change was applied there have been 31 commits pushed to the master branch:

  • 633eab2: 8174819: java/nio/file/WatchService/LotsOfEvents.java fails intermittently
  • 8876eae: 8269685: Optimize HeapHprofBinWriter implementation
  • c880b87: 8274367: Re-indent stack-trace examples for Throwable.printStackTrace
  • c4b52c7: 8271303: jcmd VM.cds {static, dynamic}_dump should print more info
  • 5b660f3: 8274392: Suppress more warnings on non-serializable non-transient instance fields in java.sql.rowset
  • 0865120: 8274345: make build-test-lib is broken
  • 75404ea: 8267636: Bump minimum boot jdk to JDK 17
  • 14100d5: 8274170: Add hooks for custom makefiles to augment jtreg test execution
  • daaa47e: 8274311: Make build.tools.jigsaw.GenGraphs more configurable
  • 2cffe4c: 8274326: [macos] Ensure initialisation of sun/lwawt/macosx/CAccessibility in JavaComponentAccessibility.m
  • ... and 21 more: https://git.openjdk.java.net/jdk/compare/f214d6e8736a620c8e1b87c30587aa0977cccc4c...master

Your commit was automatically rebased without conflicts.

Loading

@openjdk openjdk bot closed this Sep 28, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Sep 28, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Sep 28, 2021

@jaikiran Pushed as commit ddc2627.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Loading

@jaikiran jaikiran deleted the 8273790 branch Sep 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4 participants