Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8197825: [Test] Intermittent timeout with javax/swing JColorChooser Test #2220

Closed
wants to merge 5 commits into from

Conversation

prsadhuk
Copy link
Contributor

@prsadhuk prsadhuk commented Jan 25, 2021

This test was failing in our nightly mach5 testing. Appropriate stability code in form of waitForIdle and delay is added.
mach5 job running for several iterations on all platforms is ok. Link in JBS.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8197825: [Test] Intermittent timeout with javax/swing JColorChooser Test

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/2220/head:pull/2220
$ git checkout pull/2220

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 25, 2021

👋 Welcome back psadhukhan! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 25, 2021
@openjdk
Copy link

openjdk bot commented Jan 25, 2021

@prsadhuk The following label will be automatically applied to this pull request:

  • swing

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the swing client-libs-dev@openjdk.org label Jan 25, 2021
@mlbridge
Copy link

mlbridge bot commented Jan 25, 2021

Webrevs

Copy link
Member

@aivanov-jdk aivanov-jdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Yet I suggest fixing the typo in the bug synopsis: Intermiitent → Intermittent

@openjdk
Copy link

openjdk bot commented Jan 25, 2021

@prsadhuk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8197825: [Test] Intermittent timeout with javax/swing JColorChooser Test

Reviewed-by: trebari, aivanov

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 136 new commits pushed to the master branch:

  • 8a9004d: 8260574: Remove parallel constructs in GenCollectedHeap::process_roots
  • 0da9cad: 8260501: [Vector API] Improve register usage for shift operations on x86
  • a61ff87: 8260685: ProblemList 2 compiler/jvmci/compilerToVM tests in Xcomp configs
  • fcfe647: 8260462: Missing in Modality.html
  • 67a34da: 8260630: Templatize literal_size
  • 6b24e98: 8259008: ArithmeticException was thrown at "Monitor Cache Dump" on HSDB
  • 69ee314: 8249867: xml declaration is not followed by a newline
  • fb46c91: 8260674: ProblemList jdk/incubator/vector/VectorHash.java in Xcomp configs
  • 71bfe96: 8217633: Configurable extensions with system properties
  • f5ca838: 8065773: JDI: UOE is not thrown, when redefineClasses changes a class modifier
  • ... and 126 more: https://git.openjdk.java.net/jdk/compare/1452280073757eb4ed058ff3f07a8106f5c386f7...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 25, 2021
@mrserb
Copy link
Member

mrserb commented Jan 25, 2021

This bug looks like a product bug, infinite loop in the "Toolkit.shutdown", so it is better to fix the jdk than the test.

@openjdk openjdk bot added 2d client-libs-dev@openjdk.org awt client-libs-dev@openjdk.org labels Jan 27, 2021
while (!tk.IsDisposed()) {
Sleep(100);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can cause infinite looping as the message queue to get DISPOSE events is removed so if we are not getting disposed by now, we probably will not get DISPOSE event ever causing infinite loop.
XToolkit, LWToolkit does not have this infinite loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that this is not a test bug, so the test should not be changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at the "AwtToolkit::Dispose()" method, on how much stuff should be done to properly shutdown the toolkit. This Dispose() method is executed immediately when we exit the message loop in the "Java_sun_awt_windows_WToolkit_eventLoop". So when the shutdown hook is executed we should have the message loop, then we call tk.QuitMessageLoop to stop it, and wait until all code in the Dispose() is executed. But since the IsDisposed() return false we for unknow reason hang, does it mean that the message loop still operates? Or we got some error during "QuitMessageLoop"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems in successful run, when the test finish

  • AwtToolkit::MessageLoop starts
  • DoQuitMessageLoop is called
  • AwtToolkit::QuitMessageLoop starts
  • AwtToolkit::QuitMessageLoop finishes
  • AwtToolkit::MessageLoop finish
  • Dispose() is called, m_isDisposed sets to true
  • shutdown hook isDisposed is true so no infinite loop

During unsuccessful run,

  • AwtToolkit::MessageLoop starts
  • DoQuitMessageLoop is called
  • AwtToolkit::QuitMessageLoop starts
  • AwtToolkit::QuitMessageLoop finishes
  • AwtToolkit::MessageLoop NEVER ends so Dispose() is not called so m_isDisposed is not set to true so shutdown hook goes in infinite loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the code yesterday. Could it be because of synchronisation? I mean, do we need to use native synchronisation to guarantee variable changes are seen across the threads?

Does MessageLoop not receive Quit / Null message?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that this is not a test bug, so the test should not be changed.

The test never dispose of the frame. Why is it expected to shut down the toolkit? Shall the frame be disposed of when the main thread in the test finishes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test never dispose of the frame. Why is it expected to shut down the toolkit? Shall the frame be disposed of when the main thread in the test finishes?

The shutdown is caused by the System.exit call while the toolkit active, so we should shut down it before the end.

Copy link
Contributor Author

@prsadhuk prsadhuk Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems "m_breakMessageLoop" is never true for unsuccessful run even though AwtToolkit::QuitMessageLoop finish execution (where m_breakMessageLoop is set to true),
so AwtToolkit::MessageLoop never ends and shutdown hook gets called where tk.isDisposed() loop start spinning. seems to be some timing issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems "m_breakMessageLoop" is never true for unsuccessful run even though AwtToolkit::QuitMessageLoop finish execution (where m_breakMessageLoop is set to true),
so AwtToolkit::MessageLoop never ends and shutdown hook gets called where tk.isDisposed() loop start spinning. seems to be some timing issue.

Then this does look like a synchronisation problem. One thread changes the value of m_breakMessageLoop but another doesn't see it's changed. Should m_breakMessageLoop be declared as volatile?

Comment on lines +77 to +78
if (frame != null) {
SwingUtilities.invokeAndWait(() -> frame.dispose());

This comment was marked as outdated.

@aivanov-jdk
Copy link
Member

Yet I suggest fixing the typo in the bug synopsis: Intermiitent → Intermittent

I've edited the JBS synopsis. Please also update the PR subject.

@prsadhuk prsadhuk changed the title 8197825: [Test] Intermiitent timeout with javax/swing JColorChooser Test 8197825: [Test] Intermittent timeout with javax/swing JColorChooser Test Jan 29, 2021
@@ -475,7 +475,7 @@ class AwtToolkit {
BOOL m_verifyComponents;
BOOL m_breakOnError;

BOOL m_breakMessageLoop;
volatile BOOL m_breakMessageLoop;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this volatile modifier resolve the now-removed infinite loop in while (!tk.IsDisposed()) in WToolkit_shutdown?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this volatile modifier resolve the now-removed infinite loop in while (!tk.IsDisposed()) in WToolkit_shutdown?

The loop should not be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, volatile modifier has no effect if infinite loop is reinstated..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this volatile modifier resolve the now-removed infinite loop in while (!tk.IsDisposed()) in WToolkit_shutdown?

The loop should not be removed.

No, it should not, as you noted previously.

However, making m_breakMessageLoop volatile does not look right either. If QuitMessageLoop is called from !IsMainThread() thread, it is posted as a message to run on the correct thread. Thus m_breakMessageLoop should be accessed on a single thread only; if it's the case, volatile is unneeded.

And @prsadhuk's latest test confirms it. There must be something else…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the test locally for several iterations but it does not fail locally so I believe it's a test issue. Re reverting the test changes made in 1st webrev...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a reason to change the test until the root cause of the hang is not recognized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were many instances faced during test sprint where a test was failing in one platform and we added the stability code of waitForIdle,delay etc to make it work. It was not cited as product bug...I believe it's the same here...I am not able to reproduce it locally at all, and I also checked with other who was also not able to reproduce it locally. It only failed in mach5 and it passed once the test is modified with stability fixes...I dont see a reason to stall it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is not the same, this a hang in the product due to failed toolkit shutdown, please create one more bug and attach the old test to it reproduce the bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. JDK-8261016

@openjdk openjdk bot removed 2d client-libs-dev@openjdk.org awt client-libs-dev@openjdk.org labels Feb 1, 2021
@prsadhuk
Copy link
Contributor Author

prsadhuk commented Feb 3, 2021

/integrate

@openjdk openjdk bot closed this Feb 3, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 3, 2021
@openjdk
Copy link

openjdk bot commented Feb 3, 2021

@prsadhuk Since your change was applied there have been 172 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

Pushed as commit c008410.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@prsadhuk prsadhuk deleted the 8197825 branch February 3, 2021 04:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated swing client-libs-dev@openjdk.org
4 participants