Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8327137: Add test for ConcurrentModificationException in BasicDirectoryModel #18109

Conversation

aivanov-jdk
Copy link
Member

@aivanov-jdk aivanov-jdk commented Mar 4, 2024

I'm adding a regression test for JDK-8323670 and JDK-8307091; it's also a regression test for JDK-8240690.

I referenced this test in PR #17462 in this comment. I fine-tuned the test since that time.

The test doesn't fail all the time without the fix for JDK-8323670, however, it fails sometimes. If you run the test several times, it will likely fail without the fix.

For me, the test fails about 10 times from 40 runs in the CI. It fails on macOS more frequently than on Linux.

When the test passes, it usually completes in 5 minutes.

How the test works

The test creates a temporary directory in the current directory and creates a number of files in it. (The number of files is controlled by NUMBER_OF_THREADS constant). Then the test creates JFileChooser in the temporary directory.

The test starts several scanner threads, the number is controlled by NUMBER_OF_THREADS, which repeatedly call fileChooser.rescanCurrentDirectory(). This results in calling BasicDirectoryModel.validateFileCache which starts a background thread — "Basic L&F File Loading Thread" — to enumerate the files.

A timer is used to create new files in the directory that the file chooser is using.

After enumerating the files, the File Loading Thread posts an event to EDT. The event updates fileCache and fires a ListDataEvent.

If the File Loading Thread is iterating over fileCache using Iterator (when fileCache.subList or fileCache.equals is running; or a new Vector instance is created from a fileCache or its sublist) and fileCache is being updated on EDT, then ConcurrentModificationException is thrown.

On Linux and on headless macOS, ShellFolder.invoke is executed in the caller, which makes it easier to reproduce the issue. Because of JDK-8325179, there are several File Loading Threads, which also helps to reproduce the issue.

On headful macOS, the BasicDirectoryModel is not used, so the test does not reproduce the issue.

On Windows, the test does not fail or fails with OutOfMemoryError. It is because all the File Loading Threads are serialised on the COM thread, ShellFolder.invoke submits the task to the COM thread and waits for it to complete. The chance of updating fileCache while another thread is iterating over it is significantly reduced.

Even more, filechooser.isTraversable(file) is also executed on the COM thread, which means there's a heavy contention for the COM thread from each File Loading Thread.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8327137: Add test for ConcurrentModificationException in BasicDirectoryModel (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18109/head:pull/18109
$ git checkout pull/18109

Update a local copy of the PR:
$ git checkout pull/18109
$ git pull https://git.openjdk.org/jdk.git pull/18109/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 18109

View PR using the GUI difftool:
$ git pr show -t 18109

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18109.diff

Webrev

Link to Webrev Comment

This test creates 10 threads which concurrently call
rescanFileSystem that calls BasicDirectoryModel.validateFileCache.
Make the timer create a random number of files and more frequently.
Ensure the test cleans up after itself:
 * Handle exceptions in the threads and timer task,
 * Interrupt threads in this case,
 * Pass the first exception as the test result,
 * Add as suppressed exceptions if more exceptions occur.
It's more jtreg friendly, files are created in the scratch directory
which is cleaned up automatically. Otherwise, the directory could
be left on the system when the test failed.
ConcurrentModificationException is thrown in a thread where this test
doesn't catch the exception.

Mark timeStart as final.
When all the test threads are created within a new ThreadGroup,
uncaughtException can catch ConcurrentModificationException which
occurs in the 'Basic L&F File Loading Thread'. In this case,
the test exits quickly and cleans up the files.

Restructure the source file: main, wrapper, runTest; exception handlers;
helper classes.

Change the test parameters.
All the 8323670, 8307091, 8240690 are about the same problem where
ConcurrentModificationException is thrown from BasicDirectoryModel.
On Windows, the test usually fails with OutOfMemoryException.
Because ShellFolder.invoke() runs on COM thread on Windows,
all the (worker) threads are serialised on COM thread, reducing
the opportunity to concurrently modify fileCache.
On Linux and *headless* macOS, ShellFolder.invoke() runs in the
same thread, therefore all the threads run simultaneously.
It's important to do it on macOS, otherwise the test won't fail
because BasicDirectoryModel isn't used in headful environment on macOS.
@bridgekeeper
Copy link

bridgekeeper bot commented Mar 4, 2024

👋 Welcome back aivanov! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 4, 2024
@openjdk
Copy link

openjdk bot commented Mar 4, 2024

@aivanov-jdk The following label will be automatically applied to this pull request:

  • client

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the client client-libs-dev@openjdk.org label Mar 4, 2024
@mlbridge
Copy link

mlbridge bot commented Mar 4, 2024

Webrevs

@aivanov-jdk
Copy link
Member Author

@mrserb, @RealCLanger, could you run this test in your environments, please? It is for Linux and macOS only.

I'm looking for confirmation that the test fails without the fix for JDK-8323670. Since JDK-8323670 is fixed in mainline, you can run it in 21u or 17u which don't have the fix yet.

You'll need to run the test several times because it fails only occasionally. That's why I want to collect more data on whether the test is useful/stable enough.

I greatly appreciate your help.

@prrace
Copy link
Contributor

prrace commented Mar 12, 2024

"When the test passes, it usually completes in 5 minutes."

and you've never seen the test time out ? I thought default jtreg timeout is 120 seconds.

@aivanov-jdk
Copy link
Member Author

"When the test passes, it usually completes in 5 minutes."

and you've never seen the test time out ? I thought default jtreg timeout is 120 seconds.

No, I've never seen the test time out.

The default timeout is factored by 2 in client jobs, another factor could be applied on the hosts.

@openjdk
Copy link

openjdk bot commented Mar 13, 2024

@aivanov-jdk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8327137: Add test for ConcurrentModificationException in BasicDirectoryModel

Reviewed-by: serb, tr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 444 new commits pushed to the master branch:

  • 5cddc2d: 8325252: C2 SuperWord: refactor the packset
  • 6b1b0e9: 8329103: assert(!thread->in_asgct()) failed during multi-mode profiling
  • bc546c2: 8328561: test java/awt/Robot/ManualInstructions/ManualInstructions.java isn't used
  • af7c6af: 8324808: Manual printer tests have no Pass/Fail buttons, instructions close set 3
  • d3fc8df: 8329135: Store Universe::*exception_instance() in CDS archive
  • a85c849: 8328273: sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java failed with java.rmi.server.ExportException: Port already in use
  • 70c8ff1: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout
  • ecd2b71: 8329354: java/text/Format/MessageFormat/CompactSubFormats.java fails
  • c2979c1: 8329425: ProblemList containers/docker/TestJFREvents.java on linux-x64
  • 5698f7a: 8329134: Reconsider TLAB zapping
  • ... and 434 more: https://git.openjdk.org/jdk/compare/8f6edd8dc866bf970b7e7b8358f62832887e6e8b...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

try {
createFiles(temp);

final JFileChooser fc = new JFileChooser(temp.toFile());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be created on EDT, just in case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters here… The test does not rely on even processing on EDT, no state of JFileChooser is modified except for calling rescanCurrentDirectory. Technically, I am not allowed to do so either. But if I call rescanCurrentDirectory from EDT only, the test becomes void.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will fire the property change events via firePropertyChange from the constructor and it is better to do that on EDT.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it does fire property change events on the current thread rather than EDT in this case. I still don't think it matters in this case. After the JFileChooser is created, it is accessed on the Scanner threads only, and only to call its rescanCurrentDirectory method. This call, in its turn, creates a new background thread which accesses the fields and methods of JFileChooser as well as FileSystemView.

Instantiating JFileChooser on EDT adds complexity for no benefit. The JFileChooser is still accessed concurrently from several threads, including Swing internal thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instantiating JFileChooser on EDT adds complexity for no benefit. The JFileChooser is still accessed concurrently from several threads, including Swing internal thread.

But still, this is a common requirement to create UI components on EDT, and that will replicate more or less how it will work in the application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instantiating JFileChooser on EDT adds complexity for no benefit. The JFileChooser is still accessed concurrently from several threads, including Swing internal thread.

But still, this is a common requirement to create UI components on EDT, and that will replicate more or less how it will work in the application.

And it is a common requirement not to call fileChooser.rescanCurrentDirectory concurrently in 5 different threads. So it does not apply.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I agree all Swing components are to be created and accessed on EDT.

However, there are cases where instantiating and testing is safe on main thread. It is when the component is not shown on the screen, its state does not depend on any events dispatched by the event queue. There are such tests in the JDK code base.

I think this test is one of such cases where JFileChooser can be (safely) created on the current thread. The test does not display the file chooser, its state does not depend on receiving input events and handling them.

Essentially, the test verifies that the implementation of JFileChooser.rescanCurrentDirectory is thread-safe. It wasn't. Two bugs were resolved to make the implementation thread-safe: JDK-8323670 and JDK-8325179.

Copy link
Member

@mrserb mrserb Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, there are cases where instantiating and testing is safe on main thread.

That is my point, make it less safe - when the component is created on one thread:EDT and then for some reason accessed on a different thread, the rescanCurrentDirectory should still work.

Copy link
Member Author

@aivanov-jdk aivanov-jdk Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, there are cases where instantiating and testing is safe on main thread.

That is my point, make it less safe - when the component is created on one thread:EDT and then for some reason accessed on a different thread, the rescanCurrentDirectory should still work.

I got your idea!

However, I still see no benefit for doing it.

Yes, when JFileChooser is instantiated, it creates its UI delegates and the default model, BasicDirectoryModel. The constructor of the model calls validateFileCache which starts the background thread and will update fileCache.

It does not make the initialisation less thread-safe: the same two threads are involved — FilesLoader and EDT. Yet there's no race yet…

The ConcurrentModificationException is thrown when the background FilesLoader is iterating over fileCache while EDT runs DoChangeContents.run which adds elements to fileCache or removes elements from it.

By the time the Scanner threads start, a race becomes possible. If the initial FilesLoader completes and posts the DoChangeContents object to EDT, the FilesLoader threads created by the scanners will race… but fileCache is empty initially, thus iterating over it is very quick.

Therefore getting ConcurrentModificationException is unlikely until fileCache contains a list of files. This state is reached later in the timeline of the test.

The initial FilesLoader thread that's started when JFileChooser is instantiated plays a role too… Yet it's running concurrently with the Scanner threads whether the JFileChooser object is created on the current thread or on EDT.


Taking the above into account, instantiating JFileChooser on EDT adds complexity to the test but brings no benefits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I agree with you.

@RealCLanger
Copy link
Contributor

@mrserb, @RealCLanger, could you run this test in your environments, please? It is for Linux and macOS only.

I'm looking for confirmation that the test fails without the fix for JDK-8323670. Since JDK-8323670 is fixed in mainline, you can run it in 21u or 17u which don't have the fix yet.

You'll need to run the test several times because it fails only occasionally. That's why I want to collect more data on whether the test is useful/stable enough.

I greatly appreciate your help.

I added it to our testing. Results early next week.

@aivanov-jdk
Copy link
Member Author

I added it to our testing. Results early next week.

@RealCLanger Thank you. I'm more interested in the failing case. How reliable does the test reproduce the problem?

@RealCLanger
Copy link
Contributor

I added it to our testing. Results early next week.

@RealCLanger Thank you. I'm more interested in the failing case. How reliable does the test reproduce the problem?

Yes, it's running in 21 and 22 as well, so we'll see what happens there.

@RealCLanger
Copy link
Contributor

I added it to our testing. Results early next week.

@RealCLanger Thank you. I'm more interested in the failing case. How reliable does the test reproduce the problem?

Yes, it's running in 21 and 22 as well, so we'll see what happens there.

We got the following, once in jdk21u and once in jdk21u-dev on macintelx64 through the weekend:

2024-03-16 05:55:37.407 java[3056:1287194577] XType: Using static font registry.
2024-03-16 05:55:38.047 java[3056:1287194577] name is : .SFNS-Regular
2024-03-16 05:55:38.047 java[3056:1287194577] family is : .AppleSystemUIFont
2024-03-16 05:55:38.047 java[3056:1287194577] name is : .SFNS-Bold
2024-03-16 05:55:38.047 java[3056:1287194577] family is : .AppleSystemUIFont
2024-03-16 05:55:38.616 java[3056:1287194577] nsFont-name is : .AppleSystemUIFont
2024-03-16 05:55:38.616 java[3056:1287194577] nsFont-family is : .AppleSystemUIFont
2024-03-16 05:55:38.616 java[3056:1287194577] nsFont-desc-name is : .SFNS-Regular
2024-03-16 05:55:38.616 java[3056:1287194577] nsFont-name is : .AppleSystemUIFont
2024-03-16 05:55:38.617 java[3056:1287194577] nsFont-family is : .AppleSystemUIFontBold
2024-03-16 05:55:38.617 java[3056:1287194577] nsFont-desc-name is : .SFNS-Bold
2024-03-16 05:55:38.629 java[3056:1287194577] nsFont-name is : .AppleSystemUIFont
2024-03-16 05:55:38.629 java[3056:1287194577] nsFont-family is : .AppleSystemUIFont
2024-03-16 05:55:38.629 java[3056:1287194577] nsFont-desc-name is : .SFNS-Regular
2024-03-16 05:55:38.629 java[3056:1287194577] nsFont-name is : .AppleSystemUIFont
2024-03-16 05:55:38.629 java[3056:1287194577] nsFont-family is : .AppleSystemUIFontBold
2024-03-16 05:55:38.629 java[3056:1287194577] nsFont-desc-name is : .SFNS-Bold
Exception in Basic L&F File Loading Thread: class java.util.ConcurrentModificationException
java.util.ConcurrentModificationException
at java.base/java.util.AbstractList$SubList.checkForComodification(AbstractList.java:906)
at java.base/java.util.AbstractList$SubList.size(AbstractList.java:792)
at java.base/java.util.AbstractCollection.toArray(AbstractCollection.java:143)
at java.base/java.util.Collections$SynchronizedCollection.toArray(Collections.java:2310)
at java.base/java.util.Vector.(Vector.java:182)
at java.desktop/javax.swing.plaf.basic.BasicDirectoryModel$FilesLoader$1.call(BasicDirectoryModel.java:365)
at java.desktop/javax.swing.plaf.basic.BasicDirectoryModel$FilesLoader$1.call(BasicDirectoryModel.java:342)
at java.desktop/sun.awt.shell.ShellFolderManager$DirectInvoker.invoke(ShellFolderManager.java:148)
at java.desktop/sun.awt.shell.ShellFolder.invoke(ShellFolder.java:532)
at java.desktop/sun.awt.shell.ShellFolder.invoke(ShellFolder.java:518)
at java.desktop/javax.swing.plaf.basic.BasicDirectoryModel$FilesLoader.run0(BasicDirectoryModel.java:342)
at java.desktop/javax.swing.plaf.basic.BasicDirectoryModel$FilesLoader.run(BasicDirectoryModel.java:295)
at java.base/java.lang.Thread.run(Thread.java:1583)

So it seems to work that the test can reproduce the issue.

@aivanov-jdk
Copy link
Member Author

Yes, it's running in 21 and 22 as well, so we'll see what happens there.

We got the following, once in jdk21u and once in jdk21u-dev on macintelx64 through the weekend:

Thank you, Christoph! @RealCLanger. At least one failure is detected. It's better than nothing, isn't it?

@TejeshR13
Copy link
Contributor

Yes, it's running in 21 and 22 as well, so we'll see what happens there.

We got the following, once in jdk21u and once in jdk21u-dev on macintelx64 through the weekend:

Thank you, Christoph! @RealCLanger. At least one failure is detected. It's better than nothing, isn't it?

Definitely helpful. For these fixes - JDK-8323670 and JDK-8307091, the only way to debug was with code walkthrough. This test will help a lot even its able to reproduce intermittently according to me, especially to build some confidence for the fix.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 31, 2024
Copy link
Contributor

@TejeshR13 TejeshR13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@aivanov-jdk
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Apr 10, 2024

Going to push as commit 9731b1c.
Since your change was applied there have been 577 commits pushed to the master branch:

  • c5150c7: 8309751: Duplicate constant pool entries added during default method processing
  • 86cb767: 8326568: jdk/test/com/sun/net/httpserver/bugs/B6431193.java should use try-with-resource and try-finally
  • b49ba42: 8330002: Remove redundant public keyword in BarrierSet
  • dd6e453: 8329767: G1: Move G1BlockOffsetTable::set_for_starts_humongous to HeapRegion
  • e0fd6c4: 8329545: [s390x] Fix garbage value being passed in Argument Register
  • 51ed69a: 8327621: Check return value of uname in os::get_host_name
  • bea9acc: 8328482: Convert and Open source few manual applet test to main based
  • d037a59: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments
  • bab7019: 8329431: Improve speed of writing CDS heap objects
  • 47df145: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking
  • ... and 567 more: https://git.openjdk.org/jdk/compare/8f6edd8dc866bf970b7e7b8358f62832887e6e8b...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Apr 10, 2024
@openjdk openjdk bot closed this Apr 10, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Apr 10, 2024
@openjdk
Copy link

openjdk bot commented Apr 10, 2024

@aivanov-jdk Pushed as commit 9731b1c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@aivanov-jdk aivanov-jdk deleted the 8323670-BasicDirectoryModel-concurrency branch April 17, 2024 11:38
@aivanov-jdk aivanov-jdk restored the 8323670-BasicDirectoryModel-concurrency branch April 25, 2024 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client client-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants