Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaner optimizaton #1536

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Conversation

pmconrad
Copy link

@pmconrad pmconrad commented Aug 3, 2023

Fixes #1535 (also fixes #1521 )

This PR fixes the cleaner overload issue by creating a Cleaner instance per thread so that each thread will clean up the objects it previously created.
An additional "Master Cleaner" cleans up after thread exits. The Master Cleaner terminates itself 1 minute after all registered objects have been cleaned up.

Copy link
Member

@matthiasblaesing matthiasblaesing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this indeed looks like a good improvement. Looking at the logic it seems to be an improvement and good stabilization.

This might need another look:

while ((now = System.currentTimeMillis()) < lastNonEmpty + MAX_LINGER_MS || !deleteIfEmpty()) {
if (!cleanerImpls.isEmpty()) { lastNonEmpty = now; }
try {
Reference<?> ref = impl.referenceQueue.remove(CLEANUP_INTERVAL_MS);
if(ref instanceof CleanerRef) {
((CleanerRef) ref).clean();
} else {
masterCleanup();
}
} catch (InterruptedException ex) {
// Can be raised on shutdown. If anyone else messes with
// our reference queue, well, there is no way to separate
// the two cases.
// https://groups.google.com/g/jna-users/c/j0fw96PlOpM/m/vbwNIb2pBQAJ
break;
} catch (Exception ex) {
Logger.getLogger(Cleaner.class.getName()).log(Level.SEVERE, null, ex);
}
}

Consider this situation:

If new threads allocating native memory are periodically created and end with an interval that is smaller than CLEANUP_INTERVAL_MS, every iteration of the while loop (158) will yield a CleanerRef in 161 and masterCleanup() will never be called. If now multiple threads are created (and are kept alive), that allocate native memory, the last allocation of these threads might not be collected, even if it should be gced, because for these cases cleanup is only done from cleanQueue, which is only called from masterCleanup then.

The Callback Cleanup Test failes on windows (also locally reproducible, I tested with JDK8 both 32bit and 64bit):

Appveyor (64bit):
https://ci.appveyor.com/project/matthiasblaesing/jna/build/job/2x995gulpks9il6y?fullLog=true#L2061
https://ci.appveyor.com/project/matthiasblaesing/jna/build/job/2x995gulpks9il6y?fullLog=true#L2092

Appveyor (32bit):
https://ci.appveyor.com/project/matthiasblaesing/jna/build/job/9423p77olylb56ua?fullLog=true#L2063
https://ci.appveyor.com/project/matthiasblaesing/jna/build/job/9423p77olylb56ua?fullLog=true#L2094

On windows 64bit I also got a test failure on a manual run:
Bildschirmfoto vom 2023-08-05 18-44-56

src/com/sun/jna/internal/Cleaner.java Outdated Show resolved Hide resolved
@pmconrad
Copy link
Author

If new threads allocating native memory are periodically created and end with an interval that is smaller than CLEANUP_INTERVAL_MS, every iteration of the while loop (158) will yield a CleanerRef in 161 and masterCleanup() will never be called. If now multiple threads are created (and are kept alive), that allocate native memory, the last allocation of these threads might not be collected, even if it should be gced, because for these cases cleanup is only done from cleanQueue, which is only called from masterCleanup then.

Good catch, thanks. Should be fixed now.

The Callback Cleanup Test failes on windows (also locally reproducible, I tested with JDK8 both 32bit and 64bit):
Appveyor (64bit): https://ci.appveyor.com/project/matthiasblaesing/jna/build/job/2x995gulpks9il6y?fullLog=true#L2061 https://ci.appveyor.com/project/matthiasblaesing/jna/build/job/2x995gulpks9il6y?fullLog=true#L2092
Appveyor (32bit): https://ci.appveyor.com/project/matthiasblaesing/jna/build/job/9423p77olylb56ua?fullLog=true#L2063 https://ci.appveyor.com/project/matthiasblaesing/jna/build/job/9423p77olylb56ua?fullLog=true#L2094

I've found and fixed a possible cause. Will see if it works.

On windows 64bit I also got a test failure on a manual run:

That's what I would expect if you ran the first commit only (test case to demonstrate #1535 ). Can you reproduce that with the latest commit?

@matthiasblaesing
Copy link
Member

Windows is still a problem. The Linux side seems to cleanup nicely (github actions run clean).

These are only the amd64 ones:

On my virtual machine I get:

image
image

The first set of problems might be a to short test time, the latter a too small heap.

@pmconrad
Copy link
Author

Sorry for the delay.
Rebased on latest master and refactored some tests to get rid of the windows-specific error. Works for me with 64bit Oracle JDK 1.8.0.

@matthiasblaesing
Copy link
Member

The situation is much improved. What I'm seeing though is, that the cleaner thread does not shut down. From my POV it can't. The static com.sun.jna.internal.Cleaner.MY_INSTANCE holds a strong reference on a one cleaner per Thread (when at least one cleaner relevant action is executed in that thread). That strong reference in turn will prevent GC/shutdown of the Masterthread.

Could you please recheck this?

@pmconrad
Copy link
Author

What I'm seeing though is, that the cleaner thread does not shut down. From my POV it can't. The static com.sun.jna.internal.Cleaner.MY_INSTANCE holds a strong reference on a one cleaner per Thread (when at least one cleaner relevant action is executed in that thread). That strong reference in turn will prevent GC/shutdown of the Masterthread.

Yes and no. The cleaner thread does not shut down as long as there are Cleaner instances alive. This is on purpose - the cleaner thread is required as long as the Cleaner instances could still be used for registering new cleanables.

The idea is that in the "interesting" cases, the Cleaner instances will eventually be GC'd because either

  • the threads holding Cleaner instances terminate, or
  • the Cleaner class is unloaded (which means the static ThreadLocal will eventually be GC'd).

If neither happens, the cleaner thread will (and needs to) stay alive.

I admit there is one edge case that is not covered by this, i.e. if an application uses JNA only during startup and then never again. In that case, the cleaner thread will stay alive and consume some (few) resources. If that is a problem, it can be circumvented by executing the initial JNA calls in a separate thread that shuts down after JNA is no longer needed.

Now that I've explained this, I'm not sure that the class unloading actually works. :-/ Will test.

@matthiasblaesing
Copy link
Member

#1555 was created as reaction to #1521. It is not easy to get JNA unloaded. The pattern to use a static INSTANCE variable on the interface makes it hard to unload the library, but it is possible.

I disagree with the assessment that it is uncommon to call JNA methods from long running threads. For example can it be necessary to invoke functions from the EDT or using a thread pool. Firing up a separate thread to call into native is the opposite of a performance improvement and also might not be feasible at all.

@pmconrad
Copy link
Author

I've simplified the Cleaner instance handling. It no longer relies on Thread instances being GC'd, which wasn't working as expected, nor on Java's ThreadLocal implementation, which wasn't either.
Instead I'm using a simple Map Thread->Cleaner, and let the MasterCleaner thread take over instances that are either no longer in active use, or that belong to dead threads. That should properly handle long-running threads. It should also allow class unloading once all to-be-cleaned-up references have been GC'd, but I'm unsure how to best test this.

Copy link
Member

@matthiasblaesing matthiasblaesing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update! For a smoke test I use a setup based on this:

#1555 (comment)

Maven project: Test.zip

I used NetBeans for testing, opened the project and "Ran" the "Test" class with Debugging. I observed output and running threads. After the hostname is printed and the "JNA Cleaner" thread vanished, I put a breakpoint onto the first if(exit) { line (line 36). Using the debugger I set the variable exit to true. I allow the programm to continue and observe, that a new "JNA Cleaner" thread is spun up, when the second run method (40) is invoked. So shutting down and starting up again works.

Comments are left inline.

I ran ant checkstyle over this and checkstyle is not happy:

matthias@enterprise:~/src/jnalib$ ant checkstyle
Buildfile: /home/matthias/src/jnalib/build.xml

checkstyle:
[checkstyle] Running Checkstyle 8.17 on 579 files
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:138: Das Unterelement von 'block' hat eine unerwartete Einrückungstiefe von 20 (erwartet: 16). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:139: Das Unterelement von 'block' hat eine unerwartete Einrückungstiefe von 20 (erwartet: 16). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:140: Das Unterelement von 'block' hat eine unerwartete Einrückungstiefe von 20 (erwartet: 16). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:141: 'while' hat eine unerwartete Einrückungstiefe von 20 (erwartet: 16). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:142: 'if' hat eine unerwartete Einrückungstiefe von 24 (erwartet: 20). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:143: 'try' hat eine unerwartete Einrückungstiefe von 24 (erwartet: 20). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:144: Das Unterelement von 'try' hat eine unerwartete Einrückungstiefe von 28 (erwartet: 24). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:145: 'if' hat eine unerwartete Einrückungstiefe von 28 (erwartet: 24). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:146: Das Unterelement von 'if' hat eine unerwartete Einrückungstiefe von 32 (erwartet: 28). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:147: 'if rcurly' hat eine unerwartete Einrückungstiefe von 28 (erwartet: 24). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:149: 'if' hat eine unerwartete Einrückungstiefe von 28 (erwartet: 24). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:150: Das Unterelement von 'if' hat eine unerwartete Einrückungstiefe von 32 (erwartet: 28). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:151: Das Unterelement von 'if' hat eine unerwartete Einrückungstiefe von 32 (erwartet: 28). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:152: 'if rcurly' hat eine unerwartete Einrückungstiefe von 28 (erwartet: 24). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:153: 'try rcurly' hat eine unerwartete Einrückungstiefe von 24 (erwartet: 20). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:158: Das Unterelement von 'catch' hat eine unerwartete Einrückungstiefe von 28 (erwartet: 24). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:159: 'catch rcurly' hat eine unerwartete Einrückungstiefe von 24 (erwartet: 20). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:160: Das Unterelement von 'catch' hat eine unerwartete Einrückungstiefe von 28 (erwartet: 24). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:161: 'catch rcurly' hat eine unerwartete Einrückungstiefe von 24 (erwartet: 20). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:162: 'while rcurly' hat eine unerwartete Einrückungstiefe von 20 (erwartet: 16). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:163: Das Unterelement von 'block' hat eine unerwartete Einrückungstiefe von 20 (erwartet: 16). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/src/com/sun/jna/internal/Cleaner.java:164: 'block rcurly' hat eine unerwartete Einrückungstiefe von 16 (erwartet: 12). [Indentation]
[checkstyle] [ERROR] /home/matthias/src/jnalib/test/com/sun/jna/MasterCleanerTest.java:1: Zeile entspricht nicht der erwarteten Kopfzeile '^/\*( Copyright.*$|$)'. [header-check]
[checkstyle] [ERROR] /home/matthias/src/jnalib/test/com/sun/jna/internal/MasterAccessor.java:1: Die erforderliche Kopfzeile fehlt - Datei hat nicht genug Zeilen. [header-check]

BUILD FAILED
/home/matthias/src/jnalib/build.xml:1581: Got 24 errors and 0 warnings.

Total time: 4 seconds
matthias@enterprise:~/src/jnalib$

Comment on lines +79 to +94
private void cleanQueue() {
if (lock.compareAndSet(false, true)) {
try {
Reference<?> ref;
while ((ref = referenceQueue.poll()) != null) {
try {
if (ref instanceof Cleanable) {
((Cleanable) ref).clean();
}
} catch (RuntimeException ex) {
Logger.getLogger(Cleaner.class.getName()).log(Level.SEVERE, null, ex);
}
}
} finally {
lock.set(false);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this locking necessary? I would expect the referenceQueue#poll to ensure a reference is only delivered to one poller.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API docs of ReferenceQueue do not mention that poll() is thread-safe, so I assume it isn't.

Comment on lines +174 to +190
Cleaner cleaner = entry.getValue();
long currentCount = cleaner.counter.get();
if (currentCount == cleaner.lastCount // no new cleanables registered since last master cleanup interval -> assume it is no longer in use
|| !entry.getKey().isAlive()) { // owning thread died -> assume it is no longer in use
it.remove();
CleanerImpl impl = cleaner.impl;
LOG.log(Level.FINE, () -> "MasterCleaner stealing cleaner " + impl + " from thread " + entry.getKey());
referencedCleaners.add(impl);
watchedCleaners.add(impl);
register(cleaner, () -> {
referencedCleaners.remove(impl);
LOG.log(Level.FINE, "Cleaner {0} no longer referenced", impl);
});
cleaners.remove(cleaner);
} else {
cleaner.lastCount = currentCount;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks strange and I think I see a problem. Consider a thread that does native work, then sleeps for a time longer time than MASTER_CLEANUP_INTERVAL_MS, and does this in a loop. While doing this a native reference is kept in the Thread. Then in each interation a new CleanerImpl will be constructed.

In any case I find it hard to see the relationship between cleaners, referencedCleaners and watchedCleaners, a short description might help here.

This might be simplified:

I would start with the currentCount == cleaner.lastCount case. Use this as an indicator, that the master thread should poll the reference queue, i.e. run Cleaner#cleanQueue().

Then I would adopt all cleaners, that have no cleanables registered anymore or associated to dead threads. The former one needs adoption to handle the possible race where the check for "no cleanables" happens and disassociation from the thread happens. The latter needs to latter to not keep dead threads around because of this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a thread that does native work, then sleeps for a time longer time than MASTER_CLEANUP_INTERVAL_MS, and does this in a loop. While doing this a native reference is kept in the Thread. Then in each interation a new CleanerImpl will be constructed.
...
The former one needs adoption to handle the possible race where the check for "no cleanables" happens

Yes.

I did consider what you suggest, and decided against it because I couldn't see a way to avoid the race without introducing some kind of synchronization in the Cleaner.. The original Cleaner implementation seemed to deliberately avoid synchronization, so I wanted to avoid it too. The price to be paid for this is the possibility of having more Cleaner instances than strictly required, but I think the overhead is small except in rather degenerate cases.

@@ -0,0 +1,92 @@
package com.sun.jna;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing license header.

@matthiasblaesing
Copy link
Member

I just had another look at this and I think I see another problem related to the adopted cleaners. If I'm not mistaken, the situation could happen: the last active cleaner is determined to be stale and moved to watched/referenced cleaners. In the next cycle the MasterCleaner can shut down because MasterCleaner#cleaners is now empty. But the watched/referenced cleaners still hold the instances and these are not processes anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cleaner thread too slow Cleaner (thread) related memory leak
2 participants