You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We noticed that the JVM heap was growing for a Vert.x based service and upon doing a heap dump we saw that the TagsCache for four of our event loop contexts had grown to over a million entries.
What follows is my speculation on what happened. I have not been able to reproduce this bug in a test case, but I am working on doing so.
The tag cache size is limited using an implementation of removeEldestEntry in a LinkedHashMap that signals that the entry should be removed if the size of the map exceeds 512. We have seen these maps grow to more than 1M entries.
Concurrent modifications to a LinkedHashMap (which is not allowed) can cause the removeEldestEntry check to fail, as the implementation is not thread safe, and is therefore a way to grow beyond what the removeEldestEntry check would allow.
The TagsCache implementation tries to avoid concurrent access to the cache by doing the following tests:
if (context == null || context.isWorkerContext() || !context.inThread()) {
// Don't use the cache
}
For an event loop context in a Netty NioEventLoop, the context is non-null, isWorkerContext() returns false, and context.inThread() returns nettyEventLoop().inEventLoop();
I'm not sure this last check (return nettyEventLoop().inEventLoop();) guarantees single-threadedness. For Netty event loops served by multiple threads, it only checks if the calling thread is one of them, not if it is the event loop thread(?).
The result is concurrent access to the cache, which causes it to grow without bounds.
Do you have a reproducer?
Working on it, will update issue when I have one.
Steps to reproduce
(Suspected)
Call TagsCache.getOrCreate concurrently from a context where the context implementation's inThread() returns true for more than one thread.
Extra
Ubuntu 23.04, JDK17
Proof of exceeding LinkedHashMap size:
import java.util.LinkedHashMap;
import java.util.Map;
public class LRUTest {
public static void main(String[] args) throws Throwable {
final var map = lruMap();
var threads = new ArrayList<Thread>();
for (int t = 0; t < 16; ++t) {
threads.add(new Thread() {
public void run() {
for (int i = 0; i < 1024000; ++i) {
map.put(i, i);
}
}
});
}
for (var t : threads) {
t.start();
}
for (var t : threads) {
t.join();
}
System.out.println(map.size());
}
private static LinkedHashMap<Integer,Integer> lruMap() {
return new LinkedHashMap<Integer,Integer>() {
@Override
protected boolean removeEldestEntry(Map.Entry eldest) {
return size() > 512;
}
};
}
}
The text was updated successfully, but these errors were encountered:
Version
4.4.1, 4.4.4
Context
We noticed that the JVM heap was growing for a Vert.x based service and upon doing a heap dump we saw that the TagsCache for four of our event loop contexts had grown to over a million entries.
What follows is my speculation on what happened. I have not been able to reproduce this bug in a test case, but I am working on doing so.
The tag cache size is limited using an implementation of
removeEldestEntry
in aLinkedHashMap
that signals that the entry should be removed if the size of the map exceeds 512. We have seen these maps grow to more than 1M entries.Concurrent modifications to a
LinkedHashMap
(which is not allowed) can cause theremoveEldestEntry
check to fail, as the implementation is not thread safe, and is therefore a way to grow beyond what theremoveEldestEntry
check would allow.The
TagsCache
implementation tries to avoid concurrent access to the cache by doing the following tests:For an event loop context in a Netty NioEventLoop, the context is non-null,
isWorkerContext()
returnsfalse
, andcontext.inThread()
returnsnettyEventLoop().inEventLoop();
I'm not sure this last check (
return nettyEventLoop().inEventLoop();
) guarantees single-threadedness. For Netty event loops served by multiple threads, it only checks if the calling thread is one of them, not if it is the event loop thread(?).The result is concurrent access to the cache, which causes it to grow without bounds.
Do you have a reproducer?
Working on it, will update issue when I have one.
Steps to reproduce
(Suspected)
inThread()
returnstrue
for more than one thread.Extra
Ubuntu 23.04, JDK17
Proof of exceeding
LinkedHashMap
size:The text was updated successfully, but these errors were encountered: