Rollback of ThreadLocal optimization change #43
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change rollsback 3def696, 7597ac7 and f6ab6e0 due to an
internal (Google) memory regression.
The cause is a memory leak because of a non-static ThreadLocal usage that contained references to the object instance itself (Machine -> RE2 objects).
The main issue is creating unbound new ThreadLocal instances leaks ThreadLocals since they are not GCed.
The repro case is much easier:
What is happening here?
Each Pattern.compile is creating a new RE2 instance. Each RE2 instance is creating its own ThreadLocal.
When we use the ThreadLocal we access a per thread map ThreadLocalMap. This map contains Entries with ThreadLocal as key and the value as the value.
The Entry of the map (ThreadLocalMap.Entry) extend WeakReference. And here is the problem. The entry itself is not weak. It is just the key that is weak.
So we are essentially creating thousands of entries on that map using the a different ThreadLocal as a key each time.
Why are not GCed? The ThreadLocal is not longer referenced, right? It is referenced:
Each Entry on that map (That is not weak, remember, it is just the key that is weak) contains a value that is...The Machine. Machine points to RE2...and RE2 class points to the ThreadLocal!. And there we have the cycle.
TL;DR: Never create unbound number of ThreadLocals. If not static, make sure that the number of instances are controlled (IOW, not unbound).
But does this mean that ThreadLocals are never GCed from the ThreadLocalMap?
No
When we look for a ThreadLocal in the ThreadLocalMap (by doing threadLocal.get() for example), if we don't find the ThreadLocal in the map by doing a simple hashcode lookup in a array, it calls getEntryAfterMiss.This method calls expungeStaleEntry if Entry.get() returns null (it is the week reference to the key, the threadLocal). This code (and also other places that end up calling expungeStaleEntry) cleanups the stale entry if the threadlocal has been GCed.
But in this case it wasn't GCed, because the value is an indirect reference to the ThreadLocal (because it is a non-static member of RE2).