Revised type cache#1189
Conversation
This change overhauls the core type cache The new approach aims to achieve several things... 1 - cache is strictly bounded -- no variance for number of classes of ClassLoaders 2 - cache is significantly smaller 3 - cache doesn't compromise start-up time 4 - primary eviction policy isn't driven by time 5 - primary eviction policy isn't driven by GC There are some slight compromises here. In practice, start-up does increase slightly in a memory rich environment; however, start-up improves considerably in a memory poor environment. The basic approcach is to have a single unified Guava cache for all ClassLoaders -- nominally keyed a composite of ClassLoader & class name The ByteBuddy CacheProvider are simply thin wrappers around the Guava cache associated to a particular ClassLoader However rather than having a large number of WeakReferences floating around. The cache assigns an ID to each ClassLoader. To further avoid, consuming memory the cache only preserves a small map of Loader / ID assignments. This means a ClassLoader may have more than one active ID. This introduce the possibility for ID exhaustion. That unlikely case is handle by retiring the internal CacheInstance and starting anew.
| * be created in its place. | ||
| */ | ||
| private static final class CacheInstance { | ||
| static final int CONCURRENCY_LEVEL = 8; |
There was a problem hiding this comment.
I've currently sized the cache based on start-up performance tests with a Spring application.
Since we might something larger in some apps, I'll probably add a Spec object and make the size configurable.
tylerbenson
left a comment
There was a problem hiding this comment.
I'm not sure I follow all of the benefits with this change. I'm not opposed to it, but I have several questions below.
| return EvictingCacheProvider.withObjectType(cleaner, 1, TimeUnit.MINUTES); | ||
| static final long EXHAUSTED_ID = LIMIT_ID; | ||
|
|
||
| // Many things are package visible for testing purposes -- |
There was a problem hiding this comment.
With groovy you can still test private methods. I suggest marking things as private where it make sense.
There was a problem hiding this comment.
So groovy allows private methods but not private fields?
Anyway, I don't want to generate synthetic accessors either, so I'll probably leave most things visible.
| this.cacheId = cacheId; | ||
| this.name = name; | ||
|
|
||
| hashCode = (int) (31 * cacheId) ^ name.hashCode(); |
There was a problem hiding this comment.
Why use this algorithm? Maybe add a comment?
There was a problem hiding this comment.
No particular reason -- just a variation on FNV
| */ | ||
| final Cache<ClassLoader, Long> loaderIdCache = | ||
| CacheBuilder.newBuilder() | ||
| .weakKeys() |
There was a problem hiding this comment.
Pointing out that we are still using weak keys on the classloader, but I don't see any way to avoid that.
There was a problem hiding this comment.
The only way would be push the ID into the ClassLoader itself -- or inject a Meta info class into each ClassLoader.
| */ | ||
| final Cache<TypeCacheKey, TypePool.Resolution> sharedResolutionCache = | ||
| CacheBuilder.newBuilder() | ||
| .softValues() |
There was a problem hiding this comment.
Just pointing out that you're still using softValues.
There was a problem hiding this comment.
Yes, that was a deliberate choice. I think they are unnecessary at this point, but for now, I kept them just to be safe.
| * Single shared Type.Resolution cache -- uses a composite key of loader ID & class name The | ||
| * initial capacity is set to the maximum capacity to avoid expansion overhead. | ||
| */ | ||
| final Cache<TypeCacheKey, TypePool.Resolution> sharedResolutionCache = |
There was a problem hiding this comment.
One problem I see with this new shared cache is if a classloader gets GC'd, the associated resolution classes won't be automatically cleared out. It seems they would actually stay there until evicted by the cache's size constraints.
There was a problem hiding this comment.
That's true, but I don't think it is necessarily a problem.
If that ClassLoader goes cold, then those cache entries will also go cold.
If the cache needs to hold new entries, then those abandoned entries should be evicted.
I think a more useful eviction addition would be to dump the cache, once ClassLoader activity slows regardless of whether the ClassLoaders are still reachable or not.
Muzzle doesn't like creation of SecureClassLoader-s, so switching to a URLClassLoader for my placeholder loader in tests
First pass at replacing ID generation with WeakReference reuse In this first version, the Cache<ClassLoader, ID> was replaced with Cache<ClassLoader, WeakReference<ClassLoader>>. The core cache is still of Cache<TypeCacheKey, TypePool.Resolution> and TypeCacheKey logically remains a composite key of ClassLoader, class name. The removal of ID assignment means ID exhaustion is no longer na issue, so there's never a need to rebuild the cache. For that reason, CacheInstance has removed and the core caching logic has been moved into DDCachingPoolStrategy. While TypeCacheKey remains conceptually the same, the internals have changed somewhat. The TypeCacheKey now has 3 core fields... - loaderHash - loadeRef - class name Since loader refs are recycled, the fast path for key equivalence can use reference equivalence of the reference objects. This change ripples through the CacheProvider-s which also have to store loaderHash and loaderRef. It may be worth going a step further and switching to a Cache<Loader, TypePool> as well. That still avoid the creation of many WeakReference-s, since the underlying CacheProvider will hold a canonical WeakReference per ClassLoader.
MuzzlePlugin groovy checks that no threads are spawned because this holds the ClassLoader live. This was breaking with the caching change because the cache no longer uses the Cleaner service. This caused a problem because the Thread behind the cleaner is created lazily when the first task is created, but without the cache the creation was delayed. To solve this, I addressed the original cause of the leak. The newly created Thread automatically inherits the contextClassLoader of its parent, but that's unnecessary for a cleaner thread. So I changed the ThreadFactory for cleaner to explicitly null out the contextClassLoader. We should probably null out contextClassLoader in other thread factories and also reduce our use of contextClassLoaders in general, but that will left to another PR.
tylerbenson
left a comment
There was a problem hiding this comment.
Pretty minor changes requested... Approved pending these changes.
| * <p>We also use our bootstrap proxy when matching against the bootstrap loader. | ||
| * <ul> | ||
| * There two core parts to the cache... | ||
| * <li>a cache of ID assignments for ClassLoaders |
There was a problem hiding this comment.
This seems to be out of date with the latest changes.
There was a problem hiding this comment.
Good catch, I'll fix that.
| .concurrencyLevel(CONCURRENCY_LEVEL) | ||
| .initialCapacity(LOADER_CAPACITY / 2) | ||
| .maximumSize(LOADER_CAPACITY) | ||
| .build(); |
There was a problem hiding this comment.
use the build(CacheLoader) method instead and you won't need the null check below.
There was a problem hiding this comment.
Not using CacheLoader was a deliberate choice, I deliberately chose to keep the critical section as short as possible by doing the WeakReference construction outside.
Admittedly, this is debatable choice because it does risk creating extra WeakReferences.
There was a problem hiding this comment.
In that case, it might be good to explain your reasoning in a comment or risk someone else making this change later.
| final ClassLoader key = | ||
| BOOTSTRAP_CLASSLOADER == classLoader ? BOOTSTRAP_CLASSLOADER_PLACEHOLDER : classLoader; | ||
| final TypePool.CacheProvider cache = typePoolCache.computeIfAbsent(key, this); | ||
| private final TypePool createNonCachingTypePool(final ClassFileLocator classFileLocator) { |
There was a problem hiding this comment.
This method is no longer used.
There was a problem hiding this comment.
Yes, you are right. I'll get rid of it.
- Removed unused method from earlier version - Corrected previously overlooked comments that were remnant of prior version
This PR overhauls the core type cache
The new approach aims to achieve several things...
1 - cache is strictly bounded -- no variance for number of classes or ClassLoaders
2 - cache is significantly smaller
3 - cache doesn't compromise start-up time
4 - primary eviction policy isn't driven by time
5 - primary eviction policy isn't driven by GC
There are some slight compromises here.
In practice, start-up can increase slightly in a memory rich environment; however, start-up improves considerably in a memory poor environment.
The basic approach is to have a single unified Guava cache for all ClassLoaders -- conceptually keyed by a composite of ClassLoader & class name
The ByteBuddy CacheProviders are simply thin wrappers around the Guava cache associated to a particular ClassLoader.
However rather than having a large number of WeakReferences floating around, the WeakReferences are normalized. This reduces GC pressure and allows for fast equivalence checks.
In experiments with a Spring application with largish heap (>512MiB)
On smaller heaps (<256MiB), the memory reduction is more modest, but start-up time improves.