Skip to content

Revised type cache#1189

Merged
dougqh merged 11 commits into
masterfrom
dougqh/type-caching2
Jan 30, 2020
Merged

Revised type cache#1189
dougqh merged 11 commits into
masterfrom
dougqh/type-caching2

Conversation

@dougqh
Copy link
Copy Markdown
Contributor

@dougqh dougqh commented Jan 23, 2020

This PR overhauls the core type cache

The new approach aims to achieve several things...
1 - cache is strictly bounded -- no variance for number of classes or ClassLoaders
2 - cache is significantly smaller
3 - cache doesn't compromise start-up time
4 - primary eviction policy isn't driven by time
5 - primary eviction policy isn't driven by GC

There are some slight compromises here.
In practice, start-up can increase slightly in a memory rich environment; however, start-up improves considerably in a memory poor environment.

The basic approach is to have a single unified Guava cache for all ClassLoaders -- conceptually keyed by a composite of ClassLoader & class name

The ByteBuddy CacheProviders are simply thin wrappers around the Guava cache associated to a particular ClassLoader.

However rather than having a large number of WeakReferences floating around, the WeakReferences are normalized. This reduces GC pressure and allows for fast equivalence checks.

In experiments with a Spring application with largish heap (>512MiB)

  • a 40+MiB reduction in heap usage - w/ a small < 0.25s increase in start-up

On smaller heaps (<256MiB), the memory reduction is more modest, but start-up time improves.

  • 10MiB reduction in heap usage and starts 1-2sec faster

This change overhauls the core type cache

The new approach aims to achieve several things...
1 - cache is strictly bounded -- no variance for number of classes of ClassLoaders
2 - cache is significantly smaller
3 - cache doesn't compromise start-up time
4 - primary eviction policy isn't driven by time
5 - primary eviction policy isn't driven by GC

There are some slight compromises here.
In practice, start-up does increase slightly in a memory rich environment; however, start-up improves considerably in a memory poor environment.

The basic approcach is to have a single unified Guava cache for all ClassLoaders -- nominally keyed a composite of ClassLoader & class name

The ByteBuddy CacheProvider are simply thin wrappers around the Guava cache associated to a particular ClassLoader

However rather than having a large number of WeakReferences floating around.  The cache assigns an ID to each ClassLoader.

To further avoid, consuming memory the cache only preserves a small map of Loader / ID assignments.  This means a ClassLoader may have more than one active ID.

This introduce the possibility for ID exhaustion.  That unlikely case is handle by retiring the internal CacheInstance and starting anew.
@dougqh dougqh requested a review from a team as a code owner January 23, 2020 15:59
@dougqh dougqh changed the title Dougqh/type caching2 Revised type cache Jan 23, 2020
* be created in its place.
*/
private static final class CacheInstance {
static final int CONCURRENCY_LEVEL = 8;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've currently sized the cache based on start-up performance tests with a Spring application.
Since we might something larger in some apps, I'll probably add a Spec object and make the size configurable.

Copy link
Copy Markdown
Contributor

@tylerbenson tylerbenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow all of the benefits with this change. I'm not opposed to it, but I have several questions below.

return EvictingCacheProvider.withObjectType(cleaner, 1, TimeUnit.MINUTES);
static final long EXHAUSTED_ID = LIMIT_ID;

// Many things are package visible for testing purposes --
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With groovy you can still test private methods. I suggest marking things as private where it make sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So groovy allows private methods but not private fields?
Anyway, I don't want to generate synthetic accessors either, so I'll probably leave most things visible.

this.cacheId = cacheId;
this.name = name;

hashCode = (int) (31 * cacheId) ^ name.hashCode();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use this algorithm? Maybe add a comment?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason -- just a variation on FNV

*/
final Cache<ClassLoader, Long> loaderIdCache =
CacheBuilder.newBuilder()
.weakKeys()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointing out that we are still using weak keys on the classloader, but I don't see any way to avoid that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way would be push the ID into the ClassLoader itself -- or inject a Meta info class into each ClassLoader.

*/
final Cache<TypeCacheKey, TypePool.Resolution> sharedResolutionCache =
CacheBuilder.newBuilder()
.softValues()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pointing out that you're still using softValues.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was a deliberate choice. I think they are unnecessary at this point, but for now, I kept them just to be safe.

* Single shared Type.Resolution cache -- uses a composite key of loader ID & class name The
* initial capacity is set to the maximum capacity to avoid expansion overhead.
*/
final Cache<TypeCacheKey, TypePool.Resolution> sharedResolutionCache =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem I see with this new shared cache is if a classloader gets GC'd, the associated resolution classes won't be automatically cleared out. It seems they would actually stay there until evicted by the cache's size constraints.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but I don't think it is necessarily a problem.
If that ClassLoader goes cold, then those cache entries will also go cold.
If the cache needs to hold new entries, then those abandoned entries should be evicted.

I think a more useful eviction addition would be to dump the cache, once ClassLoader activity slows regardless of whether the ClassLoaders are still reachable or not.

Muzzle doesn't like creation of SecureClassLoader-s, so switching to a URLClassLoader for my placeholder loader in tests
First pass at replacing ID generation with WeakReference reuse

In this first version, the Cache<ClassLoader, ID> was replaced with Cache<ClassLoader, WeakReference<ClassLoader>>.

The core cache is still of Cache<TypeCacheKey, TypePool.Resolution> and TypeCacheKey logically remains a composite key of ClassLoader, class name.

The removal of ID assignment means ID exhaustion is no longer na issue, so there's never a need to rebuild the cache.  For that reason, CacheInstance has removed and the core caching logic has been moved into DDCachingPoolStrategy.

While TypeCacheKey remains conceptually the same, the internals have changed somewhat.  The TypeCacheKey now has 3 core fields...
- loaderHash
- loadeRef
- class name

Since loader refs are recycled, the fast path for key equivalence can use reference equivalence of the reference objects.

This change ripples through the CacheProvider-s which also have to store loaderHash and loaderRef.

It may be worth going a step further and switching to a Cache<Loader, TypePool> as well.  That still avoid the creation of many WeakReference-s, since the underlying CacheProvider will hold a canonical WeakReference per ClassLoader.
MuzzlePlugin groovy checks that no threads are spawned because this holds the ClassLoader live.

This was breaking with the caching change because the cache no longer uses the Cleaner service.

This caused a problem because the Thread behind the cleaner is created lazily when the first task is created, but without the cache the creation was delayed.

To solve this, I addressed the original cause of the leak.  The newly created Thread automatically inherits the contextClassLoader of its parent, but that's unnecessary for a cleaner thread.

So I changed the ThreadFactory for cleaner to explicitly null out the contextClassLoader.

We should probably null out contextClassLoader in other thread factories and also reduce our use of contextClassLoaders in general, but that will left to another PR.
Copy link
Copy Markdown
Contributor

@tylerbenson tylerbenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty minor changes requested... Approved pending these changes.

* <p>We also use our bootstrap proxy when matching against the bootstrap loader.
* <ul>
* There two core parts to the cache...
* <li>a cache of ID assignments for ClassLoaders
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be out of date with the latest changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I'll fix that.

.concurrencyLevel(CONCURRENCY_LEVEL)
.initialCapacity(LOADER_CAPACITY / 2)
.maximumSize(LOADER_CAPACITY)
.build();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the build(CacheLoader) method instead and you won't need the null check below.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not using CacheLoader was a deliberate choice, I deliberately chose to keep the critical section as short as possible by doing the WeakReference construction outside.

Admittedly, this is debatable choice because it does risk creating extra WeakReferences.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it might be good to explain your reasoning in a comment or risk someone else making this change later.

final ClassLoader key =
BOOTSTRAP_CLASSLOADER == classLoader ? BOOTSTRAP_CLASSLOADER_PLACEHOLDER : classLoader;
final TypePool.CacheProvider cache = typePoolCache.computeIfAbsent(key, this);
private final TypePool createNonCachingTypePool(final ClassFileLocator classFileLocator) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is no longer used.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I'll get rid of it.

- Removed unused method from earlier version
- Corrected previously overlooked comments that were remnant of prior version
@dougqh dougqh merged commit 53d32b4 into master Jan 30, 2020
@devinsba devinsba added this to the 0.42.0 milestone Jan 31, 2020
@tylerbenson tylerbenson deleted the dougqh/type-caching2 branch January 31, 2020 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants