[#10093] fix(core): Introduce ClassLoaderPool to share ClassLoaders across same-type catalogs#10480
[#10093] fix(core): Introduce ClassLoaderPool to share ClassLoaders across same-type catalogs#10480LuciferYang wants to merge 3 commits intoapache:mainfrom
Conversation
|
We had tried it in this way, see #2644, but aborted it as a potential risk exists in classloader sharing, including
These two factors mentioned above will affect the correctness of the Gravitino catalogs, so we hesitate to move on. |
@yuqi1129 Thank you for the context on #2644. The three concerns raised by @jerryshao are valid for that PR's approach — here's how this PR differs and addresses each one. Background: how this PR differs from #2644PR #2644 proposed reusing ClassLoaders during alter operations on a single catalog, passing the old ClassLoader to the new This PR takes a fundamentally different approach: a reference-counted pool that shares ClassLoaders across different catalog instances based on a composite key. The alter path is unaffected — alter still invalidates the old cache entry (releasing the pool reference) and creates a new wrapper (acquiring from the pool). The pool handles whether to reuse or create a new ClassLoader based on whether the key changed. Addressing the three specific concerns1. "The classloader reuse for alteration makes the code a little broken" This PR does not change the alter path's control flow. When a catalog is altered:
The complex 2. "If we modify some properties that require rebooting and refreshing the classloader, with this we cannot support it" This concern applies when alter changes properties that affect ClassLoader construction. In this PR, the If the alter changes a property that is NOT part of the key (e.g., metastore URI, JDBC URL), the ClassLoader is reused — which is correct, because these properties do not affect ClassLoader construction (they are consumed by 3. "Sharing classloader between catalogs is dangerous, because some static variables that were created based on logic A will be used by another catalog that are based on logic B" I audited all catalog implementations for static mutable state:
The key insight is that ClassLoader construction in Gravitino depends only on JAR paths (provider + package + auth plugin), not on catalog runtime properties. Two catalogs with the same Future extensibility
If new ClassLoader-scoped static state is discovered in the future, the isolation criteria can be extended by introducing a server configuration such as: gravitino.catalog.classloader.isolation.properties = authentication.type,authentication.kerberos.principal,authentication.kerberos.keytab-uri
Since I don't have a thorough understanding of the project yet, please correct me if I've misunderstood anything. |
Code Coverage Report
Files
|
yuqi1129
left a comment
There was a problem hiding this comment.
Thanks for this PR — the OOM/Metaspace problem is real and the benchmark numbers are impressive. I spent some time studying the design carefully and have a few concerns about correctness under certain real-world configurations. I'd like to discuss these before merging.
Background: What ClassLoader sharing gives up
IsolatedClassLoader was designed to give every catalog its own isolated class space so that third-party library static state (Hadoop UGI, FileSystem cache, JDBC DriverManager, HiveConf, etc.) cannot bleed between catalogs. ClassLoaderPool partially relaxes that isolation. The question is whether ClassLoaderKey captures all the dimensions along which static state can diverge between two catalogs of the same type.
Concern 1: ClassLoaderKey is missing critical backend-URI dimensions
The current key is:
provider + packageProperty + authorizationPkgPath + kerberosPrincipal + kerberosKeytab
This is correct for isolating Kerberos identity, but it doesn't account for catalogs that point to different backends of the same type. Consider:
Scenario — two Iceberg catalogs with different HMS URIs:
catalog-A: provider=lakehouse-iceberg, metastore.uris=thrift://hms-A:9083
catalog-B: provider=lakehouse-iceberg, metastore.uris=thrift://hms-B:9083
Both produce the same ClassLoaderKey and share one IsolatedClassLoader. Inside that ClassLoader, HiveConf has a static configuration space. HiveConf.ConfVars and the valuesPerLabel cache are static. If catalog-B's initialization overwrites catalog-A's HMS URI in HiveConf, catalog-A starts talking to the wrong metastore.
Same problem exists for:
fs.defaultFS(HadoopFileSystem.CACHEis a per-ClassLoader static keyed onURI + conf + UserGroupInformation; two catalogs pointing at different HDFS clusters may cross-contaminate the FileSystem cache)- JDBC URL (less likely to corrupt static state, but the
AbandonedConnectionCleanupThreadand driver registry are global per-ClassLoader)
Suggested fix: extend ClassLoaderKey to include the backend URI(s) that anchor static state. For Iceberg/Paimon Hive-backend: metastore.uris. For JDBC catalogs: jdbc-url. For fileset catalogs: fs.defaultFS.
Concern 2: FileSystem.closeAll() during doFinalCleanup can disconnect live catalogs
closeStatsDataClearerInFileSystem calls FileSystem.closeAll() — a static method that closes every cached FileSystem in that ClassLoader's cache. doFinalCleanup runs only when refCount reaches 0 (i.e., the last catalog sharing this ClassLoader is closed), so under the current logic it won't fire while other catalogs are live.
However, if Concern 1 is fixed and two catalogs with different HDFS URIs are given separate keys, this is safe. But as long as ClassLoaderKey is under-specified (same key for different backends), there is a window where catalog-A's cleanup kills catalog-B's live HDFS connections.
Concern 3: ThreadLocal cross-contamination during shared lifetime
With per-catalog ClassLoaders, ThreadLocal values from different catalogs are of different Class objects (loaded by different ClassLoaders), so they are naturally isolated. With a shared ClassLoader, Catalog A and Catalog B load the same Class, meaning a ThreadLocal set by catalog-A's code on a Jetty thread is visible to catalog-B's code running on the same thread.
Concretely: if Iceberg or Hive sets a per-request ThreadLocal (e.g., Hadoop SecurityContext, Hive SessionState, or Iceberg's ResolvingFileIO context), catalog-B could pick up a stale value left by catalog-A. This is especially risky for Hive SessionState, which is a ThreadLocal singleton used throughout HMS interaction.
Comparison with industry practice
The closest industry analogue is Trino's plugin ClassLoader: a single ClassLoader is shared across all connector instances of the same plugin. Trino makes this safe because its connectors are stateless — all per-request state lives in ConnectorSession, not in static fields. Gravitino's catalogs are stateful (they hold HMS connections, HDFS FileSystems, UGI login state), which is the key difference that makes sharing riskier here.
Summary of concerns
| Issue | Severity | Affected catalogs |
|---|---|---|
ClassLoaderKey missing metastore.uris / fs.defaultFS |
High | Iceberg-Hive, Hive, Paimon-Hive, fileset |
FileSystem.closeAll() can disconnect live catalog (if keys are under-specified) |
High (conditional on above) | Any catalog using HDFS |
| ThreadLocal cross-contamination between shared-ClassLoader catalogs | Medium | Iceberg, Hive (SessionState) |
| AWS SDK v2 static state not cleaned up (only v1 handled) | Low | Iceberg/Paimon with S3 backend |
Suggestion
The OOM fix is valuable and I'd like to see it merged. One path forward:
- Extend
ClassLoaderKeyto include the backend URI dimensions that anchor per-ClassLoader static state. This makes sharing conservative (fewer catalogs will share) but correct. - Or, limit sharing to the case where all config properties are identical (not just the 5 current dimensions). This is the safest interpretation of "same type" — if two catalogs are truly identical in configuration, they genuinely cannot diverge in static state.
- Add a test that creates two same-type catalogs pointing at different backends and verifies they get separate ClassLoader entries in the pool.
Happy to discuss the trade-offs — the refCount/cleanup mechanics look solid and the testConnection leak fix is a clear win regardless.
|
@yuqi1129 Thanks for your valuable advice. I'm a bit busy today, but I'll take care of the this tomorrow. |
|
The failure of the S3 integration test appears to be unrelated to the current pr |
|
Thanks @yuqi1129 for the thorough review. Here is the status of each concern. Concern 1:
|
| Category | Property keys |
|---|---|
| Classpath | package (Catalog.PROPERTY_PACKAGE), authorization-provider |
| Kerberos identity | authentication.type, authentication.kerberos.principal, authentication.kerberos.keytab-uri |
| Backend URIs | metastore.uris, jdbc-url, fs.defaultFS |
These defaults cannot be removed. Operators can add more via a new server config:
gravitino.catalog.classloader.isolation.extra-properties = custom.backend.endpointClassLoaderKey stores isolation properties as a generic Map<String, String>, decoupled from specific property names — only CatalogManager.buildClassLoaderKey needs to know which properties matter. This makes the pool infrastructure key-agnostic and extensible without modifying pool classes.
Concern 2: FileSystem.closeAll() can disconnect live catalogs — Resolved by Concern 1 fix
With backend URIs now in the key, catalogs pointing at different HDFS clusters get separate ClassLoaders. doFinalCleanup only runs when refCount reaches 0, so FileSystem.closeAll() cannot affect live catalogs sharing the same ClassLoader.
Concern 3: ThreadLocal cross-contamination — Not fully resolved, inherent trade-off
This is a genuine limitation of ClassLoader sharing. With per-catalog ClassLoaders, ThreadLocal values are naturally isolated because each ClassLoader loads its own copy of the class. With a shared ClassLoader, catalogs on the same thread can see each other's ThreadLocal state.
The Concern 1 fix reduces the blast radius — catalogs sharing a ClassLoader now have identical backend configurations, so leaked ThreadLocal state is less likely to cause incorrect behavior (e.g., talking to the wrong metastore). But it does not eliminate the problem. If a library sets a ThreadLocal with per-catalog state (e.g., Hive SessionState, Hadoop SecurityContext), cross-contamination is still possible between catalogs sharing the same ClassLoader on the same thread.
What this PR does:
- Reduces exposure by isolating catalogs with different backends into separate ClassLoaders (Concern 1 fix)
- Provides
extra-propertiesconfig as an operational escape hatch — if a specific ThreadLocal issue surfaces, operators can add the relevant property key to force separation without a code release
What this PR does not do:
- It does not guarantee ThreadLocal isolation between catalogs sharing a ClassLoader. This is an inherent trade-off of sharing and cannot be fully solved at the key level.
Open question for the community: Is this trade-off acceptable given the Metaspace savings, or should we consider additional safeguards (e.g., a per-catalog opt-out property to force a dedicated ClassLoader)? Happy to discuss.
Concern 4: AWS SDK v2 static state — Not addressed in this PR
Low severity, not related to the key design. Can be addressed as a follow-up if this one merged.
Suggestion 3: Tests for different-backend isolation — Added
New tests:
testDifferentMetastoreUrisCreateDifferentEntriestestSameMetastoreUrisShareEntrytestDifferentJdbcUrlsCreateDifferentEntriestestDifferentDefaultFsCreateDifferentEntriestestKeyWithAuthorizationProvider
Total: 19 unit tests + 3 integration tests.
I will take time to review it again. Thanks for your quick response. |
|
Thank you @yuqi1129 |
What changes were proposed in this pull request?
Introduce a
ClassLoaderPoolwith reference counting to shareIsolatedClassLoaderinstances across catalogs of the same type, and centralize ClassLoader resource cleanup into the pool's lifecycle.Core mechanism: Catalogs with identical isolation-relevant properties share a single
IsolatedClassLoader. The isolation key includes the provider, package, authorization provider, Kerberos identity, and backend URIs (metastore.uris,jdbc-url,fs.defaultFS). Operators can extend the isolation dimensions viagravitino.catalog.classloader.isolation.extra-propertieswithout code changes. The pool usesConcurrentHashMap.compute()for atomic acquire/release, and performs cleanup (JDBC driver deregistration, ThreadLocal clearing, MySQLAbandonedConnectionCleanupThreadshutdown) only when the last catalog releases the shared ClassLoader.New classes:
ClassLoaderKey—Map<String, String>-based key for ClassLoader sharing, decoupled from specific property namesClassLoaderPool— thread-safe pool with reference counting and lifecycle managementPooledClassLoaderEntry— holds a shared ClassLoader and its reference countChanges to existing classes:
CatalogManager— integrates pool into catalog creation, test connection, and close paths; fixes ClassLoader leak intestConnection()andgetResolvedProperties(); defines built-in isolation property keys and supports configurable extra keysConfigs— addsgravitino.catalog.classloader.isolation.extra-propertiesconfigurationClassLoaderResourceCleanerUtils— broadens ThreadLocal cleanup from webserver-only to all application threads; adds MySQL cleanupJdbcCatalogOperations,IcebergCatalogWrapper,IcebergCatalogOperations, andPaimonCatalogOperationsWhy are the changes needed?
Concurrent catalog creation with different names but the same provider type causes
OutOfMemoryError: Metaspace. Each catalog creates an independentIsolatedClassLoaderthat loads all provider JARs into Metaspace. WithMaxMetaspaceSize=512m(default) and Iceberg catalogs consuming ~30-80 MB each, ~10 catalogs exhaust the limit.This patch addresses four root causes:
testConnection()— wrapper was never closed after connection testClassLoaderResourceCleanerUtilsFix: #10093
Does this PR introduce any user-facing change?
Yes. A new optional server configuration is added:
gravitino.catalog.classloader.isolation.extra-properties— comma-separated list of additional catalog property keys used to determine ClassLoader isolation. Supplements the built-in defaults (package, authorization-provider, Kerberos identity, metastore.uris, jdbc-url, fs.defaultFS) and cannot remove them. Default is empty.How was this patch tested?
Unit tests (
TestClassLoaderPool— 19 tests): acquire/release semantics, reference counting, concurrent access with 20 threads, close-during-acquire race, double-release resilience, Kerberos key isolation, backend URI isolation (metastore URIs, JDBC URLs, fs.defaultFS), authorization provider isolation, package property isolation.Integration tests (
TestClassLoaderPoolIntegration— 3 tests): same-type catalogs share ClassLoader instance, drop one doesn't affect others, manager close cleans up pool.Existing tests:
TestCatalogManagerandTestJdbcCatalogOperationspass without modification.Benchmark (JDK 17,
-XX:MaxMetaspaceSize=512m,filesetprovider, 10 concurrent threads):Metaspace growth (committed KB):
main)Classes loaded:
Baseline Metaspace grows O(N) with catalog count. The pool stays flat at ~8.7 MB — O(number of distinct keys). No OOM or performance regression on either version. For Iceberg catalogs (~50 MB/ClassLoader), baseline OOMs at ~10 catalogs; with the pool, catalogs sharing the same key reuse a single ClassLoader, so Metaspace scales with the number of distinct configurations rather than the number of catalog instances.
Future extensibility
ClassLoaderKeystores isolation properties as a genericMap<String, String>, decoupled from any specific property names. This makes the pool infrastructure key-agnostic — only the logic that builds the key needs to know which properties matter.If new ClassLoader-scoped static state is discovered in the future, the isolation criteria can be extended by adding property keys to the server configuration:
gravitino.catalog.classloader.isolation.extra-properties = custom.backend.endpointCatalogManagerreads this list at startup and extracts matching values from catalog properties when building the key. Operators can then add isolation criteria for environment-specific static state without code changes or recompilation. TheClassLoaderKey,ClassLoaderPool, andPooledClassLoaderEntryrequire no modification — only the source of property keys changes.