[#10093] fix(core): Introduce ClassLoaderPool to share ClassLoaders across same-type catalogs by LuciferYang · Pull Request #10480 · apache/gravitino

LuciferYang · 2026-03-19T13:54:33Z

What changes were proposed in this pull request?

Introduce a ClassLoaderPool with reference counting to share IsolatedClassLoader instances across catalogs of the same type, and centralize ClassLoader resource cleanup into the pool's lifecycle.

Core mechanism: Catalogs with identical isolation-relevant properties share a single IsolatedClassLoader. The isolation key includes the provider, package, authorization provider, Kerberos identity, and backend URIs (metastore.uris, jdbc-url, fs.defaultFS). Operators can extend the isolation dimensions via gravitino.catalog.classloader.isolation.extra-properties without code changes. The pool uses ConcurrentHashMap.compute() for atomic acquire/release, and performs cleanup (JDBC driver deregistration, ThreadLocal clearing, MySQL AbandonedConnectionCleanupThread shutdown) only when the last catalog releases the shared ClassLoader.

New classes:

ClassLoaderKey — Map<String, String>-based key for ClassLoader sharing, decoupled from specific property names
ClassLoaderPool — thread-safe pool with reference counting and lifecycle management
PooledClassLoaderEntry — holds a shared ClassLoader and its reference count

Changes to existing classes:

CatalogManager — integrates pool into catalog creation, test connection, and close paths; fixes ClassLoader leak in testConnection() and getResolvedProperties(); defines built-in isolation property keys and supports configurable extra keys
Configs — adds gravitino.catalog.classloader.isolation.extra-properties configuration
ClassLoaderResourceCleanerUtils — broadens ThreadLocal cleanup from webserver-only to all application threads; adds MySQL cleanup
Removes scattered cleanup from JdbcCatalogOperations, IcebergCatalogWrapper, IcebergCatalogOperations, and PaimonCatalogOperations

Why are the changes needed?

Concurrent catalog creation with different names but the same provider type causes OutOfMemoryError: Metaspace. Each catalog creates an independent IsolatedClassLoader that loads all provider JARs into Metaspace. With MaxMetaspaceSize=512m (default) and Iceberg catalogs consuming ~30-80 MB each, ~10 catalogs exhaust the limit.

This patch addresses four root causes:

No ClassLoader sharing — same-type catalogs loaded identical classes into separate Metaspace regions
ClassLoader leak in testConnection() — wrapper was never closed after connection test
Incomplete ThreadLocal cleanup — only cleaned webserver threads, missing ForkJoinPool and other threads
Inconsistent cleanup — only 2 of 9+ catalog types called ClassLoaderResourceCleanerUtils

Fix: #10093

Does this PR introduce any user-facing change?

Yes. A new optional server configuration is added:

gravitino.catalog.classloader.isolation.extra-properties — comma-separated list of additional catalog property keys used to determine ClassLoader isolation. Supplements the built-in defaults (package, authorization-provider, Kerberos identity, metastore.uris, jdbc-url, fs.defaultFS) and cannot remove them. Default is empty.

How was this patch tested?

Unit tests (TestClassLoaderPool — 19 tests): acquire/release semantics, reference counting, concurrent access with 20 threads, close-during-acquire race, double-release resilience, Kerberos key isolation, backend URI isolation (metastore URIs, JDBC URLs, fs.defaultFS), authorization provider isolation, package property isolation.

Integration tests (TestClassLoaderPoolIntegration — 3 tests): same-type catalogs share ClassLoader instance, drop one doesn't affect others, manager close cleans up pool.

Existing tests: TestCatalogManager and TestJdbcCatalogOperations pass without modification.

Benchmark (JDK 17, -XX:MaxMetaspaceSize=512m, fileset provider, 10 concurrent threads):

Metaspace growth (committed KB):

Catalogs	Baseline (`main`)	ClassLoaderPool	Reduction
100	+890	+261	3.4x
500	+3,280	+82	40x
1,000	+6,416	+9	713x
5,000	+13,969	+67	209x
10,000	+40,394	+11	3,672x

Classes loaded:

Catalogs	Baseline	ClassLoaderPool	Reduction
1,000	21,772	11,387	48%
10,000	60,373	11,961	80%

Baseline Metaspace grows O(N) with catalog count. The pool stays flat at ~8.7 MB — O(number of distinct keys). No OOM or performance regression on either version. For Iceberg catalogs (~50 MB/ClassLoader), baseline OOMs at ~10 catalogs; with the pool, catalogs sharing the same key reuse a single ClassLoader, so Metaspace scales with the number of distinct configurations rather than the number of catalog instances.

Future extensibility

ClassLoaderKey stores isolation properties as a generic Map<String, String>, decoupled from any specific property names. This makes the pool infrastructure key-agnostic — only the logic that builds the key needs to know which properties matter.

If new ClassLoader-scoped static state is discovered in the future, the isolation criteria can be extended by adding property keys to the server configuration:

gravitino.catalog.classloader.isolation.extra-properties = custom.backend.endpoint

CatalogManager reads this list at startup and extracts matching values from catalog properties when building the key. Operators can then add isolation criteria for environment-specific static state without code changes or recompilation. The ClassLoaderKey, ClassLoaderPool, and PooledClassLoaderEntry require no modification — only the source of property keys changes.

yuqi1129 · 2026-03-19T14:03:31Z

We had tried it in this way, see #2644, but aborted it as a potential risk exists in classloader sharing, including

Static value in the classloader may be changed by one catalog and used by another
Properties in classloader initialization may affect its behaviour; however, it can't be changed, although another catalog with the same type has different properties.

These two factors mentioned above will affect the correctness of the Gravitino catalogs, so we hesitate to move on.

LuciferYang · 2026-03-19T14:42:34Z

We had tried it in this way, see #2644, but aborted it as a potential risk exists in classloader sharing, including

Static value in the classloader may be changed by one catalog and used by another

Properties in classloader initialization may affect its behaviour; however, it can't be changed, although another catalog with the same type has different properties.

These two factors mentioned above will affect the correctness of the Gravitino catalogs, so we hesitate to move on.

@yuqi1129 Thank you for the context on #2644. The three concerns raised by @jerryshao are valid for that PR's approach — here's how this PR differs and addresses each one.

Background: how this PR differs from #2644

PR #2644 proposed reusing ClassLoaders during alter operations on a single catalog, passing the old ClassLoader to the new CatalogWrapper. This created complex lifecycle issues — especially when catalog names change (cache invalidation closes the wrapper, which would close the shared ClassLoader) and when altered properties require a fresh ClassLoader.

This PR takes a fundamentally different approach: a reference-counted pool that shares ClassLoaders across different catalog instances based on a composite key. The alter path is unaffected — alter still invalidates the old cache entry (releasing the pool reference) and creates a new wrapper (acquiring from the pool). The pool handles whether to reuse or create a new ClassLoader based on whether the key changed.

Addressing the three specific concerns

1. "The classloader reuse for alteration makes the code a little broken"

This PR does not change the alter path's control flow. When a catalog is altered:

The old CatalogWrapper is invalidated from the Caffeine cache → removal listener calls wrapper.close() → pool.release(poolEntry) decrements refCount
A new CatalogWrapper is created via createCatalogWrapper() → pool.acquire(key, factory) either reuses an existing ClassLoader (if other catalogs share the same key) or creates a new one
The alter code in CatalogManager remains unchanged — it still invalidates and recreates as before

The complex setShouldCloseClassLoader / conditional invalidation pattern from #2644 is not needed because the pool's reference counting handles the lifecycle automatically.

2. "If we modify some properties that require rebooting and refreshing the classloader, with this we cannot support it"

This concern applies when alter changes properties that affect ClassLoader construction. In this PR, the ClassLoaderKey is built from the catalog's current properties at creation time. If an alter changes a property that is part of the key (e.g., Catalog.PROPERTY_PACKAGE, authorization plugin path, or Kerberos identity), the new wrapper will have a different key and get a fresh ClassLoader. The old ClassLoader's refCount is decremented by the close of the old wrapper, and will be cleaned up when no other catalogs reference it.

If the alter changes a property that is NOT part of the key (e.g., metastore URI, JDBC URL), the ClassLoader is reused — which is correct, because these properties do not affect ClassLoader construction (they are consumed by CatalogOperations.initialize(), which runs per-instance on the new wrapper).

3. "Sharing classloader between catalogs is dangerous, because some static variables that were created based on logic A will be used by another catalog that are based on logic B"

I audited all catalog implementations for static mutable state:

Instance-level state (safe): All catalog operations classes store runtime configuration in instance fields — dataSource, clientPool, icebergCatalogWrapper, hadoopConf, fileSystemCache, etc. These are per-catalog-instance and are not affected by ClassLoader sharing.
UserGroupInformation.loginUser (addressed): The only unkeyed static state that affects correctness. ClassLoaderKey includes Kerberos principal and keytab via isolation properties, so catalogs with different identities get separate ClassLoaders.
IcebergHiveCachedClientPool.clientPoolCache (safe): Static, but keyed by (metastore URI, catalog name, user credentials). Different catalog configurations produce different cache entries — sharing the ClassLoader does not cause cross-contamination.
FileSystem.CACHE (safe): Static, but keyed by (scheme, authority, UGI). Same isolation guarantee as above.

The key insight is that ClassLoader construction in Gravitino depends only on JAR paths (provider + package + auth plugin), not on catalog runtime properties. Two catalogs with the same ClassLoaderKey load exactly the same classes — their "logic A" and "logic B" differ only at the instance level (different metastore URIs, different JDBC URLs, etc.), which is stored in instance fields and not affected by ClassLoader sharing.

Future extensibility

ClassLoaderKey stores isolation properties as a generic Map<String, String>, decoupled from any specific property names. This makes the pool infrastructure key-agnostic — only the logic that builds the key needs to know which properties matter.

If new ClassLoader-scoped static state is discovered in the future, the isolation criteria can be extended by introducing a server configuration such as:

gravitino.catalog.classloader.isolation.properties = authentication.type,authentication.kerberos.principal,authentication.kerberos.keytab-uri

CatalogManager would read this list at startup and extract matching values from catalog properties when building the key. Operators can then add isolation criteria for environment-specific static state without code changes or recompilation. The ClassLoaderKey, ClassLoaderPool, and PooledClassLoaderEntry require no modification — only the source of property keys changes. If the community prefers, I can implement this configurable approach in the current PR.

Since I don't have a thorough understanding of the project yet, please correct me if I've misunderstood anything.

github-actions · 2026-03-19T15:00:31Z

Code Coverage Report

Overall Project	64.92% `-0.15%`	🟢
Files changed	60.01%	🟢

Module	Coverage
aliyun	1.73%	🔴
api	47.14%	🟢
authorization-common	85.96%	🟢
aws	1.1%	🔴
azure	2.6%	🔴
catalog-common	9.8% `-3.83%`	🔴
catalog-fileset	80.02%	🟢
catalog-hive	80.98%	🟢
catalog-jdbc-clickhouse	79.06%	🟢
catalog-jdbc-common	42.49% `-10.77%`	🟢
catalog-jdbc-doris	80.28%	🟢
catalog-jdbc-hologres	54.03%	🟢
catalog-jdbc-mysql	79.23%	🟢
catalog-jdbc-oceanbase	78.38%	🟢
catalog-jdbc-postgresql	82.05%	🟢
catalog-jdbc-starrocks	78.27%	🟢
catalog-kafka	77.01%	🟢
catalog-lakehouse-generic	45.07%	🟢
catalog-lakehouse-hudi	79.1%	🟢
catalog-lakehouse-iceberg	87.13% `-2.52%`	🟢
catalog-lakehouse-paimon	77.69% `-0.9%`	🟢
catalog-model	77.72%	🟢
cli	44.51%	🟢
client-java	77.83%	🟢
common	49.42%	🟢
core	80.99% `-0.31%`	🟢
filesystem-hadoop3	76.97%	🟢
flink	38.86%	🔴
flink-runtime	0.0%	🔴
gcp	14.2%	🔴
hadoop-common	10.39%	🔴
hive-metastore-common	45.82%	🟢
iceberg-common	51.87% `-8.2%`	🟢
iceberg-rest-server	66.54% `+0.38%`	🟢
integration-test-common	0.0%	🔴
jobs	66.17%	🟢
lance-common	23.88%	🔴
lance-rest-server	57.84%	🟢
lineage	53.02%	🟢
optimizer	82.87%	🟢
optimizer-api	21.95%	🔴
server	85.6%	🟢
server-common	69.43%	🟢
spark	32.79%	🔴
spark-common	39.09%	🔴
trino-connector	31.62%	🔴

Files

Module	File	Coverage
catalog-common	ClassLoaderResourceCleanerUtils.java	0.0%	🔴
catalog-jdbc-common	JdbcCatalogOperations.java	5.69%	🔴
catalog-lakehouse-iceberg	IcebergCatalogOperations.java	80.74%	🟢
catalog-lakehouse-paimon	PaimonCatalogOperations.java	74.55%	🟢
core	ClassLoaderKey.java	100.0%	🟢
	Configs.java	98.87%	🟢
	PooledClassLoaderEntry.java	88.24%	🟢
	ClassLoaderPool.java	84.85%	🟢
	CatalogManager.java	66.05%	🟢
	IsolatedClassLoader.java	48.48%	🔴
iceberg-common	IcebergCatalogWrapper.java	0.0%	🔴
iceberg-rest-server	CatalogWrapperForREST.java	70.98%	🟢

yuqi1129

Thanks for this PR — the OOM/Metaspace problem is real and the benchmark numbers are impressive. I spent some time studying the design carefully and have a few concerns about correctness under certain real-world configurations. I'd like to discuss these before merging.

Background: What ClassLoader sharing gives up

IsolatedClassLoader was designed to give every catalog its own isolated class space so that third-party library static state (Hadoop UGI, FileSystem cache, JDBC DriverManager, HiveConf, etc.) cannot bleed between catalogs. ClassLoaderPool partially relaxes that isolation. The question is whether ClassLoaderKey captures all the dimensions along which static state can diverge between two catalogs of the same type.

Concern 1: `ClassLoaderKey` is missing critical backend-URI dimensions

The current key is:

provider + packageProperty + authorizationPkgPath + kerberosPrincipal + kerberosKeytab

This is correct for isolating Kerberos identity, but it doesn't account for catalogs that point to different backends of the same type. Consider:

Scenario — two Iceberg catalogs with different HMS URIs:

catalog-A: provider=lakehouse-iceberg, metastore.uris=thrift://hms-A:9083
catalog-B: provider=lakehouse-iceberg, metastore.uris=thrift://hms-B:9083

Both produce the same ClassLoaderKey and share one IsolatedClassLoader. Inside that ClassLoader, HiveConf has a static configuration space. HiveConf.ConfVars and the valuesPerLabel cache are static. If catalog-B's initialization overwrites catalog-A's HMS URI in HiveConf, catalog-A starts talking to the wrong metastore.

Same problem exists for:

fs.defaultFS (Hadoop FileSystem.CACHE is a per-ClassLoader static keyed on URI + conf + UserGroupInformation; two catalogs pointing at different HDFS clusters may cross-contaminate the FileSystem cache)
JDBC URL (less likely to corrupt static state, but the AbandonedConnectionCleanupThread and driver registry are global per-ClassLoader)

Suggested fix: extend ClassLoaderKey to include the backend URI(s) that anchor static state. For Iceberg/Paimon Hive-backend: metastore.uris. For JDBC catalogs: jdbc-url. For fileset catalogs: fs.defaultFS.

Concern 2: `FileSystem.closeAll()` during `doFinalCleanup` can disconnect live catalogs

closeStatsDataClearerInFileSystem calls FileSystem.closeAll() — a static method that closes every cached FileSystem in that ClassLoader's cache. doFinalCleanup runs only when refCount reaches 0 (i.e., the last catalog sharing this ClassLoader is closed), so under the current logic it won't fire while other catalogs are live.

However, if Concern 1 is fixed and two catalogs with different HDFS URIs are given separate keys, this is safe. But as long as ClassLoaderKey is under-specified (same key for different backends), there is a window where catalog-A's cleanup kills catalog-B's live HDFS connections.

Concern 3: ThreadLocal cross-contamination during shared lifetime

With per-catalog ClassLoaders, ThreadLocal values from different catalogs are of different Class objects (loaded by different ClassLoaders), so they are naturally isolated. With a shared ClassLoader, Catalog A and Catalog B load the same Class, meaning a ThreadLocal set by catalog-A's code on a Jetty thread is visible to catalog-B's code running on the same thread.

Concretely: if Iceberg or Hive sets a per-request ThreadLocal (e.g., Hadoop SecurityContext, Hive SessionState, or Iceberg's ResolvingFileIO context), catalog-B could pick up a stale value left by catalog-A. This is especially risky for Hive SessionState, which is a ThreadLocal singleton used throughout HMS interaction.

Comparison with industry practice

The closest industry analogue is Trino's plugin ClassLoader: a single ClassLoader is shared across all connector instances of the same plugin. Trino makes this safe because its connectors are stateless — all per-request state lives in ConnectorSession, not in static fields. Gravitino's catalogs are stateful (they hold HMS connections, HDFS FileSystems, UGI login state), which is the key difference that makes sharing riskier here.

Summary of concerns

Issue	Severity	Affected catalogs
`ClassLoaderKey` missing `metastore.uris` / `fs.defaultFS`	High	Iceberg-Hive, Hive, Paimon-Hive, fileset
`FileSystem.closeAll()` can disconnect live catalog (if keys are under-specified)	High (conditional on above)	Any catalog using HDFS
ThreadLocal cross-contamination between shared-ClassLoader catalogs	Medium	Iceberg, Hive (SessionState)
AWS SDK v2 static state not cleaned up (only v1 handled)	Low	Iceberg/Paimon with S3 backend

Suggestion

The OOM fix is valuable and I'd like to see it merged. One path forward:

Extend ClassLoaderKey to include the backend URI dimensions that anchor per-ClassLoader static state. This makes sharing conservative (fewer catalogs will share) but correct.
Or, limit sharing to the case where all config properties are identical (not just the 5 current dimensions). This is the safest interpretation of "same type" — if two catalogs are truly identical in configuration, they genuinely cannot diverge in static state.
Add a test that creates two same-type catalogs pointing at different backends and verifies they get separate ClassLoader entries in the pool.

Happy to discuss the trade-offs — the refCount/cleanup mechanics look solid and the testConnection leak fix is a clear win regardless.

LuciferYang · 2026-03-23T08:14:11Z

@yuqi1129 Thanks for your valuable advice. I'm a bit busy today, but I'll take care of the this tomorrow.

LuciferYang · 2026-03-24T07:11:53Z

The failure of the S3 integration test appears to be unrelated to the current pr

LuciferYang · 2026-03-24T07:12:45Z

Thanks @yuqi1129 for the thorough review. Here is the status of each concern.

Concern 1: `ClassLoaderKey` missing backend-URI dimensions — Fixed

Extended the isolation key with metastore.uris, jdbc-url, and fs.defaultFS. Also added authorization-provider (determines which authorization plugin JARs are loaded).

The full set of built-in default isolation keys (catalog property names) is now:

Category	Property keys
Classpath	`package` (Catalog.PROPERTY_PACKAGE), `authorization-provider`
Kerberos identity	`authentication.type`, `authentication.kerberos.principal`, `authentication.kerberos.keytab-uri`
Backend URIs	`metastore.uris`, `jdbc-url`, `fs.defaultFS`

These defaults cannot be removed. Operators can add more via a new server config:

gravitino.catalog.classloader.isolation.extra-properties = custom.backend.endpoint

ClassLoaderKey stores isolation properties as a generic Map<String, String>, decoupled from specific property names — only CatalogManager.buildClassLoaderKey needs to know which properties matter. This makes the pool infrastructure key-agnostic and extensible without modifying pool classes.

Concern 2: `FileSystem.closeAll()` can disconnect live catalogs — Resolved by Concern 1 fix

With backend URIs now in the key, catalogs pointing at different HDFS clusters get separate ClassLoaders. doFinalCleanup only runs when refCount reaches 0, so FileSystem.closeAll() cannot affect live catalogs sharing the same ClassLoader.

Concern 3: ThreadLocal cross-contamination — Not fully resolved, inherent trade-off

This is a genuine limitation of ClassLoader sharing. With per-catalog ClassLoaders, ThreadLocal values are naturally isolated because each ClassLoader loads its own copy of the class. With a shared ClassLoader, catalogs on the same thread can see each other's ThreadLocal state.

The Concern 1 fix reduces the blast radius — catalogs sharing a ClassLoader now have identical backend configurations, so leaked ThreadLocal state is less likely to cause incorrect behavior (e.g., talking to the wrong metastore). But it does not eliminate the problem. If a library sets a ThreadLocal with per-catalog state (e.g., Hive SessionState, Hadoop SecurityContext), cross-contamination is still possible between catalogs sharing the same ClassLoader on the same thread.

What this PR does:

Reduces exposure by isolating catalogs with different backends into separate ClassLoaders (Concern 1 fix)
Provides extra-properties config as an operational escape hatch — if a specific ThreadLocal issue surfaces, operators can add the relevant property key to force separation without a code release

What this PR does not do:

It does not guarantee ThreadLocal isolation between catalogs sharing a ClassLoader. This is an inherent trade-off of sharing and cannot be fully solved at the key level.

Open question for the community: Is this trade-off acceptable given the Metaspace savings, or should we consider additional safeguards (e.g., a per-catalog opt-out property to force a dedicated ClassLoader)? Happy to discuss.

Concern 4: AWS SDK v2 static state — Not addressed in this PR

Low severity, not related to the key design. Can be addressed as a follow-up if this one merged.

Suggestion 3: Tests for different-backend isolation — Added

New tests:

testDifferentMetastoreUrisCreateDifferentEntries
testSameMetastoreUrisShareEntry
testDifferentJdbcUrlsCreateDifferentEntries
testDifferentDefaultFsCreateDifferentEntries
testKeyWithAuthorizationProvider

Total: 19 unit tests + 3 integration tests.

yuqi1129 · 2026-03-24T14:26:32Z

Thanks @yuqi1129 for the thorough review. Here is the status of each concern.

Concern 1: ClassLoaderKey missing backend-URI dimensions — Fixed

Extended the isolation key with metastore.uris, jdbc-url, and fs.defaultFS. Also added authorization-provider (determines which authorization plugin JARs are loaded).

The full set of built-in default isolation keys (catalog property names) is now:

Category Property keys
Classpath package (Catalog.PROPERTY_PACKAGE), authorization-provider
Kerberos identity authentication.type, authentication.kerberos.principal, authentication.kerberos.keytab-uri
Backend URIs metastore.uris, jdbc-url, fs.defaultFS
These defaults cannot be removed. Operators can add more via a new server config:
gravitino.catalog.classloader.isolation.extra-properties = custom.backend.endpoint
ClassLoaderKey stores isolation properties as a generic Map<String, String>, decoupled from specific property names — only CatalogManager.buildClassLoaderKey needs to know which properties matter. This makes the pool infrastructure key-agnostic and extensible without modifying pool classes.

Concern 2: FileSystem.closeAll() can disconnect live catalogs — Resolved by Concern 1 fix

With backend URIs now in the key, catalogs pointing at different HDFS clusters get separate ClassLoaders. doFinalCleanup only runs when refCount reaches 0, so FileSystem.closeAll() cannot affect live catalogs sharing the same ClassLoader.

Concern 3: ThreadLocal cross-contamination — Not fully resolved, inherent trade-off

This is a genuine limitation of ClassLoader sharing. With per-catalog ClassLoaders, ThreadLocal values are naturally isolated because each ClassLoader loads its own copy of the class. With a shared ClassLoader, catalogs on the same thread can see each other's ThreadLocal state.

The Concern 1 fix reduces the blast radius — catalogs sharing a ClassLoader now have identical backend configurations, so leaked ThreadLocal state is less likely to cause incorrect behavior (e.g., talking to the wrong metastore). But it does not eliminate the problem. If a library sets a ThreadLocal with per-catalog state (e.g., Hive SessionState, Hadoop SecurityContext), cross-contamination is still possible between catalogs sharing the same ClassLoader on the same thread.

What this PR does:

Reduces exposure by isolating catalogs with different backends into separate ClassLoaders (Concern 1 fix)

Provides extra-properties config as an operational escape hatch — if a specific ThreadLocal issue surfaces, operators can add the relevant property key to force separation without a code release

What this PR does not do:

It does not guarantee ThreadLocal isolation between catalogs sharing a ClassLoader. This is an inherent trade-off of sharing and cannot be fully solved at the key level.

Open question for the community: Is this trade-off acceptable given the Metaspace savings, or should we consider additional safeguards (e.g., a per-catalog opt-out property to force a dedicated ClassLoader)? Happy to discuss.

Concern 4: AWS SDK v2 static state — Not addressed in this PR

Low severity, not related to the key design. Can be addressed as a follow-up if this one merged.

Suggestion 3: Tests for different-backend isolation — Added

New tests:

testDifferentMetastoreUrisCreateDifferentEntries

testSameMetastoreUrisShareEntry

testDifferentJdbcUrlsCreateDifferentEntries

testDifferentDefaultFsCreateDifferentEntries

testKeyWithAuthorizationProvider

Total: 19 unit tests + 3 integration tests.

I will take time to review it again. Thanks for your quick response.

LuciferYang · 2026-03-24T15:21:24Z

Thank you @yuqi1129

init

b2e765c

roryqi requested a review from yuqi1129 March 19, 2026 14:49

yuqi1129 reviewed Mar 23, 2026

View reviewed changes

address comments

181a05d

Merge branch 'main' into fix-10093

6fb5ac2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#10093] fix(core): Introduce ClassLoaderPool to share ClassLoaders across same-type catalogs#10480

[#10093] fix(core): Introduce ClassLoaderPool to share ClassLoaders across same-type catalogs#10480
LuciferYang wants to merge 3 commits intoapache:mainfrom
LuciferYang:fix-10093

LuciferYang commented Mar 19, 2026 •

edited

Loading

Uh oh!

yuqi1129 commented Mar 19, 2026

Uh oh!

LuciferYang commented Mar 19, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

yuqi1129 left a comment

Uh oh!

LuciferYang commented Mar 23, 2026

Uh oh!

LuciferYang commented Mar 24, 2026

Uh oh!

LuciferYang commented Mar 24, 2026 •

edited

Loading

Uh oh!

yuqi1129 commented Mar 24, 2026

Concern 1: `ClassLoaderKey` missing backend-URI dimensions — Fixed

Concern 2: `FileSystem.closeAll()` can disconnect live catalogs — Resolved by Concern 1 fix

Concern 3: ThreadLocal cross-contamination — Not fully resolved, inherent trade-off

Concern 4: AWS SDK v2 static state — Not addressed in this PR

Suggestion 3: Tests for different-backend isolation — Added

Uh oh!

LuciferYang commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LuciferYang commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Future extensibility

Uh oh!

yuqi1129 commented Mar 19, 2026

Uh oh!

LuciferYang commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background: how this PR differs from #2644

Addressing the three specific concerns

Future extensibility

Uh oh!

github-actions bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage Report

Uh oh!

yuqi1129 left a comment

Choose a reason for hiding this comment

Background: What ClassLoader sharing gives up

Concern 1: ClassLoaderKey is missing critical backend-URI dimensions

Concern 2: FileSystem.closeAll() during doFinalCleanup can disconnect live catalogs

Concern 3: ThreadLocal cross-contamination during shared lifetime

Comparison with industry practice

Summary of concerns

Suggestion

Uh oh!

LuciferYang commented Mar 23, 2026

Uh oh!

LuciferYang commented Mar 24, 2026

Uh oh!

LuciferYang commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Concern 1: ClassLoaderKey missing backend-URI dimensions — Fixed

Concern 2: FileSystem.closeAll() can disconnect live catalogs — Resolved by Concern 1 fix

Concern 3: ThreadLocal cross-contamination — Not fully resolved, inherent trade-off

Concern 4: AWS SDK v2 static state — Not addressed in this PR

Suggestion 3: Tests for different-backend isolation — Added

Uh oh!

yuqi1129 commented Mar 24, 2026

Concern 1: ClassLoaderKey missing backend-URI dimensions — Fixed

Concern 2: FileSystem.closeAll() can disconnect live catalogs — Resolved by Concern 1 fix

Concern 3: ThreadLocal cross-contamination — Not fully resolved, inherent trade-off

Concern 4: AWS SDK v2 static state — Not addressed in this PR

Suggestion 3: Tests for different-backend isolation — Added

Uh oh!

LuciferYang commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LuciferYang commented Mar 19, 2026 •

edited

Loading

LuciferYang commented Mar 19, 2026 •

edited

Loading

github-actions bot commented Mar 19, 2026 •

edited

Loading

Concern 1: `ClassLoaderKey` is missing critical backend-URI dimensions

Concern 2: `FileSystem.closeAll()` during `doFinalCleanup` can disconnect live catalogs

LuciferYang commented Mar 24, 2026 •

edited

Loading

Concern 1: `ClassLoaderKey` missing backend-URI dimensions — Fixed

Concern 2: `FileSystem.closeAll()` can disconnect live catalogs — Resolved by Concern 1 fix

Concern 1: `ClassLoaderKey` missing backend-URI dimensions — Fixed

Concern 2: `FileSystem.closeAll()` can disconnect live catalogs — Resolved by Concern 1 fix