Skip to content

[#11534] fix(spark-connector): GravitinoGlueCatalog does not invalidate Iceberg SparkCatalog cache after table mutations#11559

Merged
diqiu50 merged 1 commit into
apache:mainfrom
diqiu50:fix/glue-iceberg-cache-invalidation
Jun 10, 2026
Merged

[#11534] fix(spark-connector): GravitinoGlueCatalog does not invalidate Iceberg SparkCatalog cache after table mutations#11559
diqiu50 merged 1 commit into
apache:mainfrom
diqiu50:fix/glue-iceberg-cache-invalidation

Conversation

@diqiu50

@diqiu50 diqiu50 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Override invalidateTable in GravitinoGlueCatalog to also clear icebergGlueCatalog's cache. Change BaseCatalog mutation methods to call invalidateTable(ident) via virtual dispatch so the override takes effect.

Why are the changes needed?

icebergGlueCatalog's internal CachingCatalog was never invalidated after table mutations, causing stale schema reads and IllegalStateException: Couldn't find <col> in [...] after ALTER TABLE ADD COLUMNS.

Fix #11534

Does this PR introduce any user-facing change?

Yes. Stale schema error on Iceberg Glue tables after ALTER/DROP/PURGE/RENAME is fixed.

How was this patch tested?

  • Unit tests in TestGravitinoGlueCatalog
  • Integration test testIcebergAlterTableAddColumnCacheInvalidation in SparkGlueCatalogIT, verified against real AWS Glue + S3 (Spark 3.5)

…table mutations in GravitinoGlueCatalog

GravitinoGlueCatalog maintains two separate catalog backends: sparkCatalog
(HiveTableCatalog) for non-Iceberg tables, and icebergGlueCatalog (Iceberg
SparkCatalog) for Iceberg tables. BaseCatalog.alterTable/dropTable/purgeTable/
renameTable all called sparkCatalog.invalidateTable directly, bypassing virtual
dispatch and never invalidating icebergGlueCatalog's internal CachingCatalog.

After ALTER TABLE ADD COLUMNS, the stale CachingCatalog returned the pre-ALTER
SparkTable, causing SparkIcebergTable.newScanBuilder() to use the old schema
while schema() reported the new one, resulting in:
  IllegalStateException: Couldn't find <newCol> in [<oldCols>]

Fix: change BaseCatalog to call invalidateTable(ident) (virtual dispatch) instead
of sparkCatalog.invalidateTable(ident) directly, then override invalidateTable in
GravitinoGlueCatalog to also invalidate icebergGlueCatalog when initialized.
Copilot AI review requested due to automatic review settings June 10, 2026 09:22

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@diqiu50 diqiu50 requested a review from Copilot June 10, 2026 10:00
@diqiu50 diqiu50 changed the title [#11534] Fix GravitinoGlueCatalog not invalidating Iceberg SparkCatalog cache after table mutations [#11534] fix(spark-connector): GravitinoGlueCatalog does not invalidate Iceberg SparkCatalog cache after table mutations Jun 10, 2026
@diqiu50 diqiu50 self-assigned this Jun 10, 2026
@diqiu50 diqiu50 added the branch-1.3 Automatically cherry-pick commit to branch-1.3 label Jun 10, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@github-actions

Copy link
Copy Markdown

Code Coverage Report

Overall Project 66.93% -0.31% 🟢
Files changed 48.17% 🔴

Module Coverage
aliyun 1.72% 🔴
api 46.82% 🟢
authorization-common 85.96% 🟢
aws 3.66% 🔴
azure 2.47% 🔴
catalog-common 10.4% 🔴
catalog-fileset 80.23% 🟢
catalog-glue 66.72% -19.27% 🟢
catalog-hive 79.44% 🟢
catalog-jdbc-clickhouse 80.02% 🟢
catalog-jdbc-common 44.22% 🟢
catalog-jdbc-doris 80.28% 🟢
catalog-jdbc-hologres 54.03% 🟢
catalog-jdbc-mysql 79.23% -2.76% 🟢
catalog-jdbc-oceanbase 78.38% 🟢
catalog-jdbc-postgresql 82.29% 🟢
catalog-jdbc-starrocks 78.51% 🟢
catalog-kafka 77.01% 🟢
catalog-lakehouse-generic 58.53% 🟢
catalog-lakehouse-hudi 79.1% 🟢
catalog-lakehouse-iceberg 85.79% 🟢
catalog-lakehouse-paimon 79.15% 🟢
catalog-model 77.72% 🟢
cli 44.51% 🟢
client-java 77.91% +0.16% 🟢
common 49.9% 🟢
core 82.38% 🟢
filesystem-hadoop3 77.27% 🟢
flink 0.0% 🔴
flink-common 45.72% 🟢
flink-runtime 0.0% 🔴
gcp 14.12% 🔴
hadoop-common 10.39% 🔴
hive-metastore-common 53.9% 🟢
iceberg-common 57.41% 🟢
iceberg-rest-server 73.61% 🟢
idp-basic 86.18% 🟢
integration-test-common 0.0% 🔴
jobs 66.17% 🟢
lance-common 20.81% 🔴
lance-rest-server 60.54% 🟢
lineage 53.02% 🟢
optimizer 82.95% 🟢
optimizer-api 21.95% 🔴
server 85.87% 🟢
server-common 73.28% 🟢
spark 28.57% 🔴
spark-common 41.4% -6.7% 🟢
trino-connector 40.13% 🟢
Files
Module File Coverage
catalog-glue GlueCatalogOperations.java 72.16% 🟢
GlueIcebergTableHelper.java 38.66% 🔴
catalog-jdbc-mysql MysqlColumnDefaultValueConverter.java 37.14% 🔴
client-java HTTPClient.java 81.21% 🟢
spark-common GravitinoGlueCatalog.java 42.11% 🔴
BaseCatalog.java 1.95% 🔴

@roryqi roryqi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@diqiu50 diqiu50 merged commit ff3f385 into apache:main Jun 10, 2026
29 checks passed
jerryshao pushed a commit that referenced this pull request Jun 11, 2026
…lueCatalog does not invalidate Iceberg SparkCatalog cache after table mutations (#11559) (#11577)

**Cherry-pick Information:**
- Original commit: ff3f385
- Target branch: `branch-1.3`
- Status: ✅ Clean cherry-pick (no conflicts)

Co-authored-by: Yuhui <hui@datastrato.com>
@diqiu50 diqiu50 deleted the fix/glue-iceberg-cache-invalidation branch June 18, 2026 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

branch-1.3 Automatically cherry-pick commit to branch-1.3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug report] GravitinoGlueCatalog does not invalidate Iceberg SparkCatalog cache after ALTER TABLE, causing stale schema on read

3 participants