Skip to content

AWS: Handle S3 Table Bucket purge gracefully in GlueCatalog (#14449)#16073

Open
yadavay-amzn wants to merge 1 commit into
apache:mainfrom
yadavay-amzn:fix/14449-glue-s3-table-purge
Open

AWS: Handle S3 Table Bucket purge gracefully in GlueCatalog (#14449)#16073
yadavay-amzn wants to merge 1 commit into
apache:mainfrom
yadavay-amzn:fix/14449-glue-s3-table-purge

Conversation

@yadavay-amzn
Copy link
Copy Markdown
Contributor

When calling GlueCatalog.dropTable() with purge=true on a table in an S3 Table Bucket, the purge fails because S3 Table Buckets do not allow manual file deletion.

This change wraps CatalogUtil.dropTableData() in a try-catch so that purge failures are logged as warnings instead of propagating and failing the entire drop operation. The table is still successfully dropped from the Glue catalog.

Closes #14449

@github-actions github-actions Bot added the AWS label Apr 21, 2026
…4449)

When calling GlueCatalog.dropTable() with purge=true on a table in an
S3 Table Bucket, the purge fails because S3 Table Buckets do not allow
manual file deletion. This change wraps CatalogUtil.dropTableData() in
a try-catch so that purge failures are logged as warnings instead of
propagating and failing the entire drop operation.

Closes apache#14449
@yadavay-amzn yadavay-amzn force-pushed the fix/14449-glue-s3-table-purge branch from 840236c to e996c80 Compare April 21, 2026 22:02
LOG.info("Glue table {} data purged", identifier);
try {
CatalogUtil.dropTableData(ops.io(), lastMetadata);
LOG.info("Glue table {} data purged", identifier);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to check whether the target table exists in S3 Table bucket?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location could be checked for S3 Table Bucket ARN patterns, but catching the exception is more robust as it handles any case where purge fails (permissions, bucket policies, etc.) without needing to enumerate all possible URI formats.
Looks like this also aligns with the Trino approach you linked!

Happy to add a URI check if you'd prefer a more targeted approach though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we could do both (S3 Table check + try-catch) to avoid redundant S3 requests and warning logs. I think we should keep the try-catch regardless of S3 Table because it may fail for other reasons.

The "Enumerate all possible URI formats" approach doesn't look straightforward. Only adding try-catch looks good to me.

LOG.info("Glue table {} data purged", identifier);
} catch (Exception e) {
LOG.warn(
"Failed to purge data for table: {}, continuing drop without purge", identifier, e);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table has already been dropped by the time we reach this line, so this change makes sense to me.

The Trino Iceberg connector also suppresses failures when it cannot delete data using the Glue catalog:

https://github.com/trinodb/trino/blob/5a116341b53f9f3a3b29b8b405773010e307e40b/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/glue/TrinoGlueCatalog.java#L676-L696

try {
CatalogUtil.dropTableData(ops.io(), lastMetadata);
LOG.info("Glue table {} data purged", identifier);
} catch (Exception e) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to catch the Specific exception rather than catch all Exception ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally catching broad Exception here — the table metadata has already been dropped by this point, so the try-catch is a safety net to prevent any unexpected failure from blocking the drop operation. Narrowing to a specific exception risks missing edge cases from different IO implementations (S3, GCS, HDFS, etc.). This also aligns with the Trino approach that @ebyhr linked.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant against this broad of an exception. We should be strict in what we are catching here it's a purge that other use cases could hit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point — two reviewers have raised this now. What exception type would you prefer here? The challenge is that CatalogUtil.dropTableData delegates to different IO implementations (S3, GCS, HDFS) which throw different exception types. Would RuntimeException be narrow enough, or would you prefer something S3-specific like SdkServiceException?

Copy link
Copy Markdown
Member

@geruh geruh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stepping back here, this PR is the first time in OSS (on the Java side) that GlueCatalog acknowledges the "federation" story from Glue. Where Glue can talk to linked catalogs, whether they're IRC backed or, in this case, S3 Tables. And in some cases federation is painful, like this managed storage story where S3 Tables limits normal S3 operations, and uses work arounds for location allocation.

With that said, it's worth being explicit about that rather than implicit with a broad try/catch.

So the root cause is that this Glue Table is federated from an S3 Tables catalog where data is managed server side, so our client side CatalogUtil.dropTableData is unnecessary and will fail. Catching Exception around dropTableData can swallow unrelated failures like iam issues or bugs we otherwise would want surfaced in non s3table catalogs especially during a purge.

There are 3 options here:

  1. Build support for Glues federation use case. Glue responses expose a federated connection type. So in this case if its s3 tables we could skip client side purge. This is what we do today.

  2. Decide GlueCatalog doesn't model federation yet and keep this scoped down. If we're not ready to introduce the glue federation support. Then this PR at a minimum narrow this try/catch to what S3Tables throws on delete and log. Furthermore, we need some documentation on this.

  3. Block all requests against federated tables. This is quite difficult if I remember correctly because a federated table can be linked to a database? I'd need to think this one through more.

I'd perfer option 1 because it will scale with the federation stories in Glue.

WDYT?

try {
CatalogUtil.dropTableData(ops.io(), lastMetadata);
LOG.info("Glue table {} data purged", identifier);
} catch (Exception e) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant against this broad of an exception. We should be strict in what we are catching here it's a purge that other use cases could hit.

.build());
LOG.info("Successfully dropped table {} from Glue", identifier);
if (purge && lastMetadata != null) {
CatalogUtil.dropTableData(ops.io(), lastMetadata);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, s3tables is purging data behind the scenes right. So in the case of dropTable(identifier, false) we should throw and force the user to be specific about purging.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about S3 Tables purging behind the scenes. The current fix only affects the purge=true path — when purge=false, dropTableData is never called so there's no change in behavior. But I agree the broader S3 Tables story may need more thought beyond this PR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a matter of expected behavior someone could call drop table with purge set to false, and their data will still be deleted by S3Tables. I'm leaning towards forcing the user to specify purge when dropping otherwise fail.

}

@Test
public void testDropTableWithPurgeFailure() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you able to test this functionality against a real s3tables catalog?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tested against a real S3 Tables catalog. Is that strictly necessary for this change, or would the unit test coverage be sufficient?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we'd know what the actual exceptions to account for in catch above with some real testing right. We should be strict in handling here to avoid masking any real bugs.

@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions Bot added the stale label May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AWS Glue catalog dropTable purge fails when target is an S3 Table bucket

4 participants