Skip to content

HIVE-29035: Fixing cache handling for REST catalog #5882

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Aug 9, 2025

Conversation

henrib
Copy link
Contributor

@henrib henrib commented Jun 20, 2025

  • add logic check on cache to ensure returned table is not stale

What changes were proposed in this pull request?

Adding logic to verify if a cached table actually points to the latest known version

Why are the changes needed?

Cache handling for REST Iceberg catalog may return stale tables.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests

henrib added 2 commits June 30, 2025 17:46
- add an event listener to invalidate cached tables impervious to source of change (direct HMS or REST);
- added configuration option for event class handler;
- lengthened default cache TTL;
@okumin okumin changed the title HIVE-29016: fixing cache handling for REST catalog; HIVE-29035: Fixing cache handling for REST catalog; Aug 6, 2025
Copy link
Contributor

@okumin okumin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, looks good. I left some comments. I also changed the title because HIVE-29016 has been already merged and released.

…apache/iceberg/rest/HMSCatalogFactory.java

Co-authored-by: Shohei Okumiya <okumin@apache.org>
@okumin okumin changed the title HIVE-29035: Fixing cache handling for REST catalog; HIVE-29035: Fixing cache handling for REST catalog Aug 7, 2025
Copy link

sonarqubecloud bot commented Aug 7, 2025

Copy link
Contributor

@okumin okumin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@okumin okumin merged commit 9234ddd into apache:master Aug 9, 2025
4 checks passed
final TableIdentifier canonicalized = identifier.toLowerCase();
Table cachedTable = cache.getIfPresent(canonicalized);
if (cachedTable != null) {
String location = catalog.getTableLocation(canonicalized);
Copy link
Member

@deniskuzZ deniskuzZ Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henrib, did you even test the code ??? I don't see even a single test that validates the caching functionality

catalog.getTableLocation(canonicalized)
file:/var/folders/qy/whx78wjd1w72_z8_wgqfwd_c0000gn/T/RESTCatalogServer_b5e4f53d-d4c9-4b1a-8897-c9b9e82ef29e/external/newdb.db/newtable/metadata/00000-fd422e1e-bf30-4529-9baf-606662a2253a.metadata.json

cachedTable.location()
file:/var/folders/qy/whx78wjd1w72_z8_wgqfwd_c0000gn/T/RESTCatalogServer_b5e4f53d-d4c9-4b1a-8897-c9b9e82ef29e/external/newdb.db/newtable

one returns metadata file location and another just the root table location. I am reverting this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure the cached table is not stale, we check against the stored reference that the location it points to is still current. This situation (staleness) can occur if users are updating the table from HMS/Hive concurrently with through Catalog and/or HA. @okumin do you concur ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current server-side cache assumes the following race conditions. Say we have two HAed HMS instances X and Y with RANDOM load-balancing.

  1. User A caches a table at T1 in HMS X
  2. User B updates the same table to T2 and caches it in HMS Y
  3. User C fetches the same table from HMS X, and HMS X returns the table at T1

I suppose the server-side eventual consistency requires us to design and advertise the consistency semantics carefully. I'd say the client side should do such a control.

Copy link
Member

@deniskuzZ deniskuzZ Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

location objects won't ever match !!!
one returns metadata file location and another just the root table location

Copy link
Member

@deniskuzZ deniskuzZ Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the server-side eventual consistency requires us to design and advertise the consistency semantics carefully. I'd say the client side should do such a control.

@okumin, I think similar thing was discussed in https://lists.apache.org/thread/zb5ldjfxvjz4fjwll1lc9cm8rl015zx2

Copy link
Contributor Author

@henrib henrib Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

location objects won't ever match !!! one returns metadata file location and another just the root table location

You are absolutely right, I must missed that during debug, thanks for catching it. It could have been fixed with a call to:

  private static String getMetadataLocation(final Table table) {
    if (table instanceof HasTableOperations tableOps) {
      final TableOperations ops = tableOps.operations();
      final TableMetadata meta;
      if (ops != null && (meta = ops.current()) != null) {
        return meta.metadataFileLocation();
      }
    }
    return null;
  }

Copy link
Member

@deniskuzZ deniskuzZ Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 remark: hive-catalog is a clone of iceberg's hive-metastore. So whenever we need to apply changes there - we should raise a PR for iceberg as well.

maybe instead of adding getTableLocation we could use

getMetadataLocation(catalog.loadTable(tableidentifier))

Copy link
Contributor

@okumin okumin Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

location objects won't ever match !!!

You're right. My bad. Let's add an addendum patch.

1 remark: hive-catalog is a clone of iceberg's hive-metastore. So whenever we need to apply changes there - we should raise a PR for iceberg as well.

I'm not finding the policy about how to manage cloned files, but it is also right anyway. I'm wondering if we can push the entire caching feature into HiveCatalog in apache/iceberg. I can try it in my spare time if no one tries it. Probably off topic

@okumin, I think similar thing was discussed in https://lists.apache.org/thread/zb5ldjfxvjz4fjwll1lc9cm8rl015zx2

Thanks for identifying the related discussion. We might want to implement the finally-decided solution, probably the ETag-based one.

}
}
Table table = cache.get(canonicalized, catalog::loadTable);
if (table instanceof BaseMetadataTable) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we only supposed to cache metadata tables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 92 does cache non-metadata table. Per the caffeine doc, cache 'get' obeys "if cached, return; otherwise create, cache and return" pattern ; line 93 and after are meant to deal with the specifics of metadata table.
What am I missing ?

@henrib
Copy link
Contributor Author

henrib commented Aug 12, 2025

Newer version in #6022 with an actual test that verifies the correct caching behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants