-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29035: Fixing cache handling for REST catalog #5882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...etastore/metastore-rest-catalog/src/main/java/org/apache/iceberg/rest/HMSCachingCatalog.java
Outdated
Show resolved
Hide resolved
...etastore/metastore-rest-catalog/src/main/java/org/apache/iceberg/rest/HMSCachingCatalog.java
Outdated
Show resolved
Hide resolved
...metastore/metastore-rest-catalog/src/main/java/org/apache/iceberg/rest/HMSEventListener.java
Outdated
Show resolved
Hide resolved
e73bb8c
to
acc2463
Compare
a07bcbd
to
f5c717e
Compare
- add an event listener to invalidate cached tables impervious to source of change (direct HMS or REST); - added configuration option for event class handler; - lengthened default cache TTL;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, looks good. I left some comments. I also changed the title because HIVE-29016 has been already merged and released.
...etastore/metastore-rest-catalog/src/main/java/org/apache/iceberg/rest/HMSCatalogFactory.java
Outdated
Show resolved
Hide resolved
...etastore/metastore-rest-catalog/src/main/java/org/apache/iceberg/rest/HMSCatalogFactory.java
Show resolved
Hide resolved
…apache/iceberg/rest/HMSCatalogFactory.java Co-authored-by: Shohei Okumiya <okumin@apache.org>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
final TableIdentifier canonicalized = identifier.toLowerCase(); | ||
Table cachedTable = cache.getIfPresent(canonicalized); | ||
if (cachedTable != null) { | ||
String location = catalog.getTableLocation(canonicalized); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henrib, did you even test the code ??? I don't see even a single test that validates the caching functionality
catalog.getTableLocation(canonicalized)
file:/var/folders/qy/whx78wjd1w72_z8_wgqfwd_c0000gn/T/RESTCatalogServer_b5e4f53d-d4c9-4b1a-8897-c9b9e82ef29e/external/newdb.db/newtable/metadata/00000-fd422e1e-bf30-4529-9baf-606662a2253a.metadata.json
cachedTable.location()
file:/var/folders/qy/whx78wjd1w72_z8_wgqfwd_c0000gn/T/RESTCatalogServer_b5e4f53d-d4c9-4b1a-8897-c9b9e82ef29e/external/newdb.db/newtable
one returns metadata file location and another just the root table location. I am reverting this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ensure the cached table is not stale, we check against the stored reference that the location it points to is still current. This situation (staleness) can occur if users are updating the table from HMS/Hive concurrently with through Catalog and/or HA. @okumin do you concur ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current server-side cache assumes the following race conditions. Say we have two HAed HMS instances X and Y with RANDOM load-balancing.
- User A caches a table at T1 in HMS X
- User B updates the same table to T2 and caches it in HMS Y
- User C fetches the same table from HMS X, and HMS X returns the table at T1
I suppose the server-side eventual consistency requires us to design and advertise the consistency semantics carefully. I'd say the client side should do such a control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
location objects won't ever match !!!
one returns metadata file location and another just the root table location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the server-side eventual consistency requires us to design and advertise the consistency semantics carefully. I'd say the client side should do such a control.
@okumin, I think similar thing was discussed in https://lists.apache.org/thread/zb5ldjfxvjz4fjwll1lc9cm8rl015zx2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
location objects won't ever match !!! one returns metadata file location and another just the root table location
You are absolutely right, I must missed that during debug, thanks for catching it. It could have been fixed with a call to:
private static String getMetadataLocation(final Table table) {
if (table instanceof HasTableOperations tableOps) {
final TableOperations ops = tableOps.operations();
final TableMetadata meta;
if (ops != null && (meta = ops.current()) != null) {
return meta.metadataFileLocation();
}
}
return null;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 remark: hive-catalog
is a clone of iceberg's hive-metastore
. So whenever we need to apply changes there - we should raise a PR for iceberg as well.
maybe instead of adding getTableLocation
we could use
getMetadataLocation(catalog.loadTable(tableidentifier))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
location objects won't ever match !!!
You're right. My bad. Let's add an addendum patch.
1 remark: hive-catalog is a clone of iceberg's hive-metastore. So whenever we need to apply changes there - we should raise a PR for iceberg as well.
I'm not finding the policy about how to manage cloned files, but it is also right anyway. I'm wondering if we can push the entire caching feature into HiveCatalog in apache/iceberg. I can try it in my spare time if no one tries it. Probably off topic
@okumin, I think similar thing was discussed in https://lists.apache.org/thread/zb5ldjfxvjz4fjwll1lc9cm8rl015zx2
Thanks for identifying the related discussion. We might want to implement the finally-decided solution, probably the ETag-based one.
This reverts commit 9234ddd.
} | ||
} | ||
Table table = cache.get(canonicalized, catalog::loadTable); | ||
if (table instanceof BaseMetadataTable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we only supposed to cache metadata tables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 92 does cache non-metadata table. Per the caffeine doc, cache 'get' obeys "if cached, return; otherwise create, cache and return" pattern ; line 93 and after are meant to deal with the specifics of metadata table.
What am I missing ?
Newer version in #6022 with an actual test that verifies the correct caching behavior. |
What changes were proposed in this pull request?
Adding logic to verify if a cached table actually points to the latest known version
Why are the changes needed?
Cache handling for REST Iceberg catalog may return stale tables.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit tests