HIVE-29035 : new version of cache #6441
Conversation
…e DB to ensure no stale table object is returned;
- improved check; - quiesce console logs due to internal throws in servlet;
|
@ayushtkn if you have some spare time to review, thanks |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@henrib there are some warnings from Sonar & I feel few could be fixed |
| String tableName = baseTableIdentifier.name(); | ||
| try { | ||
| List<Table> tables = clients.run( | ||
| client -> client.getTables(catName, database, Collections.singletonList(tableName), PARAM_SPEC) |
There was a problem hiding this comment.
TODO: It is possible to use IMetaStoreClient#getTable, but #getTables with PARAM_SPEC could be more lightweight. Which is better?
| LOGGER.debug("Table {} not found: {}", baseTableIdentifier, e.getMessage()); | ||
| throw e; | ||
| } catch (NoSuchObjectException e) { | ||
| throw new NoSuchTableException("Table %s not found: %s", baseTableIdentifier, e.getMessage()); |
There was a problem hiding this comment.
This might also not happen since we obtain the list of tables. Also, this method can either return null or throw NoSuchTableException. We may use only either.
| // A RESTException is thrown by HMSCatalogAdapter.execute() after the error handler has | ||
| // already written the correct HTTP status and body to the response (e.g. 404, 403). | ||
| // It is not an unexpected server failure, so log at DEBUG to avoid flooding the console. | ||
| LOG.debug("REST request resulted in a client error (already handled): {}", e.getMessage()); |
There was a problem hiding this comment.
We may not need this one now
|
Please also address issues reported by SonarQube. |
…cache performance metrics; - remove end point to access cache performance metrics; - enhanced tests to check L1 cache; - simplified MetadataLocator exception handling;
Can't remove all of them in particular usage of clientPool(). |
| // so the first post-reset load is a genuine cold miss rather than an L1/L2 hit. | ||
| // NOTE: catalog.invalidateTable() only clears the REST *client* state and does not | ||
| // reach the server-side cache. | ||
| serverCachingCatalog.invalidateTable(tableId); |
There was a problem hiding this comment.
I feel we should write this not as an integration test but as a unit test. If we do so, we can remove an escape hatch, i.e., getLatestCache
| MBeanServer mbs = ManagementFactory.getPlatformMBeanServer(); | ||
| String sanitized = catalogName == null || catalogName.isEmpty() | ||
| ? "default" | ||
| : catalogName.replaceAll("[^a-zA-Z0-9.\\\\-]", "_"); |
There was a problem hiding this comment.
Should we use CATALOG_DEFAULT in MetastoreConf?
| this.jmxObjectName = name; | ||
| LOG.info("Registered JMX MBean: {}", name); | ||
| } catch (JMException e) { | ||
| LOG.warn("Failed to register JMX MBean for HMSCachingCatalog", e); |
There was a problem hiding this comment.
This is probably eligible for error
| LOG.warn("Failed to register JMX MBean for HMSCachingCatalog", e); | |
| LOG.error("Failed to register JMX MBean for HMSCachingCatalog", e); |
| private final int l1CacheSize; | ||
|
|
||
| // Metrics counters. | ||
| private final AtomicLong cacheHitCount = new AtomicLong(0); |
There was a problem hiding this comment.
LongAdder could be an option
| this(catalog, expirationMs, /*caseSensitive*/ true); | ||
| } | ||
|
|
||
| public HMSCachingCatalog(HiveCatalog catalog, long expirationMs, boolean caseSensitive) { |
There was a problem hiding this comment.
Can we use this constructor in the near future? If no, we may always use the default sensitivity.
| * | ||
| * @param tid the table identifier | ||
| */ | ||
| protected void onCacheMetaLoad(TableIdentifier tid) { |
There was a problem hiding this comment.
| protected void onCacheMetaLoad(TableIdentifier tid) { | |
| private void onCacheMetaLoad(TableIdentifier tid) { |
| * | ||
| * @param tid the table identifier | ||
| */ | ||
| protected void onL1CacheHit(TableIdentifier tid) { |
There was a problem hiding this comment.
| protected void onL1CacheHit(TableIdentifier tid) { | |
| private void onL1CacheHit(TableIdentifier tid) { |
| * | ||
| * @param tid the table identifier | ||
| */ | ||
| protected void onL1CacheMiss(TableIdentifier tid) { |
There was a problem hiding this comment.
| protected void onL1CacheMiss(TableIdentifier tid) { | |
| private void onL1CacheMiss(TableIdentifier tid) { |
| // If the table is no longer in L1 cache, we need to check the location. | ||
| final String location = metadataLocator.getLocation(canonicalized); | ||
| if (location == null) { | ||
| LOG.debug("Table {} has no location, returning cached table without location", canonicalized); |
There was a problem hiding this comment.
I think this case needs to throw NoSuchTableException because it does mean the table metadata is deleted, which means manifest lists are highly likely to be deleted or already invalid
| LOG.debug("Table {} is in L1 cache, returning cached table", canonicalized); | ||
| onL1CacheHit(canonicalized); | ||
| onCacheHit(canonicalized); | ||
| return cachedTable; |
There was a problem hiding this comment.
Can we make L1 stuff out of scope from this PR? As CachingCatalog is designed for client-side caching, there is no additional authorization against cached objects. It means if Alice successfully loads abc table into the cache and then Bob fetches it, the security boundary can be breached. It never happens on the client-side since it is always used by Alice if it is initialized by Alice. In the case of server-side (HMS), it is not always true. I'm not 100% confident that the current implementation is safe enough.
If we could split a PR, I could merge the majority of this PR quickly and then closely review the security model for the advanced feature in the second PR.
| final TableIdentifier baseTableIdentifier; | ||
| if (!catalog.isValidIdentifier(identifier)) { | ||
| if (!isValidMetadataIdentifier(identifier)) { | ||
| return null; |
There was a problem hiding this comment.
If we follow the semantics of loadTable, NoSuchTableException
| clients.run(client -> client.getTables(catName, database, Collections.singletonList(tableName), PARAM_SPEC)); | ||
| if (tables != null && !tables.isEmpty()) { | ||
| Table table = tables.getFirst(); | ||
| if (table != null) { |
There was a problem hiding this comment.
I guess this is not a view
| if (table != null) { | |
| if (table != null) { | |
| HiveOperationsBase.validateIcebergViewNotLoadedAsIcebergTable(table, fullName); |
| return table.getParameters().get(BaseMetastoreTableOperations.METADATA_LOCATION_PROP); | ||
| } | ||
| } | ||
| return null; |
There was a problem hiding this comment.
Probably, NoSuchTableException
| * | ||
| * @param tid the table identifier to invalidate | ||
| */ | ||
| protected void onCacheInvalidate(TableIdentifier tid) { |
There was a problem hiding this comment.
Or we may not need a method
|



This checks the table location from HMS DB to ensure no stale table object is returned;