Skip to content

HIVE-29035 : new version of cache #6441

Open
henrib wants to merge 14 commits into
apache:masterfrom
henrib:HIVE-29035
Open

HIVE-29035 : new version of cache #6441
henrib wants to merge 14 commits into
apache:masterfrom
henrib:HIVE-29035

Conversation

@henrib
Copy link
Copy Markdown
Contributor

@henrib henrib commented Apr 17, 2026

This checks the table location from HMS DB to ensure no stale table object is returned;

…e DB to ensure no stale table object is returned;
Copilot AI review requested due to automatic review settings April 17, 2026 18:13
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

- improved check;
- quiesce console logs due to internal throws in servlet;
@henrib henrib requested a review from Copilot April 28, 2026 16:34
@henrib
Copy link
Copy Markdown
Contributor Author

henrib commented Apr 28, 2026

@ayushtkn if you have some spare time to review, thanks

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ayushtkn
Copy link
Copy Markdown
Member

ayushtkn commented May 7, 2026

String tableName = baseTableIdentifier.name();
try {
List<Table> tables = clients.run(
client -> client.getTables(catName, database, Collections.singletonList(tableName), PARAM_SPEC)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: It is possible to use IMetaStoreClient#getTable, but #getTables with PARAM_SPEC could be more lightweight. Which is better?

LOGGER.debug("Table {} not found: {}", baseTableIdentifier, e.getMessage());
throw e;
} catch (NoSuchObjectException e) {
throw new NoSuchTableException("Table %s not found: %s", baseTableIdentifier, e.getMessage());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might also not happen since we obtain the list of tables. Also, this method can either return null or throw NoSuchTableException. We may use only either.

// A RESTException is thrown by HMSCatalogAdapter.execute() after the error handler has
// already written the correct HTTP status and body to the response (e.g. 404, 403).
// It is not an unexpected server failure, so log at DEBUG to avoid flooding the console.
LOG.debug("REST request resulted in a client error (already handled): {}", e.getMessage());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not need this one now

@okumin
Copy link
Copy Markdown
Contributor

okumin commented May 10, 2026

…cache performance metrics;

- remove end point to access cache performance metrics;
- enhanced tests to check L1 cache;
- simplified MetadataLocator exception handling;
@henrib
Copy link
Copy Markdown
Contributor Author

henrib commented May 12, 2026

Please also address issues reported by SonarQube. https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=6441&issueStatuses=OPEN,CONFIRMED&sinceLeakPeriod=true

Can't remove all of them in particular usage of clientPool().

// so the first post-reset load is a genuine cold miss rather than an L1/L2 hit.
// NOTE: catalog.invalidateTable() only clears the REST *client* state and does not
// reach the server-side cache.
serverCachingCatalog.invalidateTable(tableId);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we should write this not as an integration test but as a unit test. If we do so, we can remove an escape hatch, i.e., getLatestCache

MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
String sanitized = catalogName == null || catalogName.isEmpty()
? "default"
: catalogName.replaceAll("[^a-zA-Z0-9.\\\\-]", "_");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use CATALOG_DEFAULT in MetastoreConf?

this.jmxObjectName = name;
LOG.info("Registered JMX MBean: {}", name);
} catch (JMException e) {
LOG.warn("Failed to register JMX MBean for HMSCachingCatalog", e);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably eligible for error

Suggested change
LOG.warn("Failed to register JMX MBean for HMSCachingCatalog", e);
LOG.error("Failed to register JMX MBean for HMSCachingCatalog", e);

private final int l1CacheSize;

// Metrics counters.
private final AtomicLong cacheHitCount = new AtomicLong(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LongAdder could be an option

this(catalog, expirationMs, /*caseSensitive*/ true);
}

public HMSCachingCatalog(HiveCatalog catalog, long expirationMs, boolean caseSensitive) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use this constructor in the near future? If no, we may always use the default sensitivity.

*
* @param tid the table identifier
*/
protected void onCacheMetaLoad(TableIdentifier tid) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
protected void onCacheMetaLoad(TableIdentifier tid) {
private void onCacheMetaLoad(TableIdentifier tid) {

*
* @param tid the table identifier
*/
protected void onL1CacheHit(TableIdentifier tid) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
protected void onL1CacheHit(TableIdentifier tid) {
private void onL1CacheHit(TableIdentifier tid) {

*
* @param tid the table identifier
*/
protected void onL1CacheMiss(TableIdentifier tid) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
protected void onL1CacheMiss(TableIdentifier tid) {
private void onL1CacheMiss(TableIdentifier tid) {

// If the table is no longer in L1 cache, we need to check the location.
final String location = metadataLocator.getLocation(canonicalized);
if (location == null) {
LOG.debug("Table {} has no location, returning cached table without location", canonicalized);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this case needs to throw NoSuchTableException because it does mean the table metadata is deleted, which means manifest lists are highly likely to be deleted or already invalid

LOG.debug("Table {} is in L1 cache, returning cached table", canonicalized);
onL1CacheHit(canonicalized);
onCacheHit(canonicalized);
return cachedTable;
Copy link
Copy Markdown
Contributor

@okumin okumin May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make L1 stuff out of scope from this PR? As CachingCatalog is designed for client-side caching, there is no additional authorization against cached objects. It means if Alice successfully loads abc table into the cache and then Bob fetches it, the security boundary can be breached. It never happens on the client-side since it is always used by Alice if it is initialized by Alice. In the case of server-side (HMS), it is not always true. I'm not 100% confident that the current implementation is safe enough.
If we could split a PR, I could merge the majority of this PR quickly and then closely review the security model for the advanced feature in the second PR.

final TableIdentifier baseTableIdentifier;
if (!catalog.isValidIdentifier(identifier)) {
if (!isValidMetadataIdentifier(identifier)) {
return null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we follow the semantics of loadTable, NoSuchTableException

clients.run(client -> client.getTables(catName, database, Collections.singletonList(tableName), PARAM_SPEC));
if (tables != null && !tables.isEmpty()) {
Table table = tables.getFirst();
if (table != null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is not a view

Suggested change
if (table != null) {
if (table != null) {
HiveOperationsBase.validateIcebergViewNotLoadedAsIcebergTable(table, fullName);

return table.getParameters().get(BaseMetastoreTableOperations.METADATA_LOCATION_PROP);
}
}
return null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, NoSuchTableException

*
* @param tid the table identifier to invalidate
*/
protected void onCacheInvalidate(TableIdentifier tid) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we may not need a method

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants