Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHOENIX-6883 : Phoenix Metadata Caching Redesign #1883

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

palashc
Copy link
Contributor

@palashc palashc commented Apr 24, 2024

JIRA - https://issues.apache.org/jira/browse/PHOENIX-6883

  1. Introduce a region server endpoint for phoenix and server side metadata caching on region servers.
  2. Server side cache kept consistent by invalidating cache entries synchronously during a DDL operation.
  3. Client sets UPDATE_CACHE_FREQUENCY=never and uses LAST_DDL_TIMESTAMP to validate metadata before query/upsert.
  4. Client refreshes its cache using the getTable RPC only when it receives a stale cache exception from the server.

@virajjasani virajjasani self-requested a review April 26, 2024 07:29
boolean isRetry) throws Throwable {
RegionServerEndpointProtos.InvalidateServerMetadataCacheRequest protoRequest =
getRequest(invalidateCacheRequests);
// TODO Do I need my own executor or can I re-use QueryServices#Executor
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This came up in discussion today and the recommendation is to create a separate thread pool for executing cache invalidations. I can create a subtask for the same and add the changes to this PR.
@shahrs87 @tkhurana @kadirozde

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that these new methods are only called at the server side, I think we should move this code to a separate or existing server side class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the cache invalidation code is in CQSI and we would want to avoid creating a new thread pool for every DDL operation, how do you suggest we save the executor? We can have an executor in CQSI but we would not want client side CQSI to create an executor unnecessarily - or is that okay?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kadirozde This code did start out in a separate class but I remember @shahrs87 and @jpisaac had some reasons to move it to CQSI. I could not find which PR that was. @shahrs87 Can you recall those details?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the PR - #1748
@kadirozde Please see comments here #1748 (comment)

@@ -636,7 +636,7 @@ public MutationPlan compile(UpsertStatement upsert) throws SQLException {
// as max TS, so that the query can safely restarted and still work of a snapshot
// (so it won't see its own data in case of concurrent splits)
// see PHOENIX-4849
long serverTime = selectResolver.getTables().get(0).getCurrentTime();
long serverTime = selectResolver.getTables().get(0).getTimeStamp();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between getCurrentTime() and getTimeStamp() ?

Copy link
Contributor Author

@palashc palashc Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tkhurana Currently both of those methods return the same value i.e. upperBoundTimestamp which is either -1 or whatever is provided to the TableRef constructor. Current code:

// if UPDATE_CACHE_FREQUENCY is set, always let the server set timestamps
this.upperBoundTimeStamp = table.getUpdateCacheFrequency()!=0 ? QueryConstants.UNSET_TIMESTAMP : upperBoundTimeStamp;
this.currentTime = this.upperBoundTimeStamp;

public long getTimeStamp() {
        return this.upperBoundTimeStamp;
}
public long getCurrentTime() {
        return this.currentTime;
 }

Once we set UPDATE_CACHE_FREQ to NEVER, currentTime was being set to -1 and features using currentTime were breaking (like QueryOptimizer deciding whether a disabled index is under its usability threshold or asyncCreatedDate during CreateIndex). The following change in TableRef keeps currentTime to whatever is provided to the TableRef constructor so that features using currentTime can continue to use it. There is effectively no change in UpsertCompiler - it will still be using the same value i.e. TableRef.upperBoundTimeStamp.

this.currentTime = upperBoundTimeStamp;

public static final long DEFAULT_UPDATE_CACHE_FREQUENCY
= (long) ConnectionProperty.UPDATE_CACHE_FREQUENCY.getValue("ALWAYS");
public static final boolean DEFAULT_LAST_DDL_TIMESTAMP_VALIDATION_ENABLED = false;
public static final boolean DEFAULT_PHOENIX_METADATA_INVALIDATE_CACHE_ENABLED = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this being used ?

Copy link
Contributor Author

@palashc palashc Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the client-side/server-side flags we would use to enable the feature.

DEFAULT_LAST_DDL_TIMESTAMP_VALIDATION_ENABLED used in helper methods in ValidateLastDDLTimestampUtil which client uses to decide whether to validate timestamps and DEFAULT_PHOENIX_METADATA_INVALIDATE_CACHE_ENABLED in CQSI.invalidateServerMetadataCache to decide whether to invalidate cache on server side.

setLastQueryPlan(plan);

//verify metadata for the table/view/index in the query plan
//plan.getTableRef can be null in some cases like EXPLAIN <query>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice comment

requestBuilder.addLastDDLTimestampRequests(innerBuilder);

// add all indexes of the current table
for (PTable idxPTable : tableRef.getTable().getIndexes()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When validating timestamps we validate the timestamps of all the ancestors and the current table as well as indexes. When the server returns StaleMetadataCacheException are we also re-populating the client cache for all the objects that we are validating ? Are we also re-populating the indexes of the current table ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Currently, we are re-populating everything since we use MetaDataClient.updateCache workflow - it resolves ancestors and goes through the process of adding inherited columns, indexes from them, it also re-populates the indexes into the client cache.

/**
* Factory to instantiate InvalidateMetadataCacheControllers
*/
public class InvalidateMetadataCacheControllerFactory extends RpcControllerFactory {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why didn't we consider using the ServerSideRpcControllerFactory instead of creating a new factory ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants