-
Notifications
You must be signed in to change notification settings - Fork 993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHOENIX-6883 : Phoenix Metadata Caching Redesign #1883
base: master
Are you sure you want to change the base?
Conversation
boolean isRetry) throws Throwable { | ||
RegionServerEndpointProtos.InvalidateServerMetadataCacheRequest protoRequest = | ||
getRequest(invalidateCacheRequests); | ||
// TODO Do I need my own executor or can I re-use QueryServices#Executor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This came up in discussion today and the recommendation is to create a separate thread pool for executing cache invalidations. I can create a subtask for the same and add the changes to this PR.
@shahrs87 @tkhurana @kadirozde
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that these new methods are only called at the server side, I think we should move this code to a separate or existing server side class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the cache invalidation code is in CQSI and we would want to avoid creating a new thread pool for every DDL operation, how do you suggest we save the executor? We can have an executor
in CQSI but we would not want client side CQSI to create an executor unnecessarily - or is that okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kadirozde This code did start out in a separate class but I remember @shahrs87 and @jpisaac had some reasons to move it to CQSI. I could not find which PR that was. @shahrs87 Can you recall those details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the PR - #1748
@kadirozde Please see comments here #1748 (comment)
phoenix-core-client/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java
Outdated
Show resolved
Hide resolved
@@ -636,7 +636,7 @@ public MutationPlan compile(UpsertStatement upsert) throws SQLException { | |||
// as max TS, so that the query can safely restarted and still work of a snapshot | |||
// (so it won't see its own data in case of concurrent splits) | |||
// see PHOENIX-4849 | |||
long serverTime = selectResolver.getTables().get(0).getCurrentTime(); | |||
long serverTime = selectResolver.getTables().get(0).getTimeStamp(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between getCurrentTime() and getTimeStamp() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tkhurana Currently both of those methods return the same value i.e. upperBoundTimestamp
which is either -1 or whatever is provided to the TableRef constructor. Current code:
// if UPDATE_CACHE_FREQUENCY is set, always let the server set timestamps
this.upperBoundTimeStamp = table.getUpdateCacheFrequency()!=0 ? QueryConstants.UNSET_TIMESTAMP : upperBoundTimeStamp;
this.currentTime = this.upperBoundTimeStamp;
public long getTimeStamp() {
return this.upperBoundTimeStamp;
}
public long getCurrentTime() {
return this.currentTime;
}
Once we set UPDATE_CACHE_FREQ to NEVER, currentTime was being set to -1 and features using currentTime were breaking (like QueryOptimizer deciding whether a disabled index is under its usability threshold or asyncCreatedDate during CreateIndex). The following change in TableRef keeps currentTime to whatever is provided to the TableRef constructor so that features using currentTime can continue to use it. There is effectively no change in UpsertCompiler - it will still be using the same value i.e. TableRef.upperBoundTimeStamp.
this.currentTime = upperBoundTimeStamp;
phoenix-core-client/src/main/java/org/apache/phoenix/cache/ServerMetadataCacheImpl.java
Outdated
Show resolved
Hide resolved
phoenix-core-client/src/main/java/org/apache/phoenix/cache/ServerMetadataCacheImpl.java
Outdated
Show resolved
Hide resolved
phoenix-core-client/src/main/java/org/apache/phoenix/cache/ServerMetadataCacheImpl.java
Outdated
Show resolved
Hide resolved
phoenix-core-client/src/main/java/org/apache/phoenix/cache/ServerMetadataCacheImpl.java
Outdated
Show resolved
Hide resolved
phoenix-core-client/src/main/java/org/apache/phoenix/compile/FromCompiler.java
Show resolved
Hide resolved
phoenix-core-client/src/main/java/org/apache/phoenix/compile/FromCompiler.java
Show resolved
Hide resolved
...ix-core-server/src/main/java/org/apache/phoenix/coprocessor/PhoenixRegionServerEndpoint.java
Outdated
Show resolved
Hide resolved
public static final long DEFAULT_UPDATE_CACHE_FREQUENCY | ||
= (long) ConnectionProperty.UPDATE_CACHE_FREQUENCY.getValue("ALWAYS"); | ||
public static final boolean DEFAULT_LAST_DDL_TIMESTAMP_VALIDATION_ENABLED = false; | ||
public static final boolean DEFAULT_PHOENIX_METADATA_INVALIDATE_CACHE_ENABLED = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this being used ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the client-side/server-side flags we would use to enable the feature.
DEFAULT_LAST_DDL_TIMESTAMP_VALIDATION_ENABLED
used in helper methods in ValidateLastDDLTimestampUtil
which client uses to decide whether to validate timestamps and DEFAULT_PHOENIX_METADATA_INVALIDATE_CACHE_ENABLED
in CQSI.invalidateServerMetadataCache
to decide whether to invalidate cache on server side.
phoenix-core-client/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java
Show resolved
Hide resolved
setLastQueryPlan(plan); | ||
|
||
//verify metadata for the table/view/index in the query plan | ||
//plan.getTableRef can be null in some cases like EXPLAIN <query> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice comment
requestBuilder.addLastDDLTimestampRequests(innerBuilder); | ||
|
||
// add all indexes of the current table | ||
for (PTable idxPTable : tableRef.getTable().getIndexes()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When validating timestamps we validate the timestamps of all the ancestors and the current table as well as indexes. When the server returns StaleMetadataCacheException are we also re-populating the client cache for all the objects that we are validating ? Are we also re-populating the indexes of the current table ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Currently, we are re-populating everything since we use MetaDataClient.updateCache
workflow - it resolves ancestors and goes through the process of adding inherited columns, indexes from them, it also re-populates the indexes into the client cache.
phoenix-core-server/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java
Show resolved
Hide resolved
/** | ||
* Factory to instantiate InvalidateMetadataCacheControllers | ||
*/ | ||
public class InvalidateMetadataCacheControllerFactory extends RpcControllerFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why didn't we consider using the ServerSideRpcControllerFactory instead of creating a new factory ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about this one
JIRA - https://issues.apache.org/jira/browse/PHOENIX-6883
UPDATE_CACHE_FREQUENCY=never
and uses LAST_DDL_TIMESTAMP to validate metadata before query/upsert.getTable
RPC only when it receives a stale cache exception from the server.