Skip to content

branch-3.1: [fix](cloud) modify some partition getVisibleVersion to getCachedVisibleVersion (#60072)#61038

Closed
mymeiyi wants to merge 1 commit intoapache:branch-3.1from
mymeiyi:branch-3.1-pick-60072
Closed

branch-3.1: [fix](cloud) modify some partition getVisibleVersion to getCachedVisibleVersion (#60072)#61038
mymeiyi wants to merge 1 commit intoapache:branch-3.1from
mymeiyi:branch-3.1-pick-60072

Conversation

@mymeiyi
Copy link
Contributor

@mymeiyi mymeiyi commented Mar 4, 2026

pick #60072

@mymeiyi mymeiyi requested a review from morrySnow as a code owner March 4, 2026 09:15
Copilot AI review requested due to automatic review settings March 4, 2026 09:15
@Thearas
Copy link
Contributor

Thearas commented Mar 4, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backports #60072 to branch-3.1 to reduce cloud-mode reliance on live partition visible-version lookups by switching several call sites from Partition#getVisibleVersion() to Partition#getCachedVisibleVersion().

Changes:

  • Replace getVisibleVersion() with getCachedVisibleVersion() in multiple FE code paths (diagnosis, metadata viewing, row count / sampling, proximate row count).
  • Simplify CloudReplica#checkVersionCatchUp() to an unconditional true implementation (and remove now-unused Partition import).
  • Remove partition-cache update request building logic from RowBatchBuilder and a PartitionRange.PartitionSingle#debug() helper.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
fe/fe-core/src/main/java/org/apache/doris/system/Diagnoser.java Uses cached partition version during tablet diagnosis.
fe/fe-core/src/main/java/org/apache/doris/statistics/OlapAnalysisTask.java Uses cached partition version when estimating tablet rows for sampling.
fe/fe-core/src/main/java/org/apache/doris/qe/cache/RowBatchBuilder.java Removes partition-cache update request builder method.
fe/fe-core/src/main/java/org/apache/doris/qe/cache/PartitionRange.java Removes PartitionSingle.debug() logging helper.
fe/fe-core/src/main/java/org/apache/doris/httpv2/rest/RowCountAction.java Uses cached partition version when computing row counts.
fe/fe-core/src/main/java/org/apache/doris/cloud/catalog/CloudReplica.java Makes checkVersionCatchUp() always return true; removes Partition import and redundant logic.
fe/fe-core/src/main/java/org/apache/doris/catalog/OlapTable.java Uses cached partition version in proximateRowCount().
fe/fe-core/src/main/java/org/apache/doris/catalog/MetadataViewer.java Uses cached partition version when building replica status output.
Comments suppressed due to low confidence (1)

fe/fe-core/src/main/java/org/apache/doris/system/Diagnoser.java:156

  • In cloud mode, CloudPartition.getCachedVisibleVersion() may be -1 (unknown) until a meta-service fetch occurs. With this change, visibleVersion can be -1, and the subsequent replica.getVersion() != visibleVersion check will always report a version mismatch for every replica, making ReplicaVersionStatus misleading/noisy in cloud. Consider skipping the replica-vs-partition version equality check when Config.isCloudMode() (or when visibleVersion < Partition.PARTITION_INIT_VERSION), or explicitly fetching partition.getVisibleVersion() only for this diagnostic path if accurate partition version is required.
        // for local mode, getCachedVisibleVersion return visibleVersion.
        // for cloud mode, the replica version is not updated.
        long visibleVersion = partition.getCachedVisibleVersion();
        for (Replica replica : replicas) {
            // backend
            do {
                Backend be = infoService.getBackend(replica.getBackendIdWithoutException());
                if (be == null) {
                    backendErr.append("Backend "
                            + replica.getBackendIdWithoutException() + " does not exist. ");
                    break;
                }
                if (!be.isAlive()) {
                    backendErr.append("Backend " + replica.getBackendIdWithoutException() + " is not alive. ");
                    break;
                }
                if (be.isDecommissioned()) {
                    backendErr.append("Backend " + replica.getBackendIdWithoutException() + " is decommission. ");
                    break;
                }
                if (!be.isLoadAvailable()) {
                    backendErr.append("Backend " + replica.getBackendIdWithoutException() + " is not load available. ");
                    break;
                }
                if (!be.isQueryAvailable()) {
                    backendErr.append("Backend "
                            + replica.getBackendIdWithoutException() + " is not query available. ");
                    break;
                }
                if (be.diskExceedLimit()) {
                    backendErr.append("Backend " + replica.getBackendIdWithoutException() + " has no space left. ");
                    break;
                }
            } while (false);
            // version
            if (replica.getVersion() != visibleVersion) {
                versionErr.append("Replica on backend " + replica.getBackendIdWithoutException() + "'s version ("
                        + replica.getVersion() + ") does not equal"
                        + " to partition visible version (" + visibleVersion + ")");
            } else if (replica.getLastFailedVersion() != -1) {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 153 to 157
}
}
return key;
}

/**
* Rowbatch split to Row
*/
public InternalService.PUpdateCacheRequest buildPartitionUpdateRequest(String sql) {
if (updateRequest == null) {
updateRequest = InternalService.PUpdateCacheRequest.newBuilder()
.setSqlKey(CacheProxy.getMd5(sql))
.setCacheType(InternalService.CacheType.PARTITION_CACHE).build();
}
HashMap<Long, List<byte[]>> partRowMap = new HashMap<>();
List<byte[]> partitionRowList;
PartitionRange.PartitionKeyType cacheKey;
for (byte[] row : rowList) {
cacheKey = getKeyFromRow(row, keyIndex, keyType);
if (!cachePartMap.containsKey(cacheKey.realValue())) {
LOG.info("cant find partition key {}", cacheKey.realValue());
continue;
}
if (!partRowMap.containsKey(cacheKey.realValue())) {
partitionRowList = Lists.newArrayList();
partitionRowList.add(row);
partRowMap.put(cacheKey.realValue(), partitionRowList);
} else {
partRowMap.get(cacheKey.realValue()).add(row);
}
}

for (HashMap.Entry<Long, List<byte[]>> entry : partRowMap.entrySet()) {
Long key = entry.getKey();
PartitionRange.PartitionSingle partition = cachePartMap.get(key);
partitionRowList = entry.getValue();
updateRequest = updateRequest.toBuilder()
.addValues(InternalService.PCacheValue.newBuilder()
.setParam(InternalService.PCacheParam.newBuilder()
.setPartitionKey(key)
.setLastVersion(partition.getPartition().getVisibleVersion())
.setLastVersionTime(partition.getPartition().getVisibleVersionTime())
.build()).setDataSize(dataSize).addAllRows(
partitionRowList.stream().map(ByteString::copyFrom)
.collect(Collectors.toList()))).build();
}
return updateRequest;
}
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildPartitionUpdateRequest() was removed from RowBatchBuilder, but it is still referenced by PartitionCache.updateCache() (see PartitionCache.java calling rowBatchBuilder.buildPartitionUpdateRequest(...)). This will cause a compilation failure and also breaks partition cache update functionality. Either restore this method (updated to use getCachedVisibleVersion() as intended) or change PartitionCache to use the new/alternative update-request builder API.

Copilot uses AI. Check for mistakes.
@mymeiyi mymeiyi closed this Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants