Skip to content

Remove validations for clustering metadata #16864

@hudi-bot

Description

@hudi-bot

When clustering plan has log files which delete all records in the partition/base file, the clustering used to fail before because of this validation validateClusteringCommit
[https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java#L490C5-L490C29] 
{code:java}
if (clusteringMetadata.getWriteStatuses().isEmpty()) {
HoodieClusteringPlan clusteringPlan = ClusteringUtils.getClusteringPlan(
table.getMetaClient(), ClusteringUtils.getInflightClusteringInstant(clusteringCommitTime, table.getActiveTimeline(), table.getInstantGenerator()).get())
.map(Pair::getRight).orElseThrow(() -> new HoodieClusteringException(
"Unable to read clustering plan for instant: " + clusteringCommitTime));
throw new HoodieClusteringException("Clustering plan produced 0 WriteStatus for " + clusteringCommitTime

  • " #groups: " + clusteringPlan.getInputGroups().size() + " expected at least "
  • clusteringPlan.getInputGroups().stream().mapToInt(HoodieClusteringGroup::getNumOutputFileGroups).sum()
  • " write statuses");
    }{code}

We can remove this validation as it's not required.

JIRA info

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions