Skip to content

Check conflict strat#9259

Closed
codope wants to merge 2 commits intoapache:masterfrom
codope:check-conflict-strat
Closed

Check conflict strat#9259
codope wants to merge 2 commits intoapache:masterfrom
codope:check-conflict-strat

Conversation

@codope
Copy link
Member

@codope codope commented Jul 21, 2023

Change Logs

Describe context and summary for this change. Highlight if any code was copied.

Impact

Describe any public API or user-facing feature change or any performance impact.

Risk level (write none, low medium or high below)

If medium or high, explain what verification was done to mitigate the risks.

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

HoodieClusteringConfig clusteringConfig = HoodieClusteringConfig.newBuilder().withClusteringMaxNumGroups(10)
.withClusteringTargetPartitions(0).withInlineClusteringNumCommits(1)
.withClusteringUpdatesStrategy("org.apache.hudi.client.clustering.update.strategy.SparkAllowUpdateStrategy")
.withRollbackPendingClustering(true)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change can be ignored.. i was trying to check whether explicitly setting rollback pending clustering reverts the inflight replacecommit or not. But, this config does not have any effect.


// Do a rollback on the replacecommit that is failed
clusteringWriteClient.rollback(clusteringCommitTime);
// clusteringWriteClient.rollback(clusteringCommitTime);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as per my understanding, there should not be a need to rollback the clustering commit explicitly.. it should be done automatically if a conflict is detected by the new strategy. But, both replacecommit inflight and requested remain in the timeline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, rollback of these clustering commits is handled separately, so it will leave out the inflight in the timeline. We use SparkAllowUpdateStrategy so only those cases you are allowed to use IngestionPrimaryWriterBasedConflictResolutionStrategy, we are using couple of approaches to clean these inflights, one by explicitly assigning a rollback for the failed commit and another approach is by including replacecommits as part of rollbackFailedWrites that way ingestion takes care of clearing them.
I think we need to make immutable nature of clustering commits as a table property i.e. store it in hoodie.properties. That way ingestion knows whether a clustering commits can be rolled back or not and accordingly it can either use SparkRejectUpdateStrategy or SparkAllowUpdateStrategy implementations and cleanup can. be done separately.

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@github-actions github-actions bot added the size:XS PR with lines of changes in <= 10 label Feb 26, 2024
@yihua
Copy link
Contributor

yihua commented Mar 26, 2024

@codope is this PR still needed?

@yihua yihua added the area:table-service Table services label Mar 26, 2024
@yihua
Copy link
Contributor

yihua commented Oct 8, 2024

Closing this PR. @codope @suryaprasanna Feel free to reopen this if needed.

@yihua yihua closed this Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:table-service Table services size:XS PR with lines of changes in <= 10

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants