[HUDI-9140] Fix log block io type and other rollback strategy fixes for table version 6#12992
[HUDI-9140] Fix log block io type and other rollback strategy fixes for table version 6#12992lokeshj1703 wants to merge 4 commits intoapache:masterfrom
Conversation
nsivabalan
left a comment
There was a problem hiding this comment.
I am also in a dilemma whether to abstract this out propertly. i.e adding version based classes for ListingBasedRollbackStrategy.
Will jam w/ @yihua .
| // (B.3) Rollback triggered for first commit - Same as (B.1) | ||
| // (B.4) Rollback triggered for recurring commits - Same as (B.2) plus we need to delete the log files | ||
| // as well if the base file gets deleted. | ||
| HoodieCommitMetadata commitMetadata = TimelineUtils.getCommitMetadata(instantToRollback, table.getMetaClient().getCommitsTimeline()); |
There was a problem hiding this comment.
why do we need this commit metadata. already in L 112, we do ahve commitMetadataOptional right and in fact we are using the commitMetadataOptional in L 209
| if (metaClient.getTableConfig().getTableVersion().lesserThan(HoodieTableVersion.EIGHT)) { | ||
| hoodieRollbackRequests.addAll(getHoodieRollbackRequests(partitionPath, filesToDelete.get())); | ||
| } else { | ||
| hoodieRollbackRequests.addAll(getHoodieRollbackRequests(partitionPath, |
There was a problem hiding this comment.
can we add java docs whats differerence b/w version 6 and 8
| if (tableVersion < HoodieTableVersion.EIGHT.versionCode()) { | ||
| // Ensure rollback metadata contains rollback log files for table version 6 | ||
| assertTrue(rollbackMetadata.getPartitionMetadata().values().stream() | ||
| .noneMatch(rollbackPartitionMetadata -> rollbackPartitionMetadata.getRollbackLogFiles().isEmpty())); |
There was a problem hiding this comment.
rollbackLogFiles should only refer to the log files added due to rollback command block.
original log files added with the commit being rolledback should be part of logBlocksToBeDeleted
There was a problem hiding this comment.
or incase of completed rollback, it should be part of logFilesFromFailedCommit
There was a problem hiding this comment.
logBlocksToBeDeleted will be the case for rollback plan
There was a problem hiding this comment.
synced up f2f. we are good here.
| assertTrue(rollbackMetadata.getPartitionMetadata().values().stream() | ||
| .noneMatch(rollbackPartitionMetadata -> rollbackPartitionMetadata.getRollbackLogFiles().isEmpty())); | ||
| // Total 3 partitions would contain 6 log files - one rollback log file and one log file with data block | ||
| assertEquals(6, logFiles.size()); |
There was a problem hiding this comment.
can we also validate logFilesFromFailedCommit from completed rollback commit metadata. this should refer to actual data file added by the commit being rolledback
|
Being covered as part of #13007 |
Change Logs
The PR fixes the iotype as APPEND for log blocks in table version 6. It also reverts some changes made to MarkerBasedRollbackStrategy for table version 6 in HUDI-9030.
Impact
NA
Risk level (write none, low medium or high below)
low
Documentation Update
NA
Contributor's checklist