Skip to content

Comments

[HUDI-5859] Adding standalone restore tool#8044

Closed
nsivabalan wants to merge 5 commits intoapache:masterfrom
nsivabalan:restoreTool
Closed

[HUDI-5859] Adding standalone restore tool#8044
nsivabalan wants to merge 5 commits intoapache:masterfrom
nsivabalan:restoreTool

Conversation

@nsivabalan
Copy link
Contributor

@nsivabalan nsivabalan commented Feb 25, 2023

Change Logs

For MOR Table, restoring to a very old delta commit is very time consuming. since internally, we do rollback of 1 commit at a time. This standalone tool takes a stab at improving the performance of restore. You can choose a delta commit just before a compaction commit, and this tool will directly delete files for newer file slices after the delta commit chosen.

this tool does not yet suport restoring to middle of file slice.
Restore timestamp has to be latest delta commit before any compaction commit.

For eg,
dc1,
dc2
c3,
dc4,
dc5,
c6,
dc7,
dc8,
c9,
dc10,
dc11

Valid commit times to restore w/ this tool:
dc2, or dc5, dc8.

In other words, this tool can only clean up entire file slices and hence.

After cleaning up the data files, this toll will also delete the corresponding commit meta files from ".hoodie".

Caution:
Metadata has to be disbaled.
And this tool takesn unconventional route of not going via rollback. This tool directly lists the files and deletes them and also deleted the timeline files if necessary.

Sample command

./bin/spark-submit --num-executors 1 --executor-cores 2 --deploy-mode client --driver-memory 8g --executor-memory 8g  --class org.apache.hudi.utilities.MORRestoreTool BUNDLE_LOCATION/hudi-utilities-bundle_2.11-0.14.0-SNAPSHOT.jar --base-path /tmp/hudi_trips_more_restore/ --parallelism 4 --commitTime 20230225091008404 --spark-master local[2] --execute --cleanUpMetadata

Impact

Describe any public API or user-facing feature change or any performance impact.

Risk level (write none, low medium or high below)

If medium or high, explain what verification was done to mitigate the risks.

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

break;
} else {
// we need to collect only partial list of log files to be deleted
/*TableSchemaResolver tableSchemaResolver = new TableSchemaResolver(metaClient);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is feasible to support restoring to any delta commit. For now, have not tested this part yet.

@nsivabalan nsivabalan force-pushed the restoreTool branch 2 times, most recently from 038d577 to 8742a53 Compare February 25, 2023 17:39
* Clears hoodie.table.metadata.partitions in hoodie.properties
*/
private void clearMetadataTablePartitionsConfig(Option<MetadataPartitionType> partitionType, boolean clearAll) {
public static void clearMetadataTablePartitionsConfig(Option<MetadataPartitionType> partitionType, boolean clearAll, HoodieTableMetaClient metaClient) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid moving methods to static. This only makes the code harder to unit test. Also changing from private to public static seems like maybe we should move this functionality outside of the HoodieTable class, what do you think?

private final HoodieTableMetaClient metaClient;

public MORRestoreTool(HoodieTableMetaClient metaClient) {
this.metaClient = metaClient;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor isn't setting the other instance vars, can we set those to avoid NPEs? Are there some vars that only need to exist within the constructor?

}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some assertions that the table can still be read and that the records retrieved match expectations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any assertions on the metadata table that also need to be added here to make sure that has the correct end result/state?

DATA_GENERATOR.close();
}

@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a sanity check test that the "dry run" is taken into account

HoodieSparkEngineContext engineContext = new HoodieSparkEngineContext(jsc);
FileSystem fs = metaClient.getFs();
if (fs.exists(new Path(basePath + "/" + METADATA_TABLE_FOLDER_PATH))) {
if (!cfg.cleanupMetadata) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a dryrun option for this?

filesToDelete.add(Pair.of(pPath, logFile.getPath().toString()));
});
LOG.info(fileSlice.getBaseInstantTime() + " Not processing remaining file slices");
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of using break here and below?

@nsivabalan nsivabalan changed the title [WIP] Adding standalone restore tool [HUDI-5859] Adding standalone restore tool Feb 28, 2023
@xushiyan xushiyan self-assigned this Feb 28, 2023
@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@vinothchandar
Copy link
Member

Lets build this as a CLI enhancement? and also fr COW and MOR in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants