Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-5629] Clean CDC log files for enable/disable scenario #7767

Merged
merged 2 commits into from
Jan 28, 2023

Conversation

YannByron
Copy link
Contributor

Change Logs

According the current clean logic about cdc, clean cdc log files only when the cdc config is enabled. But if a table enables cdc first, then disables it, will probably leave some cdc log files which can't be cleaned.

So clean log files if exists directly.

Impact

NONE

Risk level (write none, low medium or high below)

NONE

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

// and normal log files for mor tables.
cleanPaths.addAll(
nextSlice.getLogFiles().map(lf -> new CleanFileInfo(lf.getPath().toString(), false))
.collect(Collectors.toList()));
return cleanPaths;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should pre-condition hoodieTable.getMetaClient().getTableType() == HoodieTableType.MERGE_ON_READ be kept for efficiency?

@xushiyan xushiyan changed the title [HUDI-5629] Clean CDC log fils [HUDI-5629] Clean CDC log files for enable/disable scenario Jan 28, 2023
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@YannByron
Copy link
Contributor Author

The failed UTs is not related to this pr, but fixed in #7768. Now merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

5 participants