Skip to content

[HUDI-8551] Support no precombine field with EVENT_TIME_ORDERING merge mode#12292

Merged
codope merged 5 commits intoapache:masterfrom
yihua:HUDI-8551-no-precombine
Jan 29, 2025
Merged

[HUDI-8551] Support no precombine field with EVENT_TIME_ORDERING merge mode#12292
codope merged 5 commits intoapache:masterfrom
yihua:HUDI-8551-no-precombine

Conversation

@yihua
Copy link
Contributor

@yihua yihua commented Nov 19, 2024

Change Logs

This PR fixes the issue when the precombine field is not set with the merge mode of EVENT_TIME_ORDERING. Note that after #12600, there is no default of the precombine field config and it is no longer required for COMMIT_TIME_ORDERING. For EVENT_TIME_ORDERING, if the precombine field is not set, the logic should fall back to use the default ordering value of (int) 0 which has the highest priority (i.e., commit time ordering).

Detailed fixes include:

  • Handles no precombine field when getting the ordering value in HoodieSparkRecord and HoodieAvroIndexedRecord to avoid NPE;
  • Checks both empty and null for precombine field name to avoid issues;
  • Considers empty ordering field specified by the Hudi Streamer when creating HoodieRecords from the Streamer.

New tests are added in TestMORDataSource and TestHoodieDeltaStreamer to cover the failure cases without this PR.

Impact

Supports no precombine field with EVENT_TIME_ORDERING merge mode.

Risk level

low

Documentation Update

Release docs will be updated.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:M PR with lines of changes in (100, 300] label Nov 19, 2024
@yihua yihua force-pushed the HUDI-8551-no-precombine branch 3 times, most recently from bd2b267 to 511d02e Compare January 28, 2025 17:06
Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we de-couple partial update support w/ Commit time ordering and sql writes separately

Once we do that, we are good to land. changes LGTM otherwise.

@nsivabalan nsivabalan added the release-1.0.1 Patches targetted for 1.0.1 release label Jan 28, 2025
@yihua yihua force-pushed the HUDI-8551-no-precombine branch from 511d02e to 5c65a5d Compare January 29, 2025 08:51
@github-actions github-actions bot added size:S PR with lines of changes in (10, 100] and removed size:M PR with lines of changes in (100, 300] labels Jan 29, 2025
@yihua yihua changed the title [HUDI-8551][DNM] Support no precombine field in MOR tables [HUDI-8551] Support no precombine field in MOR tables Jan 29, 2025
@yihua yihua changed the title [HUDI-8551] Support no precombine field in MOR tables [HUDI-8551] Support no precombine field with EVENT_TIME_ORDERING Jan 29, 2025
@github-actions github-actions bot added size:M PR with lines of changes in (100, 300] and removed size:S PR with lines of changes in (10, 100] labels Jan 29, 2025
@yihua yihua changed the title [HUDI-8551] Support no precombine field with EVENT_TIME_ORDERING [HUDI-8551] Support no precombine field with EVENT_TIME_ORDERING merge mode Jan 29, 2025
@yihua yihua marked this pull request as ready for review January 29, 2025 10:26
@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@codope codope merged commit 8f6b1ae into apache:master Jan 29, 2025
43 of 45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-1.0.1 Patches targetted for 1.0.1 release size:M PR with lines of changes in (100, 300]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants