[HUDI-8126] Use union to parallelize data and error table writes by vinishjail97 · Pull Request #12813 · apache/hudi

vinishjail97 · 2025-02-08T00:34:47Z

Change Logs

Enable writing of error and data table in parallel. This behavior is disabled by default and can enabled by setting error table config property: hoodie.errortable.write.union.enable to true.

Impact

The DAG's for data table + error table are sequential today, this change executes them in a union to better utilize the executor resources in the spark driver.

Risk level (write none, low medium or high below)

Low.

Documentation Update

  public static final ConfigProperty<Boolean> ENABLE_ERROR_TABLE_WRITE_UNIFICATION = ConfigProperty
      .key("hoodie.errortable.write.union.enable")
      .defaultValue(false)
      .withDocumentation("Enable error table union with data table when writing for improved commit performance. "
          + "By default it is disabled meaning data table and error table writes are sequential");
### Contributor's checklist

Read through contributor's guide
Change Logs and Impact were stated clearly
Adequate tests were added if applicable
CI passed

nsivabalan · 2025-02-11T19:27:58Z

@the-other-tim-brown @rmahindra : Can you folks review this. once everything looks good and CI is green, lmk. I can help land the patch.

the-other-tim-brown · 2025-02-13T02:10:20Z

hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieErrorTableConfig.java

      .defaultValue(ErrorWriteFailureStrategy.ROLLBACK_COMMIT.name())
      .withDocumentation("The config specifies the failure strategy if error table write fails. "
          + "Use one of - " + Arrays.toString(ErrorWriteFailureStrategy.values()));
+  public static final ConfigProperty<Boolean> ENABLE_ERROR_TABLE_WRITE_UNIFICATION = ConfigProperty


I am wondering if we really need a flag for this. This seems like it will be more performant for users with the error table writer enabled

The error table write implementation can't do a union if they implement the upsertAndCommit method so still need to support it behind a feature flag to avoid breaking things for existing users.

vinishjail97 · 2025-02-20T04:50:28Z

@nsivabalan How do I see logs for azure run ? Can we re-trigger the CI run again if possible ?
Nothing to show. Final logs are missing. This can happen when the job is cancelled or times out.

vinishjail97 · 2025-02-20T06:23:34Z

This seems to be because of testReadArchivedCommitsIncrementally, azure failed in another PR as well.

https://dev.azure.com/apachehudi/hudi-oss-ci/_build/results?buildId=3732&view=logs&j=d8698f62-59df-5d60-659e-8e4b90e4e5ba&t=7e6a176b-fa3b-5e2b-bced-8d295e83a4a8

nsivabalan · 2025-02-20T16:20:45Z

have retriggered

vinishjail97 · 2025-02-21T17:46:48Z

Rebased with latest master.

vinishjail97 · 2025-02-21T20:07:15Z

Jacoco failures.

Jacoco CLI jar: jacoco-lib/lib/jacococli.jar
Hudi source directory: /home/vsts/work/1/s
[INFO] Loading execution data file /home/vsts/work/1/s/hudi-client/hudi-spark-client/target/jacoco-agent/jacoco1.exec.
[INFO] Loading execution data file /home/vsts/work/1/s/hudi-client/hudi-spark-client/target/jacoco-agent/jacoco2.exec.
Exception in thread "main" java.io.EOFException
	at java.base/java.io.DataInputStream.readUnsignedShort(DataInputStream.java:345)
	at java.base/java.io.DataInputStream.readUTF(DataInputStream.java:594)
	at java.base/java.io.DataInputStream.readUTF(DataInputStream.java:569)
	at org.jacoco.cli.internal.core.data.ExecutionDataReader.readExecutionData(ExecutionDataReader.java:149)
	at org.jacoco.cli.internal.core.data.ExecutionDataReader.readBlock(ExecutionDataReader.java:116)
	at org.jacoco.cli.internal.core.data.ExecutionDataReader.read(ExecutionDataReader.java:93)
	at org.jacoco.cli.internal.core.tools.ExecFileLoader.load(ExecFileLoader.java:60)
	at org.jacoco.cli.internal.core.tools.ExecFileLoader.load(ExecFileLoader.java:74)
	at org.jacoco.cli.internal.commands.Merge.loadExecutionData(Merge.java:61)
	at org.jacoco.cli.internal.commands.Merge.execute(Merge.java:45)
	at org.jacoco.cli.internal.Main.execute(Main.java:90)
	at org.jacoco.cli.internal.Main.main(Main.java:105)

##[error]Bash exited with code '1'.

vinishjail97 · 2025-02-21T20:08:04Z

Test failures.

2025-02-21T19:32:37.9608824Z [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.143 s - in org.apache.hudi.utilities.TestManifestFileWriterSpark
2025-02-21T19:32:39.5325904Z [INFO] 
2025-02-21T19:32:39.5331397Z [INFO] Results:
2025-02-21T19:32:39.5338649Z [INFO] 
2025-02-21T19:32:39.5361383Z [ERROR] Errors: 
2025-02-21T19:32:39.5934520Z [ERROR]   TestHoodieIncrSourceE2E.testSyncE2ENoPrevCkpThenSyncMultipleTimes:284 » HoodieIO
2025-02-21T19:32:39.5935804Z [ERROR]   TestHoodieIncrSourceE2E.testSyncE2ENoPrevCkpThenSyncMultipleTimes:284 » HoodieIO
2025-02-21T19:32:39.5936486Z [ERROR]   TestHoodieIncrSourceE2EAutoUpgrade.testSyncE2ENoPrevCkpThenSyncMultipleTimes:221 » HoodieIO
2025-02-21T19:32:39.5936840Z [INFO] 
2025-02-21T19:32:39.5937117Z [ERROR] Tests run: 771, Failures: 0, Errors: 3, Skipped: 6
2025-02-21T19:32:39.5937390Z [INFO] 
2025-02-21T19:32:39.6887020Z [INFO] ------------------------------------------------------------------------
2025-02-21T19:32:39.6890829Z [INFO] BUILD FAILURE
2025-02-21T19:32:39.6895163Z [INFO] ------------------------------------------------------------------------

hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java

nsivabalan · 2025-02-22T18:47:28Z

@hudi-bot run azure

hudi-bot · 2025-02-22T23:59:52Z

CI report:

7eda562 UNKNOWN
2a312c2 Azure: SUCCESS

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

…che#12813) (cherry picked from commit cb32e5e)

github-actions bot added the size:L PR with lines of changes in (300, 1000] label Feb 8, 2025

the-other-tim-brown reviewed Feb 13, 2025

View reviewed changes

kroushan-nit mentioned this pull request Feb 18, 2025

[HUDI-8126] Use union to parallelize data and error table writes #11843

Closed

4 tasks

vinishjail97 force-pushed the HUDI-8126_error-table branch 2 times, most recently from d72d93b to 1189123 Compare February 19, 2025 20:42

the-other-tim-brown approved these changes Feb 20, 2025

View reviewed changes

Use union to parallelize data and error table writes

cf313c0

vinishjail97 force-pushed the HUDI-8126_error-table branch from 1189123 to cf313c0 Compare February 21, 2025 17:46

Handle table version upgrade use-case in writeToSinkAndDoMetaSync

f586076

vinishjail97 commented Feb 21, 2025

View reviewed changes

hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java Outdated Show resolved Hide resolved

Reload metaclient if timelineLayout version changes

462b5e9

nsivabalan reviewed Feb 22, 2025

View reviewed changes

hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java Outdated Show resolved Hide resolved

Address comments

7eda562

nsivabalan approved these changes Feb 22, 2025

View reviewed changes

Call getLatestCommittedInstant behind errorTableWriter

2a312c2

nsivabalan merged commit cb32e5e into apache:master Feb 23, 2025
43 checks passed

yihua added the release-1.0.2 label Mar 14, 2025

voonhous pushed a commit to voonhous/hudi that referenced this pull request Apr 8, 2025

[HUDI-8126] Use union to parallelize data and error table writes (apa…

2c5c1d0

…che#12813) (cherry picked from commit cb32e5e)

voonhous pushed a commit to voonhous/hudi that referenced this pull request Apr 8, 2025

[HUDI-8126] Use union to parallelize data and error table writes (apa…

b1721e1

…che#12813) (cherry picked from commit cb32e5e)

voonhous pushed a commit to voonhous/hudi that referenced this pull request Apr 9, 2025

[HUDI-8126] Use union to parallelize data and error table writes (apa…

a1f7483

…che#12813) (cherry picked from commit cb32e5e)

voonhous pushed a commit to voonhous/hudi that referenced this pull request Apr 15, 2025

[HUDI-8126] Use union to parallelize data and error table writes (apa…

ab0f9d2

…che#12813) (cherry picked from commit cb32e5e)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HUDI-8126] Use union to parallelize data and error table writes#12813

[HUDI-8126] Use union to parallelize data and error table writes#12813
nsivabalan merged 5 commits intoapache:masterfrom
vinishjail97:HUDI-8126_error-table

vinishjail97 commented Feb 8, 2025

Uh oh!

nsivabalan commented Feb 11, 2025

Uh oh!

the-other-tim-brown Feb 13, 2025

Uh oh!

vinishjail97 Feb 19, 2025

Uh oh!

vinishjail97 commented Feb 20, 2025

Uh oh!

vinishjail97 commented Feb 20, 2025

Uh oh!

nsivabalan commented Feb 20, 2025

Uh oh!

vinishjail97 commented Feb 21, 2025

Uh oh!

vinishjail97 commented Feb 21, 2025

Uh oh!

vinishjail97 commented Feb 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

nsivabalan commented Feb 22, 2025

Uh oh!

hudi-bot commented Feb 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

vinishjail97 commented Feb 8, 2025

Change Logs

Impact

Risk level (write none, low medium or high below)

Documentation Update

Uh oh!

nsivabalan commented Feb 11, 2025

Uh oh!

the-other-tim-brown Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

vinishjail97 Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

vinishjail97 commented Feb 20, 2025

Uh oh!

vinishjail97 commented Feb 20, 2025

Uh oh!

nsivabalan commented Feb 20, 2025

Uh oh!

vinishjail97 commented Feb 21, 2025

Uh oh!

vinishjail97 commented Feb 21, 2025

Uh oh!

vinishjail97 commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nsivabalan commented Feb 22, 2025

Uh oh!

hudi-bot commented Feb 22, 2025

CI report:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

vinishjail97 commented Feb 21, 2025 •

edited

Loading