[HUDI-6480] Flink support non-blocking concurrency control#9850
[HUDI-6480] Flink support non-blocking concurrency control#9850danny0405 merged 4 commits intoapache:masterfrom
Conversation
| return OptionsResolver.isMorTable(conf) | ||
| ? TestUtils.getLastDeltaCompleteInstant(basePath) | ||
| : TestUtils.getLastCompleteInstant(basePath, HoodieTimeline.COMMIT_ACTION); | ||
| return this.ckpMetadata.lastCompleteInstant(); |
There was a problem hiding this comment.
Why we must fetch the instant from ckp metadata?
There was a problem hiding this comment.
Updated.
Add a special branch to handle lockless multiple writers.
For multiple writes, the writer started first might finish later, similarly, the writer started later might finish first. So the last completed instant from ckp metadata might not be equals to the last completed instant from timeline.
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteMergeOnRead.java
Outdated
Show resolved
Hide resolved
| pipeline1.checkpoint(1) | ||
| .assertNextEvent() | ||
| .checkpointCompleteThrows(1, HoodieWriteConflictException.class, "Cannot resolve conflicts"); | ||
| testConcurrentCommit(pipeline1); |
There was a problem hiding this comment.
Can we add new tests instead of ammending existing one?
There was a problem hiding this comment.
The behavior of existed test cases (testWriteMultiWriterInvolved and testWriteMultiWriterPartialOverlapping) change after support non-blocking multiple writers. So those two tests need to be updated.
I also add some new tests.
|
@danny0405 Thanks for review. I would add more tests soon. |
e01944d to
d6a5091
Compare
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
Show resolved
Hide resolved
...flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/OptionsResolver.java
Show resolved
Hide resolved
| .checkpoint(1) | ||
| .assertNextEvent(); | ||
| if (OptionsResolver.isLocklessMultiWriter(conf)) { | ||
| // should success for concurrent modification of same fileGroups if using lockless multi writers |
There was a problem hiding this comment.
It should be always true? OptionsResolver.isLocklessMultiWriter(conf)
There was a problem hiding this comment.
No, it's only be true if all the following condition is satisfied:
- the table is MOR table
- the index is simple bucket index
- enable OPTIMISTIC_CONCURRENCY_CONTROL
So it could be false for COW table or MOR table with other index type.
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/TransactionUtils.java
Show resolved
Hide resolved
...atasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteMergeOnReadWithCompact.java
Show resolved
Hide resolved
| // If there are lockless multiple writer, fetch last complete instant of current writer from ckp metadata | ||
| // because the writer started first might finish later, similarly, the writer started later might finish first. | ||
| return this.ckpMetadata.lastCompleteInstant(); | ||
| } else { |
There was a problem hiding this comment.
Got it, maybe we can use return this.ckpMetadata.lastCompleteInstant(); directly.
There was a problem hiding this comment.
In most case return this.ckpMetadata.lastCompleteInstant(); directly works fine.
But the case testSubtaskFails would fail because failover happens, the ckp meta might be cleaned, fetch the instant from timeline.
I add branch to handle this case.
There was a problem hiding this comment.
Does this.ckpMetadata.isEmpty() indicate a failover?
a247acb to
f73c1b6
Compare
|
6480.patch.zip |
…riteMultiWriterPartialOverlapping' 2. Introduce a new test case to test complex multiple writers with compaction
…nstant time as input argument
f73c1b6 to
1297401
Compare
Change Logs
Since #9776 is merged, this pr aims to support multiple streaming writers into the same MOR table with simple bucket index.
Set the following configure to enable this feature:
Besides, only enable async cleaning for one writer (by default, they are all enabled).
ps: This work is follow up of #9125. Thanks for contribution @danny0405
Impact
NA
Risk level (write none, low medium or high below)
NA
Documentation Update
Would be add in a separate PR.
Contributor's checklist