-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-1575] Early Conflict Detection For Multi-writer #6133
[HUDI-1575] Early Conflict Detection For Multi-writer #6133
Conversation
@zhangyue19921010 Would you please update the PR to fix the conflicts. |
@hudi-bot run azure |
@hudi-bot run azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, the logic looks good and follows the design. We need to think about better code abstraction and reuse to avoid any discrepancy compared to the existing conflict detection and resolution strategy.
...nt/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/LockManager.java
Outdated
Show resolved
Hide resolved
...nt/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/LockManager.java
Outdated
Show resolved
Hide resolved
*/ | ||
private TypedProperties refreshLockConfig(HoodieWriteConfig writeConfig, String key) { | ||
TypedProperties props = new TypedProperties(writeConfig.getProps()); | ||
props.setProperty(LockConfiguration.ZK_LOCK_KEY_PROP_KEY, key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it should check if the ZK-based lock is configured. Otherwise, it should throw an exception.
Generally, we should think about how to support different lock provider implementations. For the first cut, it may be okay to have this specific logic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing Changed. Also we could mark a TODO here to support more lock provider as next step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sg. Let's use LOCK_PROVIDER_CLASS_NAME
instead of ZK_BASE_PATH_PROP_KEY
for checking whether ZK-based lock is configured.
@zhangyue19921010 could you file a JIRA ticket besides the TODO because this requires more work?
.../hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieLockConfig.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
if (earlyConflictDetectionStrategy.hasMarkerConflict()) { | ||
earlyConflictDetectionStrategy.resolveMarkerConflict(basePath, markerDir, markerName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No exception should be thrown here at the timeline server if there is detected conflict. The timeline server should simply return false for the marker creation request and let the executor/write handle resolve the marker conflict (throw the exception).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason, this check is batch and async. For specific request get false result. It means maker checker find a conflict but maybe it is not current request related marker conflict.
So is it possible to let current executor to handle others' conflict based on timeline sever in async and batch mode. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeline server should simply return false for the marker creation request
Totally agree with it.
For now timeline server will return false for executor request and and executor will
if (success) {
return Option.of(new Path(FSUtils.getPartitionPath(markerDirPath, partitionPath), markerFileName));
} else {
// this failed may due to early conflict detection, so we need to throw out.
throw new HoodieEarlyConflictDetectionException(new ConcurrentModificationException("Early conflict detected but cannot resolve conflicts for overlapping writes"));
}
...ce/src/main/java/org/apache/hudi/timeline/service/handlers/marker/MarkerCheckerRunnable.java
Outdated
Show resolved
Hide resolved
...ce/src/main/java/org/apache/hudi/timeline/service/handlers/marker/MarkerCheckerRunnable.java
Outdated
Show resolved
Hide resolved
...ce/src/main/java/org/apache/hudi/timeline/service/handlers/marker/MarkerCheckerRunnable.java
Outdated
Show resolved
Hide resolved
* @param instants | ||
* @return | ||
*/ | ||
private List<String> getCandidateInstants(List<Path> instants, String currentInstantTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we adapt the common logic from ConflictResolutionStrategy
instead of reinventing similar logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap, actually there are some diff here using the same name :)
for occ getCandidateInstants which depends on a state:
// To find which instants are conflicting, we apply the following logic
// 1. Get completed instants timeline only for commits that have happened since the last successful write.
// 2. Get any scheduled or completed compaction or clustering operations that have started and/or finished
// after the current instant. We need to check for write conflicts since they may have mutated the same files
// that are being newly created by the current write.
For current early conflict detection getCandidateInstants:
/**
* Get Candidate Instant to do conflict checking:
* 1. Skip current writer related instant(currentInstantTime)
* 2. Skip all instants after currentInstantTime
* 3. Skip dead writers related instants based on heart-beat
* @param instants
* @return
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also Thanks for your reviewing here! Really appreciate!
...udi-client-common/src/main/java/org/apache/hudi/client/embedded/EmbeddedTimelineService.java
Outdated
Show resolved
Hide resolved
...nt/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/LockManager.java
Outdated
Show resolved
Hide resolved
*/ | ||
private TypedProperties refreshLockConfig(HoodieWriteConfig writeConfig, String key) { | ||
TypedProperties props = new TypedProperties(writeConfig.getProps()); | ||
props.setProperty(LockConfiguration.ZK_LOCK_KEY_PROP_KEY, key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sg. Let's use LOCK_PROVIDER_CLASS_NAME
instead of ZK_BASE_PATH_PROP_KEY
for checking whether ZK-based lock is configured.
@zhangyue19921010 could you file a JIRA ticket besides the TODO because this requires more work?
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
Outdated
Show resolved
Hide resolved
...client/hudi-client-common/src/main/java/org/apache/hudi/table/marker/DirectWriteMarkers.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/common/util/MarkerUtils.java
Outdated
Show resolved
Hide resolved
.../hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
Outdated
Show resolved
Hide resolved
...nt/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/LockManager.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
Outdated
Show resolved
Hide resolved
hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/TimelineService.java
Outdated
Show resolved
Hide resolved
… and revise config description
hudi-common/src/main/java/org/apache/hudi/common/util/MarkerUtils.java
Outdated
Show resolved
Hide resolved
long maxAllowableHeartbeatIntervalInMs = config.getHoodieClientHeartbeatIntervalInMs() * config.getHoodieClientHeartbeatTolerableMisses(); | ||
|
||
HoodieDirectMarkerBasedEarlyConflictDetectionStrategy strategy = | ||
(HoodieDirectMarkerBasedEarlyConflictDetectionStrategy) ReflectionUtils.loadClass(config.getEarlyConflictDetectionStrategyClassName(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can think about loading the strategy class through reflection in a common place for reuse, instead of loading for every marker creation.
...-timeline-service/src/main/java/org/apache/hudi/timeline/service/handlers/MarkerHandler.java
Outdated
Show resolved
Hide resolved
@hudi-bot run azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The marker APIs can be further improved which can be addressed in a separate PR. @zhangyue19921010 Good job on getting the implementation done!
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/marker/WriteMarkers.java
Outdated
Show resolved
Hide resolved
...-timeline-service/src/main/java/org/apache/hudi/timeline/service/handlers/MarkerHandler.java
Outdated
Show resolved
Hide resolved
I'm doing more thorough tests in CI. Please do not merge this PR now. |
…he marker type and fallback to default
@hudi-bot run azure |
The Azure CI run with the feature flag turned on by default (#7703) has succeeded. The CI failure of this PR is due to flaky tests. Merging this PR. |
Before this PR, Hudi implements OCC (Optimistic Concurrency Control) to detect the write conflict at the pre-commit time to ensure data consistency, integrity, and correctness between multiple writers. OCC detects the conflict at Hudi's file group level, i.e., two concurrent writers updating the same file group are detected as a conflict. Currently, conflict detection is performed before committing metadata and after the data writing is completed. If any conflict is detected, it leads to a waste of cluster resources because computing and writing are finished already. To solve this problem, this change implements an early conflict detection mechanism to detect the conflict during the data writing phase and abort the writing early if a conflict is detected, using Hudi's marker mechanism. Before writing each data file, the writer creates a corresponding marker to mark that the file is created, so that later on, the writer can use the markers to automatically clean up uncommitted data files in the failure and rollback scenarios. We leverage the markers to identify the conflict at the file group level during writing data. There are subtle differences in the early conflict detection workflow among different types of markers. For direct markers, the writer lists necessary marker files directly and checks the conflict before creating markers and starting to write the corresponding data file. For the timeline-server-based markers, the writer gets the result of marker conflict detection by contacting the timeline server. The conflict detection is asynchronously and periodically executed at the timeline server so that the write conflicts can be detected as early as possible. Both writers may still write the data files of the same file slice until the conflict is detected in the next round of detection. Note that, the implemented early conflict detection operates within OCC. Any conflict detection outside the scope of OCC is not handled. For example, the current OCC for multiple writers cannot detect the conflict if two concurrent writers perform INSERT operations for the same set of record keys, because the writers write to different file groups. This set of changes does not intend to address this problem. Co-authored-by: yuezhang <yuezhang@freewheel.tv> Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
Before this PR, Hudi implements OCC (Optimistic Concurrency Control) to detect the write conflict at the pre-commit time to ensure data consistency, integrity, and correctness between multiple writers. OCC detects the conflict at Hudi's file group level, i.e., two concurrent writers updating the same file group are detected as a conflict. Currently, conflict detection is performed before committing metadata and after the data writing is completed. If any conflict is detected, it leads to a waste of cluster resources because computing and writing are finished already. To solve this problem, this change implements an early conflict detection mechanism to detect the conflict during the data writing phase and abort the writing early if a conflict is detected, using Hudi's marker mechanism. Before writing each data file, the writer creates a corresponding marker to mark that the file is created, so that later on, the writer can use the markers to automatically clean up uncommitted data files in the failure and rollback scenarios. We leverage the markers to identify the conflict at the file group level during writing data. There are subtle differences in the early conflict detection workflow among different types of markers. For direct markers, the writer lists necessary marker files directly and checks the conflict before creating markers and starting to write the corresponding data file. For the timeline-server-based markers, the writer gets the result of marker conflict detection by contacting the timeline server. The conflict detection is asynchronously and periodically executed at the timeline server so that the write conflicts can be detected as early as possible. Both writers may still write the data files of the same file slice until the conflict is detected in the next round of detection. Note that, the implemented early conflict detection operates within OCC. Any conflict detection outside the scope of OCC is not handled. For example, the current OCC for multiple writers cannot detect the conflict if two concurrent writers perform INSERT operations for the same set of record keys, because the writers write to different file groups. This set of changes does not intend to address this problem. Co-authored-by: yuezhang <yuezhang@freewheel.tv> Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
Replaced #6059
Change Logs
Please take a look at #6003 for more details.
Impact
no impact
Risk level low
Documentation Update
Please take a look at #6003 for more details.
Contributor's checklist