[CELEBORN-2287] Split mode should be HARD_SPLIT when disk is full#3653
[CELEBORN-2287] Split mode should be HARD_SPLIT when disk is full#3653zaynt4606 wants to merge 4 commits intoapache:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Updates worker-side partition split decision logic so that when a disk is detected as full, the split response is always HARD_SPLIT (i.e., not downgraded to SOFT_SPLIT based on split mode).
Changes:
- Return
StatusCode.HARD_SPLITimmediately whendiskFullis true and the disk file length meets the minimum split size. - Keep existing split-threshold behavior for primary partitions (SOFT vs HARD based on
PartitionSplitModeand max size).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/PushDataHandler.scala
Outdated
Show resolved
Hide resolved
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/PushDataHandler.scala
Outdated
Show resolved
Hide resolved
|
@zaynt4606, please update the description of this pull request. |
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/PushDataHandler.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/PushDataHandler.scala
Outdated
Show resolved
Hide resolved
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/PushDataHandler.scala
Show resolved
Hide resolved
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/PushDataHandler.scala
Outdated
Show resolved
Hide resolved
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/PushDataHandler.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
worker/src/test/scala/org/apache/celeborn/service/deploy/worker/PushDataHandlerSuite.scala
Outdated
Show resolved
Hide resolved
worker/src/test/scala/org/apache/celeborn/service/deploy/worker/PushDataHandlerSuite.scala
Outdated
Show resolved
Hide resolved
RexXiong
left a comment
There was a problem hiding this comment.
LGTM. The PR has been significantly improved with:
- Comprehensive unit tests - Covers all key scenarios including disk full, split enabled/disabled, primary/replica, and SOFT/HARD modes
- Cleaner code structure - Using
splitStatusvariable makes the logic more readable - Correct fix - Forces
HARD_SPLITwhen disk is full to prevent reserved space from being exhausted
Thanks for adding the tests!
by claude
### What changes were proposed in this pull request? Change the split mode to `HARD_SPLIT` when disk is full. ### Why are the changes needed? When the disk is already in a full state, continuous writing data in `SOFT_SPLIT` mode may cause the reserved space to be filled up as well. ### Does this PR resolve a correctness bug? No. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual test and UT. Closes #3653 from zaynt4606/clb2287. Authored-by: zhengtao <shuaizhentao.szt@alibaba-inc.com> Signed-off-by: SteNicholas <programgeek@163.com> (cherry picked from commit 913d027) Signed-off-by: SteNicholas <programgeek@163.com>
|
Merged to main(v0.7.0) and branch-0.6(v0.6.3). |
What changes were proposed in this pull request?
Change the split mode to
HARD_SPLITwhen disk is full.Why are the changes needed?
When the disk is already in a full state, continuous writing data in
SOFT_SPLITmode may cause the reserved space to be filled up as well.Does this PR resolve a correctness bug?
No.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manual test and UT.