-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-11043. Explore client retry optimizations after write() and hsync() are desynced #9195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3f6ba7e to
28e738b
Compare
… enhance clarity in handling putBlock operations. Introduced RetryChunkSelection for better management of chunk buffers during retries, and streamlined the optimization logic in RetryRequestBatcher.
2623b35 to
e522657
Compare
|
Hi @jojochuang and @smengcl — just checking in. If you have a moment, I’d love any initial thoughts on this. Thanks! |
…ns-after-write-and-hsync-are-desynced
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces RetryRequestBatcher, a sliding-window optimizer that reduces network overhead during retry operations by combining multiple failed writeChunk requests and consolidating putBlock metadata operations. The optimization is particularly effective when write() and hsync() operations are desynchronized, reducing retry latency by minimizing redundant RPCs.
- Introduces intelligent retry request batching that combines consecutive writeChunk operations and retains only the latest putBlock metadata
- Refactors
BlockOutputStream.writeOnRetry()to leverage the optimizer for fewer, more efficient retry RPCs - Implements lazy memory cleanup strategy that balances performance with memory usage
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/RetryRequestBatcher.java |
New class implementing sliding-window retry optimization with efficient tracking, acknowledgement handling, and lazy memory cleanup |
hadoop-hdds/client/src/test/java/org/apache/hadoop/hdds/scm/storage/TestRetryRequestBatcher.java |
Comprehensive unit tests covering basic operations, multi-chunk scenarios, acknowledgements, edge cases, and complex failure scenarios |
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockOutputStream.java |
Integrates RetryRequestBatcher into write path, refactors writeOnRetry to use optimized retry plans, adds helper classes for retry chunk selection |
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyOutputStream.java |
Removes @VisibleForTesting annotation from handleWrite method, making it package-private for internal use |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@jojochuang @smengcl please take a look |
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
|
Thank you for your contribution. This PR is being closed due to inactivity. If needed, feel free to reopen it. |
What changes were proposed in this pull request?
RetryRequestBatcher, a sliding-window planner that keeps failed writeChunk requests sorted by end offset, retains only the most recent putBlock offset, and produces an optimized retry plan (combined chunk list + putBlock flag).BlockOutputStream: every outgoing writeChunk/putBlock updates the window,writeOnRetrynow replays the optimized plan (piggybacking the final chunk when supported), and acknowledgements/clears shrink the window once putBlock succeeds.TestRetryRequestBatcherto exercise the batching logic across basic, duplicate putBlock, acknowledgement, complex, and bookkeeping scenarios.Benefit:
Shared setup: Every writeChunk/putBlock RPC now flows through
RetryRequestBatcher. On the happy path we track each write’s end-offset and the latest putBlock offset. If an RPC fails, the window already knows exactly which buffers still need to be retried and in what order; when a putBlock succeeds,acknowledgeUpTo(flushPos)removes all requests the datanodes have committed.Retry without piggyback:
writeOnRetryblindly replayed each allocated chunk, issuing awriteChunk, immediately followed by a standaloneputBlock. That meantnfailed chunks produced2nretry RPCs, even if multiple writes could be coalesced before the next metadata update.retryRequestBatcher.optimizeForRetry(). This collapses all outstanding chunks into a single ordered list and keeps just the highest putBlock offset. The retry loop now issues each chunk exactly once and sends a singleputBlockat the end. Result: fewer network round-trips, less checksum/compression work, and shorter retry latency.Retry with piggyback enabled:
writeChunkAndPutBlock, so we ended up sending a putBlock for every chunk in the window.writeChunkAndPutBlock). All preceding chunks are sent as plainwriteChunkcalls. Effectively we collapse the retries to “N chunk writes + 1 piggybacked flush” instead of “N piggybacked writes”, reducing both network chatter and datanode commit work while preserving the benefits of piggyback (no extra standalone putBlock).What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11043
How was this patch tested?
TestRetryRequestBatcher UT