fix(common): Close log writer output stream on append failure#18909
fix(common): Close log writer output stream on append failure#18909fhan688 wants to merge 4 commits into
Conversation
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for the contribution! This PR adds proper resource cleanup when HoodieLogFormatWriter.appendBlocks fails after opening the output stream, and ensures closeStream still closes the underlying stream when sync() throws — preserving the original exception and attaching close failures as suppressed exceptions. The control flow in closeStream correctly handles all four combinations of sync/close success/failure, and closeOutputStream uses a try-finally so writer state (output=null, closed=true) is updated even when output.close() throws. The behavioral change (writer becomes unusable after a failed append, since closed=true makes subsequent getCurrentSize() throw IllegalStateException) appears intentional and is asserted in the new tests; no callers in this repo rely on retrying on the same writer after a failed append. No correctness issues found. A few style/readability suggestions in the inline comments. Please take a look, and this should be ready for a Hudi committer or PMC member to take it from here. One naming nit on a newly added helper method; everything else is clean.
cc @yihua
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for the iteration here. The original fix (closing the output stream on append failure) is preserved intact — closeStream, closeOutputStream, and closeOutputStreamOnAppendFailure are all still present and correctly handle the sync/close paths. However, the PR has expanded considerably beyond its original "fix log writer close on append failure" scope: it now also bundles a large Lombok refactor, the HoodieLogFormat.Writer interface → abstract class conversion, removal of WriterBuilder, builder method renames (withFileId → withLogFileId, onParentPath → withParentPath), and HoodieCleanStat builder method renames/constructor removal. These are binary/source-incompatible changes for any downstream consumers — it might be worth splitting them into a focused follow-up so reviewers can evaluate them independently from the original bug fix. The prior nit about renaming throwAsIOExceptionOrRuntimeException doesn't appear to be addressed but it's a nit and not blocking. One small inline question about the boolean → Boolean change in HoodieReaderContext. Please take a look, and this should be ready for a Hudi committer or PMC member to take it from here.
|
Heyo @fhan688, we recently started cleaning up our builder patterns with Lombok to reduce boilerplate code. The error you are seeing is a builder pattern migration introduce in: Just a headsup. |
thanks! I'll fix this later. |
…nOrRuntimeException to rethrow
| } | ||
| } | ||
|
|
||
| private void rethrow(Exception exception) throws IOException { |
There was a problem hiding this comment.
is this right? the method does not always throw IOException now?
There was a problem hiding this comment.
is this right? the method does not always throw IOException now?
Good point. I used Exception to simplify the catch path, but made rethrow handle generic checked exceptions safely by preserving IOException and RuntimeException, and wrapping any other checked exception as IOException
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18909 +/- ##
============================================
+ Coverage 68.77% 68.82% +0.04%
- Complexity 29087 29097 +10
============================================
Files 2517 2517
Lines 139822 139842 +20
Branches 17209 17207 -2
============================================
+ Hits 96169 96243 +74
+ Misses 35880 35820 -60
- Partials 7773 7779 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for working on this! The PR closes the underlying FSDataOutputStream when appendBlocks fails after the stream has been opened, preserving the original exception while attaching close failures as suppressed. The closeStream refactor also correctly handles the case where sync() fails by still attempting to close the output stream. No new critical correctness issues flagged from this automated pass — a Hudi committer or PMC member can take it from here for a final review.
cc @yihua
Describe the issue this Pull Request addresses
HoodieLogFormatWriter#appendBlockscan leave the underlyingFSDataOutputStreamopen when an exception is thrown after the output stream has already been opened, for example while writing log block bytes. In thisfailure path, the writer does not release the stream/lease until a later close path, which may not be reached by callers.
This PR ensures the output stream is closed on append failure while preserving the original failure as the primary exception.
Summary and Changelog
This PR improves failure-path resource cleanup in
HoodieLogFormatWriter.Changes:
appendBlocksfails after opening the stream.closeStreamstill closes the output stream even whensync()fails.sync()fails.No code was copied.
Impact
No public API change.
Successful append behavior is unchanged. This PR does not restore append-time
sync()/hsync, so the existing append performance behavior remains intact.The behavioral change is limited to failure handling:
Risk Level
low
The change is scoped to error handling and resource cleanup in
HoodieLogFormatWriter. The successful write path remains functionally unchanged.Verification:
mvn -pl hudi-hadoop-common -am -DskipITs -Dcheckstyle.skip=true -DfailIfNoTests=false -Dsurefire.failIfNoSpecifiedTests=false -Dtest=TestHoodieLogFormatWriter test Result: Tests run: 3, Failures: 0, Errors: 0, Skipped: 0Documentation Update
none
No new config, public API, or user-facing feature is added.
Contributor's checklist