fix: clear Hive work map after combine split failures#18719
Conversation
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for the contribution! This PR wraps HoodieCombineHiveInputFormat.getSplits in a try/finally so Utilities.clearWorkMapForConf runs even when split generation fails, preventing stale ThreadLocal work-map state from leaking across reused threads. No correctness issues found. A few style/readability suggestions in the inline comments. Please take a look, and this should be ready for a Hudi committer or PMC member to take it from here. One comment about the finally-block comment still saying "after splits are generated" when the fix is specifically about the failure case.
cc @yihua
| } finally { | ||
| // Clear work from ThreadLocal after splits are generated in case the thread is reused in a pool. | ||
| Utilities.clearWorkMapForConf(job); | ||
| } |
There was a problem hiding this comment.
🤖 nit: the comment says "after splits are generated" but since this is now in a finally block, it runs even when generation fails — which is the whole point of the fix. Could you update it to something like // Clear work from ThreadLocal after getSplits completes (successfully or not), in case the thread is reused in a pool.?
- AI-generated; verify before applying. React 👍/👎 to flag quality.
HoodieCombineHiveInputFormat cleared Hive's work map only after successful split generation, leaving ThreadLocal work state behind when split classification or generation failed. Move the cleanup into a finally block and add a regression test that forces getSplits to fail before verifying Utilities.clearWorkMapForConf is still invoked.
34717d0 to
e9438ee
Compare
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for the contribution! The PR wraps getSplits in a try/finally so the Hive work-map ThreadLocal is cleared even when split generation throws. The change is narrowly scoped and preserves the prior success-path cleanup behavior. No correctness issues found. A few style/readability suggestions in the inline comments. Please take a look, and this should be ready for a Hudi committer or PMC member to take it from here. One minor naming inconsistency in the new test method; the rest of the change is clean.
cc @yihua
|
|
||
| @Test | ||
| public void clearWorkMapForConfOnGetSplitsFailure() throws Exception { | ||
| StorageConfiguration<Configuration> conf = HoodieTestUtils.getDefaultStorageConf(); |
There was a problem hiding this comment.
🤖 nit: could you prefix this with test to match the convention used by the other methods in this class (testInternalSchemaCacheForMR, testHoodieRealtimeCombineHoodieInputFormat, etc.)? Something like testClearWorkMapForConfOnGetSplitsFailure.
- AI-generated; verify before applying. React 👍/👎 to flag quality.
|
Updated in Validation passed: |
|
Follow-up nit addressed in Validation passed: |
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for the contribution! This PR wraps getSplits in a try/finally so Utilities.clearWorkMapForConf is invoked on both success and failure paths, preventing stale ThreadLocal work state from leaking across reused threads. No issues flagged from this automated pass — a Hudi committer or PMC member can take it from here for a final review.
cc @yihua
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18719 +/- ##
=========================================
Coverage 68.14% 68.15%
+ Complexity 29051 29036 -15
=========================================
Files 2516 2516
Lines 140935 140935
Branches 17472 17475 +3
=========================================
+ Hits 96047 96049 +2
+ Misses 36993 36990 -3
- Partials 7895 7896 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
* fix: clear Hive work map after combine split failures HoodieCombineHiveInputFormat cleared Hive's work map only after successful split generation, leaving ThreadLocal work state behind when split classification or generation failed. Move the cleanup into a finally block and add a regression test that forces getSplits to fail before verifying Utilities.clearWorkMapForConf is still invoked.
* fix: clear Hive work map after combine split failures HoodieCombineHiveInputFormat cleared Hive's work map only after successful split generation, leaving ThreadLocal work state behind when split classification or generation failed. Move the cleanup into a finally block and add a regression test that forces getSplits to fail before verifying Utilities.clearWorkMapForConf is still invoked.
Describe the issue this Pull Request addresses
Closes #17985.
Summary and Changelog
HoodieCombineHiveInputFormat.getSplits now clears Hive's work map in a finally block so ThreadLocal work state is cleaned up even when split generation fails.
Added a regression test that forces split classification to fail and verifies Utilities.clearWorkMapForConf is still invoked.
Validation:
mvn -pl hudi-hadoop-mr -am -Dtest=TestHoodieCombineHiveInputFormat#clearWorkMapForConfOnGetSplitsFailure -Dsurefire.failIfNoSpecifiedTests=false -DfailIfNoTests=false -DskipITs=true -DskipFTs=true testImpact
Prevents stale Hive work-map state from leaking across reused threads after exceptional split generation. There is no public API change.
Risk Level
low. The change preserves the successful cleanup behavior and extends it to failure paths.
Documentation Update
none
Contributor's checklist