Fix potential NPE in BaseTaskGenerator.getNonConsumingSegmentsZKMetadataForRealtimeTable#18715
Conversation
…ataForRealtimeTable IdealState.getInstanceStateMap() can return null when a segment has no instance assignments (e.g. during cluster init or partition rebalance). Extract to a local variable and guard the containsValue call to avoid NPE. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18715 +/- ##
============================================
+ Coverage 64.51% 64.58% +0.07%
Complexity 1291 1291
============================================
Files 3372 3372
Lines 208638 208667 +29
Branches 32596 32606 +10
============================================
+ Hits 134604 134769 +165
+ Misses 63239 63068 -171
- Partials 10795 10830 +35
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
xiangfu0
left a comment
There was a problem hiding this comment.
Found one high-signal correctness issue; see inline comment.
| if (idealStateSegments.contains(segmentName) | ||
| && segmentZKMetadata.getStatus().isCompleted() // skip consuming segments | ||
| && !idealState.getInstanceStateMap(segmentName).containsValue(SegmentStateModel.CONSUMING)) { | ||
| && (instanceStateMap == null || !instanceStateMap.containsValue(SegmentStateModel.CONSUMING))) { |
There was a problem hiding this comment.
Treating a missing instanceStateMap as "non-consuming" changes the method's safety behavior. The comment below says these segments should be skipped so RealtimeSegmentValidationManager can repair the missing IdealState update first; with instanceStateMap == null || ... we now schedule minion work for exactly that broken-controller edge case.
There was a problem hiding this comment.
You're absolutely right — thank you for catching this. I had the null case backwards. When instanceStateMap == null the segment is in exactly the broken edge case the comment describes (partition in IdealState but no instance assignments), so we must not schedule minion work and should let RealtimeSegmentValidationManager repair it first.
Fixed in the latest commit: condition is now instanceStateMap != null && !instanceStateMap.containsValue(CONSUMING).
When instanceStateMap is null the segment is in the broken edge case (partition exists in IdealState but has no instance assignments); we should NOT schedule minion work for it and instead let RealtimeSegmentValidationManager repair it first. Fixes the condition from (null || !containsValue) to (!=null && !containsValue) as pointed out in review. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@xiangfu0 @Jackie-Jiang — correctness fix applied per xiangfu0's review. |
|
@xiangfu0 @Jackie-Jiang all CI checks pass, review comments addressed. Please review when you get a chance. |
| if (idealStateSegments.contains(segmentName) | ||
| && segmentZKMetadata.getStatus().isCompleted() // skip consuming segments | ||
| && !idealState.getInstanceStateMap(segmentName).containsValue(SegmentStateModel.CONSUMING)) { | ||
| && instanceStateMap != null |
There was a problem hiding this comment.
We don't need this null check since we already checked idealStateSegments.contains(segmentName). No need to handle corrupted ideal state
|
Hey @xiangfu0, would appreciate a review on this when you get a chance! |
|
Closing based on reviewer feedback. Thank you @xiangfu0 and @Jackie-Jiang for the review! |
Description
IdealState.getInstanceStateMap(segmentName)can returnnullwhen a segment exists in the partition set but has no instance assignments (e.g., during cluster initialization or partition rebalance). The previous code called.containsValue()directly on the result without a null check, which would throw aNullPointerException.Fix
Extract
getInstanceStateMap()result to a local variable and guard thecontainsValuecall:instanceStateMapisnull, treat the segment as non-consuming (safe default: include it in the result)Tests
Existing unit tests cover this code path. No functional behavior change for the normal case (non-null map).
Checklist