Skip to content

Fix potential NPE in BaseTaskGenerator.getNonConsumingSegmentsZKMetadataForRealtimeTable#18715

Closed
Akanksha-kedia wants to merge 2 commits into
apache:masterfrom
Akanksha-kedia:fix/base-task-generator-getinstancestatemap-npe
Closed

Fix potential NPE in BaseTaskGenerator.getNonConsumingSegmentsZKMetadataForRealtimeTable#18715
Akanksha-kedia wants to merge 2 commits into
apache:masterfrom
Akanksha-kedia:fix/base-task-generator-getinstancestatemap-npe

Conversation

@Akanksha-kedia

Copy link
Copy Markdown
Contributor

Description

IdealState.getInstanceStateMap(segmentName) can return null when a segment exists in the partition set but has no instance assignments (e.g., during cluster initialization or partition rebalance). The previous code called .containsValue() directly on the result without a null check, which would throw a NullPointerException.

Fix

Extract getInstanceStateMap() result to a local variable and guard the containsValue call:

  • If instanceStateMap is null, treat the segment as non-consuming (safe default: include it in the result)

Tests

Existing unit tests cover this code path. No functional behavior change for the normal case (non-null map).

Checklist

  • No new public APIs without documentation
  • Backward compatible change
  • Code follows existing patterns in the codebase

…ataForRealtimeTable

IdealState.getInstanceStateMap() can return null when a segment has no
instance assignments (e.g. during cluster init or partition rebalance).
Extract to a local variable and guard the containsValue call to avoid NPE.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov-commenter

codecov-commenter commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 64.58%. Comparing base (4ff9dbe) to head (2787b0e).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...helix/core/minion/generator/BaseTaskGenerator.java 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18715      +/-   ##
============================================
+ Coverage     64.51%   64.58%   +0.07%     
  Complexity     1291     1291              
============================================
  Files          3372     3372              
  Lines        208638   208667      +29     
  Branches      32596    32606      +10     
============================================
+ Hits         134604   134769     +165     
+ Misses        63239    63068     -171     
- Partials      10795    10830      +35     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.58% <50.00%> (+0.07%) ⬆️
temurin 64.58% <50.00%> (+0.07%) ⬆️
unittests 64.58% <50.00%> (+0.07%) ⬆️
unittests1 57.00% <ø> (+0.07%) ⬆️
unittests2 37.15% <50.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one high-signal correctness issue; see inline comment.

if (idealStateSegments.contains(segmentName)
&& segmentZKMetadata.getStatus().isCompleted() // skip consuming segments
&& !idealState.getInstanceStateMap(segmentName).containsValue(SegmentStateModel.CONSUMING)) {
&& (instanceStateMap == null || !instanceStateMap.containsValue(SegmentStateModel.CONSUMING))) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Treating a missing instanceStateMap as "non-consuming" changes the method's safety behavior. The comment below says these segments should be skipped so RealtimeSegmentValidationManager can repair the missing IdealState update first; with instanceStateMap == null || ... we now schedule minion work for exactly that broken-controller edge case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right — thank you for catching this. I had the null case backwards. When instanceStateMap == null the segment is in exactly the broken edge case the comment describes (partition in IdealState but no instance assignments), so we must not schedule minion work and should let RealtimeSegmentValidationManager repair it first.

Fixed in the latest commit: condition is now instanceStateMap != null && !instanceStateMap.containsValue(CONSUMING).

When instanceStateMap is null the segment is in the broken edge case
(partition exists in IdealState but has no instance assignments); we
should NOT schedule minion work for it and instead let
RealtimeSegmentValidationManager repair it first.

Fixes the condition from (null || !containsValue) to (!=null && !containsValue)
as pointed out in review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Akanksha-kedia

Copy link
Copy Markdown
Contributor Author

@xiangfu0 @Jackie-Jiang — correctness fix applied per xiangfu0's review. instanceStateMap \!= null condition now correctly skips segments where the partition has no instance assignments, letting RealtimeSegmentValidationManager repair them first. Please re-review when you get a chance.

@Akanksha-kedia

Copy link
Copy Markdown
Contributor Author

@xiangfu0 @Jackie-Jiang all CI checks pass, review comments addressed. Please review when you get a chance.

if (idealStateSegments.contains(segmentName)
&& segmentZKMetadata.getStatus().isCompleted() // skip consuming segments
&& !idealState.getInstanceStateMap(segmentName).containsValue(SegmentStateModel.CONSUMING)) {
&& instanceStateMap != null

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this null check since we already checked idealStateSegments.contains(segmentName). No need to handle corrupted ideal state

@Akanksha-kedia

Copy link
Copy Markdown
Contributor Author

Hey @xiangfu0, would appreciate a review on this when you get a chance!

@Akanksha-kedia

Copy link
Copy Markdown
Contributor Author

Closing based on reviewer feedback. Thank you @xiangfu0 and @Jackie-Jiang for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants