Skip to content

Conversation

@krishan1390
Copy link
Contributor

@krishan1390 krishan1390 commented Dec 30, 2025

  1. Pinot segment record reader has a flag skipDefaultNullValues which is used to return NULL rather than default if the flag is set to true. We are adding a similar flag to PinotSegmentColumnReaderImpl.
  2. Also adding a new constructor to PinSegmentColumnReaderImpl if the client directly wants to pass a PinSegmentColumnReader implementation.
  3. support for tracking valid document IDs. This is particularly useful for upsert compaction scenarios where only valid (non-obsolete) documents should be processed.

@codecov-commenter
Copy link

codecov-commenter commented Dec 30, 2025

Codecov Report

❌ Patch coverage is 60.00000% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.30%. Comparing base (7d40b9f) to head (1264cea).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...egment/readers/PinotSegmentColumnarDataSource.java 0.00% 5 Missing ⚠️
.../segment/readers/PinotSegmentColumnReaderImpl.java 78.57% 2 Missing and 1 partial ⚠️
...local/segment/creator/impl/BaseSegmentCreator.java 0.00% 1 Missing ⚠️
...che/pinot/spi/data/readers/ColumnarDataSource.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17443      +/-   ##
============================================
+ Coverage     63.29%   63.30%   +0.01%     
  Complexity     1475     1475              
============================================
  Files          3162     3163       +1     
  Lines        188681   188692      +11     
  Branches      28873    28872       -1     
============================================
+ Hits         119424   119459      +35     
+ Misses        60007    59979      -28     
- Partials       9250     9254       +4     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.28% <60.00%> (+0.05%) ⬆️
java-21 63.24% <60.00%> (-0.01%) ⬇️
temurin 63.30% <60.00%> (+0.01%) ⬆️
unittests 63.30% <60.00%> (+0.01%) ⬆️
unittests1 55.59% <0.00%> (+<0.01%) ⬆️
unittests2 34.05% <60.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

* @param indexSegment Source segment to read from
*
*/
public PinotSegmentColumnarDataSource(IndexSegment indexSegment, RoaringBitmap validDocIds) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the backward incomaptibility is fine as this class is unused and recently added by me

@Override
public ColumnReaderFactory createColumnReaderFactory() {
return new PinotSegmentColumnReaderFactory(_indexSegment, _initializeDefaultValueReaders);
return new PinotSegmentColumnReaderFactory(_indexSegment, true, false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're setting these flags by default based on their appropriate use case.

No client is yet using PinotSegmentColumnarDataSource for this to be a problem.

public void close()
throws IOException {
// Segment lifecycle is managed externally, so no cleanup needed here
_indexSegment.destroy();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pinot segment record reader also destroys the segment by default, thus enabling the same capability here.

* This parameter is kept for backward compatibility and will be removed in future.
*/
public PinotSegmentColumnReaderFactory(IndexSegment indexSegment, boolean initializeDefaultValueReaders) {
public PinotSegmentColumnReaderFactory(IndexSegment indexSegment, boolean skipDefaultNullValues,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the new variable in the middle because initialize default value readers will get deprecated in the future. Also, this is a new method added by me and not yet used, so it shouldn't cause any backward incompatibility.

@krishan1390 krishan1390 changed the title Add skipDefaultNullValues to PinotSegmentColumnReaderImpl Changes in Pinot segment column reader, data source, and factory to support nulls better and support valid doc IDs. Jan 7, 2026
Copy link
Contributor

@swaminathanmanish swaminathanmanish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@swaminathanmanish swaminathanmanish merged commit b313aa8 into apache:master Jan 7, 2026
37 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants