New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keeps nullness attributes of merged in comparison column values #10704
Keeps nullness attributes of merged in comparison column values #10704
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10704 +/- ##
============================================
+ Coverage 70.25% 70.28% +0.03%
+ Complexity 6429 5681 -748
============================================
Files 2112 2164 +52
Lines 113996 116347 +2351
Branches 17219 17597 +378
============================================
+ Hits 80084 81776 +1692
- Misses 28312 28857 +545
- Partials 5600 5714 +114
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 286 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this is incorrect.
When loading sealed segment, ComparisonColumns
might have multiple values set, in which case we need to compare multiple values to determine which value is newer.
So the fix should be: when comparisonIndex
is -1
, loop over all values and check if it has any value larger than the value in another ComparisonColumns
. There is no need to modify the value within the current ComparisonColumns
because it should already have values set.
@@ -177,26 +179,35 @@ public void close() | |||
|
|||
public static class MultiComparisonColumnReader implements UpsertUtils.ComparisonColumnReader { | |||
private final PinotSegmentColumnReader[] _comparisonColumnReaders; | |||
private final NullValueVectorReader[] _comparisonColumnNullReaders; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed. You may use PinotSegmentColumnReader.isNull()
instead
One case I'm struggling to figure out how to support is when altering from one set of comparison columns to a new set. In such a case, I believe there would be a need to modify the value within the current ComparisonColumns (ex. adding 1 new column in addition to a set of existing comparison columns, or migrating from a single comparison column to a set of columns). |
Are you referring to modifying the comparison columns after table is created? I don't think that is allowed even with single comparison column with partial upsert because we should not change the history of the records. |
Ya that makes sense, otherwise it would literally be changing the semantics of which docs should have been persisted in the first place. I do want to allow for support of adding a brand new column to the schema + comparisonColumns list though, which I agree should work. When reloading sealed segments, is processing done in global order of |
… comparison columns enabled
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ComparisonColumns.java
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/UpsertUtils.java
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ComparisonColumns.java
Outdated
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ComparisonColumns.java
Outdated
Show resolved
Hide resolved
Sealed segments are loaded in parallel without any global order (we don't really have global ordered doc ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good if we can add an unit test to verify the behavior of ComparisonColumns
Keeps nullness attributes of merged comparison column values to avoid
IndexOutOfBounds
exception on segment read.