New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts. #4321
#4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts. #4321
Conversation
variable length byte dictionary. In the next iteration, we'll be deciding dynamically whether to use fixed or variable length impl.
Also fixed other review comments from Kishore.
for a column configurable through TableConfig.
- IndexLoadingConfig - RealtimeSegmentConfig
causing a race condition and unit test failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. Minor comments
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Show resolved
Hide resolved
...-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentDictionaryCreator.java
Outdated
Show resolved
Hide resolved
...-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentDictionaryCreator.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/config/IndexingConfig.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/pinot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java
Show resolved
Hide resolved
.../src/main/java/org/apache/pinot/core/data/manager/realtime/LLRealtimeSegmentDataManager.java
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Outdated
Show resolved
Hide resolved
...-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentDictionaryCreator.java
Outdated
Show resolved
Hide resolved
...-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentDictionaryCreator.java
Show resolved
Hide resolved
.../java/org/apache/pinot/core/segment/index/loader/defaultcolumn/BaseDefaultColumnHandler.java
Outdated
Show resolved
Hide resolved
...ore/src/main/java/org/apache/pinot/core/segment/index/readers/ImmutableDictionaryReader.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is wrong with a template implementation like this #4348
instead of duplicate logic?
IDE will re-factor out Buffer, rename, etc. to our heart's content.
@buchireddy , @kishoreg please provide your comments on #4348. I need a ship-it from one of you I have not addressed most of the comments by @mayankshriv which will be in the implementation of VarLengthBytesValueReaderWriter. Basically, my proposal is to build mutable and immutable versions of OffHeapByteArrayStore. Either we can derive two classes out of this and use them, or we can play around with the ctor of OffHeapByteArraystore to make it independent of memory manager etc. (caller allocates) and provide only a pnot-databuffer that is mutable (vs the immutable, which could then becoime an empty constructor with an init call). We can work around these things. All I want is one way to encode strings in a compact fashion. thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please re-format the files with Pinot format introduced in #3705? Also, for member variables, we prefix the value with '_' and do not use 'this.' when accessing them.
pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java
Outdated
Show resolved
Hide resolved
FTR, I was waiting to address review comments on this PR until now because @mcvsubbu was working on pulling out |
@buchireddy Have you addressed all the comments? @Jackie-Jiang, can you take another look and approve. |
@kishoreg @Jackie-Jiang @mayankshriv I've addressed the review comments and pushed the latest code. Please take a look and LMK if you have any more comments. |
since we don't expect it to be ever called.
we can't really reuse it so removing the ThreadLocal byte[]. Thanks to the unit test which caught this issue 👍
Codecov Report
@@ Coverage Diff @@
## master #4321 +/- ##
============================================
+ Coverage 56.23% 65.35% +9.12%
Complexity 20 20
============================================
Files 1061 1067 +6
Lines 54980 55526 +546
Branches 7824 7906 +82
============================================
+ Hits 30916 36289 +5373
+ Misses 21677 16665 -5012
- Partials 2387 2572 +185
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @mayankshriv, please review.
pinot-core/src/main/java/org/apache/pinot/core/realtime/converter/RealtimeSegmentConverter.java
Show resolved
Hide resolved
so that we don't need to make any assumptions about where the data starts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Implementation of the feature mentioned in #4317