-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for disabling bitmap indexes. #5402
Conversation
Can save space for columns where bitmap indexes are pointless (like free-form text).
docs/content/ingestion/index.md
Outdated
{ | ||
"type": "string", | ||
"name": "comment", | ||
"bitmapIndex": false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "createBitmapIndex"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, why not. Changed it.
@@ -143,7 +144,7 @@ public static SerializerBuilder serializerBuilder() | |||
public static class SerializerBuilder | |||
{ | |||
private VERSION version = null; | |||
private int flags = NO_FLAGS; | |||
private int flags = Feature.NO_BITMAP_INDEX.getMask(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested to make a method maskOf(Collection<Feature>)
, and DEFAULT_FEATURES = Collections.singletonList(NO_BITMAP_INDEX)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a STARTING_FLAGS
constant but I didn't do the maskOf
function- it doesn't seem useful enough to exist at this point.
@leventov thanks for review- updated the patch. |
); | ||
if (!Feature.NO_BITMAP_INDEX.isSet(rFlags)) { | ||
GenericIndexed<ImmutableBitmap> rBitmaps = GenericIndexed.read( | ||
buffer, bitmapSerdeFactory.getObjectStrategy(), builder.getFileMapper() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One arg at one line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left few comments,
Also, would be great if you can also modify dimensionSchema in some integration test too so that this feature is tested end to end.
@@ -109,13 +109,13 @@ | |||
private static final Interval COMPACTION_INTERVAL = Intervals.of("2017-01-01/2017-06-01"); | |||
private static final Map<Interval, DimensionSchema> MIXED_TYPE_COLUMN_MAP = ImmutableMap.of( | |||
Intervals.of("2017-01-01/2017-02-01"), | |||
new StringDimensionSchema(MIXED_TYPE_COLUMN, null), | |||
new StringDimensionSchema(MIXED_TYPE_COLUMN, null, null), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit - StringDimensionSchema has a constr for just name, we can use that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I changed these.
@@ -287,6 +287,7 @@ protected IncrementalIndex( | |||
if (dimSchema.getTypeName().equals(DimensionSchema.SPATIAL_TYPE_NAME)) { | |||
capabilities.setHasSpatialIndexes(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed above in NewSpatialDimensionSchema the const passes true for hasBitmapIndexes,
probably need to set capabilities.setBitmapIndex(true) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this outside the if/else.
@@ -165,6 +167,12 @@ public SerializerBuilder withBitmapSerdeFactory(BitmapSerdeFactory bitmapSerdeFa | |||
|
|||
public SerializerBuilder withBitmapIndex(GenericIndexedWriter<ImmutableBitmap> bitmapIndexWriter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : @nullable annotation for bitmapIndexWriter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
if (Feature.MULTI_VALUE.isSet(flags)) { | ||
return VSizeColumnarMultiInts.readFromByteBuffer(buffer); | ||
} else { | ||
throw new IAE("Unrecognized multi-value flag[%d] for version[%s]", flags, version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing multi value V3 handling ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing on purpose; from the code it looks like MULTI_VALUE_V3 is only supported for compressed formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment to code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add a comment
|
||
private static boolean mustWriteFlags(final int flags) | ||
{ | ||
return flags != NO_FLAGS && flags != Feature.MULTI_VALUE.getMask(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multi value V3 mask ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MUTLI_VALUE_V3 must be written so that's why it's not listed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment to code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment.
LGTM after addressing the comments of @nishantmonu51. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left minor comment about adding small comment to code, after that 👍
@nishantmonu51 @leventov updated per your comments & added an integration test. |
thanks @gianm |
Can save space for columns where bitmap indexes are pointless (like
free-form text).
Requires adding a new flag and version code to dictionary encoded string
columns. So, segments written with this option will not be backwards
compatible with older versions of Druid.