Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for disabling bitmap indexes. #5402

Merged
merged 6 commits into from
Mar 1, 2018

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Feb 20, 2018

Can save space for columns where bitmap indexes are pointless (like
free-form text).

Requires adding a new flag and version code to dictionary encoded string
columns. So, segments written with this option will not be backwards
compatible with older versions of Druid.

Can save space for columns where bitmap indexes are pointless (like
free-form text).
{
"type": "string",
"name": "comment",
"bitmapIndex": false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "createBitmapIndex"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not. Changed it.

@@ -143,7 +144,7 @@ public static SerializerBuilder serializerBuilder()
public static class SerializerBuilder
{
private VERSION version = null;
private int flags = NO_FLAGS;
private int flags = Feature.NO_BITMAP_INDEX.getMask();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested to make a method maskOf(Collection<Feature>), and DEFAULT_FEATURES = Collections.singletonList(NO_BITMAP_INDEX)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a STARTING_FLAGS constant but I didn't do the maskOf function- it doesn't seem useful enough to exist at this point.

@gianm
Copy link
Contributor Author

gianm commented Feb 20, 2018

@leventov thanks for review- updated the patch.

);
if (!Feature.NO_BITMAP_INDEX.isSet(rFlags)) {
GenericIndexed<ImmutableBitmap> rBitmaps = GenericIndexed.read(
buffer, bitmapSerdeFactory.getObjectStrategy(), builder.getFileMapper()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One arg at one line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it.

Copy link
Member

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left few comments,
Also, would be great if you can also modify dimensionSchema in some integration test too so that this feature is tested end to end.

@@ -109,13 +109,13 @@
private static final Interval COMPACTION_INTERVAL = Intervals.of("2017-01-01/2017-06-01");
private static final Map<Interval, DimensionSchema> MIXED_TYPE_COLUMN_MAP = ImmutableMap.of(
Intervals.of("2017-01-01/2017-02-01"),
new StringDimensionSchema(MIXED_TYPE_COLUMN, null),
new StringDimensionSchema(MIXED_TYPE_COLUMN, null, null),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit - StringDimensionSchema has a constr for just name, we can use that here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I changed these.

@@ -287,6 +287,7 @@ protected IncrementalIndex(
if (dimSchema.getTypeName().equals(DimensionSchema.SPATIAL_TYPE_NAME)) {
capabilities.setHasSpatialIndexes(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed above in NewSpatialDimensionSchema the const passes true for hasBitmapIndexes,
probably need to set capabilities.setBitmapIndex(true) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this outside the if/else.

@@ -165,6 +167,12 @@ public SerializerBuilder withBitmapSerdeFactory(BitmapSerdeFactory bitmapSerdeFa

public SerializerBuilder withBitmapIndex(GenericIndexedWriter<ImmutableBitmap> bitmapIndexWriter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : @nullable annotation for bitmapIndexWriter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

if (Feature.MULTI_VALUE.isSet(flags)) {
return VSizeColumnarMultiInts.readFromByteBuffer(buffer);
} else {
throw new IAE("Unrecognized multi-value flag[%d] for version[%s]", flags, version);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing multi value V3 handling ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing on purpose; from the code it looks like MULTI_VALUE_V3 is only supported for compressed formats.

Copy link
Member

@nishantmonu51 nishantmonu51 Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment to code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a comment


private static boolean mustWriteFlags(final int flags)
{
return flags != NO_FLAGS && flags != Feature.MULTI_VALUE.getMask();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi value V3 mask ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MUTLI_VALUE_V3 must be written so that's why it's not listed here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment to code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment.

@jihoonson
Copy link
Contributor

LGTM after addressing the comments of @nishantmonu51.

Copy link
Member

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left minor comment about adding small comment to code, after that 👍

@gianm
Copy link
Contributor Author

gianm commented Feb 27, 2018

@nishantmonu51 @leventov updated per your comments & added an integration test.

@nishantmonu51
Copy link
Member

thanks @gianm

@fjy fjy added this to the 0.13.0 milestone Feb 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants