Adds long compression methods #3148

acslk · 2016-06-14T22:30:03Z

This PR introduces a few new compression strategies for long values that are not block based (described in #3147). The enum CompressionFormat is added to control the compression procedure, which could call GenericIndexed for block based compression.

The compression format added include delta compression, table compression, and a new uncompressed format. Delta compression finds the smallest value in the segment, and store all values as offsets. Table compression maps unique values and store their id. The offsets and ids are stored by using the smallest number of bits required for the maximum values. For example, if the maximum offset is 200, then all values are stored using 8 bits. The new uncompressed format is the uncompressed format without the header for position of each block and empty bytes at beginning of each block.

Update

After some discussion in #3147 , it's decided that compression strategy, which compress and decompress blocks of data without knowing the data format, and encoding format, which encode data using knowledge of its attributes such as size of each element, should be separated. Users would choose the combination of compression and encoding to use.

The existing compressions including LZF, LZ4, and Uncompressed all count as compression strategy. In addition, this PR included the compression NONE, which indicate no compression strategy should be used, and the values should be stored all together instead of in blocks, since this would allow faster random access. The LZF and Uncompressed strategy are still needed for backward compatibility, but they shouldn't be used for new data.

The encoding format introduced in this PR are Delta, Table, and Longs. Delta and table are described above, while Longs just indicated values are stored as 8 byte longs. Segments created prior to this PR are viewed as compression strategy used + Longs encoding. The strategy for choosing the encoding are auto and longs. Auto strategy scans through the data for its cardinality and maximum offset, and choose the most appropriate format. Longs strategy just always choose longs format.

Below is a benchmark of combinations of all compression strategies and encoding strategies using some generated data sets.

Data sets includes:

enumerated (value from only a few selections, with probability heavily skewed toward one)
zipfLow (zipf distribution between 0 to 1000 with low exponent)
zipfHigh (zipf distribution between 0 to 1000 with high exponent)
uniform (uniform distribution between 0 to 1000)
sequential (sequential values starting from a timestamp)

Generated files size in kb (5,000,000 values)

zipfLow-LZ4-DELTA: 6870
zipfLow-LZ4-LONGS: 9495
zipfLow-NONE-DELTA: 7324
zipfLow-NONE-LONGS: 39062
uniform-LZ4-DELTA: 7354
uniform-LZ4-LONGS: 11184
uniform-NONE-DELTA: 7324
uniform-NONE-LONGS: 39062
sequential-LZ4-DELTA: 14706
sequential-LZ4-LONGS: 19542
sequential-NONE-DELTA: 14648
sequential-NONE-LONGS: 39062
enumerate-LZ4-DELTA: 442
enumerate-LZ4-LONGS: 711
enumerate-NONE-DELTA: 2441
enumerate-NONE-LONGS: 39062
zipfHigh-LZ4-DELTA: 1201
zipfHigh-LZ4-LONGS: 1784
zipfHigh-NONE-DELTA: 4884
zipfHigh-NONE-LONGS: 39062

Benchmarks:
readContinuous reads every value, while readSkipping skips randomly between 0 - 2000 values on each read

Read Continuous
 enumerate      auto   lz4  30.587 ± 0.493  ms/op
 enumerate      auto  none   7.093 ± 0.116  ms/op
 enumerate     longs   lz4  19.731 ± 0.849  ms/op
 enumerate     longs  none   3.107 ± 0.060  ms/op
   zipfLow      auto   lz4  36.530 ± 0.695  ms/op
   zipfLow      auto  none  11.070 ± 0.209  ms/op
   zipfLow     longs   lz4  33.192 ± 0.652  ms/op
   zipfLow     longs  none   3.127 ± 0.076  ms/op
  zipfHigh      auto   lz4  28.517 ± 0.804  ms/op
  zipfHigh      auto  none   2.808 ± 0.072  ms/op
  zipfHigh     longs   lz4  21.903 ± 0.508  ms/op
  zipfHigh     longs  none   3.033 ± 0.040  ms/op
sequential      auto   lz4  29.984 ± 0.807  ms/op
sequential      auto  none   6.771 ± 0.090  ms/op
sequential     longs   lz4  54.404 ± 1.106  ms/op
sequential     longs  none   3.172 ± 0.088  ms/op
   uniform      auto   lz4  32.214 ± 0.803  ms/op
   uniform      auto  none  10.702 ± 0.145  ms/op
   uniform     longs   lz4  31.917 ± 0.663  ms/op
   uniform     longs  none   3.055 ± 0.072  ms/op

ReadSkipping
 enumerate      auto   lz4   1.533 ± 0.075  ms/op
 enumerate      auto  none   0.083 ± 0.002  ms/op
 enumerate     longs   lz4   5.862 ± 0.471  ms/op
 enumerate     longs  none   0.285 ± 0.003  ms/op
   zipfLow      auto   lz4   5.017 ± 0.121  ms/op
   zipfLow      auto  none   0.172 ± 0.003  ms/op
   zipfLow     longs   lz4  19.103 ± 1.118  ms/op
   zipfLow     longs  none   0.283 ± 0.002  ms/op
  zipfHigh      auto   lz4   3.060 ± 0.164  ms/op
  zipfHigh      auto  none   0.109 ± 0.002  ms/op
  zipfHigh     longs   lz4   8.791 ± 1.083  ms/op
  zipfHigh     longs  none   0.280 ± 0.002  ms/op
sequential      auto   lz4   1.379 ± 0.304  ms/op
sequential      auto  none   0.253 ± 0.003  ms/op
sequential     longs   lz4  40.777 ± 0.681  ms/op
sequential     longs  none   0.281 ± 0.003  ms/op
   uniform      auto   lz4   0.822 ± 0.289  ms/op
   uniform      auto  none   0.171 ± 0.003  ms/op
   uniform     longs   lz4  18.799 ± 1.704  ms/op
   uniform     longs  none   0.286 ± 0.003  ms/op

Benchmark on master, lz4 correspond to longs + lz4. There is no major performance difference

ReadContinous
 enumerate     lz4  19.714 ± 0.455  ms/op
   zipfLow     lz4  33.191 ± 0.640  ms/op
  zipfHigh     lz4  21.761 ± 0.488  ms/op
sequential     lz4  54.163 ± 1.077  ms/op
   uniform     lz4  31.790 ± 0.871  ms/op

ReadSkipping
 enumerate     lz4   5.741 ± 0.346  ms/op
   zipfLow     lz4  18.736 ± 0.425  ms/op
  zipfHigh     lz4   8.739 ± 0.753  ms/op
sequential     lz4  41.081 ± 1.034  ms/op
   uniform     lz4  18.322 ± 0.325  ms/op

drcrallen · 2016-06-29T21:42:06Z

processing/src/main/java/io/druid/segment/data/BlockLayoutSerde.java

+    private final ByteOrder order;
+    private final CompressionFactory.LongEncodingFormatReader baseReader;
+
+    public BlockLayoutIndexedLongsSupplier(int totalSize, int sizePer, ByteBuffer fromBuffer, ByteOrder order,


Add javadoc here please

gianm · 2016-07-15T22:50:23Z

processing/src/main/java/io/druid/segment/IndexMergerV9.java

@@ -743,12 +745,13 @@ private LongColumnSerializer setupTimeWriter(final IOPeon ioPeon) throws IOExcep
  {
    ArrayList<GenericColumnSerializer> metWriters = Lists.newArrayListWithCapacity(mergedMetrics.size());
    final CompressedObjectStrategy.CompressionStrategy metCompression = indexSpec.getMetricCompressionStrategy();
+    final CompressionFactory.LongEncodingFormat metEncoding = indexSpec.getMetricLongEncodingFormat();


This is more like a longEncoding than a metEncoding (keep in mind we will have long dimensions soon; also, __time is a long).

gianm · 2016-08-12T03:39:25Z

processing/src/main/java/io/druid/segment/data/DeltaLongEncodingWriter.java

+      serializer.close();
+    }
+  }
+}


Newline @ EOF

gianm · 2016-08-12T03:46:50Z

@acslk Okay, whew, finally finished reading through :)

I wrote some comments on specific areas but the general structure looks good to me. The benchmarks and storage numbers look good too!

Other than that, like I said in the previous comment, I do think it makes sense to include "none" compression for floats, minus the probably unnecessary floatEncoding stuff.

Anyone else have a chance to take a look at this now that it's been edited a bunch? @xvrl @jon-wei

jon-wei · 2016-08-12T21:18:17Z

@acslk @gianm reviewing again

acslk · 2016-08-12T22:43:50Z

Added the float support for "none" compression strategy. It uses the same BlockLayout/EntireLayout pattern as long compression, but does not have any encoding strategy/format.

xvrl · 2016-08-12T23:43:26Z

@acslk I'm a bit puzzled by some of the benchmark numbers. longs + lz4 seems to be really standing out in the readSkipping benchmarks compared to delta + lz4. The difference is much less pronounced in readContinous. I wonder it it would make sense to understand what's causing this, especially give that the compressed sizes are not that much different.

@gianm As far as making delta+lz4 the default in the future, it still seems to be up to 3x slower compared to longs + lz4 for some full column scans, so we might want to experiment with real-world data before making that decision.

Overall I'm on board with this PR, I haven't had a chance to read the code much yet, but the approach is the description seems reasonable.

jon-wei · 2016-08-13T02:23:06Z

processing/src/main/java/io/druid/segment/IndexSpec.java

   */
  @JsonCreator
  public IndexSpec(
      @JsonProperty("bitmap") BitmapSerdeFactory bitmapSerdeFactory,
      @JsonProperty("dimensionCompression") String dimensionCompression,
-      @JsonProperty("metricCompression") String metricCompression
+      @JsonProperty("metricCompression") String metricCompression,


I think the implementation of IndexSpec could be simplified by using the Enums directly in the constructor, instead of Strings, jackson can serialize/deserialize those to/from strings.

SegmentMetadataQuery.AnalysisType is an example using @JsonValue and @JsonCreator for that purpose

it seems that jackson cannot convert input string to uppercase for the enum, so serialize/deserialize need to take string and apply conversion.

Nevermind, added the toString and fromString for the enums and replaced the string with enum values. There seems to be a special handle for Uncompressed strategy for dimension compression, where getStrategy returns null for uncompressed. The caller handles the null as indicator to not use compressedIndexedInt, I think it's doing something similar to none strategy in this PR for performance reasons. I removed the null return value and did the special handling for UNCOMPRESSED strategy instead, this syntax change should not affect anything.

jon-wei · 2016-08-16T05:02:12Z

done reviewing the new changes, had a few comments but looks good overall

acslk · 2016-08-23T00:01:45Z

@xvrl After some investigation, I believe the reason for the discrepancy is due to performance of different components. For block based compression layout, the total time for reading column is composed of : copy byte buffer block + decompression + read values from byte buffer.

longs + lz4 takes more time on the decompression part, while delta + lz4 take more time on the read value part. For readContinuous, the two differences kinda cancel out, making the performance similar. For readSkipping, the read value part is basically gone with so few reads, while the decompression part remains around the same, making longs + lz4 take much longer.

I did some benchmark using the uncompressed strategy, which goes through the same process as lz4 for copy block + read value, with the only difference being no decompression step. The difference in time between lz4 and uncompressed should be the time for decompression. From the benchmarks below, it seems that readContinuous and readSkipping decompression time matches.

Note delta strategy is renamed to auto since the strategy could use delta, table, or longs depending on data cardinality and maximum offset.

enumerate
readContinuous
 auto           lz4  10  29.033 ± 1.111  ms/op
 auto  uncompressed  10  28.591 ± 0.911  ms/op
longs           lz4  10  20.878 ± 1.856  ms/op
longs  uncompressed  10  17.101 ± 0.358  ms/op
readSkipping
 auto           lz4  10   1.459 ± 0.069  ms/op
 auto  uncompressed  10   0.236 ± 0.123  ms/op
longs           lz4  10   5.889 ± 2.222  ms/op
longs  uncompressed  10   3.327 ± 1.929  ms/op

zipfLow
readContinuous
 auto           lz4  10  35.505 ± 0.744  ms/op
 auto  uncompressed  10  31.311 ± 1.022  ms/op
longs           lz4  10  33.904 ± 2.328  ms/op
longs  uncompressed  10  17.404 ± 0.690  ms/op
readSkipping
 auto           lz4  10   4.941 ± 0.360  ms/op
 auto  uncompressed  10   0.522 ± 0.141  ms/op
longs           lz4  10  18.387 ± 1.321  ms/op
longs  uncompressed  10   3.390 ± 2.403  ms/op


zipfHigh
readContinuous
 auto           lz4  10  26.399 ± 1.428  ms/op
 auto  uncompressed  10  23.593 ± 0.667  ms/op
longs           lz4  10  23.355 ± 1.868  ms/op
longs  uncompressed  10  17.335 ± 0.796  ms/op
readSkipping
 auto           lz4  10   2.914 ± 0.259  ms/op
 auto  uncompressed  10   0.383 ± 0.142  ms/op
longs           lz4  10   8.015 ± 0.590  ms/op
longs  uncompressed  10   3.406 ± 2.738  ms/op

sequential
readContinuous
 auto           lz4  10  29.491 ± 0.951  ms/op
 auto  uncompressed  10  28.636 ± 0.351  ms/op
longs           lz4  10  54.110 ± 0.894  ms/op
longs  uncompressed  10  17.035 ± 0.249  ms/op
readSkipping
 auto           lz4  10   1.335 ± 0.735  ms/op
 auto  uncompressed  10   1.079 ± 0.282  ms/op
longs           lz4  10  39.549 ± 0.544  ms/op
longs  uncompressed  10   3.472 ± 2.714  ms/op

uniform
readContinuous
 auto           lz4  10  31.522 ± 1.021  ms/op
 auto  uncompressed  10  33.260 ± 1.238  ms/op
longs           lz4  10  32.292 ± 1.532  ms/op
longs  uncompressed  10  17.211 ± 0.955  ms/op
readSkipping
 auto           lz4  10   0.746 ± 0.518  ms/op
 auto  uncompressed  10   0.528 ± 0.176  ms/op
longs           lz4  10  18.018 ± 1.105  ms/op
longs  uncompressed  10   3.302 ± 2.018  ms/op

I'm not sure why lz4 decompression is faster for auto compared to longs even though the compressed size are similar. Perhaps less work is done for auto decompression since the decompressed size is smaller.

jon-wei · 2016-08-23T00:47:14Z

benchmarks/src/main/java/io/druid/benchmark/LongCompressionBenchmark.java

+// Run LongCompressionBenchmarkFileGenerator to generate the required files before running this benchmark
+
+@State(Scope.Benchmark)
+@Fork(value = 1)


Let's make this:

@Fork(jvmArgsPrepend = "-server", value = 1)

so that the JVM runs in "server" mode:
http://www.oracle.com/technetwork/java/whitepaper-135217.html#2

gianm · 2016-08-26T22:54:53Z

@xvrl yes, that's a good point, looks like @acslk findings indicate that auto+lz4 generally decompresses faster than longs+lz4 but individual values are then read out slower (which explains why full scans are hit-or-miss but filtered scans are generally faster in auto+lz4's favor).

I just read over the more recent changes (the float compression stuff and the introduction of "auto" encoding strategy) and they look good to me. We've also been running this on our test cluster for a couple of weeks now and it has been working out there.

👍 from me, any one else want to take one more look? @jon-wei @xvrl

jon-wei · 2016-08-30T22:15:08Z

LGTM, 👍

gianm · 2016-08-30T23:17:41Z

Looks like we have a few +1 and no objections, so I'll merge this as discussed in the dev sync today. thanks everyone!

drcrallen · 2016-09-27T22:32:49Z

processing/src/main/java/io/druid/segment/data/GenericIndexedWriter.java

  {
-    return ByteStreams.join(
+    return ByteSource.concat(


This broke batch indexing that uses Guava pre-15

leventov · 2017-06-02T02:01:52Z

processing/src/main/java/io/druid/segment/IndexSpec.java

-  private final String metricCompression;
+  private final CompressedObjectStrategy.CompressionStrategy dimensionCompression;
+  private final CompressedObjectStrategy.CompressionStrategy metricCompression;
+  private final CompressionFactory.LongEncodingStrategy longEncoding;


Shouldn't longEncoding be checked in IndexSpec.equals() and hashCode()?

Yes, it should.

Fixed that here: #4353

acslk changed the title ~~Additional long compression options~~ Adds long compression options Jun 14, 2016

acslk changed the title ~~Adds long compression options~~ Adds long compression methods Jun 14, 2016

fjy added the Feature label Jun 14, 2016

fjy added this to the 0.9.2 milestone Jun 14, 2016

xvrl added the Discuss label Jun 15, 2016

gianm mentioned this pull request Jun 15, 2016

Proposal: alternative compression methods #3147

Closed

acslk force-pushed the feature-compress branch from f41c88c to 867565f Compare June 23, 2016 01:42

drcrallen reviewed Jun 29, 2016
View reviewed changes

acslk force-pushed the feature-compress branch from 9285f25 to 09c893e Compare July 13, 2016 21:20

gianm reviewed Jul 15, 2016
View reviewed changes

acslk added 2 commits July 18, 2016 18:34

add read

81e0943

update deprecated guava calls

67c7b63

gianm reviewed Aug 12, 2016
View reviewed changes

processing/src/main/java/io/druid/segment/data/DeltaLongEncodingWriter.java

serializer.close();

}

}

}

Copy link

Contributor

gianm Aug 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newline @ EOF

acslk added 2 commits August 12, 2016 15:05

add float NONE handling

3f63314

address PR comment

f0412dd

jon-wei reviewed Aug 13, 2016
View reviewed changes

address PR comment 2

95fb994

jon-wei reviewed Aug 23, 2016
View reviewed changes

gianm removed the Discuss label Aug 30, 2016

gianm merged commit c4e8440 into apache:master Aug 30, 2016

drcrallen mentioned this pull request Sep 27, 2016

Guava 14.x batch indexing fails #3519

Closed

drcrallen reviewed Sep 27, 2016

View reviewed changes

gianm mentioned this pull request Sep 29, 2016

Druid 0.9.2 release notes #3503

Closed

leventov reviewed Jun 2, 2017

View reviewed changes

clintropolis mentioned this pull request Jul 18, 2018

Druid 'Shapeshifting' Columns #6016

Open

7 tasks

jihoonson mentioned this pull request Aug 14, 2019

add copyright info back to NOTICE and NOTICE.BINARY #8298

Merged

seoeun25 pushed a commit to seoeun25/incubator-druid that referenced this pull request Feb 25, 2022

apache#3148 Apply compression on non-string columns on the wire

dd6e72a

clintropolis mentioned this pull request Nov 17, 2022

Unsigned integer druid complex column #13370

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds long compression methods #3148

Adds long compression methods #3148

acslk commented Jun 14, 2016 •

edited

Loading

drcrallen Jun 29, 2016

gianm Jul 15, 2016

gianm Aug 12, 2016

gianm commented Aug 12, 2016 •

edited

Loading

jon-wei commented Aug 12, 2016

acslk commented Aug 12, 2016

xvrl commented Aug 12, 2016

jon-wei Aug 13, 2016

acslk Aug 19, 2016

acslk Aug 20, 2016

jon-wei commented Aug 16, 2016

acslk commented Aug 23, 2016

jon-wei Aug 23, 2016

gianm commented Aug 26, 2016

jon-wei commented Aug 30, 2016

gianm commented Aug 30, 2016

drcrallen Sep 27, 2016

leventov Jun 2, 2017

gianm Jun 2, 2017

leventov Jun 2, 2017

Adds long compression methods #3148

Adds long compression methods #3148

Conversation

acslk commented Jun 14, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gianm commented Aug 12, 2016 • edited Loading

jon-wei commented Aug 12, 2016

acslk commented Aug 12, 2016

xvrl commented Aug 12, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jon-wei commented Aug 16, 2016

acslk commented Aug 23, 2016

Choose a reason for hiding this comment

gianm commented Aug 26, 2016

jon-wei commented Aug 30, 2016

gianm commented Aug 30, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

acslk commented Jun 14, 2016 •

edited

Loading

gianm commented Aug 12, 2016 •

edited

Loading