Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve bitmap vector offset to report contiguous groups #11039

Merged
merged 3 commits into from
Apr 13, 2021

Conversation

clintropolis
Copy link
Member

@clintropolis clintropolis commented Mar 27, 2021

Description

This PR is a follow-up to #11004, which improves BitmapVectorOffset to report the current set of offsets as "contiguous" when they are, in order to take advantage of any optimizations available when processing contiguous vectorized gets.

In #11004, I noticed the improvements to vectorized decoding of long values in my column scan benchmarks only seemed to impact a full scan. Looking closer, it was due to how the benchmark uses a BitmapVectorOffset to simulate filtering rows, which prior to this pr would never report the current set of offsets as contiguous, even if it was. Since #11004 primarily improved performance of contiguous vectorized gets, when using a BitmapVectorOffset we were missing out on these improvements entirely.

To help measure the improvements of allowing contiguous blocks to be processed with the contiguous get methods, I improved the benchmarks added in #11004 to now have several different filter "distributions" to simulate different filtering patterns, instead of previously only using a random distribution. These include:

  • random: BitmapVectorOffset, bitmap is set randomly for desired number of rows
  • contiguous-start: NoFilterVectorOffset from starting offset 0 to desired number of rows
  • contiguous-end: opposite end of column as 'contiguous-start'
  • contiguous-bitmap-start: BitmapVectorOffset, bitmap set from 0 to desired number of rows
  • contiguous-bitmap-end: opposite end of column as contiguous-bitmap-start
  • chunky-1000: BitmapVectorOffset, position chosen randomly and up to 1000 contiguous values set until chosen number of rows is set
  • chunky-10000: 'chunky-1000' but with chunks of 10k contiguous values

The results of this change is a pretty decent improvement to vectorized column read speed when there are contiguous vectors to read:

before-after-bitmaps

before:

Benchmark                                                                        (distribution)  (encoding)     (filterDistribution)  (filteredRowCountPercentage)   (rows)  (zeroProbability)  Mode  Cnt        Score      Error  Units
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.1  5000000                0.0  avgt   10     3096.136 ±   20.732  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                          0.25  5000000                0.0  avgt   10     8142.492 ±  124.247  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.5  5000000                0.0  avgt   10    16204.635 ±  101.374  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.6  5000000                0.0  avgt   10    19377.004 ±  109.489  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                          0.75  5000000                0.0  avgt   10    25621.704 ± 1544.863  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.8  5000000                0.0  avgt   10    26393.053 ±  190.878  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.9  5000000                0.0  avgt   10    29200.991 ±  493.996  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                          0.95  5000000                0.0  avgt   10    30702.206 ±  198.864  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           1.0  5000000                0.0  avgt   10    17997.131 ±  127.793  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.1  5000000                0.0  avgt   10     3881.268 ±   19.018  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                          0.25  5000000                0.0  avgt   10     8097.172 ±  284.560  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.5  5000000                0.0  avgt   10    14498.809 ±   77.822  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.6  5000000                0.0  avgt   10    18387.397 ±   96.425  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                          0.75  5000000                0.0  avgt   10    21986.742 ±   89.077  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.8  5000000                0.0  avgt   10    23239.459 ±  290.921  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.9  5000000                0.0  avgt   10    25199.260 ±  162.413  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                          0.95  5000000                0.0  avgt   10    27449.210 ± 1688.145  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           1.0  5000000                0.0  avgt   10    17981.037 ±   63.957  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.1  5000000                0.0  avgt   10     5120.464 ±  193.518  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                          0.25  5000000                0.0  avgt   10    10236.347 ±  307.603  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.5  5000000                0.0  avgt   10    19979.496 ±  566.214  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.6  5000000                0.0  avgt   10    23576.997 ±  438.873  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                          0.75  5000000                0.0  avgt   10    29285.756 ±  986.723  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.8  5000000                0.0  avgt   10    29974.530 ± 1441.189  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.9  5000000                0.0  avgt   10    33905.473 ± 2021.488  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                          0.95  5000000                0.0  avgt   10    33024.158 ±  389.507  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           1.0  5000000                0.0  avgt   10    18252.778 ±  211.648  us/op

after:

Benchmark                                                                        (distribution)  (encoding)     (filterDistribution)  (filteredRowCountPercentage)   (rows)  (zeroProbability)  Mode  Cnt        Score     Error  Units
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.1  5000000                0.0  avgt   10     2396.812 ±   59.423  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                          0.25  5000000                0.0  avgt   10     5941.108 ±  206.432  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.5  5000000                0.0  avgt   10    10904.502 ±  248.979  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.6  5000000                0.0  avgt   10    13237.317 ±  175.817  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                          0.75  5000000                0.0  avgt   10    17216.386 ±  934.780  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.8  5000000                0.0  avgt   10    17088.263 ±  116.899  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           0.9  5000000                0.0  avgt   10    20996.326 ±   81.832  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                          0.95  5000000                0.0  avgt   10    20735.792 ±   71.165  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto  contiguous-bitmap-start                           1.0  5000000                0.0  avgt   10    17992.744 ±   87.609  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.1  5000000                0.0  avgt   10     3226.460 ±   16.228  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                          0.25  5000000                0.0  avgt   10     6771.234 ±  127.713  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.5  5000000                0.0  avgt   10    12587.962 ±   50.479  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.6  5000000                0.0  avgt   10    15874.411 ±   72.437  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                          0.75  5000000                0.0  avgt   10    18694.264 ±   34.897  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.8  5000000                0.0  avgt   10    20658.982 ±  214.946  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           0.9  5000000                0.0  avgt   10    21208.258 ±   73.285  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                          0.95  5000000                0.0  avgt   10    23644.473 ±  968.966  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto              chunky-1000                           1.0  5000000                0.0  avgt   10    17980.515 ±   95.417  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.1  5000000                0.0  avgt   10     5174.737 ±  237.912  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                          0.25  5000000                0.0  avgt   10     9280.395 ±  334.380  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.5  5000000                0.0  avgt   10    18088.591 ±  467.872  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.6  5000000                0.0  avgt   10    21431.135 ±  217.091  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                          0.75  5000000                0.0  avgt   10    25727.534 ± 1334.654  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.8  5000000                0.0  avgt   10    27875.558 ±  209.684  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           0.9  5000000                0.0  avgt   10    28871.937 ±  203.491  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                          0.95  5000000                0.0  avgt   10    32956.537 ±  524.990  us/op
ColumnarLongsSelectRowsFromGeneratorBenchmark.selectRowsVectorized                   uniform-12    lz4-auto                   random                           1.0  5000000                0.0  avgt   10    18136.036 ±  151.741  us/op

This PR has:

  • been self-reviewed.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

@@ -85,6 +87,14 @@ public void advance()

currentVectorSize = to;
}

if (currentVectorSize > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do this branch if currentVectorSize == 1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could, and I considered it, but ultimately I decided that 1 didn't really "feel" like it met the threshold for me to consider it really contiguous. I also figured it didn't really make a lot of difference between which read method it would end up in because there is only 1 offset either way. But I don't have strong opinions on it though since either seems valid, do you think it logically should be considered contiguous?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it doesn't really matter. I thought I might have been missing some reason why it wasn't ok to do it for the currentVectorSize == 1 case.

@@ -114,6 +124,9 @@ public int getCurrentVectorSize()
@Override
public int getStartOffset()
{
if (isContiguous) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the interface, getOffsets is supposed to throw an exception if isContiguous is true, so we should check it there too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added, and tests

@gianm gianm merged commit 08d3786 into apache:master Apr 13, 2021
@clintropolis clintropolis deleted the bitmap-vector-offset-contiguous branch April 13, 2021 18:49
@clintropolis clintropolis added this to the 0.22.0 milestone Aug 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants