Skip to content

Commit

Permalink
ARROW-17305: [C++] Avoid spending time in popcount in BitmapAnd bench…
Browse files Browse the repository at this point in the history
…mark (#13794)

This was artificially limiting the reported performance of BitmapAnd.

Before:
```
--------------------------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------
BenchmarkBitmapAnd/32768/0        1708 ns         1708 ns       408579 bytes_per_second=17.8726G/s
BenchmarkBitmapAnd/131072/0       6968 ns         6965 ns       102223 bytes_per_second=17.5262G/s
BenchmarkBitmapAnd/32768/1        3982 ns         3981 ns       175136 bytes_per_second=7.66574G/s
BenchmarkBitmapAnd/131072/1      15574 ns        15569 ns        44988 bytes_per_second=7.8404G/s
BenchmarkBitmapAnd/32768/2        3999 ns         3998 ns       175021 bytes_per_second=7.63248G/s
BenchmarkBitmapAnd/131072/2      15589 ns        15585 ns        44844 bytes_per_second=7.83234G/s
```

After:
```
--------------------------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------
BenchmarkBitmapAnd/32768/0         732 ns          732 ns       967465 bytes_per_second=41.6736G/s
BenchmarkBitmapAnd/131072/0       3105 ns         3105 ns       229726 bytes_per_second=39.3198G/s
BenchmarkBitmapAnd/32768/1        2913 ns         2913 ns       240233 bytes_per_second=10.4774G/s
BenchmarkBitmapAnd/131072/1      11528 ns        11526 ns        60865 bytes_per_second=10.5912G/s
BenchmarkBitmapAnd/32768/2        2924 ns         2924 ns       236873 bytes_per_second=10.4378G/s
BenchmarkBitmapAnd/131072/2      11552 ns        11550 ns        60619 bytes_per_second=10.5691G/s
```

(I didn't check, but the compiler here probably auto-vectorizes the aligned code path)

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Yibo Cai <yibo.cai@arm.com>
  • Loading branch information
pitrou committed Aug 5, 2022
1 parent 81ded07 commit 56e6caf
Showing 1 changed file with 1 addition and 3 deletions.
4 changes: 1 addition & 3 deletions cpp/src/arrow/util/bit_util_benchmark.cc
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,7 @@ static void BenchmarkAndImpl(benchmark::State& state, DoAnd&& do_and) {

for (auto _ : state) {
do_and({bitmap_1, bitmap_2}, &bitmap_3);
auto total =
internal::CountSetBits(bitmap_3.data(), bitmap_3.offset(), bitmap_3.length());
benchmark::DoNotOptimize(total);
benchmark::ClobberMemory();
}
state.SetBytesProcessed(state.iterations() * nbytes);
}
Expand Down

0 comments on commit 56e6caf

Please sign in to comment.