Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor / clean up / optimize FullFilterBitsReader #5941

Closed
wants to merge 1 commit into from

Conversation

pdillinger
Copy link
Contributor

@pdillinger pdillinger commented Oct 18, 2019

Summary: FullFilterBitsReader, after creating in BloomFilterPolicy, was
responsible for decoding metadata bits. This meant that
FullFilterBitsReader::MayMatch had some metadata checks in order to
implement "always true" or "always false" functionality in the case
of inconsistent or trivial metadata. This made for ugly
mixing-of-concerns code and probably had some runtime cost. It also
didn't really support plugging in alternative filter implementations
with extensions to the existing metadata schema.

BloomFilterPolicy::GetFilterBitsReader is now (exclusively) responsible
for decoding filter metadata bits and constructing appropriate instances
deriving from FilterBitsReader. "Always false" and "always true" derived
classes allow FullFilterBitsReader not to be concerned with handling of
trivial or inconsistent metadata. This also makes for easy expansion
to alternative filter implementations in new, alternative derived
classes. This change makes calls to FilterBitsReader::MayMatch
necessarily virtual because there's now more than one built-in
implementation. Compared with the previous implementation's extra
'if' checks in MayMatch, there's no consistent performance difference,
measured by (an older revision of) filter_bench (differences here seem
to be within noise):

Inside queries...
-  Dry run (407) ns/op: 35.9996
+  Dry run (407) ns/op: 35.2034
-  Single filter ns/op: 47.5483
+  Single filter ns/op: 47.4034
-  Batched, prepared ns/op: 43.1559
+  Batched, prepared ns/op: 42.2923
...
-  Random filter ns/op: 150.697
+  Random filter ns/op: 149.403
----------------------------
Outside queries...
-  Dry run (980) ns/op: 34.6114
+  Dry run (980) ns/op: 34.0405
-  Single filter ns/op: 56.8326
+  Single filter ns/op: 55.8414
-  Batched, prepared ns/op: 48.2346
+  Batched, prepared ns/op: 47.5667
-  Random filter ns/op: 155.377
+  Random filter ns/op: 153.942
     Average FP rate %: 1.1386

Also, the FullFilterBitsReader ctor was responsible for a surprising
amount of CPU in production, due in part to inefficient determination of
the CACHE_LINE_SIZE used to construct the filter being read. The
overwhelming common case (same as my CACHE_LINE_SIZE) is now
substantially optimized, as shown with filter_bench with
-new_reader_every=1 (old option - see below) (repeatable result):

Inside queries...
-  Dry run (453) ns/op: 118.799
+  Dry run (453) ns/op: 105.869
-  Single filter ns/op: 82.5831
+  Single filter ns/op: 74.2509
...
-  Random filter ns/op: 224.936
+  Random filter ns/op: 194.833
----------------------------
Outside queries...
-  Dry run (aa1) ns/op: 118.503
+  Dry run (aa1) ns/op: 104.925
-  Single filter ns/op: 90.3023
+  Single filter ns/op: 83.425
...
-  Random filter ns/op: 220.455
+  Random filter ns/op: 175.7
     Average FP rate %: 1.13886

However PR#5936 has/will reclaim most of this cost. After that PR, the optimization of this code path is likely negligible, but nonetheless it's clear we aren't making performance any worse.

Also fixed inadequate check of consistency between filter data size and
num_lines. (Unit test updated.)

Test Plan: previously added unit tests FullBloomTest.CorruptFilters and
FullBloomTest.RawSchema

Summary: FullFilterBitsReader, after creating in BloomFilterPolicy, was
responsible for decoding metadata bits. This meant that
FullFilterBitsReader::MayMatch had some metadata checks in order to
implement "always true" or "always false" functionality in the case
of inconsistent or trivial metadata. This made for ugly
mixing-of-concerns code and probably had some runtime cost. It also
didn't really support plugging in alternative filter implementations
with extensions to the existing metadata schema.

BloomFilterPolicy::GetFilterBitsReader is now (exclusively) responsible
for decoding filter metadata bits and constructing appropriate instances
deriving from FilterBitsReader. "Always false" and "always true" derived
classes allow FullFilterBitsReader not to be concerned with handling of
trivial or inconsistent metadata. This also makes for easy expansion
to alternative filter implementations in new, alternative derived
classes. This change makes calls to FilterBitsReader::MayMatch
*necessarily* virtual because there's now more than one implementation.
Compared with the previous implementation's extra 'if' checks in
MayMatch, there's no consistent performance difference, measured by
filter_bench (differences here seem to be within noise):

 Inside queries...
-  Dry run (407) ns/op: 35.9996
+  Dry run (407) ns/op: 35.2034
-  Single filter ns/op: 47.5483
+  Single filter ns/op: 47.4034
-  Batched, prepared ns/op: 43.1559
+  Batched, prepared ns/op: 42.2923
...
-  Random filter ns/op: 150.697
+  Random filter ns/op: 149.403
 ----------------------------
 Outside queries...
-  Dry run (980) ns/op: 34.6114
+  Dry run (980) ns/op: 34.0405
-  Single filter ns/op: 56.8326
+  Single filter ns/op: 55.8414
-  Batched, prepared ns/op: 48.2346
+  Batched, prepared ns/op: 47.5667
-  Random filter ns/op: 155.377
+  Random filter ns/op: 153.942
     Average FP rate %: 1.1386

Also, the FullFilterBitsReader ctor was responsible for a surprising
amount of CPU in production, due in part to inefficient determination of
the CACHE_LINE_SIZE used to construct the filter being read. The
overwhelming common case (same as my CACHE_LINE_SIZE) is now
substantially optimized, as shown with filter_bench with
-new_reader_every=1 (repeatable result):

 Inside queries...
-  Dry run (453) ns/op: 118.799
+  Dry run (453) ns/op: 105.869
-  Single filter ns/op: 82.5831
+  Single filter ns/op: 74.2509
...
-  Random filter ns/op: 224.936
+  Random filter ns/op: 194.833
 ----------------------------
 Outside queries...
-  Dry run (aa1) ns/op: 118.503
+  Dry run (aa1) ns/op: 104.925
-  Single filter ns/op: 90.3023
+  Single filter ns/op: 83.425
...
-  Random filter ns/op: 220.455
+  Random filter ns/op: 175.7
     Average FP rate %: 1.13886

Also fixed inadequate check of consistency between filter data size and
num_lines. (Unit test updated.)

Test Plan: recently added unit tests FullBloomTest.CorruptFilters and
FullBloomTest.RawSchema
Copy link
Contributor

@siying siying left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good change!

util/bloom.cc Show resolved Hide resolved
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdillinger is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@pdillinger merged this pull request in 5f8f2fd.

pdillinger added a commit to pdillinger/rocksdb that referenced this pull request Nov 12, 2019
Summary: Bug in PR facebook#5941 when char is unsigned that should only affect
assertion on unused/invalid filter metadata.

Test Plan: on ARM: ./bloom_test && ./db_bloom_filter_test && ./block_based_filter_block_test && ./full_filter_block_test && ./partitioned_filter_block_test
facebook-github-bot pushed a commit that referenced this pull request Nov 12, 2019
Summary:
Bug in PR #5941 when char is unsigned that should only affect
assertion on unused/invalid filter metadata.
Pull Request resolved: #6024

Test Plan: on ARM: ./bloom_test && ./db_bloom_filter_test && ./block_based_filter_block_test && ./full_filter_block_test && ./partitioned_filter_block_test

Differential Revision: D18461206

Pulled By: pdillinger

fbshipit-source-id: 68a7c813a0b5791c05265edc03cdf52c78880e9a
merryChris pushed a commit to merryChris/rocksdb that referenced this pull request Nov 18, 2019
Summary:
FullFilterBitsReader, after creating in BloomFilterPolicy, was
responsible for decoding metadata bits. This meant that
FullFilterBitsReader::MayMatch had some metadata checks in order to
implement "always true" or "always false" functionality in the case
of inconsistent or trivial metadata. This made for ugly
mixing-of-concerns code and probably had some runtime cost. It also
didn't really support plugging in alternative filter implementations
with extensions to the existing metadata schema.

BloomFilterPolicy::GetFilterBitsReader is now (exclusively) responsible
for decoding filter metadata bits and constructing appropriate instances
deriving from FilterBitsReader. "Always false" and "always true" derived
classes allow FullFilterBitsReader not to be concerned with handling of
trivial or inconsistent metadata. This also makes for easy expansion
to alternative filter implementations in new, alternative derived
classes. This change makes calls to FilterBitsReader::MayMatch
*necessarily* virtual because there's now more than one built-in
implementation. Compared with the previous implementation's extra
'if' checks in MayMatch, there's no consistent performance difference,
measured by (an older revision of) filter_bench (differences here seem
to be within noise):

    Inside queries...
    -  Dry run (407) ns/op: 35.9996
    +  Dry run (407) ns/op: 35.2034
    -  Single filter ns/op: 47.5483
    +  Single filter ns/op: 47.4034
    -  Batched, prepared ns/op: 43.1559
    +  Batched, prepared ns/op: 42.2923
    ...
    -  Random filter ns/op: 150.697
    +  Random filter ns/op: 149.403
    ----------------------------
    Outside queries...
    -  Dry run (980) ns/op: 34.6114
    +  Dry run (980) ns/op: 34.0405
    -  Single filter ns/op: 56.8326
    +  Single filter ns/op: 55.8414
    -  Batched, prepared ns/op: 48.2346
    +  Batched, prepared ns/op: 47.5667
    -  Random filter ns/op: 155.377
    +  Random filter ns/op: 153.942
         Average FP rate %: 1.1386

Also, the FullFilterBitsReader ctor was responsible for a surprising
amount of CPU in production, due in part to inefficient determination of
the CACHE_LINE_SIZE used to construct the filter being read. The
overwhelming common case (same as my CACHE_LINE_SIZE) is now
substantially optimized, as shown with filter_bench with
-new_reader_every=1 (old option - see below) (repeatable result):

    Inside queries...
    -  Dry run (453) ns/op: 118.799
    +  Dry run (453) ns/op: 105.869
    -  Single filter ns/op: 82.5831
    +  Single filter ns/op: 74.2509
    ...
    -  Random filter ns/op: 224.936
    +  Random filter ns/op: 194.833
    ----------------------------
    Outside queries...
    -  Dry run (aa1) ns/op: 118.503
    +  Dry run (aa1) ns/op: 104.925
    -  Single filter ns/op: 90.3023
    +  Single filter ns/op: 83.425
    ...
    -  Random filter ns/op: 220.455
    +  Random filter ns/op: 175.7
         Average FP rate %: 1.13886

However PR#5936 has/will reclaim most of this cost. After that PR, the optimization of this code path is likely negligible, but nonetheless it's clear we aren't making performance any worse.

Also fixed inadequate check of consistency between filter data size and
num_lines. (Unit test updated.)
Pull Request resolved: facebook#5941

Test Plan:
previously added unit tests FullBloomTest.CorruptFilters and
FullBloomTest.RawSchema

Differential Revision: D18018353

Pulled By: pdillinger

fbshipit-source-id: 8e04c2b4a7d93223f49a237fd52ef2483929ed9c
merryChris pushed a commit to merryChris/rocksdb that referenced this pull request Nov 18, 2019
Summary:
Bug in PR facebook#5941 when char is unsigned that should only affect
assertion on unused/invalid filter metadata.
Pull Request resolved: facebook#6024

Test Plan: on ARM: ./bloom_test && ./db_bloom_filter_test && ./block_based_filter_block_test && ./full_filter_block_test && ./partitioned_filter_block_test

Differential Revision: D18461206

Pulled By: pdillinger

fbshipit-source-id: 68a7c813a0b5791c05265edc03cdf52c78880e9a
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants