Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove deprecated block-based filter #10184

Closed
wants to merge 3 commits into from

Conversation

pdillinger
Copy link
Contributor

@pdillinger pdillinger commented Jun 16, 2022

Summary: In #9535, release 7.0, we hid the old block-based filter from being created using
the public API, because of its inefficiency. Although we normally maintain read compatibility
on old DBs forever, filters are not required for reading a DB, only for optimizing read
performance. Thus, it should be acceptable to remove this code and the substantial
maintenance burden it carries as useful features are developed and validated (such
as user timestamp).

This change completely removes the code for reading and writing the old block-based
filters, net removing about 1370 lines of code no longer needed. Options removed from
testing / benchmarking tools. The prior existence is only evident in a couple of places:

  • CacheEntryRole::kDeprecatedFilterBlock - We can update this public API enum in
    a major release to minimize source code incompatibilities.
  • A warning is logged when an old table file is opened that used the old block-based
    filter. This is provided as a courtesy, and would be a pain to unit test, so manual testing
    should suffice. Unfortunately, sst_dump does not tell you whether a file uses
    block-based filter, and the structure of the code makes it very difficult to fix.
  • To detect that case, kObsoleteFilterBlockPrefix (renamed from kFilterBlockPrefix)
    for metaindex is maintained (for now).

Other notes:

  • In some cases where numbers are associated with filter configurations, we have had to
    update the assigned numbers so that they all correspond to something that exists.
  • Fixed potential stat counting bug by assuming filter_checked = false for cases
    like filter == nullptr rather than assuming filter_checked = true
  • Removed obsolete block_offset and prefix_extractor parameters from several
    functions.
  • Removed some unnecessary checks if (!table_prefix_extractor() && !prefix_extractor)
    because the caller guarantees the prefix extractor exists and is compatible

Test Plan: tests updated, manually test new warning in LOG using base version to
generate a DB

Summary: TODO

Test Plan: tests updated, TODO
@pdillinger pdillinger marked this pull request as ready for review June 16, 2022 16:50
@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@riversand963 riversand963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't check in detail, but removing deprecated functionality which is already inaccessible in public API should be safe, assuming we have exercised good coding conventions so that all necessary changes are caught by compiler and tests.

pdillinger added a commit to pdillinger/rocksdb that referenced this pull request Jun 17, 2022
Summary: .. between facebook#10184 and facebook#10122 not detected by source control,
leading to non-compiling code.

Test Plan: updated test
facebook-github-bot pushed a commit that referenced this pull request Jun 17, 2022
Summary:
.. between #10184 and #10122 not detected by source control,
leading to non-compiling code.

Pull Request resolved: #10192

Test Plan: updated test

Reviewed By: hx235

Differential Revision: D37231921

Pulled By: pdillinger

fbshipit-source-id: fa21488716f4c006b111b8c4127d71c757c935c3
pdillinger added a commit to pdillinger/rocksdb that referenced this pull request Jun 17, 2022
Summary: Ribbon micro-bench needs updating after re-numbering
`BloomLikeFilterPolicy::GetAllFixedImpls()` entries.

Also fixed memory leaks while using ASAN to validate my fix. (I assume
the leaks weren't intentional for some performance characteristic.)

Test Plan: run with ASAN
facebook-github-bot pushed a commit that referenced this pull request Jun 17, 2022
Summary:
Ribbon micro-bench needs updating after re-numbering
`BloomLikeFilterPolicy::GetAllFixedImpls()` entries. (CircleCI nightly
failure.)

Also fixed memory leaks while using ASAN to validate my fix. (I assume
the leaks weren't intentional for some performance characteristic.)

Pull Request resolved: #10195

Test Plan: run with ASAN

Reviewed By: jay-zhuang

Differential Revision: D37244459

Pulled By: pdillinger

fbshipit-source-id: 5a363e10de3c4c9c88099c937e3dc3b4cf24fd30
@gitbw95 gitbw95 mentioned this pull request Jun 27, 2022
gitbw95 pushed a commit to gitbw95/rocksdb that referenced this pull request Jun 28, 2022
Summary:
In facebook#9535, release 7.0, we hid the old block-based filter from being created using
the public API, because of its inefficiency. Although we normally maintain read compatibility
on old DBs forever, filters are not required for reading a DB, only for optimizing read
performance. Thus, it should be acceptable to remove this code and the substantial
maintenance burden it carries as useful features are developed and validated (such
as user timestamp).

This change completely removes the code for reading and writing the old block-based
filters, net removing about 1370 lines of code no longer needed. Options removed from
testing / benchmarking tools. The prior existence is only evident in a couple of places:
* `CacheEntryRole::kDeprecatedFilterBlock` - We can update this public API enum in
a major release to minimize source code incompatibilities.
* A warning is logged when an old table file is opened that used the old block-based
filter. This is provided as a courtesy, and would be a pain to unit test, so manual testing
should suffice. Unfortunately, sst_dump does not tell you whether a file uses
block-based filter, and the structure of the code makes it very difficult to fix.
* To detect that case, `kObsoleteFilterBlockPrefix` (renamed from `kFilterBlockPrefix`)
for metaindex is maintained (for now).

Other notes:
* In some cases where numbers are associated with filter configurations, we have had to
update the assigned numbers so that they all correspond to something that exists.
* Fixed potential stat counting bug by assuming `filter_checked = false` for cases
like `filter == nullptr` rather than assuming `filter_checked = true`
* Removed obsolete `block_offset` and `prefix_extractor` parameters from several
functions.
* Removed some unnecessary checks `if (!table_prefix_extractor() && !prefix_extractor)`
because the caller guarantees the prefix extractor exists and is compatible

Pull Request resolved: facebook#10184

Test Plan:
tests updated, manually test new warning in LOG using base version to
generate a DB

Reviewed By: riversand963

Differential Revision: D37212647

Pulled By: pdillinger

fbshipit-source-id: 06ee020d8de3b81260ffc36ad0c1202cbf463a80
gitbw95 pushed a commit to gitbw95/rocksdb that referenced this pull request Jun 28, 2022
Summary:
.. between facebook#10184 and facebook#10122 not detected by source control,
leading to non-compiling code.

Pull Request resolved: facebook#10192

Test Plan: updated test

Reviewed By: hx235

Differential Revision: D37231921

Pulled By: pdillinger

fbshipit-source-id: fa21488716f4c006b111b8c4127d71c757c935c3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants