Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1/4][PerfImprove][ResourceMngmt] Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for Full/PartitionedFilter #9070

Closed

Conversation

hx235
Copy link
Contributor

@hx235 hx235 commented Oct 24, 2021

Note: This PR is the 1st part of a bigger PR stack (#9073).

Context:
Previously, the payload (i.e, filter data) within BlockBasedTableBuilder::Rep::FilterBlockBuilder object is not deallocated until BlockBasedTableBuilder is deallocated, despite it is no longer useful after its related filter_content being written.

Summary:

  • Transferred the payload (i.e, the filter data) out of BlockBasedTableBuilder::Rep::FilterBlockBuilder object
  • For PartitionedFilter:
    • Unified filters and filter_gc lists into one std::deque<FilterEntry> filters by adding a new field last_filter_entry_key and storing the std::unique_ptr filter_data with the Slice filter in the same entry
    • Reset last_filter_data in the case where filters is empty, which should be as by then we would've finish using all the Slice filter
  • Deallocated the payload by going out of scope as soon as we're done with using the filter_content associated with the payload
  • This is an internal interface change at the level of FilterBlockBuilder::Finish(), which leads to touching the inherited interface in BlockBasedFilterBlockBuilder. But for that, the payload transferring is ignored.

Test:

  • The main focus is to catch segment fault error during FilterBlockBuilder::Finish() and BlockBasedTableBuilder::Finish() and interface mismatch. Relying on existing CI tests is enough as assert(false) was temporarily added to verify the new logic of transferring ownership indeed run

@hx235 hx235 changed the title [BugFix] Early deallocation of BlockBasedTableBuilder::Rep::FilterBlockBuilder [1/3][BugFix] Early deallocation of BlockBasedTableBuilder::Rep::FilterBlockBuilder Oct 24, 2021
@hx235 hx235 changed the title [1/3][BugFix] Early deallocation of BlockBasedTableBuilder::Rep::FilterBlockBuilder [1/3][BugFix] Deallocate BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier Oct 24, 2021
@hx235 hx235 requested a review from pdillinger October 24, 2021 05:24
@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

HISTORY.md Outdated
@@ -10,6 +10,7 @@
* Fixed a bug where stalled writes would remain stalled forever after the user calls `WriteBufferManager::SetBufferSize()` with `new_size == 0` to dynamically disable memory limiting.
* Make `DB::close()` thread-safe.
* Fix a bug in atomic flush where one bg flush thread will wait forever for a preceding bg flush thread to commit its result to MANIFEST but encounters an error which is mapped to a soft error (DB not stopped).
* Fix a bug in `BlockBasedTableBuilder` where `BlockBasedTableBuilder::Rep::FilterBlockBuilder` object is not deallocated until `BlockBasedTableBuilder` is deallocated, despite the fact that it is no longer useful after `BlockBasedTableBuilder::Finish()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be considered a "performance improvement" because the only thing "wrong" before was using more memory than necessary. And the level of detail should be appropriate for someone only interested in public API, not internal details.

@@ -2009,6 +2009,11 @@ Status BlockBasedTableBuilder::Finish() {
if (ok()) {
WriteFooter(metaindex_block_handle, index_block_handle);
}
if (ok()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch, though I don't think we need to be so concerned about the FilterBlockBuilder object itself as we are about ensuring its payload is deallocated as soon as possible. And that should be done earlier, on call to FilterBlockBuilder::Finish(). Ideally, the FilterBlockBuilder would transfer ownership of the payload (e.g. FullFilterBlock::filter_data_) to the caller of FilterBlockBuilder::Finish() (like how FilterBitsBuilder::Finish() does in an output parameter). That's a small internal refactoring that seems worthwhile.

Btw, .reset() has same effect as .reset(nullptr)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though I don't think we need to be so concerned about the FilterBlockBuilder object itself as we are about ensuring its payload is deallocated as soon as possible. And that should be done earlier, on call to FilterBlockBuilder::Finish()

Good point! I will proceed with the small internal refactoring and then deallocate the payload on call to FilterBlockBuilder::Finish().

@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 changed the title [1/3][BugFix] Deallocate BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier [1/3][PerfImprove]Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for FullFilter Oct 26, 2021
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

2 similar comments
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235
Copy link
Contributor Author

hx235 commented Oct 26, 2021

Update:

  • Transferred the payload (i.e, the filter data) out of BlockBasedTableBuilder::Rep::FilterBlockBuilder object by adding optional filter data output parameter to FilterBlockBuilder::Finish()
  • Updated HISTORY.md, code comments, PR description
  • Ready for review after signals are cleared
  • Will reach out for some tips on quantifying the claimed memory saving (if needed)

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@hx235 hx235 changed the title [1/3][PerfImprove]Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for FullFilter [1/3][PerfImprove] Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for FullFilter Oct 26, 2021
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise seems reasonable :)

// Deallocate the filter data payload of FilterBlockBuilder when done
// using it to release memory. Otherwise, it will remain until
// BlockBasedTableBuilder is deallocated.
if (filter_data) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a free before use (use after free) because filter_content should be referring to filter_data. So you can just let RAII free filter_data automatically when it goes out of scope.

I wonder why this was not caught by ASAN run. That could indicate a bug that neither of us sees. One way to verify your new logic is actually being exercised it to throw in a temporary assert(filter_data == nullptr); and make sure it crashes some tests.

You can put a short comment on filter_data, or not a big deal if adding comment to Finish().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why this was not caught by ASAN run.

It turns out to be a bug of mine that prevents this new logic from being run. Rather silly but previous impl in 084ea78 accidentally puts ownership transferring after return statement.

Valuable lessons learned:

  1. Make sure new logic being run by existing tests before relying on them by deliberately crashing some tests
  2. Always remember lesson 1 🙃

virtual Slice Finish(const BlockHandle& tmp, Status* status) = 0;
virtual Slice Finish(
const BlockHandle& tmp, Status* status,
std::unique_ptr<const char[]>* filter_data = nullptr) = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think filter_data deserves a comment here (as well as const BlockHandle& which is very confusing in this context), like

 If filter_data is not nullptr, Finish() may transfer ownership of block data to the caller, so that it can be freed as soon as possible.

@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 marked this pull request as draft October 26, 2021 19:43
@hx235
Copy link
Contributor Author

hx235 commented Oct 26, 2021

Update:

  • not ready for review
  • currently debugging by crashing some tests and adding better test to verify my logic

@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the early_deallocate_filter_block_builder branch from 41e149c to 6ed0f9a Compare October 26, 2021 21:22
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@hx235
Copy link
Contributor Author

hx235 commented Oct 26, 2021

Update:

  • Fixed a bug of mine that prevents new logic from being run and tested
  • Fixed another free-before-use bug suggested by the comment and verified by testing after fixing the bug above
  • Clarified FilterBlockBuilder::Finish() interface
  • Updated PR summary to reflect when we can deallocate the payload and the new test plan
  • Ready for review after signals are cleared

@hx235 hx235 marked this pull request as ready for review October 26, 2021 23:08
@hx235 hx235 requested a review from pdillinger October 26, 2021 23:08
@hx235 hx235 force-pushed the early_deallocate_filter_block_builder branch from f862683 to d80cf48 Compare November 4, 2021 00:02
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@hx235
Copy link
Contributor Author

hx235 commented Nov 4, 2021

Update:

  • Resolved conflict on HISTORY.md

@hx235 hx235 changed the title [1/3][PerfImprove][ResourceMngmt] Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for Full/PartitionedFilter [1/4][PerfImprove][ResourceMngmt] Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for Full/PartitionedFilter Nov 4, 2021
@facebook-github-bot
Copy link
Contributor

@hx235 merged this pull request in 1ababeb.

facebook-github-bot pushed a commit that referenced this pull request Nov 18, 2021
#9073)

Summary:
Note: This PR is the 4th part of a bigger PR stack (#9073) and will rebase/merge only after the first three PRs (#9070, #9071, #9130) merge.

**Context:**
Similar to #8428, this PR is to track memory usage during (new) Bloom Filter (i.e,FastLocalBloom) and Ribbon Filter (i.e, Ribbon128) construction, moving toward the goal of [single global memory limit using block cache capacity](https://github.com/facebook/rocksdb/wiki/Projects-Being-Developed#improving-memory-efficiency). It also constrains the size of the banding portion of Ribbon Filter during construction by falling back to Bloom Filter if that banding is, at some point, larger than the available space in the cache under `LRUCacheOptions::strict_capacity_limit=true`.

The option to turn on this feature is `BlockBasedTableOptions::reserve_table_builder_memory = true` which by default is set to `false`. We [decided](#9073 (comment)) not to have separate option for separate memory user in table building therefore their memory accounting are all bundled under one general option.

**Summary:**
- Reserved/released cache for creation/destruction of three main memory users with the passed-in `FilterBuildingContext::cache_res_mgr` during filter construction:
   - hash entries (i.e`hash_entries`.size(), we bucket-charge hash entries during insertion for performance),
   - banding (Ribbon Filter only, `bytes_coeff_rows` +`bytes_result_rows` + `bytes_backtrack`),
   - final filter (i.e, `mutable_buf`'s size).
      - Implementation details: in order to use `CacheReservationManager::CacheReservationHandle` to account final filter's memory, we have to store the `CacheReservationManager` object and `CacheReservationHandle` for final filter in `XXPH3BitsFilterBuilder` as well as  explicitly delete the filter bits builder when done with the final filter in block based table.
- Added option fo run `filter_bench` with this memory reservation feature

Pull Request resolved: #9073

Test Plan:
- Added new tests in `db_bloom_filter_test` to verify filter construction peak cache reservation under combination of  `BlockBasedTable::Rep::FilterType` (e.g, `kFullFilter`, `kPartitionedFilter`), `BloomFilterPolicy::Mode`(e.g, `kFastLocalBloom`, `kStandard128Ribbon`, `kDeprecatedBlock`) and `BlockBasedTableOptions::reserve_table_builder_memory`
  - To address the concern for slow test: tests with memory reservation under `kFullFilter` + `kStandard128Ribbon` and `kPartitionedFilter` take around **3000 - 6000 ms** and others take around **1500 - 2000 ms**, in total adding **20000 - 25000 ms** to the test suit running locally
- Added new test in `bloom_test` to verify Ribbon Filter fallback on large banding in FullFilter
- Added test in `filter_bench` to verify that this feature does not significantly slow down Bloom/Ribbon Filter construction speed. Local result averaged over **20** run as below:
   - FastLocalBloom
      - baseline `./filter_bench -impl=2 -quick -runs 20 | grep 'Build avg'`:
         - **Build avg ns/key: 29.56295** (DEBUG_LEVEL=1), **29.98153** (DEBUG_LEVEL=0)
      - new feature (expected to be similar as above)`./filter_bench -impl=2 -quick -runs 20 -reserve_table_builder_memory=true | grep 'Build avg'`:
         - **Build avg ns/key: 30.99046** (DEBUG_LEVEL=1), **30.48867** (DEBUG_LEVEL=0)
      - new feature of RibbonFilter with fallback  (expected to be similar as above) `./filter_bench -impl=2 -quick -runs 20 -reserve_table_builder_memory=true -strict_capacity_limit=true | grep 'Build avg'` :
         - **Build avg ns/key: 31.146975** (DEBUG_LEVEL=1), **30.08165** (DEBUG_LEVEL=0)

    - Ribbon128
       - baseline `./filter_bench -impl=3 -quick -runs 20 | grep 'Build avg'`:
           - **Build avg ns/key: 129.17585** (DEBUG_LEVEL=1), **130.5225** (DEBUG_LEVEL=0)
       - new feature  (expected to be similar as above) `./filter_bench -impl=3 -quick -runs 20 -reserve_table_builder_memory=true | grep 'Build avg' `:
           - **Build avg ns/key: 131.61645** (DEBUG_LEVEL=1), **132.98075** (DEBUG_LEVEL=0)
       - new feature of RibbonFilter with fallback (expected to be a lot faster than above due to fallback) `./filter_bench -impl=3 -quick -runs 20 -reserve_table_builder_memory=true -strict_capacity_limit=true | grep 'Build avg'` :
          - **Build avg ns/key: 52.032965** (DEBUG_LEVEL=1), **52.597825** (DEBUG_LEVEL=0)
          - And the warning message of `"Cache reservation for Ribbon filter banding failed due to cache full"` is indeed logged to console.

Reviewed By: pdillinger

Differential Revision: D31991348

Pulled By: hx235

fbshipit-source-id: 9336b2c60f44d530063da518ceaf56dac5f9df8e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants