New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite memory-charging feature's option API #9926
Conversation
ae0463f
to
5c9d23d
Compare
e366b28
to
f01eb04
Compare
1bfca26
to
750f0f2
Compare
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There's a fairly high chance the future memory tracking configuration is more complex than a bool. For that reason I'd wrap the bool in a struct and put the struct in the map's value. Assuming we do that, the bool becomes inflexible because users must decide tracking even if they really only want to override some other setting. For that reason I'd make an |
Got you - let me look into that! |
I think a lot of the changes here are based on an implementation distinction that I don't think the user should be exposed to or thinking about: pinned vs. tracked. For the user, things that use memory in RocksDB have one of four treatments:
For example, if we decide to make table readers ejectable under block cache, the old state of affairs should not be "baked in" to the API. |
@hx235 has updated the pull request. You must reimport the pull request before landing. |
Just CI - not ready for review. |
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has updated the pull request. You must reimport the pull request before landing. |
db/db_sst_test.cc
Outdated
const auto iter_1 = | ||
table_options.cache_usage_options.options_overrides.find( | ||
CacheEntryRole::kBlockBasedTableReader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
options_overrides_iter
may be a more conventional name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed (it was chosen to quickly hack a PR to benchmark)
db/db_sst_test.cc
Outdated
// if (table_options.cache_usage_options | ||
// .options_overrides[static_cast<uint32_t>( | ||
// CacheEntryRole::kBlockBasedTableReader)] | ||
// .charged == CacheEntryRoleOptions::Decision::kEnabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of commented code that needs to be deleted (and some still needs to be migrated like in db_stress).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by removing the previous hacking code
include/rocksdb/table.h
Outdated
// `cache_usage_options` allows users to specify the default | ||
// options (`cache_usage_options.default_options`) and the overrides | ||
// options (`cache_usage_options.options_overrides`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default_options
and options_overrides
doesn't feel like a perfect match to me since options_overrides
sounds like things that'd override options
. Some pairs that I like:
options
andoptions_overrides
base_options
and (role_to_options
oroptions_by_role
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go with the first suggestion - thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
include/rocksdb/table.h
Outdated
// For a certain `CacheEntryRole r` and a certain feature `f` of | ||
// `CacheEntryRoleOptions`: | ||
// (a) If `options_overrides[static_cast<uint32_t>(role)].f != | ||
// Decision::kFallback`, then it will be used to override the `Decision` value | ||
// of `cache_usage_options.default_options.f`. | ||
// | ||
// Example: | ||
// (a.1) | ||
// `options_overrides[static_cast<uint32_t>(CacheEntryRole::kBlockBasedTableReader)] | ||
// = {.charged = Decision::kEnabled}` | ||
// (a.2) `cache_usage_options.default_options.charged = Decision::kFallback` | ||
// (a.1) + (a.2) => the `CacheEntryRoleOptions::charged` | ||
// feature will be enabled for `CacheEntryRole::kBlockBasedTableReader` | ||
// | ||
// (b) If `options_overrides[static_cast<uint32_t>(role)].f == | ||
// Decision::kFallback`, then the `Decision` value of | ||
// `cache_usage_options.default_options.f` will be used. If this default value | ||
// is set to `Decision::kFallback`, then the existing behavior will be in | ||
// effect. See "Features" below for the existing behaviors. | ||
// | ||
// Example: | ||
// (b.1) | ||
// `options_overrides[static_cast<uint32_t>(CacheEntryRole::kBlockBasedTableReader)] | ||
// = {.charged = Decision::kFallback}` | ||
// (b.2) `cache_usage_options.default_options.charged = Decision::kFallback` | ||
// (b.1) + (b.2) => the `CacheEntryRoleOptions::charged` feature will fallback | ||
// to the existing behavior for `CacheEntryRole::kBlockBasedTableReader` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler to list the items in order of preference?
- If
options_overrides
has an entry forrole
andoptions_overrides[role].f != kFallback
, we useoptions_overrides[role].f
- Otherwise, if
options[role].f != kFallback
, we useoptions[role].f
- Otherwise, we follow the compatible existing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is basically a version of this for std::array (where we don't have "If options_overrides has an entry for role"). But my example and description might be too wordy so let me fix that with your style of listing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(And I realized your suggestion has the merit in " if options[role].f != kFallback, we use options[role].f" which was what I didn't have in my doc)
include/rocksdb/table.h
Outdated
// (2) CacheEntryRole::kFilterConstruction : kDisabled | ||
// If enabled, charge memory usage of Bloom Filter (format_version >= 5) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is describing existing behavior I'd say something like "Same as kDisabled - does not charge memory usage of ...".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
tools/db_bench_tool.cc
Outdated
block_based_options.cache_usage_options.options_overrides.insert( | ||
{CacheEntryRole::kFilterConstruction, | ||
{.charged = FLAGS_charge_filter_construction | ||
? CacheEntryRoleOptions::Decision::kEnabled | ||
: CacheEntryRoleOptions::Decision::kDisabled}}); | ||
block_based_options.cache_usage_options.options_overrides.insert( | ||
{CacheEntryRole::kFilterConstruction, | ||
{.charged = FLAGS_charge_filter_construction | ||
? CacheEntryRoleOptions::Decision::kEnabled | ||
: CacheEntryRoleOptions::Decision::kDisabled}}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two are the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
table/table_test.cc
Outdated
table_options.cache_usage_options.options_overrides.insert( | ||
{CacheEntryRole::kCompressionDictionaryBuildingBuffer, | ||
{.charged = CacheEntryRoleOptions::Decision::kEnabled}}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed? It's the default behavior to charge for it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not needed - it was there instead of a comment like "CacheEntryRoleOptions::charged
is enabled by default for
CacheEntryRole::kCompressionDictionaryBuildingBuffer".
But I am removing it here as suggested to for more testing coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
table/block_based/filter_policy.cc
Outdated
iter != context.table_options.cache_usage_options.options_overrides.end() | ||
? iter->second.charged == CacheEntryRoleOptions::Decision::kEnabled | ||
: context.table_options.cache_usage_options.default_options.charged == | ||
CacheEntryRoleOptions::Decision::kEnabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This expression scares me, originally because of its length/density and later because it appears to be assigning a temporary boolean to a reference. I guess the const bool&
's lifetime will be extended and it's indeed correct but it is too much extra difficulty to understand. Can we add an if-else and remove the reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by comparing .charged with CacheEntryRoleOptions::Decision instead of bool.
3199ea9
to
c2b2610
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
Not ready for review - just going through CI |
c2b2610
to
ac0e607
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
ac0e607
to
64d3b21
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
64d3b21
to
f13d55d
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@ajkr ready for review again! |
f13d55d
to
6c3baad
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for introducing this block cache configuration API!
include/rocksdb/table.h
Outdated
// reader (i.e, BlockBasedTableOptions::cache_index_and_filter_blocks == | ||
// false) | ||
// 3. Some internal data structures | ||
// 4. More to come... | ||
// + some internal data structures during table reader creation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- some internal data structures during table reader creation.
If we want to mention "some internal data structures" can it be moved together with "table properties + index block/filter block/uncompression dictionary"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
include/rocksdb/table.h
Outdated
// (1) CacheEntryRole::kCompressionDictionaryBuildingBuffer : kEnabled | ||
// Charge memory usage of the buffered data used as training samples for | ||
// dictionary compression. | ||
// If such memory usage exceeds the avaible space left in the block cache | ||
// at some point (i.e, causing a cache full under | ||
// LRUCacheOptions::strict_capacity_limit = true), construction will fall back | ||
// to Bloom Filter. | ||
// | ||
// Default: false | ||
bool reserve_table_builder_memory = false; | ||
|
||
// If true, a dynamically updating charge to block cache, loosely based | ||
// on the actual memory usage of table reader, will occur to account | ||
// the memory, if block cache available. | ||
// | ||
// Charged memory usage includes: | ||
// 1. Table properties | ||
// 2. Index block/Filter block/Uncompression dictionary if stored in table | ||
// `LRUCacheOptions::strict_capacity_limit` = true), the data will then be | ||
// unbuffered. | ||
// | ||
// (2) CacheEntryRole::kFilterConstruction : kDisabled | ||
// Same as kDisabled - does not charge memory usage of Bloom Filter | ||
// (format_version >= 5) and Ribbon Filter construction. If enabled and if | ||
// additional temporary memory of Ribbon Filter exceeds the avaible space left | ||
// in the block cache at some point (i.e, causing a cache full under | ||
// `LRUCacheOptions::strict_capacity_limit` = true), construction will fall | ||
// back to Bloom Filter. | ||
// | ||
// (3) CacheEntryRole::kBlockBasedTableReader : kDisabled | ||
// Same as kDisabled - does not charge memory usage of table properties + | ||
// index block/filter block/uncompression dictionary when stored in table | ||
// reader (i.e, BlockBasedTableOptions::cache_index_and_filter_blocks == | ||
// false) | ||
// 3. Some internal data structures | ||
// 4. More to come... | ||
// + some internal data structures during table reader creation. | ||
// If enabled and if such a table reader exceeds | ||
// the avaible space left in the block cache at some point (i.e, causing | ||
// a cache full under `LRUCacheOptions::strict_capacity_limit` = true), | ||
// creation will fail with Status::MemoryLimit(). | ||
// | ||
// (4) Other CacheEntryRole : Not supported | ||
// `Status::kNotSupported` will be returned if | ||
// `CacheEntryRoleOptions::charged` is set to {`kEnabled`, `kDisabled`}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm this feels like it contains instructions on how to use it that don't really belong in an "Existing behavior:" subsection. Perhaps we can make these subitems directly under "Memory charging to block cache". Each subitem can have three bullet points - what does kDisabled mean, what does kEnabled mean, and what is the compatible existing behavior (e.g., "Same as kDisabled"). Also can we label the subitems like (a), (b), (c), as I initially thought this is a totally separate list from the one starting with 1.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
CacheEntryRole::kBlockBasedTableReader); | ||
const auto table_reader_charged = | ||
options_overrides_iter != | ||
table_options_.cache_usage_options.options_overrides.end() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The InitializeOptions() logic looks like it populates entries for every CacheEntryRole
including filling in fallbacks to default options. In that case can we use table_options_.cache_usage_options.options_overrides.at(CacheEntryRole::kBlockBasedTableReader).charged
to get the final result directly? (Same question applies to similar code like for calculating filter_construction_charged
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For CacheEntryRole::kBlockBasedTableReader
, yeah we should remove (and already fixed it)
For filter_construction_charged
, some tests directly initialize with FilterBitsBuilder without going through InitializeOptions like https://github.com/facebook/rocksdb/blob/7.2.fb/util/bloom_test.cc#L301 so we can't do so without fixing a lot of the tests (so I will leave it as it is)
For dictionary compression buffer, still going through CI check and see if there is any test we need to fix for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok with me. Sounds like an issue with the tests if they are using internal APIs in a way that could not happen in non-test code.
6c3baad
to
b7306e4
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
b7306e4
to
6a51af1
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: **Context/Summary:** #9926 removed inefficient `reserve*` option API but forgot to mark them deprecated in `block_based_table_type_info` for compatible table format. Pull Request resolved: #10016 Test Plan: build-format-compatible Reviewed By: pdillinger Differential Revision: D36484247 Pulled By: hx235 fbshipit-source-id: c41b90cc99fb7ab7098934052f0af7290b221f98
Context:
Previous PR #9748, #9073, #8428 added separate flag for each charged memory area. Such API design is not scalable as we charge more and more memory areas. Also, we foresee an opportunity to consolidate this feature with other cache usage related features such as
cache_index_and_filter_blocks
usingCacheEntryRole
.Therefore we decided to consolidate all these flags with
CacheUsageOptions cache_usage_options
and this PR serves as the first step by consolidating memory-charging related flags.Summary:
kCompressionDictionaryBuildingBuffer
opt-out and added a unit test for thatTest plan:
kCompressionDictionaryBuildingBuffer
TEST_F(ChargeCompressionDictionaryBuildingBufferTest, Basic)
TEST_F(CacheUsageOptionsOverridesTest, SanitizeAndValidateOptions)
TEST_TMPDIR=/dev/shm/testdb ./db_bench -benchmarks=fillseq -db=$TEST_TMPDIR -charge_compression_dictionary_building_buffer=1(remove this for comparison) -compression_max_dict_bytes=10000 -disable_auto_compactions=1 -write_buffer_size=100000 -num=4000000 | egrep 'fillseq'
python3 tools/db_crashtest.py blackbox -charge_compression_dictionary_building_buffer=1 -charge_filter_construction=1 -charge_table_reader=1 -cache_size=1
killed as normal