Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Charge blob cache usage against the global memory limit #10321

Closed
wants to merge 27 commits into from

Conversation

gangliao
Copy link
Contributor

@gangliao gangliao commented Jul 7, 2022

Summary:

To help service owners to manage their memory budget effectively, we have been working towards counting all major memory users inside RocksDB towards a single global memory limit (see e.g. https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager#cost-memory-used-in-memtable-to-block-cache). The global limit is specified by the capacity of the block-based table's block cache, and is technically implemented by inserting dummy entries ("reservations") into the block cache. The goal of this task is to support charging the memory usage of the new blob cache against this global memory limit when the backing cache of the blob cache and the block cache are different.

This PR is a part of #10156

include/rocksdb/cache.h Outdated Show resolved Hide resolved
@gangliao
Copy link
Contributor Author

gangliao commented Jul 7, 2022

If we can charge blob cache usage against the global memory limit, we can also do that for blob file cache. But it's no particular urgency since the blob file cache is pretty small.

@gangliao gangliao mentioned this pull request Jul 8, 2022
14 tasks
Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @gangliao !

Comment on lines 63 to 71
Status s = GetShard(Shard(hash))
->Insert(key, hash, value, charge, deleter, handle, priority);
if (s.ok() && cache_res_mgr_) {
// Insert may cause the cache entry eviction if the cache is full. So we
// directly call the reservation manager to update the total memory used
// in the cache.
cache_res_mgr_->UpdateCacheReservation(GetUsage()).PermitUncheckedError();
}
return s;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding the cache reservation manager directly to ShardedCache, we could consider moving this logic a separate class which also implements the Cache interface, wraps another Cache, takes care of the cache reservations and forwards calls to the wrapped cache. This would be a more decoupled design, and would also make the charging logic work with any custom cache implementations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to add some validation to ensure that such a "cache reservation aware" cache is configured as a blob cache if and only if charging is enabled for CacheEntryRole::kBlobCache.

Copy link
Contributor

@hx235 hx235 Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if and only if charging is enabled for CacheEntryRole::kBlobCache.

If we do open up API usage like https://github.com/facebook/rocksdb/pull/10321/files#diff-5c4ced6afb6a90e27fec18ab03b2cd89e8f99db87791b4ecc6fa2694284d50c0R2921-R2925 to users, update to BlockBasedTableOptions::cache_usage_options API comment is also needed.

Also, should we consider early option validation like https://github.com/facebook/rocksdb/blob/7.4.fb/table/block_based/block_based_table_factory.cc#L709-L714 but for kBlobCache e.g, no blob cache is specified or blob cache == block cache when CacheEntryRole::kBlobCache is enabled?

Copy link
Contributor Author

@gangliao gangliao Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding the cache reservation manager directly to ShardedCache, we could consider moving this logic a separate class which also implements the Cache interface, wraps another Cache, takes care of the cache reservations and forwards calls to the wrapped cache. This would be a more decoupled design, and would also make the charging logic work with any custom cache implementations.

Where is this cache wrapper created? It looks like the user can create it and pass it to options.blob_cache?
In this case, it's hard to add some validation. The way @hx235 said above is feasible, even if the options.blob_cache settings here are wrong, we are able to detect anomalies from option validation in block_based_table_factory and throw an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we consider early option validation like https://github.com/facebook/rocksdb/blob/7.4.fb/table/block_based/block_based_table_factory.cc#L709-L714 but for kBlobCache e.g, no blob cache is specified or blob cache == block cache when CacheEntryRole::kBlobCache is enabled?

Yes, and I suppose we could also do the validation I mentioned here (i.e. make sure that options.blob_cache is the above-mentioned special type of cache iff the blob cache is charged to the block cache).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this cache wrapper created? It looks like the user can create it and pass it to options.blob_cache? In this case, it's hard to add some validation. The way @hx235 said above is feasible, even if the options.blob_cache settings here are wrong, we are able to detect anomalies from option validation in block_based_table_factory and throw an error.

Yes, it would be created by the application and assigned to options.blob_cache. During validation, we could use e.g. dynamic_cast to ensure the application configured the right type of cache.

Copy link
Contributor Author

@gangliao gangliao Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like dynamic_cast is not allowed in rocksdb. I used a different approach here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm kind of having second thoughts re: having the application create the wrapper. I'm wondering if it would be better if we created it e.g. in BlobSource ? I feel it would be more user-friendly and would hide this implementation detail. The downside of doing so would be that the book-keeping would be bypassed if the application manipulates the underlying cache directly but that's not something they should be doing anyways...

Copy link
Contributor Author

@gangliao gangliao Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I also wanted to create a wrapper in the blob source at first. But if we create the cache wrapper in the blob source, how to ensure that rocksdb will call the cache wrapper's release or erase. I expect this eviction will happen automatically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something but I think if we turned BlobSource::blob_cache_ into the wrapper, it would work pretty much out of the box. BlobSource would be interacting with the wrapper, and when we transfer the cache handle to PinnableSlice, it would carry the address of the wrapper cache, no?

db/blob/blob_source_test.cc Outdated Show resolved Hide resolved
db/blob/blob_source_test.cc Outdated Show resolved Hide resolved
db/blob/blob_source_test.cc Outdated Show resolved Hide resolved
db/blob/blob_source.h Outdated Show resolved Hide resolved
@hx235 hx235 self-requested a review July 14, 2022 00:12
Copy link
Contributor

@hx235 hx235 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly recalled from our last discussion that (1) we will not make change to sharded_cache level since it's mainly for cache sharing and (2) we will not make change to our public interface cache.h

Any further discussion that I am not part of on why (1) and (2) are both violated in this PR?

The level of abstraction we looked at was lru_cache.h although it might suffer from being limited to only lru cache implementation (which was fine to me)

But I like what @ltamasi has mentioned about the wrapper idea above - which may even overcome this limitation.

@gangliao
Copy link
Contributor Author

gangliao commented Jul 14, 2022

I briefly recalled from our last discussion that (1) we will not make change to sharded_cache level since it's mainly for cache sharing and (2) we will not make change to our public interface cache.h Any further discussion that I am not part of on why (1) and (2) are both violated in this PR?

This is because we only have Cache* type for blob cache. dynamic_cast it to lru cache in blob source is not very general for long term goal. So here I use virtual function polymorphism.

The level of abstraction we looked at was lru_cache.h although it might suffer from being limited to only lru cache implementation.

class LRUCache
#ifdef NDEBUG
    final
#endif
    : public ShardedCache {

LRUCache does not overload those functions: Insert, Release, Erase,...

The real impl is LRUCacheShard final : public CacheShard, but CacheShard is not ShardedCache. Yes, the class names around here are confusing.

@gangliao
Copy link
Contributor Author

I like what @ltamasi has mentioned about the wrapper idea above - which may solve this limitation.

I agree, that's a good idea!

@hx235
Copy link
Contributor

hx235 commented Jul 14, 2022

This is because we only have Cache* type for blob cache.

Oh okay - then it seems wrapper approach will saves us from needing to set CRM directly and change public interface.

@gangliao gangliao force-pushed the new_global_limit branch 2 times, most recently from 8618c17 to 558a677 Compare July 14, 2022 21:53
@gangliao gangliao requested a review from hx235 July 15, 2022 00:07
@facebook-github-bot
Copy link
Contributor

@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@gangliao has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @gangliao !! Looks pretty good! Have some mostly minor comments, please see below

HISTORY.md Outdated Show resolved Hide resolved
cache/charged_cache.h Outdated Show resolved Hide resolved
cache/charged_cache.cc Outdated Show resolved Hide resolved
cache/charged_cache.h Outdated Show resolved Hide resolved
cache/charged_cache.h Show resolved Hide resolved
db/blob/blob_source_test.cc Outdated Show resolved Hide resolved
include/rocksdb/cache.h Outdated Show resolved Hide resolved
table/block_based/block_based_table_factory.cc Outdated Show resolved Hide resolved
tools/db_bench_tool.cc Outdated Show resolved Hide resolved
db_stress_tool/db_stress_test_base.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@gangliao has updated the pull request. You must reimport the pull request before landing.

cache/charged_cache.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@gangliao has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@gangliao has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants