-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Charge blob cache usage against the global memory limit #10321
Conversation
If we can charge blob cache usage against the global memory limit, we can also do that for blob file cache. But it's no particular urgency since the blob file cache is pretty small. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @gangliao !
cache/sharded_cache.cc
Outdated
Status s = GetShard(Shard(hash)) | ||
->Insert(key, hash, value, charge, deleter, handle, priority); | ||
if (s.ok() && cache_res_mgr_) { | ||
// Insert may cause the cache entry eviction if the cache is full. So we | ||
// directly call the reservation manager to update the total memory used | ||
// in the cache. | ||
cache_res_mgr_->UpdateCacheReservation(GetUsage()).PermitUncheckedError(); | ||
} | ||
return s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding the cache reservation manager directly to ShardedCache
, we could consider moving this logic a separate class which also implements the Cache
interface, wraps another Cache
, takes care of the cache reservations and forwards calls to the wrapped cache. This would be a more decoupled design, and would also make the charging logic work with any custom cache implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to add some validation to ensure that such a "cache reservation aware" cache is configured as a blob cache if and only if charging is enabled for CacheEntryRole::kBlobCache
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if and only if charging is enabled for CacheEntryRole::kBlobCache.
If we do open up API usage like https://github.com/facebook/rocksdb/pull/10321/files#diff-5c4ced6afb6a90e27fec18ab03b2cd89e8f99db87791b4ecc6fa2694284d50c0R2921-R2925 to users, update to BlockBasedTableOptions::cache_usage_options API comment is also needed.
Also, should we consider early option validation like https://github.com/facebook/rocksdb/blob/7.4.fb/table/block_based/block_based_table_factory.cc#L709-L714 but for kBlobCache e.g, no blob cache is specified or blob cache == block cache when CacheEntryRole::kBlobCache is enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding the cache reservation manager directly to
ShardedCache
, we could consider moving this logic a separate class which also implements theCache
interface, wraps anotherCache
, takes care of the cache reservations and forwards calls to the wrapped cache. This would be a more decoupled design, and would also make the charging logic work with any custom cache implementations.
Where is this cache wrapper created? It looks like the user can create it and pass it to options.blob_cache
?
In this case, it's hard to add some validation. The way @hx235 said above is feasible, even if the options.blob_cache
settings here are wrong, we are able to detect anomalies from option validation in block_based_table_factory
and throw an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, should we consider early option validation like https://github.com/facebook/rocksdb/blob/7.4.fb/table/block_based/block_based_table_factory.cc#L709-L714 but for kBlobCache e.g, no blob cache is specified or blob cache == block cache when CacheEntryRole::kBlobCache is enabled?
Yes, and I suppose we could also do the validation I mentioned here (i.e. make sure that options.blob_cache
is the above-mentioned special type of cache iff the blob cache is charged to the block cache).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this cache wrapper created? It looks like the user can create it and pass it to
options.blob_cache
? In this case, it's hard to add some validation. The way @hx235 said above is feasible, even if theoptions.blob_cache
settings here are wrong, we are able to detect anomalies from option validation inblock_based_table_factory
and throw an error.
Yes, it would be created by the application and assigned to options.blob_cache
. During validation, we could use e.g. dynamic_cast
to ensure the application configured the right type of cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like dynamic_cast
is not allowed in rocksdb. I used a different approach here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm kind of having second thoughts re: having the application create the wrapper. I'm wondering if it would be better if we created it e.g. in BlobSource
? I feel it would be more user-friendly and would hide this implementation detail. The downside of doing so would be that the book-keeping would be bypassed if the application manipulates the underlying cache directly but that's not something they should be doing anyways...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I also wanted to create a wrapper in the blob source at first. But if we create the cache wrapper in the blob source, how to ensure that rocksdb will call the cache wrapper's release or erase. I expect this eviction will happen automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something but I think if we turned BlobSource::blob_cache_
into the wrapper, it would work pretty much out of the box. BlobSource
would be interacting with the wrapper, and when we transfer the cache handle to PinnableSlice
, it would carry the address of the wrapper cache, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I briefly recalled from our last discussion that (1) we will not make change to sharded_cache level since it's mainly for cache sharing and (2) we will not make change to our public interface cache.h
Any further discussion that I am not part of on why (1) and (2) are both violated in this PR?
The level of abstraction we looked at was lru_cache.h although it might suffer from being limited to only lru cache implementation (which was fine to me)
But I like what @ltamasi has mentioned about the wrapper idea above - which may even overcome this limitation.
This is because we only have
class LRUCache
#ifdef NDEBUG
final
#endif
: public ShardedCache { LRUCache does not overload those functions: Insert, Release, Erase,... The real impl is |
I agree, that's a good idea! |
Oh okay - then it seems wrapper approach will saves us from needing to set CRM directly and change public interface. |
8618c17
to
558a677
Compare
3301840
to
83bdd8d
Compare
@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@gangliao has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @gangliao !! Looks pretty good! Have some mostly minor comments, please see below
@gangliao has updated the pull request. You must reimport the pull request before landing. |
fab20dc
to
6938052
Compare
@gangliao has updated the pull request. You must reimport the pull request before landing. |
@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@gangliao has updated the pull request. You must reimport the pull request before landing. |
@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary:
To help service owners to manage their memory budget effectively, we have been working towards counting all major memory users inside RocksDB towards a single global memory limit (see e.g. https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager#cost-memory-used-in-memtable-to-block-cache). The global limit is specified by the capacity of the block-based table's block cache, and is technically implemented by inserting dummy entries ("reservations") into the block cache. The goal of this task is to support charging the memory usage of the new blob cache against this global memory limit when the backing cache of the blob cache and the block cache are different.
This PR is a part of #10156