Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid recompressing cold block in CompressedSecondaryCache #10527

Closed
wants to merge 56 commits into from

Conversation

gitbw95
Copy link
Contributor

@gitbw95 gitbw95 commented Aug 14, 2022

Summary:
When a block is firstly Lookup from the secondary cache, we just insert a dummy block in the primary cache (charging the actual size of the block) and don’t erase the block from the secondary cache. A standalone handle is returned from Lookup. Only if the block is hit again, we erase it from the secondary cache and add it into the primary cache.

When a block is firstly evicted from the primary cache to the secondary cache, we just insert a dummy block (size 0) in the secondary cache. When the block is evicted again, it is treated as a hot block and is inserted into the secondary cache.

Implementation Details
Add a new state of LRUHandle: The handle is never inserted into the LRUCache (both hash table and LRU list) and it doesn't experience the above three states. The entry can be freed when refs becomes 0. (refs >= 1 && in_cache == false && IS_STANDALONE == true)

The behaviors of LRUCacheShard::Lookup() are updated if the secondary_cache is CompressedSecondaryCache:

  1. If a handle is found in primary cache:
    1.1. If the handle's value is not nullptr, it is returned immediately.
    1.2. If the handle's value is nullptr, this means the handle is a dummy one. For a dummy handle, if it was retrieved from secondary cache, it may still exist in secondary cache.
    • 1.2.1. If no valid handle can be Lookup from secondary cache, return nullptr.
    • 1.2.2. If the handle from secondary cache is valid, erase it from the secondary cache and add it into the primary cache.
  2. If a handle is not found in primary cache:
    2.1. If no valid handle can be Lookup from secondary cache, return nullptr.
    2.2. If the handle from secondary cache is valid, insert a dummy block in the primary cache (charging the actual size of the block) and return a standalone handle.

The behaviors of LRUCacheShard::Promote() are updated as follows:

  1. If e->sec_handle has value, one of the following steps can happen:
    1.1. Insert a dummy handle and return a standalone handle to caller when secondary_cache_ is CompressedSecondaryCache and e is a standalone handle.
    1.2. Insert the item into the primary cache and return the handle to caller.
    1.3. Exception handling.
  2. If e->sec_handle has no value, mark the item as not in cache and charge the cache as its only metadata that'll shortly be released.

The behavior of CompressedSecondaryCache::Insert() is updated:

  1. If a block is evicted from the primary cache for the first time, a dummy item is inserted.
  2. If a dummy item is found for a block, the block is inserted into the secondary cache.

The behavior of CompressedSecondaryCache:::Lookup() is updated:

  1. If a handle is not found or it is a dummy item, a nullptr is returned.
  2. If erase_handle is true, the handle is erased.

The behaviors of LRUCacheShard::Release() are adjusted for the standalone handles.

Test Plan:

  1. stress tests.
  2. unit tests.
  3. CPU profiling for db_bench.

@gitbw95 gitbw95 force-pushed the avoid_compress_cold_block_1 branch from da1ef91 to 412e4dc Compare August 14, 2022 06:55
@gitbw95 gitbw95 changed the title [Testing ]Avoid recompressing cold block [WIP ]Avoid recompressing cold block in CompressedSecondaryCache Aug 16, 2022
@gitbw95 gitbw95 marked this pull request as ready for review August 16, 2022 15:53
@gitbw95 gitbw95 requested a review from anand1976 August 16, 2022 15:53
@facebook-github-bot
Copy link
Contributor

@gitbw95 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@gitbw95
Copy link
Contributor Author

gitbw95 commented Aug 16, 2022

As discussed offline, I may do the following changes:

  1. Remove WAS_IN_SECONDARY_CACHE from LRUHandle. Add one param into Promote() to carry this information.
  2. Charge the capacity of secondary cache for the standalone handle.
  3. Find a way to make sure the behavior of current secondary cache implementation (e.g. CacheLibWrapper) is intact.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

4 similar comments
@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

15 similar comments
@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@gitbw95 gitbw95 changed the title [WIP ]Avoid recompressing cold block in CompressedSecondaryCache Avoid recompressing cold block in CompressedSecondaryCache Aug 22, 2022
@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@@ -44,6 +46,16 @@ class CompressedSecondaryCacheResultHandle : public SecondaryCacheResultHandle {

// The CompressedSecondaryCache is a concrete implementation of
// rocksdb::SecondaryCache.
// When a block is firstly Lookup from CompressedSecondaryCache, we just
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"firstly Lookup" gives me a feeling that no matter whether we find a block from compressed cache or not, we would insert a dummy block. Perhaps clarify it.

//
// When a block is firstly evicted from the primary cache to
// CompressedSecondaryCache, we just insert a dummy block (size 0) in
// CompressedSecondaryCache. When the block is evicted again, it is treated as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should clarify "is evicted again". Only if it is evicted again before the dummy block is evicted from the cache, we insert it.

512, 640, 768, 896, 1024, 1280, 1536, 1792, 2048, 2560, 3072,
3584, 4096, 5120, 6144, 7168, 8192, 10240, 12288, 14336, 16384, 32768};
static constexpr std::array<uint16_t, 8> malloc_bin_sizes_{
128, 256, 512, 1024, 2048, 4096, 8192, 16384};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the change related?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The split and merge functions are not included in Lookup() and Insert() now.
I will revert this change.


Status Insert(const Slice& key, void* value,
const Cache::CacheItemHelper* helper) override;

std::unique_ptr<SecondaryCacheResultHandle> Lookup(
const Slice& key, const Cache::CreateCallback& create_cb, bool /*wait*/,
bool& is_in_sec_cache) override;
bool advise_erase, bool& is_in_sec_cache) override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need comment here to explain the requirement for the Lookup() implementation by advise_erase.


if ((usage_ + e->total_charge) > capacity_ && strict_capacity_limit_) {
e->Unref();
e->Free();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I missed something but in my impression we don't call Free() in the mutex.

.PermitUncheckedError();
}
entry->Free();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole for-loop must be copied from somewhere else. Can we define a helper function and call it once?

@@ -463,34 +514,59 @@ Cache::Handle* LRUCacheShard::Lookup(
const ShardedCache::CreateCallback& create_cb, Cache::Priority priority,
bool wait, Statistics* stats) {
LRUHandle* e = nullptr;
bool erase_handle_in_sec_cache{false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be easier to understand if it is called something like "found_dummy_entry".

"k1", test_item_creator, true, /*advise_erase=*/false, is_in_sec_cache);
ASSERT_EQ(handle1, nullptr);

// Insert k2 and k1 is evicted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, we should also cover the case where, re-inserting k1 will insert a dummy handle, rather than a compressed block. Ignore this comment if the scenario is covered somewhere else.

// k1's item and k2's dummy item.
ASSERT_OK(cache->Insert(
"k2", item2_2, &CompressedSecondaryCacheTest::helper_, str2.length()));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see all those scenarios are covered in the unit test!

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@gitbw95 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gitbw95 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@gitbw95 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants