Improve LRU cache's performance by acelyc111 · Pull Request #7611 · facebook/rocksdb

acelyc111 · 2020-10-29T16:21:31Z

When LRUCache insert and evict a large number of entries, there are
frequently calls of LRUHandleTable::Remove(e->key, e->hash), it will
lookup the entry in the hash table. Now that we know the entry to
remove 'e', we can remove it directly from hash table's collision list
if it's a double linked list.
This patch refactor the collision list to double linked list, the simple
benchmark CacheTest.SimpleBenchmark shows that time cost reduced about
18% in my test environment.

acelyc111 · 2020-12-24T15:27:20Z

@mrambacher could you please review this PR? thanks!

pdillinger · 2020-12-29T20:04:28Z

If we took the space spent on reverse links and instead dedicated that to reducing load factor on the hash table, how much speed improvement is seen?

Btw, we know a lot of improvement in LRUCache should be possible, especially because it uses a lot of unpredictable branching and indirection.

mrambacher

So I ran some performance tests via db_bench (seekrandom, readrandom) using these changes. The performance improvement is about 1%.

Perf runs of the new versus original code shows:
New:
3.86% 0.12% ShardedCache::Insert
3.52% 0.13% LRUCacheShard::Insert
3.01% 0.02% LRUCacheShard::Lookup
2.83% 2.83% LRUHandleTable::FindPointer
1.66% 0.27% LRUCacheShard::EvictFromLRU
1.34% 1.34% LRUCacheShared::LRU_Remove

Original:
4.53% 0.13% ShardedCache::Insert
4.13% 0.15% LRUCacheShard::Insert
2.00% 0.02% LRUCacheShard::Lookup
2.31% 2.31% LRUHandleTable::FindPointer
1.99% 0.10% LRUCacheShard::EvictFromLRU
1.34% 1.34% LRUCacheShared::LRU_Remove

Note that in this test, perf is showing most of the time is in ZSTD compression.

I think making a change like this is probably worthwhile, but I wonder if the same thing can be accomplished without the extra "prev" pointer. For example, what if FindPointer would return both the current and prev entry -- could you accomplish the same thing? The size of the metadata for the cache entry should be kept to a minimum.

Have you done any experiments or calculations that show how the change in the metadata size affects the number of entries that can be stored in the cache? Are items evicted more frequently and at what rate/size?

I support this performance improvement, but would like to better understand the tradeoffs we are making here.

mrambacher · 2020-12-29T13:53:52Z

cache/cache_hash_table_test.cc

+
+namespace ROCKSDB_NAMESPACE {
+
+class LRUHandleTableTest : public testing::Test {};


Is there a reason not to put these tests in the existing lru_cache_test file?

Just want to keep these tests independent, lru_cache_test for LRUCacheShard and cache_hash_table_test for internal LRUHandleTable.

siying · 2020-12-30T18:40:58Z

Do we supposed to resize the hash table to more than 1.5 X number of elements? There aren't supposed to be many hash table buckets with more than 1-2 elements, right? Can you measure how frequently we remove an elements that is in the third position or lower in your test that shows the improvement?

mrambacher · 2020-12-30T21:31:08Z

Do we supposed to resize the hash table to more than 1.5 X number of elements? There aren't supposed to be many hash table buckets with more than 1-2 elements, right? Can you measure how frequently we remove an elements that is in the third position or lower in your test that shows the improvement?

For kicks, I am running some experiments now to test this out using the seekrandom test. I can see the depth of the list reaching 8 deep. Here is a couple samples of the depths. Each element in depth represents a count of how many lists had that depth for a given table (e. g., the first table had 2981 lists with no elements, 3062 with 1, 1511 with 2, etc).
length=8192 depth=2981/3062/1511/477/132/26/2/0
length=8192 depth=2988/3057/1491/500/129/24/3/0
length=8192 depth=3011/3012/1522/505/101/31/7/3
length=8192 depth=3015/3008/1523/480/138/23/3/2
length=8192 depth=3031/2967/1531/509/136/15/3/0
length=8192 depth=3022/2987/1536/493/125/21/6/2
length=8192 depth=3023/3007/1481/532/120/27/1/1

So a rough estimate is that 8-10% of the lists within a table have a depth of >=3 and 2% of the lists have a depth >= 4.

siying · 2020-12-30T21:36:03Z

Do we supposed to resize the hash table to more than 1.5 X number of elements? There aren't supposed to be many hash table buckets with more than 1-2 elements, right? Can you measure how frequently we remove an elements that is in the third position or lower in your test that shows the improvement?

For kicks, I am running some experiments now to test this out using the seekrandom test. I can see the depth of the list reaching 8 deep. Here is a couple samples of the depths. Each element in depth represents a count of how many lists had that depth for a given table (e. g., the first table had 2981 lists with no elements, 3062 with 1, 1511 with 2, etc).
length=8192 depth=2981/3062/1511/477/132/26/2/0
length=8192 depth=2988/3057/1491/500/129/24/3/0
length=8192 depth=3011/3012/1522/505/101/31/7/3
length=8192 depth=3015/3008/1523/480/138/23/3/2
length=8192 depth=3031/2967/1531/509/136/15/3/0
length=8192 depth=3022/2987/1536/493/125/21/6/2
length=8192 depth=3023/3007/1481/532/120/27/1/1

So a rough estimate is that 8-10% of the lists within a table have a depth of >=3 and 2% of the lists have a depth >= 4.

Interesting. If we change these two thresholds: https://github.com/facebook/rocksdb/blob/master/cache/lru_cache.cc#L73-L75 and https://github.com/facebook/rocksdb/blob/master/cache/lru_cache.cc#L44 to keep the hash table larger, what would be the result?

mrambacher · 2020-12-30T21:38:51Z

Another interesting point is that there is approximately 35% of the lists that contain no elements. From my print statements, it appears as if that percentage stays roughly constant as the size of the list increases (as more elements are added to the cache).

acelyc111 · 2021-01-04T11:28:16Z

I think making a change like this is probably worthwhile, but I wonder if the same thing can be accomplished without the extra "prev" pointer. For example, what if FindPointer would return both the current and prev entry -- could you accomplish the same thing? The size of the metadata for the cache entry should be kept to a minimum.

Seems not possible to implement it like that, I aim to reduce list iteration when call Remove(LRUHandle* h), and some callers like EraseUnRefEntries, EvictFromLRU and Release can remove the entry directly, FindPointer is not called in these palces, and that's what I want to avoid.

acelyc111 · 2021-01-04T16:48:40Z

Have you done any experiments or calculations that show how the change in the metadata size affects the number of entries that can be stored in the cache? Are items evicted more frequently and at what rate/size?

8 bytes of LRUHandle* prev_hash is introduced to original 72 bytes ( size of struct LRUHandle),
there is a simple formula as how capacity calculate in https://github.com/facebook/rocksdb/blob/master/cache/lru_cache.h#L134:

extra_space_rate = 8 / (72 - 1 + key_lenght + charge)   // charge is roughly the value length
                 ~= 8 / (71 + kv_lenght)

facebook-github-bot added the CLA Signed label Oct 29, 2020

acelyc111 force-pushed the lru_optimize branch 4 times, most recently from cc8183d to 341b5fa Compare October 31, 2020 09:40

acelyc111 marked this pull request as ready for review November 1, 2020 16:23

acelyc111 added 2 commits December 24, 2020 23:14

Optimize remove performance of LRU cache by using double linked list.

4452010

fix

5d2d57e

acelyc111 force-pushed the lru_optimize branch from 2e72678 to 5d2d57e Compare December 24, 2020 15:15

mrambacher reviewed Dec 30, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve LRU cache's performance#7611

Improve LRU cache's performance#7611
acelyc111 wants to merge 2 commits intofacebook:mainfrom
acelyc111:lru_optimize

acelyc111 commented Oct 29, 2020

Uh oh!

acelyc111 commented Dec 24, 2020

Uh oh!

pdillinger commented Dec 29, 2020

Uh oh!

mrambacher left a comment

Uh oh!

mrambacher Dec 29, 2020

Uh oh!

acelyc111 Jan 11, 2021

Uh oh!

siying commented Dec 30, 2020

Uh oh!

mrambacher commented Dec 30, 2020

Uh oh!

siying commented Dec 30, 2020

Uh oh!

mrambacher commented Dec 30, 2020

Uh oh!

acelyc111 commented Jan 4, 2021

Uh oh!

acelyc111 commented Jan 4, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		namespace ROCKSDB_NAMESPACE {

		class LRUHandleTableTest : public testing::Test {};

Conversation

acelyc111 commented Oct 29, 2020

Uh oh!

acelyc111 commented Dec 24, 2020

Uh oh!

pdillinger commented Dec 29, 2020

Uh oh!

mrambacher left a comment

Choose a reason for hiding this comment

Uh oh!

mrambacher Dec 29, 2020

Choose a reason for hiding this comment

Uh oh!

acelyc111 Jan 11, 2021

Choose a reason for hiding this comment

Uh oh!

siying commented Dec 30, 2020

Uh oh!

mrambacher commented Dec 30, 2020

Uh oh!

siying commented Dec 30, 2020

Uh oh!

mrambacher commented Dec 30, 2020

Uh oh!

acelyc111 commented Jan 4, 2021

Uh oh!

acelyc111 commented Jan 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

acelyc111 commented Jan 4, 2021 •

edited

Loading