RFC: better control the memory size used by each part of the cache #4692

Qiaolin-Yu · 2022-10-06T15:42:00Z

Background

Currently, the kvstore of NebulaGraph only opens cache_index_and_filter_blocks when enabling partitioned index filter. In other cases, the option cache_index_and_filter_blocks is never used and defaults to false.

if (FLAGS_enable_partitioned_index_filter) {
      bbtOpts.index_type = rocksdb::BlockBasedTableOptions::IndexType::kTwoLevelIndexSearch;
      bbtOpts.partition_filters = true;
      bbtOpts.cache_index_and_filter_blocks = true;
      bbtOpts.cache_index_and_filter_blocks_with_high_priority = true;
      bbtOpts.pin_top_level_index_and_filter = true;
      bbtOpts.pin_l0_filter_and_index_blocks_in_cache =
          baseOpts.compaction_style == rocksdb::CompactionStyle::kCompactionStyleLevel;
    }

According to this section of the RocksDB Wiki, the cache_index_and_filter_blocks determines whether the index block and filter block will be cached into the block cache. In view of the background, there seems to be a tradeoff here.

Tradeoff

If index and filter blocks are cached into the block cache, the memory size used by RocksDB can be better controlled. But if the given block cache size is too small, it may cause serious performance issues because the index and filter blocks are usually large.

Otherwise, the index and filter block will be stored in heap and only be limited by setting max_open_files. They may be very large and their size cannot be calculated accurately. Furthermore, block_cache_tracer cannot trace them.

Expectation

To the best of my knowledge, we can add a separate option for cache_index_and_filter_blocks in the storage configuration (default false). It can be opened when given block cache size is large enough and the user wants to control the total memory size better.

I can help to complete this part of the work if you think it is correct and valuable.

The text was updated successfully, but these errors were encountered:

xtcyclist · 2022-10-07T02:44:04Z

If the trade-off you mentioned could be studied throughtly, I think it is possible to manage the caching of index and filter blocks automatically. On the case you mentioned: But if the given block cache size is too small, may I know what exactly your application scenario is? Do you have observations on, for example, what the lower bound of the block cache size is, below which we shall disable the caching of index and filtering blocks?

I think we should be cautious of addding a new option to configure. The trend is to reduce the number of parameters for user friendness. Simply offering an option for users to configure does not solve the problem, but may make the product harder to use. In reality, users may simply ignore many configurations and leave them as their defaults, which defeats the purpose of adding them.

I have seen users reporting the best practice of disabling the whole of block cache to avoid the distrubance caused by locking on latencies, which they have found to be the bottleneck after detailed performance analysis. And the current set of configurations satisify their needs.

Qiaolin-Yu · 2022-10-07T04:35:25Z

Thanks for your comment. @xtcyclist

On the case you mentioned: But if the given block cache size is too small, may I know what exactly your application scenario is?

According to the RocksDB Wiki,
By putting index, filter, and compression dictionary blocks in block cache, these blocks have to compete against data blocks for staying in cache. Although index and filter blocks are being accessed more frequently than data blocks, there are scenarios where these blocks can be thrashing. This is undesired because index and filter blocks tend to be much larger than data blocks, and they are usually of higher value to stay in cache (the latter is also true for compression dictionary blocks).

When the given block cache size is small, the competition between data blocks and index/filter blocks for cache space will be more fierce. In extreme cases (e.g. block cache is disabled), index/filter blocks cannot even be cached.

Do you have observations on, for example, what the lower bound of the block cache size is, below which we shall disable the caching of index and filtering blocks?

Currently, I have no a detailed observation on the lower bound of block cache size because it may be different in different workloads.

I have seen users reporting the best practice of disabling the whole of block cache to avoid the distrubance caused by locking on latencies, which they have found to be the bottleneck after detailed performance analysis. And the current set of configurations satisify their needs.

Do they enable other cache after disabling the block cache? Or just not use any cache and achieve better performance? This seems to be an interesting phenomenon.

xtcyclist · 2022-10-07T05:27:55Z

RocksDB wiki is for RocksDB or LSM-tree in general, which is fine. But, for NebulaGraph, we'd better identify a particular application scenario in the GRAPH context while we are preparing an idea for a kernel-level development in NebulaGraph. This is why I am asking for a specifric application scenario to motivate this issue.

On one hand, when the block cache is not sufficiently large, caching indexes and filters may evict many data blocks, because they are accessed more frequently. In this case, it may make sense to adjust the priority of filters, indexs and data blocks to avoid thrashing while still cache some hot data. Refer to this blog from smalldatum for the priority configurations. There may be many other options to consider facing this issue, before we make the conclusion to cut off the caching of filters and indexes.

On the other hand, if the block cache is already very small, there may be very little benefits we can gain from using it for query processing at the first place, considering that there is a whole family of troubles of using block caches. The above mentioned locking overhead is one of them. Refer to this blog for more details. So, what do you think of the choice of simply turning off the block cache, when there is little DRAM we could use for it?

Qiaolin-Yu · 2022-10-07T07:41:13Z

@xtcyclist Thank you for the relevant information.
In my previous evaluation based on LDBC graph dataset, when block cache is fully closed, the performance of NebulaGraph will become extremely poor (cc @wenhaocs). I don't know if in some workloads, the close of block cache will bring benefits. It's true that there are still many evaluations to be completed on the cache allocation of the graph database.

But this issue is not intended to determine which cache policy is the best. I just mean that the config cache_index_and_filter_blocks may be better to be a separate option instead of just controlled by enable_partitioned_index_filter

According to this blog, I think it is worth noting this part.

When they want to modify cache_index_and_filter_blocks , they need to set enable_partitioned_index_filter and max_open_files instead of just setting cache_index_and_filter_blocks . It is confusing and may cause new problems. Because sometimes I just want to set enable_partitioned_index_filter to false, and set cache_index_and_filter_blocks to true.

The trend is to reduce the number of parameters for user friendness. Simply offering an option for users to configure does not solve the problem, but may make the product harder to use. In reality, users may simply ignore many configurations and leave them as their defaults, which defeats the purpose of adding them.

I totally agree with you. If you think there is no need to add this config, I will close this issue.

wenhaocs · 2022-10-07T18:20:31Z

@Qiaolin-Yu When block cache is fully closed, the performance of NebulaGraph will become extremely poor.. Are you using knife? Which queries are you running? Did you disable the page cache?

In my opinion, block cache is more suitable for workloads that have high recency. If the workload is mostly scan, the overhead of block cache will become more obvious. Per my experiment with LDBC benchmark, it works better with a smaller block cache like 20GB - 32GB. But totally removing block cache definitely suffers.

On the other hand, I understand that a lot companies like to have full control on the memory usage by disabling page cache, and add more cache layers above rocksdb. In this case, having control on indexes and filters totally makes sense. I think we can have a discussion on whether to expose the parameter or maybe as compromise, adding this option to rocksdb_block_based_table_options as hidden param to some extent. cc @xtcyclist

Qiaolin-Yu · 2022-10-08T03:40:10Z

When block cache is fully closed, the performance of NebulaGraph will become extremely poor.. Are you using knife? Which queries are you running? Did you disable the page cache?

@wenhaocs I use knife and disable the page cache. For the workload interactive-short-1, the evaluation results are as follows.

block cache size	cache_index_and_filter_blocks	P99 latency
0	false	14084us
1024	false	9732us
128	true	1662408us
256	true	340223us
384	true	257253us
512	true	147871us
640	true	93044us
768	true	70353us
896	true	43138us
1024	true	27330us

wey-gu · 2023-02-21T01:29:46Z

@wenhaocs @xtcyclist could you please take a look at this RFC?
Thanks!

Qiaolin-Yu added the type/feature req Type: feature request label Oct 6, 2022

Qiaolin-Yu changed the title ~~RFC: store the index block and filter block in the block cache~~ RFC: better control the memory size used by each part of the cache Oct 8, 2022

wey-gu mentioned this issue Oct 8, 2022

Weekly Report 2022-10-07 vesoft-inc/nebula-community#137

Closed

Qiaolin-Yu closed this as completed Feb 20, 2023

wey-gu mentioned this issue Feb 25, 2023

Weekly Report 2023-02-24 vesoft-inc/nebula-community#333

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: better control the memory size used by each part of the cache #4692

RFC: better control the memory size used by each part of the cache #4692

Qiaolin-Yu commented Oct 6, 2022 •

edited

Loading

xtcyclist commented Oct 7, 2022 •

edited

Loading

Qiaolin-Yu commented Oct 7, 2022 •

edited

Loading

xtcyclist commented Oct 7, 2022 •

edited

Loading

Qiaolin-Yu commented Oct 7, 2022

wenhaocs commented Oct 7, 2022 •

edited

Loading

Qiaolin-Yu commented Oct 8, 2022

wey-gu commented Feb 21, 2023

RFC: better control the memory size used by each part of the cache #4692

RFC: better control the memory size used by each part of the cache #4692

Comments

Qiaolin-Yu commented Oct 6, 2022 • edited Loading

Background

Tradeoff

Expectation

xtcyclist commented Oct 7, 2022 • edited Loading

Qiaolin-Yu commented Oct 7, 2022 • edited Loading

xtcyclist commented Oct 7, 2022 • edited Loading

Qiaolin-Yu commented Oct 7, 2022

wenhaocs commented Oct 7, 2022 • edited Loading

Qiaolin-Yu commented Oct 8, 2022

wey-gu commented Feb 21, 2023

Qiaolin-Yu commented Oct 6, 2022 •

edited

Loading

xtcyclist commented Oct 7, 2022 •

edited

Loading

Qiaolin-Yu commented Oct 7, 2022 •

edited

Loading

xtcyclist commented Oct 7, 2022 •

edited

Loading

wenhaocs commented Oct 7, 2022 •

edited

Loading