Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: better control the memory size used by each part of the cache #4692

Closed
Qiaolin-Yu opened this issue Oct 6, 2022 · 7 comments
Closed
Labels
type/feature req Type: feature request

Comments

@Qiaolin-Yu
Copy link
Contributor

Qiaolin-Yu commented Oct 6, 2022

Background

Currently, the kvstore of NebulaGraph only opens cache_index_and_filter_blocks when enabling partitioned index filter. In other cases, the option cache_index_and_filter_blocks is never used and defaults to false.

if (FLAGS_enable_partitioned_index_filter) {
      bbtOpts.index_type = rocksdb::BlockBasedTableOptions::IndexType::kTwoLevelIndexSearch;
      bbtOpts.partition_filters = true;
      bbtOpts.cache_index_and_filter_blocks = true;
      bbtOpts.cache_index_and_filter_blocks_with_high_priority = true;
      bbtOpts.pin_top_level_index_and_filter = true;
      bbtOpts.pin_l0_filter_and_index_blocks_in_cache =
          baseOpts.compaction_style == rocksdb::CompactionStyle::kCompactionStyleLevel;
    }

According to this section of the RocksDB Wiki, the cache_index_and_filter_blocks determines whether the index block and filter block will be cached into the block cache. In view of the background, there seems to be a tradeoff here.

Tradeoff

If index and filter blocks are cached into the block cache, the memory size used by RocksDB can be better controlled. But if the given block cache size is too small, it may cause serious performance issues because the index and filter blocks are usually large.

Otherwise, the index and filter block will be stored in heap and only be limited by setting max_open_files. They may be very large and their size cannot be calculated accurately. Furthermore, block_cache_tracer cannot trace them.

Expectation

To the best of my knowledge, we can add a separate option for cache_index_and_filter_blocks in the storage configuration (default false). It can be opened when given block cache size is large enough and the user wants to control the total memory size better.

I can help to complete this part of the work if you think it is correct and valuable.

@Qiaolin-Yu Qiaolin-Yu added the type/feature req Type: feature request label Oct 6, 2022
@xtcyclist
Copy link
Contributor

xtcyclist commented Oct 7, 2022

If the trade-off you mentioned could be studied throughtly, I think it is possible to manage the caching of index and filter blocks automatically. On the case you mentioned: But if the given block cache size is too small, may I know what exactly your application scenario is? Do you have observations on, for example, what the lower bound of the block cache size is, below which we shall disable the caching of index and filtering blocks?

I think we should be cautious of addding a new option to configure. The trend is to reduce the number of parameters for user friendness. Simply offering an option for users to configure does not solve the problem, but may make the product harder to use. In reality, users may simply ignore many configurations and leave them as their defaults, which defeats the purpose of adding them.

I have seen users reporting the best practice of disabling the whole of block cache to avoid the distrubance caused by locking on latencies, which they have found to be the bottleneck after detailed performance analysis. And the current set of configurations satisify their needs.

@Qiaolin-Yu
Copy link
Contributor Author

Qiaolin-Yu commented Oct 7, 2022

Thanks for your comment. @xtcyclist

On the case you mentioned: But if the given block cache size is too small, may I know what exactly your application scenario is?

According to the RocksDB Wiki,
By putting index, filter, and compression dictionary blocks in block cache, these blocks have to compete against data blocks for staying in cache. Although index and filter blocks are being accessed more frequently than data blocks, there are scenarios where these blocks can be thrashing. This is undesired because index and filter blocks tend to be much larger than data blocks, and they are usually of higher value to stay in cache (the latter is also true for compression dictionary blocks).

When the given block cache size is small, the competition between data blocks and index/filter blocks for cache space will be more fierce. In extreme cases (e.g. block cache is disabled), index/filter blocks cannot even be cached.

Do you have observations on, for example, what the lower bound of the block cache size is, below which we shall disable the caching of index and filtering blocks?

Currently, I have no a detailed observation on the lower bound of block cache size because it may be different in different workloads.

I have seen users reporting the best practice of disabling the whole of block cache to avoid the distrubance caused by locking on latencies, which they have found to be the bottleneck after detailed performance analysis. And the current set of configurations satisify their needs.

Do they enable other cache after disabling the block cache? Or just not use any cache and achieve better performance? This seems to be an interesting phenomenon.

@xtcyclist
Copy link
Contributor

xtcyclist commented Oct 7, 2022

RocksDB wiki is for RocksDB or LSM-tree in general, which is fine. But, for NebulaGraph, we'd better identify a particular application scenario in the GRAPH context while we are preparing an idea for a kernel-level development in NebulaGraph. This is why I am asking for a specifric application scenario to motivate this issue.

On one hand, when the block cache is not sufficiently large, caching indexes and filters may evict many data blocks, because they are accessed more frequently. In this case, it may make sense to adjust the priority of filters, indexs and data blocks to avoid thrashing while still cache some hot data. Refer to this blog from smalldatum for the priority configurations. There may be many other options to consider facing this issue, before we make the conclusion to cut off the caching of filters and indexes.

On the other hand, if the block cache is already very small, there may be very little benefits we can gain from using it for query processing at the first place, considering that there is a whole family of troubles of using block caches. The above mentioned locking overhead is one of them. Refer to this blog for more details. So, what do you think of the choice of simply turning off the block cache, when there is little DRAM we could use for it?

@Qiaolin-Yu
Copy link
Contributor Author

@xtcyclist Thank you for the relevant information.
In my previous evaluation based on LDBC graph dataset, when block cache is fully closed, the performance of NebulaGraph will become extremely poor (cc @wenhaocs). I don't know if in some workloads, the close of block cache will bring benefits. It's true that there are still many evaluations to be completed on the cache allocation of the graph database.

But this issue is not intended to determine which cache policy is the best. I just mean that the config cache_index_and_filter_blocks may be better to be a separate option instead of just controlled by enable_partitioned_index_filter

According to this blog, I think it is worth noting this part.

Screen Shot 2022-10-07 at 15 34 11

When they want to modify cache_index_and_filter_blocks , they need to set enable_partitioned_index_filter and max_open_files instead of just setting cache_index_and_filter_blocks . It is confusing and may cause new problems. Because sometimes I just want to set enable_partitioned_index_filter to false, and set cache_index_and_filter_blocks to true.

The trend is to reduce the number of parameters for user friendness. Simply offering an option for users to configure does not solve the problem, but may make the product harder to use. In reality, users may simply ignore many configurations and leave them as their defaults, which defeats the purpose of adding them.

I totally agree with you. If you think there is no need to add this config, I will close this issue.

@wenhaocs
Copy link
Contributor

wenhaocs commented Oct 7, 2022

@Qiaolin-Yu When block cache is fully closed, the performance of NebulaGraph will become extremely poor.. Are you using knife? Which queries are you running? Did you disable the page cache?

In my opinion, block cache is more suitable for workloads that have high recency. If the workload is mostly scan, the overhead of block cache will become more obvious. Per my experiment with LDBC benchmark, it works better with a smaller block cache like 20GB - 32GB. But totally removing block cache definitely suffers.

On the other hand, I understand that a lot companies like to have full control on the memory usage by disabling page cache, and add more cache layers above rocksdb. In this case, having control on indexes and filters totally makes sense. I think we can have a discussion on whether to expose the parameter or maybe as compromise, adding this option to rocksdb_block_based_table_options as hidden param to some extent. cc @xtcyclist

@Qiaolin-Yu
Copy link
Contributor Author

When block cache is fully closed, the performance of NebulaGraph will become extremely poor.. Are you using knife? Which queries are you running? Did you disable the page cache?

@wenhaocs I use knife and disable the page cache. For the workload interactive-short-1, the evaluation results are as follows.
Screen Shot 2022-10-08 at 11 35 40

block cache size cache_index_and_filter_blocks P99 latency
0 false 14084us
1024 false 9732us
128 true 1662408us
256 true 340223us
384 true 257253us
512 true 147871us
640 true 93044us
768 true 70353us
896 true 43138us
1024 true 27330us

@Qiaolin-Yu Qiaolin-Yu changed the title RFC: store the index block and filter block in the block cache RFC: better control the memory size used by each part of the cache Oct 8, 2022
@wey-gu
Copy link
Contributor

wey-gu commented Feb 21, 2023

@wenhaocs @xtcyclist could you please take a look at this RFC?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature req Type: feature request
Projects
None yet
Development

No branches or pull requests

4 participants