Support in-memory compaction in rocksdb and use it for lock CF #8140

yiwu-arbug · 2020-06-25T23:47:42Z

Feature Request

Is your feature request related to a problem? Please describe:

For rocksdb (or any LSM tree implementation) in use cases when data get deleted soon after they being inserted, when memtable flush to disk, the end result could be significant smaller than the memtable size. This will result in inefficient compaction since L0 file sizes become too small. One way to work it around is compacting memtables: instead of flushing memtable when they are full, we can run compaction on them to reduce their size in memory. The technique reduces IO and also keep hot data in memory as much as possible.

One challenge to implement in-memory compaction is handling WAL. Normally when memtable get flushed, corresponding WAL file can be deleted since data is persisted to SSTs. However, after in-memory compaction such WAL files cannot be deleted. They will grow indefinitely. A workaround is to force memtable flush when WAL size reach certain threshold. RocksDB already provide such functionality. The WAL wouldn't be a problem if we remove the use of WAL from KvDB.

We can use in-memory compaction on lock CF to reduce its IO cost and keep its content in memory for query.

This is a common optimization for LSM implementations. We can probably sync with rocksdb team to know their views and plan.

jyizheng · 2020-07-06T04:48:44Z

Is this task to change RocksDB code? Is the in-memory compaction can be controlled by ti TiKV code? It would be helpful if anybody gives me a pointer to the relevant code.

yiwu-arbug · 2020-07-06T23:59:44Z

@jyizheng Yes, this task require changing rocksdb code, and for a warning this task is a hard task which could easily take > 1 month work. If you are still interested you can join #sig-engine channel at tikv-wg.slack.com and ping us. Thanks.

jyizheng · 2020-07-07T00:05:53Z

@yiwu-arbug Yes, I still want to work on this issue. I want to learn more about RocksDB. Can this issue be assigned to me?

yiwu-arbug · 2020-07-07T22:39:59Z

Discussed with @jyizheng offline and he will be working on the task.

Connor1996 · 2021-06-16T09:08:36Z

From my test, l0 flow of lock CF is high.

With in-memory compaction, I think it would save some unnecessary write IO.

dwangxxx · 2022-01-05T05:53:51Z

From my test, l0 flow of lock CF is high.

With in-memory compaction, I think it would save some unnecessary write IO.

Hi, how should I get the compaction flow like this image? I'm not familiar with RocksDB, could you give me some guidance? Thanks!

yiwu-arbug added help wanted Help wanted. Contributions are very welcome! component/rocksdb Component: RocksDB engine difficulty/hard Difficulty: Hard. sig/engine SIG: Engine labels Jun 25, 2020

github-actions bot added this to To Do in Engine SIG Jun 25, 2020

yiwu-arbug removed the help wanted Help wanted. Contributions are very welcome! label Jul 7, 2020

yiwu-arbug moved this from To Do to In Progress(Developing) in Engine SIG Jul 7, 2020

Connor1996 self-assigned this Jun 16, 2021

This was referenced Mar 28, 2022

Optimize the performance and stability of TiDB running on AWS i3.xlarge/i3.2xlarge pingcap/tidb#18025

Open

Remove deleted keys during memtable flushing for lock cf #8652

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support in-memory compaction in rocksdb and use it for lock CF #8140

Support in-memory compaction in rocksdb and use it for lock CF #8140

yiwu-arbug commented Jun 25, 2020 •

edited

jyizheng commented Jul 6, 2020

yiwu-arbug commented Jul 6, 2020

jyizheng commented Jul 7, 2020

yiwu-arbug commented Jul 7, 2020

Connor1996 commented Jun 16, 2021

dwangxxx commented Jan 5, 2022

Support in-memory compaction in rocksdb and use it for lock CF #8140

Support in-memory compaction in rocksdb and use it for lock CF #8140

Comments

yiwu-arbug commented Jun 25, 2020 • edited

Feature Request

Is your feature request related to a problem? Please describe:

jyizheng commented Jul 6, 2020

yiwu-arbug commented Jul 6, 2020

jyizheng commented Jul 7, 2020

yiwu-arbug commented Jul 7, 2020

Connor1996 commented Jun 16, 2021

dwangxxx commented Jan 5, 2022

yiwu-arbug commented Jun 25, 2020 •

edited