Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: a better structure for LevelHandler #110

Open
Little-Wallace opened this issue Sep 27, 2021 · 6 comments
Open

Proposal: a better structure for LevelHandler #110

Little-Wallace opened this issue Sep 27, 2021 · 6 comments
Labels
enhancement New feature or request

Comments

@Little-Wallace
Copy link

Problem

LevelHandler will re-sort the whole sst file in one level. But if there are 100K files in one level, once sort operation may cost 20~100ms (I bench it in my macbook 2020). And if there are many compaction jobs running together, they may block the read thread too long.

Solution

I propose a two-level b+ tree for LevelHandler and it will split the sst into much page. Every time it changed it only copy the origin page and update one row of it, and then replace it in a short time with holding mutex. I think two-level is enough because sort one thousand string can be finished in 1ms.

@Little-Wallace Little-Wallace added the enhancement New feature or request label Sep 27, 2021
@coocood
Copy link

coocood commented Sep 27, 2021

If we use multiple instances, the largest level only contains tens of files.

@skyzh
Copy link
Member

skyzh commented Sep 27, 2021

If we use multiple instances, the largest level only contains tens of files.

Using multiple instances of LSM store on a single disk seems not efficient. We have as many WAL files as instance number, which would lead to inefficient fsync and fragmentation in SSD internally.

@BusyJay
Copy link
Member

BusyJay commented Sep 27, 2021

Multiple instance can share WAL. For example, using raft engine to store WAL. But I do think we need to consider the number of SSTs as a generic engine.

@Little-Wallace
Copy link
Author

Do we need to consider how to split a manifest version of sst into multiple?

@Connor1996
Copy link
Member

Do we need to consider how to split a manifest version of sst into multiple?

What's it used for?

@Little-Wallace
Copy link
Author

Do we need to consider how to split a manifest version of sst into multiple?

What's it used for?

To split one engine into two engine. Because @coocood commented that we may use multiple db instances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants