Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC in level_merge #276

Closed
Peiqi0714 opened this issue Jul 10, 2023 · 6 comments
Closed

GC in level_merge #276

Peiqi0714 opened this issue Jul 10, 2023 · 6 comments

Comments

@Peiqi0714
Copy link

I can't understand the GC method of level_merge:

  1. if blobA's garbage meets threshold, add a gc tag to involve blobA in next merge
  2. but only a part of keys in blobA is compaction related keys, how to gc the rest of the keys in next merge?
    Thanks a lot!!
@Connor1996
Copy link
Member

Just rewrite the still valid kv in a new blob file in next merge. So level_merge would increase write amplification.

@Peiqi0714
Copy link
Author

yes,i know.but I still have 2 questions
1.But if I don’t set a gc tag to this vtable, it still involve in a merge if it Has compaction related keys.
2.only valid compaction related keys can be rewritten,if some keys never involves in a compaction,the vtable still contains valid values even though the portion is small,and we cant delete the vtable. how can we solve the problem,this may cause huge space amplification

@Peiqi0714
Copy link
Author

yes,i know.but I still have 2 questions 1.But if I don’t set a gc tag to this vtable, it still involve in a merge if it Has compaction related keys. 2.only valid compaction related keys can be rewritten,if some keys never involves in a compaction,the vtable still contains valid values even though the portion is small,and we cant delete the vtable. how can we solve the problem,this may cause huge space amplification

vtable means blobfile here

@Connor1996
Copy link
Member

  1. Yes, check ShouldMerge
  2. That's true. We have observed it in tests. It has huge space amplification for update intensive workload. That's why it's not GA.

@Peiqi0714
Copy link
Author

  1. Yes, check ShouldMerge
  2. That's true. We have observed it in tests. It has huge space amplification for update intensive workload. That's why it's not GA.

thanks,but i noticed that when I open db, a regular gc Thread will be initialized (it will quit and never call again if no file needs go).even we enable level Merge, I found that when I
reopen a db,this regular gc thread will be initialized and clean some blob files which meet threshold. But if open a new db(level merge enable),this thread will be initialized and quit quickly,cant do regular gc in the future unless i reopen the db

@Connor1996
Copy link
Member

Level merge is a alternative to regular gc, so need to call background gc if level merge enabled

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants