Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About Deletion marker #655

Closed
hyunsooda opened this issue Feb 9, 2019 · 2 comments
Closed

About Deletion marker #655

hyunsooda opened this issue Feb 9, 2019 · 2 comments
Labels

Comments

@hyunsooda
Copy link

I am korean student and Can I ask few question? if you are not busy.

I know that leveldb's delete operation is implemented by deletion marker.
if the user delete data multiple times, many space will be waste so leveldb has compaction algorithm.

  1. when leveldb performs compress to reuse waste space ?
  2. by any chance, other operations are wating during compression such as Mark and Sweep Algorithm ?

Thank you.

@adam-azarchs
Copy link
Contributor

Mark/sweep as you'd see in a garbage collector is way more complicated than what happens in leveldb. A simplified version of how leveldb works (ignoring some of the implementation details for the youngest generation and optimizations like bloom filters and key sharding) is that there's a stack of files representing (more or less) snapshots of the database. When you go to look up a key, you first look in the top file in the stack. If you find a key, or a deletion marker, you stop there. Otherwise you keep going down the stack until you either find the key or reach the bottom of the stack. So for example imagine this stack of tables:

key:    a     b     c
v0:     1     2    del
v1:           3     4
v2:           4     5

then what you would see iterating through the database would be {a: 1, b: 2}. If we edit or delete a key, we just make the change to the table in v0 in-memory. Removing a could be done without a deletion marker. However removing b would require a marker to keep a reader from falling back to v1. Once there's enough data in v0, it gets frozen and dumped to disk, all of the "levels" get incremented, and we start writing to a new v0.

Compaction is basically removing the redundant data in lower levels of the stack. So if we were to compact v1 and v2 together above, it would basically just remove v2 since both of its values were overwritten in v1. If we compacted all 3 layers together after making a new v0, we'd have

key:    a     b     c
v0:
v1:     1     2

Because the lower levels of the stack are logically immutable, compaction can happen in the background and then the stack of tables can be atomically replaced by the new, single, compacted table.

Some of the configured options control when stuff happens. Basically once a level gets above a certain size, it gets compacted with the level under it. Compaction can also be manually triggered for a key range, for example if you know that you're not going to be writing to that range for a while but are going to be reading from it a lot (since compaction can avoid the need to search multiple levels for your key). As the documentation states, however, you're probably better off leaving compaction up to the built-in logic.

@cmumford
Copy link
Contributor

Thx for the response @adam-azarchs. I think this is answered, but please reopen if not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants