New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compaction causes data inconsistency when using snapshots #320

Open
specialforest opened this Issue Nov 27, 2015 · 2 comments

Comments

Projects
None yet
4 participants
@specialforest

specialforest commented Nov 27, 2015

When using snapshots, background compaction causes data inconsistency. Existing records can became missing or previously existed records or their values can appear again. Test program that reproduces the issue https://gist.github.com/specialforest/fa136b660bdd672e3d4b

@richcole

This comment has been minimized.

richcole commented Nov 27, 2015

The error occurs when leveldb does a compaction while there are several records with the same key but different sequence numbers. This can occur when snapshots are in play. The compaction algorithm can choose a file such that it will push the latest record from level i to level i+1, then a following get will retrieve the older record from level i which is incorrect. I see two ways to fix: (i) always compact a file set that is closed with respect records with the same key, e.g. if files f1 and f2 have records with key k1 then compact f1 and f2 together, or (ii) when considering f1 for compaction check that no older record of any k in f1 exists at level i, it is sufficient to check the boundary elements of the file f1 I believe, and if the criteria isn't met to try to compact the previous file (I assume you sort based on key and sequence number with increasing sequence numbers being later in the sequence of files at a level).

@cmumford cmumford added the bug label Jan 15, 2016

richcole-at-amazon added a commit to richcole-at-amazon/leveldb that referenced this issue Jan 29, 2016

Fix snapshot compaction bug
Closes google#320

During compaction it was possible that records from a block b1=(l1,u1)
would be pushed down from level i to level i+1. If there is a block
b2=(l2,u2) at level i with k1 = user_key(u1) = user_key(l2) then
a subsequent search for k1 will yield the record l2 which has a smaller
sequence number than u1 because the sort order for records sorts
increasing by user key but decreaing by sequence number.

This change add a call to a new function AddBoundaryInputs to
SetupOtherInputs. AddBoundaryInputs searches for a block b2 matching the
criteria above and adds it to the set of files to be compacted. Whenever
AddBoundaryInputs is called it is important that the compaction fileset
in level i+1 (known as c->inputs_[1] in the code) be recomputed. Each
call to AddBoundaryInputs is followed by a call to GetOverlappingInputs.

SetupOtherInputs is called on both manual and automated compaction
passes. It is called for both level zero and for levels greater than 0.

Testing Done:

  A test program (issue320_test) was constructed that performs mutations
  while snapshots are active. issue320_test fails without this bug fix
  after 64k writes. It passes with this bug fix. It was run with 200M
  writes and passed.

  Unit tests were written for the new function that was added to the
  code. Make check was run and seen to pass.

Signed-off-by: Richard Cole <richcole@amazon.com>

@richcole-at-amazon richcole-at-amazon referenced a pull request that will close this issue Jan 29, 2016

Open

Fix snapshot compaction bug #339

@richcole-at-amazon

This comment has been minimized.

richcole-at-amazon commented Feb 4, 2016

In case it isn't obvious from the pretty graphics above included by github there is a patch for this bug here:

#339

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment