Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
Fix race with concurrent merges and deletes
Change order of scan & fold: now oldest -> youngest New regression test (currently broken) for recent scan change of oldest -> youngest Fix rough edges on test/bitcask_qc.erl & test/bitcask_qc_expiry.erl Write tombstone on delete only if there's a keydir entry, plus race bugfix Fix remaining bug when bitcask scan/fold order was changed to oldest->youngest The NIF change fixes a long-standing latent bug: when put'ing a key that does not exist, if there's a race with a merge, keydir_put_int() would return 'ok' (error) rather than 'already_exists' (correct). The 'already_exists' return value is a signal to the read-write owner of the bitcask that the current append file must be closed and a new one opened (with a larger fileid than any merge). The tombstone change adds a new tombstone data format. Old tombstones will be handled correctly. New tombstones for any key K contain the fileid & offset of the key that it is deleting. If the fileid F still exists, then the tombstone will always be merged forward. If the fileid F does not exist, then merging forward is not necessary. When F was merged, the on-disk representation of key K not be merged forward: K does not exist in the keydir (because it was deleted by this tombstone), or it was replaced by a newer put. FML: fix tombstone merging bug when K & K's tombstone are in same fileid Originally found with bitcask_pulse, I deconstructed the test case to help understand what was happening: the new EUnit test is new_20131217_a_test_. As a result of the puts, key #13 is written 3x to fileid #1 (normal, tombstone, normal) and 1x to fileid #2 (normal @ the very beginning of the file). The merge creates fileid #3 and copies only the tombstone (the normal entry isn't copied because it is out-of-date). Before the close, the internal keydir contains the correct info about key #13, but after the close and re-open, we see key #13's entries: normal (and most recent) in fileid 32, and tombstone in fileid #3, oops. The fix is to remove all of the merge input fileids from the set of fileids that will survive/exist after the merge is finished. WIP: new test new_20131217_c_test_() is broken, need to experiment with bitcask:delete & NIF usage before continuing Fix non-determinism in the fold_visits_unfrozen_test test ... now 100% broken Fix iterator freeze bug demonstrated by last commit: remove kh_put_will_resize() predicate test for creating a keydir->pending Fix bug introduced by almost-bugfix in commit 1a9c99 that was supposed to be a partial fix for GH #82 Fix unexported function errors, thanks Buildbot Omnibus bugfixing, including: * Add 'already_exists' return to bitcask_nifs_keydir_remove(): we need it to signal races with merge, alas. * Add state to #filestate to be able to 'undo' last update to both a data file and its hint file. This probably means that we're going to have to play some games with merge file naming, TBD, stay tuned. * For bitcask:delete(), make the keydir delete conditional: if it fails, redo the entire thing again. * inner_merge_write() can have a race that, if a partial merge happens at the proper time after, we see an old value reappearing. Fix by checking the return value of the keydir put, and if 'already_exists', then undo the write. * When do_put() has a race and gets 'already_exists' from the keydir, undo the write before retrying. If this key is deleted sometime later, and then a partial merge happens after that, we might see this value reappear after the merge is done. * Add file_truncate() to bitcask_file.erl. TODO: do the same for the NIF style I/O. * Better robustness (I hope) to EUnit tests in bitcask_merge_delete.erl Add UNIX epoch time to put_int NIF call, to avoid C use of time(3). I hope this will eliminate a nasty source of nondeterminism during PULSE testing. Fix buildbot ?TESTDIR definition location problem Uff da: most PULSE non-determinism problems fixed? Fix merge race when keydir_put specifies old fileid & offset races with delete Scenario, with 3 writers, 2 & 3 are racing: * Writer 1: Put K, write @ {file 1, offset 63} * Writer 2: Delete operation starts ... but there is no write to disk yet * Writer 3: Merge scans file 1, sees K @ {1,30} -> not out of date -> but there is no write to disk yet * Writer 2: writes a tombstone @ {3,48} * Writer 2: Keydir conditional delete @ old location @ {1,63} is ok * Writer 2: keydir delete returns from NIF-land * Writer 3: merge copies data from {1, 63} -> {4, 42} * Writer 3: keydir put {4, 42} conditional on {1,63} succeeds due to incorrect conditional validation: the record is gone, but bug permits put to return 'ok'. Fix bug in previous commit: offset=0 is valid, do not check it Use larger retry # for open_fold_files() and keydir_wait_ready() when PULSE testing Small bitcask_pulse changes: file size, pretty the keys used, slightly less verbosity Adjust command frequency weightings for PULSE model Add no_tombstones_after_reopen_test() to test desired condition Avoid storing tombstones in the keydir when possible When a Bitcask is opened and scan_key_files() is reading data from disk and loading the RAM keydir, we now detect if the key is a tombstone and, if so, do not store it in the keydir. Normally, only hint files are scanned during startup. However, hint files have not stored enough information to confirm that a key is/is not a tombstone. I have added such a flag in a backward-compatible way: the offset size has been reduced from 64 -> 63 bits, and the uppermost bit (which is assumed to be 0 in all cases -- we assume nobody has actually written a file bit enough to require 64 bits to describe the offset) is used to signal tombstone status. An optional argument was given to the increment_file_id() NIF to communicate to the NIF that a data file exists ... a fact that would otherwise be lost if a hint/data file contains only tombstones. For testing purposes, fold() and fold_keys() are extended with another argument to expose the presence of keydir tombstones. Adjust timeouts and concurrency limits in bitcask_pulse.erl to avoid the worst of false-positive errors when using the PULSE model: {badrpc,timeout} nonsense. Add checks to the PULSE model to try to catch improper tombstones in keydir Dialyzer fixes Add failing test: tombstones exist in keydir if hint files are destroyed Add {tombstone,Key} wrapper when folding .data files; fixes test failure in last commit Fix intermittent bitcask_merge_worker intermittent timeout failure NIF review: avoid looking at argv[1] directly Remove unwanted oopsie foo.erl Restore put will resize check for multi-fold Restoring the check that will postpone freezing the keydir until puts will require a structural change. Without this, the new multifold code is crippled. Redo merge/delete race fix Consolidating retry on merge race logic in do_put, making delete simply a wrapper around a put(tombstone) operation. Some cleanup regarding the different tombstone versions Add missing file truncate op for NIF mode Plus a little unit test to make sure the file truncation operation works.
- Loading branch information