Objects cannot be updated if a bad CRC is encountered by Bitcask #385

Merged
merged 6 commits into from Feb 26, 2013

5 participants

@dreverri

Steps to reproduce:
1. Start Riak with Bitcask as the backend
2. Write an object
3. Introduce an error in the Bitcask data file
4. Update the value of the object

Expected result:
Object is updated to new value

Actual result:
Update fails on the vnode with the bad CRC error

The following branch on Github has an eunit test that demonstrates the issue:
https://github.com/basho/riak_kv/tree/bitcask-badcrc-eunit-test

@engelsanchez engelsanchez was assigned Jan 31, 2013
engelsanchez added some commits Jan 31, 2013
@engelsanchez engelsanchez Merge branch 'master' into bitcask-badcrc-eunit-test
Preparing to bring the bad crc unit test into the present
c8b3c87
@engelsanchez engelsanchez Write even on bitcask bad_crc read
Fixes issue #385
Reads on corrupted files return a bad_crc error, which were not handled.
When writing, if a bad_crc was encountered on read of an existing key,
the write failed. Files being erased, or even truncated were handled
and the write proceeds, but bit corruption would die. I don't think
there is any benefit to the different handling.
eff2a96
@engelsanchez engelsanchez Whitespace change to trigger Travis ca86df4
@jrwest jrwest was assigned Feb 19, 2013
@jrwest jrwest and 1 other commented on an outdated diff Feb 19, 2013
src/riak_kv_vnode.erl
@@ -828,7 +828,7 @@ prepare_put(#state{vnodeid=VId,
prunetime=PruneTime},
IndexBackend) ->
case Mod:get(Bucket, Key, ModState) of
- {error, not_found, _UpdModState} ->
+ {error, _Err, _UpdModState} ->
@jrwest
jrwest added a note Feb 19, 2013

should we be matching on the specific bad CRC error instead of any error (e.g. {error, Err, _} when Err =:= not_found orelse Err =:= bad_crc)? I imagine there are some errors we would want the current behavior to continue for (don't try to write if the backend is totally shot).

I have basically the same question :). I was hoping to hear the opinion from more experienced riak people here. I have not found any kind of backend contract anywhere, so matching against a specific error like bad_crc doesn't sound right. Unless, of course, it was part of an official backend contract, in which case it would be a much better option. In any case, even if the put crashes on every read, there is nothing in place to mark the node or vnode as useless that I know of.

@jrwest
jrwest added a note Feb 19, 2013

So i think this is the spec for the backend API: http://docs.basho.com/riak/latest/references/appendices/Backend-API/. Looks like {error, term(), state()} is a valid return value so I think it would be safer to match specifically on the bad_crc atom (and level's equivalent?). Not ideal but the "error contract" isn't very specific. Thoughts?

Wow, so it's even worse: that spec says {ok, not_found, state()}, yet what our backends actually send now is {error, not_found, state()}. None of this is up to date. I think the real answer here is to get our story straight and define the expected backend return values and how they are handled. I would not add a reference to bad_crc until it's an official backend error code. But I still think that the best we have right now is to handle all errors in the same way. What I'm trying to fix here is the inconsistency of handling something like a complete removal of all backend files (handled as not_found, ok, go on and write), vs. a simple CRC failure bringing the process down and preventing a write, which is hardly HA.

@jrwest
jrwest added a note Feb 20, 2013

right. I agree with that but I guess my concern with the match all is we can potentially mask some other error that should crash the vnode. Perhaps that is totally off though. Maybe, like you said, the fix is to just better define the backend interface first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jrwest

This change affects other backends, is the error specific to bitcask (leveldb deals with the bad crc elsewhere?) and if so should it be handled in the bitcask backend itself?

@jrwest

The tests pass with the change as it stands. @engelsanchez and I have been discussing a small detail about the change that I think may need a 3rd party familiar with the backend's handling of bad crcs and other errors.

@seancribbs

@jrwest @engelsanchez One suggestion - perhaps the backends should not return an error tuple when the result is "not found". Not finding the key does not imply that the backend has failed.

@engelsanchez

@seancribbs it's even more complicated than that. Not finding the key could happen because of an error or an OK solution. In bitcask, for example, not finding it in the keydir vs. finding it in the keydir AND failing to find it in the file (because it's been truncated or removed). Right now both return error, not_found. Jordan and I agree that the responses from the backend and how we handle them need a serious looking at that's probably beyond this particular issue. I'm going to open a different issue and see if the Core/KV team is up to standardizing this communication with the backend. I'm hoping that this can get accepted as a "better for the moment" solution to the bad crc handling while we tackle the bigger issue.

@engelsanchez engelsanchez Adding bad_crc specifically on read before write
Although plenty of other situations may arise where an error upon
reading should simply allow the write to happen, we're going to wait to
standardize backend responses properly before trying to fix those.
This is just to be prudent until the consequences of doing that are
better understood. This makes the fix specific to bitcask bad CRC
errors and nothing more.
Standardization will hopefully happen in issue #496
f2c108e
@engelsanchez

Ok, changing this to only handle the specific case of bitcask CRC read errors. Deeper, more standard fixes will hopefully come out of #496

@jrwest jrwest and 1 other commented on an outdated diff Feb 26, 2013
src/riak_kv_vnode.erl
@@ -1565,6 +1584,29 @@ filter_keys_test() ->
flush_msgs().
+%% include bitcask.hrl for HEADER_SIZE macro
+-include_lib("bitcask/include/bitcask.hrl").
+
+%% Bad CRC's prevent objects from being writable
@jrwest
jrwest added a note Feb 26, 2013

nit picky i know but should the comment say "should not prevent objects from being writable"?

This test was written many months ago when the ticket was opened by Dan Reverri to show there was a problem. I'll change it to explain how it is now verifying that the problem is gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jrwest jrwest and 1 other commented on an outdated diff Feb 26, 2013
src/riak_kv_vnode.erl
@@ -827,8 +827,22 @@ prepare_put(#state{vnodeid=VId,
starttime=StartTime,
prunetime=PruneTime},
IndexBackend) ->
- case Mod:get(Bucket, Key, ModState) of
- {error, not_found, _UpdModState} ->
+ GetReply =
+ case Mod:get(Bucket, Key, ModState) of
+ {error, not_found, _UpdModState} ->
+ ok;
+ % NOTE: bad_crc is NOT an official backend response. It is
+ % specific to bitcask currently and handling it may be changed soon.
+ % A standard set of responses will be agreed on
+ % https://github.com/basho/riak_kv/issues/496
+ {error, bad_crc, _UpdModState} ->
+ lager:info("Bad CRC detected while reading Bucket=~p, Key=~p", [Bucket, Key]),
@jrwest
jrwest added a note Feb 26, 2013

worth logging the partition here as well so admin would know what directory the bad key is in?

Yes, that would be hard to track down without it. Adding it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jrwest

made a few minor comments but looks good. tests pass. +1

@engelsanchez engelsanchez Add partition to bad bitcask CRC err msg
Also, changed comment on bad CRC unit test to show it is verifying the
fix. The old language was written by the person first showing the
problem existed (Dan R.).
831c0da
@engelsanchez

Added partition to the bad crc error msg and verified it displayed correctly. Also updated eunit test documentation.

@engelsanchez engelsanchez merged commit 468b210 into master Feb 26, 2013

1 check was pending

Details default The Travis build is in progress
@evanmcc

@engelsanchez did you see any evidence of whole node crashes with this one, or just the update errors?

@engelsanchez
@seancribbs seancribbs deleted the eas-handle-bitcask-bad-crc branch Apr 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment