Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Every read triggers a read-repair when Last-write-wins=true #334

Merged
merged 1 commit into from Jan 31, 2013

Conversation

Projects
None yet
5 participants
Member

engelsanchez commented Jan 31, 2013

When LWW=true on a bucket, every replica increments the vclock. On read a read-repair is triggered due to mismatched vclocks at each replica, which means a put, and an incremented vclock, so the next read triggers a read-repair (and so on ad infinitum.)

This does not occur with an index capable backend, it is a product of the fast path optimisation for LWW to avoid a read before write https://github.com/basho/riak_kv/blob/master/src/riak_kv_vnode.erl#L650

LWW = true and backend is not index capable.

Funnily, we get a write after every read rather than a read before every write.

Potential fix is only increment vclock on the co-ordinating vnode.

Thanks to Martin Sumner at the NHS for the catch.

Member

russelldb commented Aug 27, 2012

Probably not an issue, after all. See https://github.com/basho/riak_kv/blob/master/src/riak_kv_vnode.erl#L803. Rushed to open the issue without reading all the code. My mistake.

@engelsanchez engelsanchez was assigned Jan 31, 2013

@engelsanchez engelsanchez Fix write on read bug with lww+bitcask
Fixes issue #334.
Reading a value written into a bucket with last write wins set to true
when using a non index supporting backend (bitcask) was causing
read-repairs writes.  The problem is the logic was causing all replica
writing nodes to increment the vclock, instead of just incrementing in
the coordinating node and forwarding to replicas as normally happens.
8895d28

@engelsanchez engelsanchez added a commit to basho/riak_test that referenced this pull request Jan 31, 2013

@engelsanchez engelsanchez Verify fix to writes on reads when LWW+Bitcask
This verifies the fix to issue basho/riak_kv#334
The test needs to run with bitcask:
  * It sets last_write_wins on a bucket
  * Writes on object
  * Repeatedly reads it
  * Verifies that the write/read repair count doesn't change
5a80c4e
Contributor

engelsanchez commented Jan 31, 2013

The problem is real. You can verify it by running the riak_test referenced above: verify_no_writes_on_read before and after the fix. Alternatively, simply do the steps by hand:

  • start a node with bitcask backend
  • set bucket property last_write_wins to true
  • write one object
  • repeatedly read the same object

You should see the write and read repair stats go up. You can also redbug the bitcask backend put operation and you'll see those going by every time you re-read the object.

Contributor

engelsanchez commented Jan 31, 2013

The riak_test to verify this is not in master yet. Use the eas-fix-lww-writes-on-read branch if you want to run it (verify_no_writes_on_read).

@chardan chardan was assigned Jan 31, 2013

Contributor

chardan commented on 8895d28 Jan 31, 2013

+1

@engelsanchez engelsanchez added a commit that referenced this pull request Jan 31, 2013

@engelsanchez engelsanchez Merge pull request #334 from basho/eas-fix-lww-writes-on-read
Every read triggers a read-repair when Last-write-wins=true
605992e

@engelsanchez engelsanchez merged commit 605992e into master Jan 31, 2013

@engelsanchez engelsanchez was assigned Jan 31, 2013

@engelsanchez engelsanchez referenced this pull request in basho/riak_test Jan 31, 2013

Merged

Verify fix to writes on reads when LWW+Bitcask #189

tisba commented Apr 15, 2013

@engelsanchez any idea when this will land in riak? 1.3.1 does not have this, correct?

Contributor

slfritchie commented Apr 15, 2013

Am I reading the history correctly, that this PR was merged to master about a day after the 1.3.x branch was created?

Contributor

engelsanchez commented Apr 15, 2013

@tisba @slfritchie this was my first bit of work for the 1.4 release, which was supposed to go into code freeze this week. The 1.3.1 bugfix release was all about a specific data corruption problem (2i bigints encoding) and only major but very simple fixes went along with that. Since now we are working on a 1.3.2 bugfix release which delays 1.4 and this fix has gotten some visibility, I will make sure it goes out with that.

tisba commented Apr 15, 2013

@engelsanchez this is great, thanks!

@engelsanchez engelsanchez added a commit to basho/riak_test that referenced this pull request Jun 19, 2013

@engelsanchez engelsanchez Verify fix to writes on reads when LWW+Bitcask
This verifies the fix to issue basho/riak_kv#334
The test needs to run with bitcask:
  * It sets last_write_wins on a bucket
  * Writes on object
  * Repeatedly reads it
  * Verifies that the write/read repair count doesn't change
3d3cee0

@seancribbs seancribbs deleted the eas-fix-lww-writes-on-read branch Jan 6, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment