Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test deletion of object version (i.e. vclock) corresponding to CRDT version (i.e. context) #335

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

lucafavatella
Copy link

@lucafavatella lucafavatella commented Nov 23, 2016

Replaces #317.
Depends on basho/riak-client-tools#21 (merged).

Please refer to commit message for background and next steps.

Luca Favatella added 2 commits November 23, 2016 18:47
Add tests for deletion of Riak KV object corresponding to a CRDT
context i.e. at a specific vclock.

Background
====

The `riakc_pb_socket:fetch_type/{3,4}` function appears not to return
the vclock of the Riak KV object hence the user cannot (naively)
delete the object at the vclock corresponding to a specific CRDT
context being therefore induced to delete the object using the unsafe
`riakc_pb_socket:delete/3` rather than
`riakc_pb_socket:delete_vclock/4` - safer as catering for concurrent
writes.  In order to delete the Riak KV object corresponding to a
certain CRDT context the user has the only option of retrieving the
server-side representation of the object via mapreduce, compare CRDT
contexts (between the one to-be-deleted and the one retrieved
server-side) and - if equal - call `riakc_pb_socket:delete_vclock/4`
passing the retrieved server-side vclock.

This PR adds tests for the above for the set datatype; it also adds
function to retrieve context from set.

Next Steps
====

TODO Make extraction of context from Riak KV client-side CRDT
     representation part of behaviour `riakc_datatype` - not only of
     `riakc_set`.

The tests show that there is room for improvement in the user API.  I
would prefer either:

- The `riakc_pb_socket:fetch_type` function to include an option to
  return the vclock of the object; or

- A new function `riakc_pb_socket:delete_type` to be added.

I understand both options require changes in the Protocol Buffers
messages definition.

I partially understand the complexity in mapping from CRDT context to
Riak KV object vclock because in a corner case the Riak object could
contain not simply a CRDT e.g. set but unexpected siblings e.g. a set
CRDT, a map CRDT, a register CRDT, a non-CRDT (I read of this corner
case in the Riak KV server code). But I guess such corner case could
be handled returning an error probably as if it happen the user has
probably got more serious issues...
@lucafavatella
Copy link
Author

Updated this PR following merge of basho/riak-client-tools#21 - tests running now.

@lukebakken
Copy link
Contributor

This is related to basho/riak_kv#1360

As I'm sure you've found, there is no way to resolve a sibling tombstone if a CRDT is updated concurrently with a delete operation.

Are you running into this issue frequently in your environment?

@lukebakken lukebakken added this to the riak-erlang-client-2.5.1 milestone Nov 25, 2016
@lukebakken lukebakken self-assigned this Nov 25, 2016
@lucafavatella
Copy link
Author

Are you running into this issue frequently in your environment?

@lukebakken Thanks for your note - I was writing more context notes right now and you saved me some typing.

I do not experience this issue in any environment.

@lukebakken
Copy link
Contributor

OK, thanks. I just added some information to basho/riak_kv#1360. Since a resolution to this issue depends on work in Riak I am going to leave this PR open.

@lucafavatella
Copy link
Author

@lukebakken Thanks I will follow that ticket then. Feel free to close this PR / change title. I had proposed this (and #317) PR more as a starting point for conversation, and I now understand that that ticket is the correct place to follow.

I became aware of this "not easy point" in the CRDT API while prototyping / designing in a project back in Sep. I now moved on to other items in the same project and may come back to this in Feb 2017 / Mar 2017 (not sure whether and when).

@lucafavatella
Copy link
Author

@lukebakken I thought more about this and I understand that basho/riak_kv#1360 is targeting a different issue than the one exposed by this PR.

  • the delete a CRDT and update a CRDT issue [JIRA: RIAK-2424] riak_kv#1360 aims at merging two already-in-vnode values for the same object - namely a tombstone and a CRDT e.g. after healing from a network partition.

  • This PR aims at exposing that Riak KV API does not allow a client to avoid deleting updates it has not seen, hence not creating tombstone in the first place.

    • Do I understand correctly that this is a separate issue (potentially classifiable as wishlist and wontfix)? Or am I missing anything?

This PR starts from the assumption that riakc_pb_socket:delete_vclock prevents deleting unseen updates on an object. E.g. for a key:

  • Client A reads vclock V1;
  • Client B reads vclock V1;
  • Client B writes vclock V2 (V2 > V1);
  • Then client A calls riakc_pb_socket:delete_vclock with vclock V1;
  • Result: the key is not deleted (because V1 < V2) - that is safe.

The tests in this PR aim at preventing deleting unseen updates on a data type e.g. set. E.g. for a key:

  • Client A reads context C1;
  • Client B reads context C1;
  • Client B updates CRDT (at a distance) from context C1, resulting to context C2;
  • Context C2 is persisted on all vnodes, no network partition;
  • Then client A calls riakc_pb_socket:delete (it cannot call delete_vclock because it has no vclock - only CRDT context C1);
  • Result: the key is deleted - that is not safe because client A meant to delete object at CRDT context C1, not any subsequent updates.
    • The tests in this PR show that, in order not to delete unseen updates, client A needs to:
      • Resort to map-reduce for retrieving a consistent snapshot of the couple vclock-context on a vnode (N=1 in map-reduce);
      • Then compare the retrieved context to context C1 (the one based upon which it had taken the decision to delete); and:
        • If equal, issue riakc_pb_socket:delete_vclock with retrieved vclock;
        • If not equal, either:
          • Give up deletion (under likely assumption that there CRDT received update - saying likely as map-reduce could hit vnode with context earlier than C1); or
          • Execute a new read-CRDT & decide & attempt-to-delete-CRDT sequence.

@lukebakken
Copy link
Contributor

Yes, basho/riak_kv#1360 has an eventual goal of dealing with the situation you describe. @russelldb - do you have any comments?

@lukebakken lukebakken modified the milestones: riak-erlang-client-2.5.1, riak-erlang-client-2.5.2 Dec 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants