Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider replacing vector clocks with naive timestamp-based conflict resolution #2784

Closed
timmaxw opened this issue Jul 31, 2014 · 11 comments
Closed
Milestone

Comments

@timmaxw
Copy link
Member

timmaxw commented Jul 31, 2014

In #2663, we decided that the new ReQL administrative API should handle vector clock conflicts as follows: Reading from a conflicted value produces an error. Writing to a conflicted value resolves the conflict.

However, I'm not sure that this is actually the best solution. The problem is that we have no good way to return a document that is partially in error. Imagine this situation: A user accidentally causes a vector clock conflict on a table's database field. Now they can't access the table's metadata at all; they get an error telling them to "overwrite the document". But maybe they've forgotten what they had the config set to. Even though the valid config is still stored on the server, the user can't access it; they have to reconstruct it. The user experience is similarly bad if they do a range scan over the rethinkdb.table_config artificial table. The table with the metadata conflict will not appear; instead, they will get a message saying that there was an error.

We should consider replacing vector clocks with a structure consisting of (timestamp, uuid, value), where timestamp is a time_t. The semilattice join is defined by comparing first by timestamp, then by UUID. When the user writes to a field, the server will set timestamp to the larger of the server's current time() and the old timestamp plus one, and it will set uuid to its machine ID. (Or peer ID. Or a newly generated UUID. It doesn't matter.)

Normally, this works transparently, just like vector clocks. If the user issues two updates almost simultaneously, one will be chosen arbitrarily. If the user writes the same field on both sides of a netsplit, whichever write happens later (by wall-clock time) will be chosen, as long as the servers' clocks are sane. If the servers' clocks are insane, then everything still works properly, except that if the user writes the same field on both sides of a netsplit the winner will be arbitrary. The user never sees a conflict; the system always picks a value for the field.

This issue is probably not important.

@timmaxw timmaxw added this to the reql-admin milestone Jul 31, 2014
@timmaxw
Copy link
Member Author

timmaxw commented Jul 31, 2014

Another benefit is that we wouldn't have to think about conflict states in all the code that reads from the vector clocks.

@mlucy
Copy link
Member

mlucy commented Jul 31, 2014

However, I'm not sure that this is actually the best solution. The problem is that we have no good way to return a document that is partially in error.

We could introduce a pseudotype for vector clock conflicts. (I thought that's what the plan was.) You can read this pseudotype to get the values in conflict, but if you try to use it in a ReQL expression it produces an error.

@timmaxw
Copy link
Member Author

timmaxw commented Jul 31, 2014

That would work. But I think that the users' lives would be much simpler if we got rid of vector clocks completely.

@timmaxw
Copy link
Member Author

timmaxw commented Jul 31, 2014

Well, "much simpler" is the wrong word. A typical user will rarely encounter vector clock conflicts. But I think they will be quite inconvenient when they do occur; the user will probably see them as a hassle. This will be especially true with the new ReQL admin API, because we'll encourage people to use scripts to configure their cluster. Vector clock conflicts are important if two humans make different changes at the same time, and also it's rare for human admins to create a vector clock conflict; but with automated configuration tools, the risk of a coincidence is higher, and the probability that the conflict is actually worth bothering the user about is lower.

@timmaxw
Copy link
Member Author

timmaxw commented Aug 1, 2014

Another problem with vector clocks is that there's no good way to handle a vector clock conflict on the name field of a table when we are using the table's name as the primary key for rethinkdb.table_config.

@timmaxw
Copy link
Member Author

timmaxw commented Aug 1, 2014

On further thought, the proposal to have the system raise an error if the user tries to read a vector clock conflict is hard to implement. Currently, we implement writes as a function that maps the old value to the new value; we would have to distinguish between writes that use the old value and writes that don't. It seems to me that naive timestamp-based resolution is significantly easier to implement than either of the vector-clock solutions.

@timmaxw
Copy link
Member Author

timmaxw commented Aug 16, 2014

This proposal has been approved. We'll implement it as part of the ReQL admin changes.

@mlucy
Copy link
Member

mlucy commented Aug 16, 2014

I'm a little bit scared of this, but I guess it's probably fine.

@coffeemug
Copy link
Contributor

I think it's fine for a couple of reasons:

  • People take advantage of this functionality (changing values on two sides of a netsplit) extremely rarely if ever, and it's usually by accident.
  • If a metadata conflict does happen, empirically people are very surprised and have no idea how to fix it. They're also frustrated that they have to deal with the issue.
  • It's temporary. Eventually we'll replace this with a consensus algorithm so conflicts couldn't happen period.

I think timestamp-based resolution would result in a dramatically better user experience when the edge cases do happen. It's probably not ideal for large deployments, but for the moment there are much bigger issues in those scenarios anyway. By the time we fix those, we'll probably also add a consensus algorithm so this problem will go away.

@timmaxw
Copy link
Member Author

timmaxw commented Aug 21, 2014

Implementation is in CR 1994.

@timmaxw
Copy link
Member Author

timmaxw commented Aug 25, 2014

Merged into reql_admin in 284e341.

@timmaxw timmaxw closed this as completed Aug 25, 2014
@danielmewes danielmewes modified the milestones: reql-admin, 1.16 Jan 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants