Skip to content
afeinberg edited this page Aug 17, 2010 · 18 revisions

What is it

The purpose of Hinted Handoff is to provide additional consistency guarantees, allowing consistency to be reached before (or without) read-repair taking place on the key. The idea is to allow handle transient failures and partitions in situations where perform read repair at the end of the get pipeline is expensive e.g., across data centers or where quorums are otherwise unwanted.

How it works

  1. Any single put or delete to a node (whether a part of a quorum or not) fails due to a non-application exception i.e., timeout, machine unreacheable
  2. At the end of the put (or delete) pipeline, a list of nodes to which requests have failed is kept.
  3. For each failed node in the list of failed nodes, randomly choose a live node elsewhere in the pipeline and perform a synchronous put to the slop store on that node, retaining the original vector clock. A slop store is a separate store that is not used for live quorums, but merely stores these hints. Each hint consist of: key, version (vector clock), original node, time handoff occured, operation (put or a delete) and, if the operation is a put, the value. The hints are serialized using protocol buffers.
  4. If ultimately the put or delete is a failure (required-writes not achieved), still return an exception to the client but if previous step suceeded specify, within the exception explanation string that a handoff has occurred
  5. Periodically, the nodes holding the hints should attempt to write the original key and value to the original node that failed in the first place. One a put or delete is confirmed as having occured, we can delete the hint

Note, that this is a deviation from the original Dynamo paper. There are no sloppy quorums for reads, if required-writes aren’t met by a strict quorum the request is still considered failed (even if hinted handoff succeeds) and hints are written to random nodes rather than to neighbours to avoid cascading failures.

Test plan

There are several things to test out for Hinted Handoff:

  1. Correct semantics. This is tested through unit tests.
  1. Transient (a few hours) failures on a real cluster, under realistic load are correctly handed off and restored. The plan is simple:
    start a cluster, start doing a mixed read/write work load (with known values written to known keys), kill a node in the cluster. Keep it down for an hour, bring the node back. Verify that the values written can now be read.
    1. There is no significant impact on performance on the other nodes in the cluster
    2. When a node is returned and a push of the hints occurs, the pushes shouldn’t cause a disrupt to ongoing activity (puts, writes)
  1. Longer terms (a whole day) failures don’t result in unexpected disk fill up or disruptive hand off push operations upon the return of the node.
  1. Failures of multiple nodes (for hours) should also not cause a disruption in normal activity.

Additional implementation work

  1. Hints should only be handed off to nodes within the same zone. This is simple logic and it should be implemented once the regular logic has been verified to be correct and safe
  1. Hinted handoff should be able to handle multi-hour (and if possible, multi-day) network partitions
  1. There should be some operational visibility into the slop store e.g., ability to see how many hints remain, ability to clear out hints

Clone this wiki locally