The purpose of Hinted Handoff is to provide an additional consistency mechanism, allowing consistency to be reached before (or without) read-repair taking place on the key. The idea is to handle transient failures and partitions in situations where performing read repair at the end of the get pipeline is expensive (e.g., across data centers) or where quorums are otherwise unwanted.
required-writesnot achieved), we still return an exception to the client; if previous step succeeded, specify within the exception explanation string that a handoff has occurred
Note that this is a deviation from the original Dynamo paper: there are no sloppy quorums for reads; if
required-writes aren’t met by a strict quorum, the request is still considered failed (even if hinted handoff succeeds) and hints are written to random nodes rather than to neighbours in the ring to avoid cascading failures.
Hinted handoff is not enabled by default
The strategy of which decides which live replica receives a request is a pluggable parameter. There are several implementations:
any-handoff (HandoffToAnyStrategy) — This is the original strategy. Here we choose
a random live node in the cluster and hand the request off to it. Note: there may be scalability issues with this specific pattern in larger (> 15 nodes per datacenter) clusters.
consistent-handoff (ConsistentHandoffStrategy) — Handoff to any of the
replica-factor nodes adjacent to the failed node in the ring, the list being static and predetermined.
proximity-handoff (ProximityHandoffStrategy) — Like HandoffToAnyStrategy but will route the hints according to the zone proximity to the client’s zone (data-center) id. Useful if all clients in a specific zone fail, for example.
To enable hinted handoff, specify the strategy type in a store definition – in stores.xml for the appropriate stores -
On the server side these are the parameters that you can use
|Parameter||Default||What it means|
|slop.store.engine||bdb||What storage engine should we use for storing misdelivered messages that need to be rerouted?|
|slop.pusher.enable||true||Enable the slop pusher job which pushes every ‘slop.frequency.ms’ ms|
|slop.read.byte.per.sec||10 * 1000 * 1000||Slop max read throughput|
|slop.write.byte.per.sec||10 * 1000 * 1000||Slop max write throughput|
|pusher.type||StreamingSlopPusherJob||Job type to use for pushing out the slops|
|slop.frequency.ms||5 * 60 * 1000||Frequency at which we’ll try to push out the slops|