HTTPS clone URL
Subversion checkout URL
- Berkeley DB Java Engine Tuning Tips
- Binary JSON Serialization
- Build and Push Jobs for Voldemort Read Only Stores
- C++ client build instructions
- Client side failure detector implementations
- EC2 Testing Infrastructure
- Fun Projects
- Hinted Handoff
- JMX Monitoring
- Krati storage engine
- Multi datacenter results
- New configuration system
- Performance Tool
- Powered By Voldemort
- Presentations and Talks
- RoutedStore redesign
- Server side transforms in Voldemort
- Topology awareness capability
- Upgrading from 0.81
- Voldemort Admin tool
- Voldemort dev roadmap
- Voldemort thin client
- Writing own client for Voldemort
Clone this wiki locally
The purpose of Hinted Handoff is to provide an additional consistency mechanism, allowing consistency to be reached before (or without) read-repair taking place on the key. The idea is to handle transient failures and partitions in situations where performing read repair at the end of the get pipeline is expensive (e.g., across data centers) or where quorums are otherwise unwanted.
- Any single put or delete to a node (whether a part of a quorum or not) fails due to a non-application exception i.e., timeout, machine unreacheable
- If the operation is part of a background request, then upon failure a hinted handoff takes place asynchronously
- If the operation is a foreground request (a part of ConfigureNodes where we look for a master), then we add the id of the failed node to a list
- For each failed operation, use a specified HintedHandoffStrategy to route the request to a node elsewhere in the cluster and perform a synchronous put to the slop store on that node, retaining the original vector clock. A slop store is a separate store that is not used for live quorums, but merely stores these hints. Each hint consist of: key, version (vector clock), original node, time handoff occurred, operation (put or a delete) and, if the operation is a put, the value. The hints are serialized using protocol buffers.
- When in a multi-colo environment, make sure to route hints that occur due to failures of remote nodes to nodes in the local datacenter. Route hints occuring due to failures of local nodes to local datacenter.
- If ultimately the put or delete is a failure (
required-writesnot achieved), we still return an exception to the client; if previous step succeeded, specify within the exception explanation string that a handoff has occurred
- Periodically the nodes holding the hints should attempt to write the original key and value (or a request to delete the specified version) to the original (once failed) node. Once a put or delete is confirmed as having succeeded to the original node, we can delete the hint. To
preserve bandwidth and avoid round trip delays, we made an AdminClient (streaming) based implementation of hinted handoff.
Note that this is a deviation from the original Dynamo paper: there are no sloppy quorums for reads; if
required-writes aren’t met by a strict quorum, the request is still considered failed (even if hinted handoff succeeds) and hints are written to random nodes rather than to neighbours in the ring to avoid cascading failures.
Hinted handoff is not enabled by default
The strategy of which decides which live replica receives a request is a pluggable parameter. There are several implementations:
any-handoff (HandoffToAnyStrategy) — This is the original strategy. Here we choose
a random live node in the cluster and hand the request off to it. Note: there may be scalability issues with this specific pattern in larger (> 15 nodes per datacenter) clusters.
consistent-handoff (ConsistentHandoffStrategy) — Handoff to any of the
replica-factor nodes adjacent to the failed node in the ring, the list being static and predetermined.
proximity-handoff (ProximityHandoffStrategy) — Like HandoffToAnyStrategy but will route the hints according to the zone proximity to the client’s zone (data-center) id. Useful if all clients in a specific zone fail, for example.
To enable hinted handoff, specify the strategy type in a store definition – in stores.xml for the appropriate stores -
On the server side these are the parameters that you can use
|Parameter||Default||What it means|
|slop.store.engine||bdb||What storage engine should we use for storing misdelivered messages that need to be rerouted?|
|slop.pusher.enable||true||Enable the slop pusher job which pushes every ‘slop.frequency.ms’ ms|
|slop.read.byte.per.sec||10 * 1000 * 1000||Slop max read throughput|
|slop.write.byte.per.sec||10 * 1000 * 1000||Slop max write throughput|
|pusher.type||StreamingSlopPusherJob||Job type to use for pushing out the slops|
|slop.frequency.ms||5 * 60 * 1000||Frequency at which we’ll try to push out the slops|