Strong Consistency Reference
Riak was originally designed as an eventually consistent system, fundamentally geared toward providing partition (i.e. fault) tolerance and high read and write availability.
While this focus on high availability is a great fit for many data storage needs, there are also many use cases for which strong data consistency is more important than availability. Basho introduced a new strong consistency option in version 2.0 to address these use cases. In Riak, strong consistency is applied using bucket types, which enables developers to apply strong consistency guarantees on a per-key basis.
Strong vs. Eventual Consistency
If you successfully write a value to a key in a strongly consistent system, the next successful read of that key is guaranteed to show that write. A client will never see out-of-date values. The drawback is that some operations may fail if an insufficient number of object replicas are available. More on this in the section on trade-offs.
In an eventually consistent system, on the other hand, a read may return an out-of-date value, particularly during system or network failures. The advantage of this approach is that reads and writes can succeed even when a cluster is experiencing significant service degradation.
Building on the example presented in the eventual consistency doc,
imagine that information about who manages Manchester United is stored
in Riak, in the key
manchester-manager. In the eventual consistency
example, the value associated with this key was originally
David Moyes, meaning that that was the first successful write to that
key. But then
Louis van Gaal became Man U's manager, and a write was
executed to change the value of
Now imagine that this write failed on one node in a multi-node cluster.
Thus, all nodes report that the value of
Louis van Gaal except for one. On the errant node, the value of the
manchester-manager key is still
David Moyes. An eventually
consistent system is one in which a get request will most likely return
Louis van Gaal but could return the outdated value
In a strongly consistent system, conversely, any successful read on
manchester-manager will return
Louis van Gaal and never
Reads will return
Louis van Gaal every single time until Man U gets a new
manager and someone performs a successful write to
to change its value.
It might also be useful to imagine it a bit more abstractly. The following causal sequence would characterize a strongly consistent system:
- The value of the key
kis set to
- All successful reads on
- The value of
kis changed to
- All successful reads on
- And so forth
At no point in time does this system return an out-of-date value.
The following sequence could characterize an eventually consistent system:
- A write is made that sets the value of the key
- Nearly all reads to
v, but a small percentage return
- A write to
kchanges the value to
- Nearly all reads to
v2, but a small number return the outdated
not found) because the newer value hasn't yet been replicated to all nodes
Making the Strong vs. Eventual Decision
The first system described above may sound like the undisputed champion, and the second system undesirable. However:
- Reads and writes on the first system will often be slower---if only by a few milliseconds---because the system needs to manage reads and writes more carefully. If performance is of primary concern, the first system might not be worth the sacrifice.
- Reads and writes on the first system may fail entirely if enough servers are unavailable. If high availability is the top priority, then the second system has a significant advantage.
So when deciding whether to use strong consistency in Riak, the following question needs to be asked:
For the specific use case at hand, is it better for reads to fail than to return a potentially out-of-date value?
If the answer is yes, then you should seriously consider using Riak in a strongly consistent way for the data that demands it, while bearing in mind that other data can still be stored in Riak in an eventually consistent way.
Using Riak in a strongly consistent fashion comes with two unavoidable trade-offs:
- Less availability
- Slightly slower performance
Strongly consistent operations are necessarily less highly available
than eventually consistent operations because they require a quorum
of available object replicas to succeed. Quorum is defined as N / 2 + 1,
n_val / 2 + 1. If N is set to 7, at least 4 object replicas must be
available, 2 must be available if N=3, etc.
If there is a network partition that leaves less than a quorum of object replicas available within an ensemble, strongly consistent operations against the keys managed by that ensemble will fail.
Nonetheless, consistent operations do provide a great deal of fault tolerance. Consistent operations can still succeed when a minority of replicas in each ensemble can be offline, faulty, or unreachable. In other words, strongly consistent operations will succeed as long as quorum is maintained. A fuller discussion can be found in the operations documentation.
A second trade-off regards performance. Riak's implementation of strong consistency involves a complex consensus subsystem that typically requires more communication between Riak nodes than eventually consistent operations, which can entail a performance hit of varying proportions, depending on a variety of factors.
Ways to address this issue can be found in strong consistency and performance.