PBS for Cassandra
Probabilistically Bounded Staleness (PBS) describes the probability that an eventually consistent data store returns consistent data.
This is a clean rewrite of the code we used in our tech report, recently accepted to VLDB 2012. The main purpose of this Cassandra fork and accompanying analysis code is to provide a concise, readable implementation of PBS in a real data store. We've favored patch brevity over optimizations that might be worthwhile someday (e.g., wire protocol changes). We've done some basic checking to ensure that this code matches the original implementation from our paper.
We've implemented the analysis portion in Python, but, after making
some design decisions about the best UI, it'd be trivial (a day or so)
to implement it in Java and make it accessible via
Please feel free to contact us with any questions.
This code instruments Cassandra to gather latency traces (WARS, in milliseconds) for use in modeling and uses these traces to predict the probability of consistency based on those traces. (Basic patch) (Basic analysis script and functions)
To view the WARS latencies, run
nodetool with the
pbswars option (e.g.,
bin/nodetool -h *host* pbswars).
By default, latencies are presented ordered by which number response
they correspond to (e.g., the latencies of the first W,A,R, and
S received for each request, then the second set, and so on). To
see unordered latencies (useful under independence assumptions), run
nodetool with the
To evaluate the probability of consistency given a set of traces, run
pbs/analyze_pbs.py. Parameters are currently stored within
pbs/pbs-params.yaml and include the replication factor (N), the
time waited after a write commits before issuing a read (t, or
t-visibility), the number of tolerable versions of staleness (k),
the number of Monte Carlo simulations (ITERATIONS), and whether or
not latencies are independently, identically distributed
(IID_assumption, see Caveats section).
bin/nodetool -h localhost pbswars > testlatencies.out then edit
latencymodel points to
testlatencies.out, then run
PBS Analysis Tool At time 0.00ms and maximum version staleness of k=1: N=3, R=1, W=1 Probability of fresh reads: 0.864730 Average read latency 8.70ms Average write latency 8.71ms ...
Caveats for the Cassandra patch
The latency-gathering code in Cassandra currently requires globally
synchronized clocks and only records remote operations. We can fix
the latter if required (e.g., logging local reads as "zero latency").
This means that if you have exactly n replicas, you probably won't
have enough latency data to run simulations where N=n. To fix
this, you can make IID assumptions by setting the IID_assumption
True. Note that the Python code does not make
independence assumptions by default (as we do in the TR).
We currently log (maximum) 10,000 WARS operation latencies according
to a LRU policy; we can make this number variable in
The wire protocol can also be made more lightweight (at the expense of
a more complicated patch!).