Skip to content

lorin/raft-pluscal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

91 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Raft spec in PlusCal (work in progress)

Introduction

This doc describes a specification of the Raft distributed consensus algorithm using PlusCal. The author of Raft, Diego Ongaro, has already written a TLA+ spec of Raft, but I wanted to write in PlusCal in order to better learn PlusCal and to see if I could write a more readable spec than the original, which I find difficult to follow.

I'm going to draw heavily on the memory interface examples from Leslie Lamport's Specifying Systems book, so you'll see similarities in some of the specs here to those examples.

Functional interface

Let's start by specifying an interface that describes the functionality that Raft provides. The specification at this level won't have anything about the Raft implementation itself, it will apply equally to any other kind of system that provides the same functional behavior (a non-distributed implementation, a Paxos-based one, etc.).

Functionally, Raft implements a key-value data store. If we were to define its interface in Java, and we assume for simplicifity that it only supports integers, it would look something like this:

interface Store {
    void Write(String key, int value);

    int Read(String key);
}

Since we're modeling a distributed system, we need to think in terms of messages: a client sends a write or a read request to the store, and the store eventually replies with a response.

Sequential store

Before we implement a specification for Raft, we'll start with something simpler: a specification for a sequential store. By a sequential store, we mean that the clients cannot make concurrent requests: only one client is permitted to make a request at a time.

The specification will contain two kinds of processes:

  • clients that submit read and write requests
  • the store that processes the requests and responds

Since only one client can make a request at a time in a sequential store, we could use a single client as a model. Instead, I'm going to use multiple clients, but use a mutex to ensure that only one client executes at a time. This should make the code easier to generalize to the final Raft implementation.

My focus here is on readability of the model, so I'm going to make heavy use of PlusCal macros so the model is easier to read.

The complete model is at SequentialStore.tla, I'll describe elements of it here.

Here's how the clients behave:

process Client \in 1..N
begin
c1: acquireMutex();
c2: with var \in Variables, val \in Values do
        either sendReadRequest(var);
        or sendWriteRequest(var, val)
        end either;
    end with;
c3: awaitResponse();
c4: releaseMutex();
    goto c1;
end process

A client acquires a mutex to ensure that client requests don't overlap. Clients send or a request for a read or a write, and then wait for a response.

Here's the store behavior:

process Store = 0
variables request,
          response,
          dict = [x \in Variables |-> NoVal];
begin
s1: getNextRequest();
s2: if request.op = ReadOp then
        response := Message(ResponseType, request.client, ReadOp, request.var, dict[request.var]);
    else \* it's a write
        dict[request.var] := request.val;
        response := Message(ResponseType, request.client, WriteOp, request.var, request.val);
    end if;
s3: responseQueues[response.client] := Append(responseQueues[response.client], response);
s4: goto s1;
end process

The store retrieves the next request from its queue, and then either reads to the dict variable if it's a read, or writes to it if it's a write. It takes the response message and places it at the head of the queue corresponding to the client.

The safety property we are interested in here is: every read of a variable must correspond to the most recent write of that variable. I expressed that like this:

ReadLastWrite == \A i \in 1..Len(log), var \in Variables, val \in Values :
   IsRead(i, var, val) =>
       \E j \in 1..(i-1) : /\ IsWrite(j, var, val)
                           /\ ~ \E k \in (j+1)..(i-1), v \in Values \ {val}: IsWrite(k, var, v)

In order to check this property, I needed to keep track of the history of requests and responses. To do that, I created a variable called log that contains a sequence. Every time a client sends a request or receives a response, the client appends to the log:

macro sendRequest(msg) begin
    requestQueue := Append(requestQueue, msg);
    log := Append(log, msg);
end macro;

...

macro awaitResponse()
begin await Len(responseQueues[self]) > 0;
      log := Append(log, Head(responseQueues[self]));
      responseQueues[self] := Tail(responseQueues[self]);
end macro;

The log variable is a good example of a bookkeeping variable that would not appear in a actual implementation, but is needed in the model in order to check a property with the model checker.

Checking liveness

As an exercise, we'll do a simple liveness check. The check we'll do is to ensure that clients can always make progress.

Here's the liveness property we'll check:

AllClientsEventuallyRun == \A client \in 1..N : []<>(pc[client] = "c2")

This says that clients can always eventually reach label "c2".

This property isn't satisfied by this model: client 1 could keep reacquiring the mutex, starving client 2. The TLA+ toolbox will check this for us.

Liveness checks require temporal properties, which are much slower to check in the TLA+ toolbox than invariants.

Before I ran the liveness check, I commented out all code that involved the "log" variable. That variable is only used for checking a safety property, and without it the liveness check is much simpler.

I deliberately chose a small model to make things simple. These are the variables specified by "What is the model" in the Model Overview of the TLA+ toolbox.

WriteOp <- "Write"
ReadOp <- "Read"
N <- 2
ResponseType <- "Response"
Variables <- [ model value ] {x}
defaultInitValue <- [ model value ]
Values <- {1}
NoVal <- [ model value ]
RequestType <- "Request"

The property we check is "AllClientsEventuallyRun".

This model runs in under a second. It immediately finds an error: "Temporal properties were violated".

In the Error-Trace window on the right, the last line is <Back to state 5>. The model checker found a path where it could loop forever without the liveness proeprty being true.

Linearizability

Raft claims to implement a linearizable store. Before we get into the Raft implementation itself, let's make it clear what linearizability is.

From Herlihy and Wing, who introduced linearizability:

Linearizability provides the illusion that each operation applied by concurrent processes takes effect instantaneously at some point between its invocation and its response... ... First, each operation should appear to “take effect” instantaneously, and second, the order of nonconcurrent operations should be preserved.

More formally, from the Hierlihy and Wing paper:

A history H is linearizable if it can be extended (by appending zero or more response events) to some history H' such that

L1: complete(H') is equivalent to some legal sequential history S, and

L2:

where:

A history is a finite sequence of operation invocation and response events. You can think of these events as being from a client's point of view: a client invokes an operation on an object, and, some time in the future, the client receives a response from the object.

complete(H) is the maximal subsequence of H consisting only of invocations and matching responses.

is an irreflexive partial order of operations in H. In particular, if res(e0) precedes inv(e1) in H, where res(e0) is the response associated with operation e0, and inv(e1) is the invocation of operation e1.

A history H is sequential if:

  1. The first event of H is an invocation.
  2. Each invocation, except possibly the last, is immediately followed by a matching response. Each response is immediately preceded by a matching invocation.

Let's define a PlusCal operator that returns true if a history is lineralizable.

Raft

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages