Skip to content

How Sirius Works

Peter Cline edited this page Feb 25, 2014 · 4 revisions

==== Sirius works internally by distributing updates across all nodes in the cluster. These updates must be applied in the same order across all nodes in the cluster to maintain consistency. This sequence is created by establishing a total ordering of all updates to the state stored by Sirius. Updates are not applied until their turn comes. When applying an update a serialized version is also stored, ahead of its application, in order to allow the state to be recovered in the event of a restart.

When Sirius receives a request for an update it first decides an ordering using a variant of Multi-Paxos. There are ample discussions of what Paxos is and how it works available through various sites on the web. In particular this version is derived from the algorithm described in Paxos Made Moderately Complex. Once the Paxos algorithm arrives at a decision for the ordering it distributes this decision to the cluster. When nodes receive a new decision they notify the requesting client of the completion of ordering and queue it for application to the in-memory state.

As updates may arrive out of order, they are queued in memory until their time has come to be applied. An update is applied once all prior updates have been applied. Updates that arrive at a node prior to a previous update in the Paxos total ordering sequence remain in the queue until the earlier updates are received and applied. Updates may be received from the result of a Paxos round, from local storage when recovering a node’s state, or from another node in the cluster when catching up in the event of missed updates.

Each Sirius node periodically polls its neighbor nodes in the cluster to determine if there have been any updates that have not been processed by the node. This catch-up algorithm will allow nodes that have missed updates to get them from the rest of the cluster. The updates are requested by querying a neighbor node for a block of transactions that have occured after the last transaction in the local log. For example a node that has processed transaction X using a block size of B, will query a neighbor for transactions X+1 through X+B. When a node receives a full block from one of its neighbors it will process the returned transactions and request the next block. When an empty or partial block is returned the node will be caught up to the responding neighbor after processing the transactions returned in the empty or partial block.

Updates are written to persistent storage in a transaction log and are then applied to the in-memory state in an application specific manner using the callback functions. These callback functions are application specific and they have to handle put, get, and delete operations.

For reads, applications generally access the built-up data structures directly.