-
Notifications
You must be signed in to change notification settings - Fork 3
Architecture :: Core Persistence
The main purpose of the stasis
project is to securely store binary data and to ensure sufficient copies of it are kept,
while allowing for easy retrieval by its owner.
Crates
represent data stored by the system; they are
usually accompanied by Manifests
that describe
the data itself, where it came from and where it was sent.
Crates are stored on one or more Nodes
, either locally
or sent remotely. The primary task of the routing is to ensure that
crates are copied to some minimum number of nodes, for redundancy.
The goal is to create a type of shared network storage, with some redundancy guarantees.
A local node is storage available directly to the device itself, but not necessarily a local disk. For example, such a node can be created on a network share; it is considered local to the device, even though the data is sent over a network. A local node could be a directory storing individual crates, a container with multiple crates in a single file or an in-memory data store.
A remote node is storage available on other peers of the core
network;
they themselves may have local nodes where crates are stored or they could delegate storage to other remote nodes.
HTTP and gRPC are the protocols available for data exchange between remote nodes, both over TLS.
Before storing a Crate
, the required storage needs to be reserved on the appropriate nodes; this is meant to
avoid wasting time and resources on pushing data that cannot be fully stored by the nodes.
The current routing implementation is lacking in two areas:
If node A
reserves storage on node B
, and node B
itself should send copies to nodes C
and D
, reservations
for nodes C
and D
cannot be done upfront; node B
will still attempt to send copies to those nodes but that
could fail (node or storage not available, for example). There is currently no retry or callback mechanism that
would instruct the original sender to try and resolve this situation.
Due to the above limitation in the current implementation, it cannot be guaranteed that enough (or any) remote nodes will be able to accept copies of the data; the redundancy requirements will still be satisfied, but only with local nodes. At this stage, remote copies should be considered best-effort.