Skip to content

A bitcask inspired, replicated key-value store.

Notifications You must be signed in to change notification settings

jdockerty/squirrel

Repository files navigation

squirrel

As in "to squirrel away".

A (replicated) persistent key-value store which uses a simple implementation of bitcask as the underlying storage mechanism.

How it works

This follows a simple structure and exposes a very common API surface: set(k,v), get(k), and remove(k).

Get

A get(k) operation will read from the running store via its keydir. The value must exist here in order for it to be returned1, as the keydir contains a mapping of all known values and their respective offsets in the active or compacted log files.

If a key exists, its containing log file is opened, the offset is seeked to, and the entry deserialised for return.

In the event of a None value, this signifies either a tombstone value (from prior removal) or a key which has never existed. In either case, the value does not exist in the keydir so no value is returned.

Set

sequenceDiagram
    participant client
    participant squirrel
    participant WAL as active log file
    participant keydir

    client->>squirrel: set(k, v)
    squirrel->>WAL: Append 'LogEntry'
    note right of WAL: LogEntry consists of<br/>key (k), value (v)<br/> and metadata
    WAL-->>squirrel: Acknowledged
    squirrel->>keydir: Upsert KeydirEntry
    note right of keydir: Maps key to <br/>log file and offset
    keydir-->>squirrel: Acknowledged
Loading

Note

For now, both keys and values are restricted to String types.

Remove

A call to remove(k) is similar to get(k), except a tombstone value, represented by None, is appended to the active log file and updated in the keydir.

The tombstone value signifies that the entry should be dropped on the next compaction cycle2. This means that the value will no longer be present afterwards.

Attempting to remove a key which does not exist will result in an error.

Client/Server

TL;DR it uses gRPC with tonic.

See client.rs/server.rs/proto definitions.

Replication

A simplistic strategy for replication is achieved via the replication crate.

Notes

This was initially built through my implementation of the PingCAP talent plan course for building a key-value store in Rust:

And has since grown into my own toy project.

Footnotes

  1. When a call to open is made, the directory is scanned for any log files, which means that in the event of a crash or restart the keydir is always rebuilt to its prior state.

  2. Compaction is not a background job, it is a simple check over MAX_LOG_FILE_SIZE after either a set or remove operation, as these cause an append to the active log file.

Releases

No releases published

Packages

No packages published

Languages