Leaderless Log Protocol

A formally verified distributed append-only log protocol with concurrent writers, compaction, and readers — built on the coordination-delegated pattern, where stateless workers offload all coordination to an external linearizable store.

What is the Leaderless Log Protocol?

The Leaderless Log Protocol implements a distributed append-only log where any writer can append without being elected leader. There is no Raft, no Paxos, no leader election. Instead, writers are stateless processes that delegate ordering and index management to an external linearizable coordination store:

Monotonic offsets are assigned via the store's AtomicIncrement (a sequence counter)
Log index is maintained as linearizable key-value entries with atomic CAS
Compaction rewrites ranges of WAL entries into compacted files via a 3-step CAS update
Fencing prevents stale writers from corrupting the log after a takeover

The protocol models three interacting sub-protocols — writer append path, compaction index update, and reader path — and is verified in both TLA+ and Fizzbee against safety and liveness properties (see Expected Results).

Full canonical specification: 1-leaderless-log-protocol.md.

Production Use: Ursa

The Leaderless Log Protocol formalized in this repo is the protocol used in Ursa, the lakehouse-native streaming engine from StreamNative. The formal spec here is the distilled, model-checked core of the protocol that Ursa runs in production.

VLDB 2025 paper — Ursa: A Lakehouse-Native Streaming System: https://www.vldb.org/pvldb/vol18/p5184-guo.pdf
Ursa for Kafka (engineering blog): https://streamnative.io/blog/ursa-for-kafka-native-apache-kafka-service-on-lakestream

Reference Implementation: S3-Queue

S3-Queue is a distributed message queue built entirely on S3-compatible object storage — no external coordination store required. It instantiates the Leaderless Log Protocol with S3 conditional writes as the coordination primitive.


Protocol	Leaderless Log
System SPEC	`examples/s3-queue/SPEC.md` — 1,100+ lines covering the full design
Implementation	Rust CLI (`s3q`) — generated by a coding agent from the SPEC alone

Try it yourself — tell your favorite coding agent:

Implement S3-Queue according to the following spec: https://github.com/lakestream-io/leaderless-log-protocol/blob/main/examples/s3-queue/SPEC.md

See examples/ for the full list of examples.

The Coordination-Delegated Pattern

The Leaderless Log Protocol is one instance of a broader architectural pattern: coordination-delegated distributed systems. In this pattern, worker nodes carry no consensus logic — they are stateless processes that offload all coordination decisions (ordering, locking, leader election, failure detection) to an external linearizable store. The store becomes the single source of truth for shared state, while workers focus purely on data-path operations.

┌──────────┐   ┌──────────┐   ┌──────────┐
│ Worker A │   │ Worker B │   │ Worker C │
│(stateless)│  │(stateless)│  │(stateless)│
└────┬─────┘   └────┬─────┘   └────┬─────┘
     │              │              │
     ▼              ▼              ▼
┌──────────────────────────────────────────┐
│     Coordination Store (linearizable)    │
│  CAS · AtomicIncrement · Ephemeral keys  │
└──────────────────────────────────────────┘

Unlike embedded consensus (Raft, Paxos), where every node participates in the consensus protocol, coordination-delegated systems separate the coordination plane from the data plane entirely. See 0-coordination-delegated-pattern.md for the full treatment of the pattern, its primitives, and its limitations.

Protocol Family

This repo formalizes a family of coordination-delegated protocols. The Leaderless Log Protocol is the headliner; additional protocols share the same pattern and coordination primitives.

Layer	Document	Description
0	`0-coordination-delegated-pattern.md`	The foundational pattern: abstract coordination primitives, properties, limitations, and reference implementations
1	`1-leaderless-log-protocol.md`	Leaderless Log — distributed append-only log with concurrent writers, compaction, and readers
2	`2-coordination-delegated-task-claiming.md`	Task Claiming — distributed task claiming with ephemeral locks and crash recovery

Protocols 1 and 2 are independent and composable. Each instantiates the Layer 0 pattern with a specific subset of coordination primitives.

From Protocol to Implementation

The real value of formally verified protocols: you can write a system SPEC that instantiates a protocol for a specific system, hand that SPEC to a coding agent, and get a correct implementation in one shot.

┌─────────────────────┐      ┌─────────────────────┐      ┌─────────────┐      ┌──────────────────┐
│   Protocol SPEC     │      │    System SPEC       │      │   Coding    │      │   Working        │
│   (formally         │  →   │    (language-agnostic,│  →   │   Agent     │  →   │   Implementation │
│    verified)        │      │     self-contained)  │      │             │      │                  │
└─────────────────────┘      └─────────────────────┘      └─────────────┘      └──────────────────┘

The protocol SPEC guarantees correctness properties (safety, liveness) via model checking. The system SPEC translates those guarantees into a concrete design — domain model, state machines, coordination primitives, pseudocode, error handling, and a test matrix. Together, they give a coding agent everything it needs to produce a correct implementation without back-and-forth.

Directory Structure

├── 0-coordination-delegated-pattern.md      -- Layer 0: The pattern
├── 1-leaderless-log-protocol.md             -- Layer 1: Canonical spec (Leaderless Log)
├── 2-coordination-delegated-task-claiming.md -- Layer 2: Canonical spec (Task Claiming)
├── README.md                                -- This file
├── tlaplus/
│   ├── LeaderlessLog.tla                    -- TLA+ implementation of Protocol 1
│   ├── LeaderlessLog.cfg                    -- Small TLC config (fast, ~10^5 states)
│   ├── LeaderlessLog-medium.cfg             -- Medium TLC config (thorough, ~10^6-7 states)
│   ├── TaskClaiming.tla                     -- TLA+ implementation of Protocol 2
│   ├── TaskClaiming.cfg                     -- Small TLC config
│   └── TaskClaiming-medium.cfg              -- Medium TLC config
├── fizzbee/
│   ├── LeaderlessLog.fizz                   -- Fizzbee implementation of Protocol 1
│   └── TaskClaiming.fizz                    -- Fizzbee implementation of Protocol 2
└── examples/
    └── s3-queue/                            -- Example: S3-Queue (spec + Rust impl)

Spec-First Approach

The canonical SPEC docs (Markdown) define each protocol precisely. Both TLA+ and Fizzbee are independent implementations of the same spec. The spec doc is the source of truth.

If TLA+ and Fizzbee disagree on a property verdict, the spec doc resolves which is wrong
Every state variable, action, and property in the spec has a corresponding construct in both TLA+ and Fizzbee

Running the Models

TLA+ (TLC Model Checker)

Prerequisites: Java 17+ and TLA+ tools (tla2tools.jar). Quick setup: make tlaplus-install

Leaderless Log Protocol:

# Small config — fast checking (~10^5 states)
cd tlaplus
java -jar tla2tools.jar -config LeaderlessLog.cfg LeaderlessLog.tla

# Medium config — thorough checking (~10^6-7 states)
java -jar tla2tools.jar -config LeaderlessLog-medium.cfg LeaderlessLog.tla

Task Claiming Protocol:

# Small config
cd tlaplus
java -jar tla2tools.jar -config TaskClaiming.cfg TaskClaiming.tla

# Medium config
java -jar tla2tools.jar -config TaskClaiming-medium.cfg TaskClaiming.tla

TLC Config Parameters

Leaderless Log

Parameter	Small	Medium	Description
`Writers`	`{w1, w2}`	`{w1, w2, w3}`	Writer processes
`MaxOffset`	`5`	`8`	Offset upper bound
`CompactRangeStart`	`1`	`1`	Compaction range start
`CompactRangeEnd`	`3`	`5`	Compaction range end
`MaxBatch`	`2`	`3`	Max records per write batch

The MaxBatch parameter controls multi-record entry modeling:

MaxBatch = 1: Dense index (every offset has an entry) — equivalent to the original 1:1 model
MaxBatch > 1: Sparse index — writes non-deterministically choose batch sizes in 1..MaxBatch, producing entries that cover multiple offsets

Note: With MaxBatch > 1, some write sequences produce entries that don't align with CompactRangeEnd. For example, with MaxBatch=2 and CompactRangeEnd=3: writes of [1,2],[3,4] produce entries at offsets 2 and 4 — offset 3 has no entry boundary. The CompactStart guard naturally handles this by only firing when all entries are fully contained in the range.

Task Claiming

Parameter	Small	Medium	Description
`Workers`	`{w1, w2, w3}`	`{w1, w2, w3, w4}`	Worker processes
`Tasks`	`{t1, t2}`	`{t1, t2, t3}`	Tasks
`MaxFailures`	`2`	`3`	Max failures before DLQ

Fizzbee

# Install Fizzbee CLI locally (one-time setup)
make fizzbee-install

# Run Fizzbee model checker on all specs
make fizzbee

# Or run individual specs directly
cd fizzbee && ~/.local/fizzbee/fizz LeaderlessLog.fizz
cd fizzbee && ~/.local/fizzbee/fizz TaskClaiming.fizz

You can also upload .fizz files to fizzbee.io for browser-based checking.

Expected Results

Protocol 1: Leaderless Log

Property	Type	TLA+	Fizzbee	Notes
`TypeOK`	Invariant	PASS	—	TLA+ only
`MonotonicOffsets`	Safety	PASS	PASS	`SequenceCounterPositive` in TLA+
`FencedRejectsAppends`	Safety	PASS	—	Uses ENABLED; omitted from Fizzbee
`CompactionPreservesData`	Safety	PASS	PASS
`NoPhantomEntries`	Safety	PASS	PASS
`CursorConsistency`	Safety	PASS	PASS
`NoOverlappingRanges`	Safety	PASS	PASS	WAL-WAL only; WAL-COMPACTED overlap allowed
`NoReaderError`	Safety	PASS	PASS
`SequentialCompactionSafety`	Safety	PASS	PASS	Compositionality: round 2 preserves round 1
`AppendProgress`	Liveness	PASS	—	Requires fairness (TLA+ only)
`CompactionCompletes`	Liveness	PASS	—	Requires fairness (TLA+ only)
`ReaderEventuallySucceeds`	Liveness	PASS	—	Vacuously true; retained as regression guard

All properties pass. The ReadEntry action correctly models that both WAL and COMPACTED entries are readable — compaction reorganizes data into compacted files but does not delete it. A reader implementation dispatches on entry file type to route reads to the appropriate storage backend.

Protocol 2: Task Claiming

Property	Type	Expected
`MutualExclusion`	Safety	PASS
`NoDoubleExecution`	Safety	PASS
`DLQOnlyAfterMaxFailures`	Safety	PASS
`LockConsistency`	Safety	PASS
`NoOrphanExecution`	Safety	PASS
`TaskCompletion`	Liveness	PASS
`CrashRecovery`	Liveness	PASS
`NoStarvation`	Liveness	PASS

All properties should pass for Protocol 2.

Spec-to-Implementation Mapping

Protocol 1: Leaderless Log

Spec Section	TLA+ Construct	Fizzbee Construct
State: `logIndex`	`VARIABLE logIndex`	`logIndex = {}`
State: `sequenceCounter`	`VARIABLE sequenceCounter`	`sequenceCounter = 1`
State: `logState`	`VARIABLE logState`	`logState = OPEN`
State: `compactionCursor`	`VARIABLE compactionCursor`	`compactionCursor = 1`
State: `writerState`	`VARIABLE writerState`	`writerState = {}`
State: `writerBatchSize`	`VARIABLE writerBatchSize`	`writerBatchSize = {}`
State: `compactorState`	`VARIABLE compactorState`	`compactorState = "IDLE"`
Action 1: `StartAppend`	`StartAppend(w)`	`action StartAppend`
Action 2: `WALWrite`	`WALWriteSuccess(w)` / `WALWriteFail(w)`	`action WALWrite` (either/or)
Action 3: `AssignOffset`	`AssignOffset(w)` / `AssignOffsetFenced(w)`	`action AssignOffset`
Action 4: `AppendComplete`	`AppendComplete(w)`	`action AppendComplete`
Actions 5-9: Compaction	`CompactStart` through `CompactReset`	`action CompactStart` through `action CompactReset`
Actions 10-11: Fencing	`FenceLog` / `UnfenceLog`	`action FenceLog` / `action UnfenceLog`
Action 12: `ReadEntry`	`ReadEntry(w, off)`	`action ReadEntry`
Helper: `HasCoveringEntry`	`HasCoveringEntry(off)`	`hasCoveringEntry(off)`
Helper: `CoveringEntryOffset`	`CoveringEntryOffset(off)`	`coveringEntryOffset(off)`
Safety S2	`MonotonicOffsets`	`always assertion MonotonicOffsets`
Safety S3	`FencedRejectsAppends`	`always assertion FencedRejectsAppends`
Safety S4	`CompactionPreservesData`	`always assertion CompactionPreservesData`
Safety S7	`NoOverlappingRanges`	`always assertion NoOverlappingRanges`
Safety S8	`NoReaderError`	`always assertion NoReaderError`
Safety S9	`SequentialCompactionSafety`	`always assertion SequentialCompactionSafety`
Liveness L3	`ReaderEventuallySucceeds`	— (liveness requires fairness; TLA+ only)

Protocol 2: Task Claiming

Spec Section	TLA+ Construct	Fizzbee Construct
State: `taskStatus`	`VARIABLE taskStatus`	`taskStatus = {}`
State: `lockState`	`VARIABLE lockState`	`lockState = {}`
State: `workerAlive`	`VARIABLE workerAlive`	`workerAlive = {}`
State: `workerState`	`VARIABLE workerState`	`workerState = {}`
Action 1: `ScanTasks`	`ScanTasks(w, t)`	`action ScanTasks`
Action 2: `TryLockTask`	`TryLockSuccess(w)` / `TryLockFail(w)`	`action TryLockSuccess` / `action TryLockFail`
Action 3: `ExecuteTask`	`ExecuteTaskSuccess(w)` / `ExecuteTaskFailure(w)`	`action ExecuteTask` (either/or)
Action 4: `UnlockTask`	`UnlockTask(w)`	`action UnlockTask`
Actions 5-7: Crash/Recovery	`WorkerCrash` / `SessionExpiry` / `WorkerRecover`	Same action names
Safety S1	`MutualExclusion`	`always assertion MutualExclusion`
Safety S2	`NoDoubleExecution`	`always assertion NoDoubleExecution`
Safety S4	`LockConsistency`	`always assertion LockConsistency`
Liveness L1	`TaskCompletion`	`always eventually assertion TaskCompletion`
Liveness L2	`CrashRecovery`	`always eventually assertion CrashRecovery`

Cross-Implementation Validation

After running both TLA+ and Fizzbee on both protocols:

Confirm all property verdicts agree between TLA+ and Fizzbee
All safety properties must pass in both protocols and both tools
All liveness properties must pass in both protocols and both tools

If results disagree, consult the canonical spec doc to determine which implementation has the bug.

License

This project is licensed under the Apache License, Version 2.0 — see the LICENSE file for the full text.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.claude/skills		.claude/skills
.github		.github
examples		examples
fizzbee		fizzbee
tlaplus		tlaplus
.gitignore		.gitignore
0-coordination-delegated-pattern.md		0-coordination-delegated-pattern.md
1-leaderless-log-protocol.md		1-leaderless-log-protocol.md
2-coordination-delegated-task-claiming.md		2-coordination-delegated-task-claiming.md
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Leaderless Log Protocol

What is the Leaderless Log Protocol?

Production Use: Ursa

Reference Implementation: S3-Queue

The Coordination-Delegated Pattern

Protocol Family

From Protocol to Implementation

Directory Structure

Spec-First Approach

Running the Models

TLA+ (TLC Model Checker)

TLC Config Parameters

Leaderless Log

Task Claiming

Fizzbee

Expected Results

Protocol 1: Leaderless Log

Protocol 2: Task Claiming

Spec-to-Implementation Mapping

Protocol 1: Leaderless Log

Protocol 2: Task Claiming

Cross-Implementation Validation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Leaderless Log Protocol

What is the Leaderless Log Protocol?

Production Use: Ursa

Reference Implementation: S3-Queue

The Coordination-Delegated Pattern

Protocol Family

From Protocol to Implementation

Directory Structure

Spec-First Approach

Running the Models

TLA+ (TLC Model Checker)

TLC Config Parameters

Leaderless Log

Task Claiming

Fizzbee

Expected Results

Protocol 1: Leaderless Log

Protocol 2: Task Claiming

Spec-to-Implementation Mapping

Protocol 1: Leaderless Log

Protocol 2: Task Claiming

Cross-Implementation Validation

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages