The third Hamster release: erasure coding and self-healing repair — and the S3 endpoint joins the cluster.
Dev preview. Objects now live erasure-coded across a real cluster, but the S3 write path commits only on the Raft leader (clients retry elsewhere), multipart and server-side copy are not on the cluster path yet, the data-plane membership is effectively fixed once data exists (rebalance lands in v0.4), and on-disk/on-wire formats may still change between v0 releases. Please don't trust real data to it — but the cluster preview now stores objects, not just metadata.
What's in v0.3
- Erasure-coded objects, end to end. A PUT erasure-codes the body into
k+mself-describing shards spread acrossk+mdistinct nodes (no two shards of one object ever share a node); a GET reconstructs from anyk. Storage profiles follow an auto ladder as the cluster grows, with a small-objectk=1rule. Object data never passes through the Raft log — only the small metadata commit does, which is the design's first invariant. - The S3 endpoint joins the cluster.
hamster cluster run -s3 <addr>puts the full S3 API on every node: reads from the local replica, mutations as Raft proposals, objects through the erasure-coded data path. (Leader-only writes for now — a non-leader answers503 SlowDownand clients retry elsewhere; multipart and server-side copy are refused on this path until their erasure-coded design lands.) - The write-ack rule, mechanically. All
k+mshards durable on the healthy path; a degraded write acks at a hard floor ofk+1and refuses below it withSlowDown. The metadata commit is the linearization point and happens only after the ack rule is met — so an object's durability budget is honest at the moment it is acknowledged. - Self-healing repair. A repair sweep scrubs every shard against its replicated checksum and rebuilds missing or bit-rotted shards from any
kverified survivors. Corruption is found by hashing and never laundered into a rebuild; a rotted shard is healed without anyone having to read the object first. - Bounded memory, random-access reads. Windowed shard transfer keeps memory bounded per in-flight transfer regardless of object size, and a ranged GET moves only the shard slices it actually covers.
How it's verified
- The deterministic simulation harness drives the whole data path — placement, shard transfer, the ack rule, repair — through seeded cluster schedules: crashed receivers, down nodes, floor refusals, mid-PUT coordinator loss, degraded reads through crashed holders, an emptied node healed, two-shard bitrot rebuilt without a read, beyond-tolerance loss reported honestly, and crash-mid-sweep convergence. Durability is checked by decoding shard files off the surviving disks, not by trusting an ack.
- A six-node end-to-end suite runs real processes over loopback mTLS, stores objects
4+2, and kills nodes mid-workload: reads reconstruct around the loss, writes ack at the floor, and once the cluster is below the floor, writes refuse honestly while reads keep serving at exactlyk. - The race detector and the v0.1 compatibility suite (
awsCLI, rclone, restic, s3cmd) keep passing.
Binaries below are static (CGO_ENABLED=0), version-stamped (hamster version), with SHA-256 checksums in SHA256SUMS. Next up, v0.4: partitioned placement made real — a stored, versioned cluster layout with zone-aware spread, capacity weighting, and rebalance, so a cluster can finally grow its data-plane membership after data exists.