Skip to content

v0.8.0 — Key and CA rotation

Choose a tag to compare

@incognick incognick released this 17 Jun 03:01
· 33 commits to main since this release
Immutable release. Only release title and notes can be modified.

The eighth Hamster release: key and CA rotation — rotate the master key that protects every object, and rotate the cluster CA that authenticates every node, both with no downtime and without ever putting a secret on the wire or in the Raft log.

Dev preview, and read the limits below. Rotation works and is verified end to end, but the v0.x limits hold: it is a cluster feature (the single-node serve preview has neither a cluster KEK nor a cluster CA), writes still commit only on the Raft leader, multipart and server-side copy are still not on the cluster path, and on-disk/on-wire formats may change between v0 releases. Hamster is not assessed or certified for any regulation (see "What this is not"). Please don't trust real regulated data to it yet.

What's in v0.8

Two independent rotations, each built so the thing being rotated is provably, observably retired before the old one is dropped.

Master-key rotation (cluster rotate-key)

  • Rewrap, never re-encrypt. Each object's data key (DEK) is wrapped under the cluster master key (KEK). Rotation rewraps every DEK from the old KEK to a new one — metadata only. The object bytes and erasure-coded shards never move, so a rotation is cheap and never touches the data path. It is COMPLIANCE-safe: an object-locked version is rewrapped with its lock and bytes intact.
  • Observable and provably complete. Each version records the fingerprint of the key that wrapped it, and cluster status shows the count of objects still on the old key. The rotation closes only when that count reaches zero — so retiring the old key is proven safe, not hoped. A node that loads the wrong master key is refused at write time (the split-key guard).
  • The key never travels the wire. Provision the next key on every node (cluster run -new-master-key-file), then cluster rotate-key drives the rewrap to convergence on the leader. One rotation at a time; never more than two keys held.

CA rotation (cluster rotate-ca)

  • A dual-trust rollover, no downtime, no trust gap. Mint a new CA, trust it alongside the old one across the whole cluster, reissue every node's certificate onto the new CA, then drop the old — the same dual-trust rollover etcd, CockroachDB, and Vault use. The transport reads each node's current certificate and trust set per handshake, so reissuing a node and widening trust take effect with no restart.
  • The CA private key never enters the Raft log. Only CA certificates (public) are replicated, in a generational trust bundle. The new CA key lives only on the node driving the rotation; each node's new certificate is delivered over the existing mutual-TLS channel, the same as a join.
  • Observable and provably complete, like the key rotation: each member records which CA signed its current certificate, and the old CA is dropped only once every member is on the new one. cluster status shows the trust generation and the count still on the old CA. A leadership blip during the rollover is ridden out, and a re-run converges from any partial state.
  • Planned rotation and lost-CA-key recovery are one flow — rotation never needs the old CA key, so recovering from a lost key is just a rotation.

How it's verified

  • Under the deterministic simulation harness (the durability bar every metadata-path change clears): a master-key rotation rewraps every object and they decrypt under the new key alone with the old one retired; a partial rotation resumes — skipping the already-rewrapped, finishing the rest, then closing; a COMPLIANCE-locked version is rewrapped lock- and byte-intact.
  • Real-process cluster tests rotate both the key and the CA over loopback mutual TLS: the trust bundle advances, every member converges, and a leader survives the rollover.
  • End-to-end over the real binary. For the key: write objects, rotate, then read every one from a node restarted with the new key as its only master key. For the CA: rotate, keep serving, then restart a node that now trusts only the new CA — it rejoins and serves every object, proving the rollover reached it. Both run with encryption on, exercising the data path across the rotation.
  • The aws CLI, rclone, restic, s3cmd, the race detector, and the simulation harness all keep passing.

What this is not

  • It is a cluster feature. The single-node serve preview has no cluster KEK and no cluster CA, so neither rotation applies there.
  • The external-PKI issuer is not built yet. The self-managed CA is implemented; pointing issuance at an operator's PKI (Vault, an offline/HSM root, a corporate CA) is the additive next step, the CA analogue of the master key's pluggable source. After a CA rotation the rotating node becomes the issuer for future joins; reissuance also delivers a node's new key over mutual TLS, matching how a join already works — a request flow that keeps the key on the node is a future refinement for both.
  • Not assessed or certified. Hamster rotates keys and trust cleanly, but it has not been assessed for any regulation by anyone, and it is v0, not production-ready, with formats that may still change. Key and trust hygiene are necessary for many regimes but nowhere near sufficient on their own.

Binaries below are static (CGO_ENABLED=0), version-stamped (hamster version), with SHA-256 checksums in SHA256SUMS. Next up, v0.9: upgrade machinery — feature gates, the health interlock, and the end-to-end upgrade test suite that earns zero-downtime rolling upgrades.