Problem
docs/INSTALLATION.md lists NATS + Consul as prerequisites:
- NATS server running
- Consul server running
But the README clearly states the production path is the P2P ZAP consensus mode — 'No external dependencies.' A new operator following the installation guide would set up the deprecated legacy transport.
Additionally there is no operator runbook covering the day-to-day operations a production cluster needs:
- Key rotation procedure (add a share, reshare, retire old share)
- Health endpoint spec (what should a liveness probe hit, what should a readiness probe assert)
- Backup / restore from shares
- Node add / remove without resharing
- Deployment-validation smoke test (
keygen → sign → verify)
Proposed changes
- INSTALLATION.md rewrite — lead with consensus-mode deployment (docker compose, k8s StatefulSet with
--peer flags). Move NATS/Consul to a 'Legacy transport' appendix with a deprecation banner.
- docs/RUNBOOK.md (new) — operator checklist covering the items above. We'll land a stub with sections + TODOs; maintainers fill in the protocol-specific steps.
- docs/HEALTH.md (new) — health endpoint spec: what each node exposes, what a K8s probe should check, recommended intervals.
- scripts/smoke-test.sh — open a session, keygen a throwaway key, sign a nonce, verify the signature. Exit 0 if healthy, 1 otherwise. Callable from a K8s readiness init-container.
Context
This came up while reviewing the luxfi/mpc deployment shape used by downstream integrators running 5-node CGGMP21 clusters in Kubernetes. The gap between the README (consensus-first) and the installation guide (legacy-first) is the #1 onboarding friction.
Happy to contribute the PR — LMK if there's a style / location preference for the new docs.
Problem
docs/INSTALLATION.mdlists NATS + Consul as prerequisites:But the README clearly states the production path is the P2P ZAP consensus mode — 'No external dependencies.' A new operator following the installation guide would set up the deprecated legacy transport.
Additionally there is no operator runbook covering the day-to-day operations a production cluster needs:
keygen → sign → verify)Proposed changes
--peerflags). Move NATS/Consul to a 'Legacy transport' appendix with a deprecation banner.Context
This came up while reviewing the
luxfi/mpcdeployment shape used by downstream integrators running 5-node CGGMP21 clusters in Kubernetes. The gap between the README (consensus-first) and the installation guide (legacy-first) is the #1 onboarding friction.Happy to contribute the PR — LMK if there's a style / location preference for the new docs.