분산 object storage 설계 원리를 살아있는 데모로 보여주는 오픈소스 레퍼런스. Go 1.26 · Apache 2.0 · 59 ADR · 55 blog episode · 48 라이브 데모 · 190 unit test
분산 object storage 의 핵심 원리를 작고 이해 가능한 코드로 구현한다. 각 원리마다 ADR(설계 결정) + 블로그 episode + 라이브 데모로 검증.
| Season | 트랙 | 상태 | 핵심 |
|---|---|---|---|
| 1 | MVP | ✅ closed | 2-daemon · 3-way replication · UrlKey · CA chunk |
| 2 | 분산 알고리즘 | ✅ closed | HRW placement · rebalance · GC · chunking · EC |
| 3 | 운영성 | ✅ closed | auto-trigger · EC repair · meta backup · heartbeat · multi-edge HA |
| 4 | 성능·효율 | ✅ closed | streaming · CDC · WAL · election · sync repl · transactional Raft · micro-opts |
| 5 | coord 분리 | ✅ closed (Ep.1~7) | ADR-015·038~042. skeleton → edge meta client → HA → WAL sync → txn commit → placement RPC → cli inspect |
| 6 | coord operational migration | ✅ Ep.1~7 done | ADR-043~049. rebalance · GC · repair · DN registry · URLKey rotation/propagation — all on coord |
| 7 | textbook primitives | ✅ closed (Ep.1~4) | ADR-051~054. failure domain · degraded read · tunable consistency · anti-entropy/Merkle |
| P8 | self-heal polish | ✅ P8-16 done | ADR-050·055~063. chaos hardening · 4채널 self-heal · continuous repair · Prometheus observability |
이것이 Ceph·MinIO·S3 가 하는 일의 단순화된 핵심. 목표는 production 이 아니라 이해 가능한 레퍼런스.
git clone https://github.com/HardcoreMonk/kvfs
cd kvfs
# 1. 클러스터 기동 (edge × 1 + dn × 3)
./scripts/up.sh
# 2. α — 3-way replication 내구성 (DN-1 죽여도 GET 성공)
./scripts/demo-alpha.sh
# 3. ε — UrlKey presigned URL (만료 후 401)
./scripts/demo-epsilon.sh전체 48개 데모 라이브 PASS — Season 별 매핑은 아래 ADR 표 참조 (scripts/demo-*.sh). 그리스 letter (αω, 21개) = S1S4, 히브리 letter (alephnun, 14개) = S5S6, S7 + P8 anti-entropy specials 포함.
┌─ kvfs-coord (× N) ──┐ (S5 이후 옵션)
│ placement·메타 owner │
│ Raft + WAL replication │
Client ─HTTP+UrlKey─► kvfs-edge ─┴─ HTTP REST ──► kvfs-dn (× N)
│ │
├─ thin gateway └─ chunks/<sha256[0:2]>/<rest>
├─ chunker / EC encoder
├─ urlkey verify (multi-kid)
└─ (coord-proxy 모드 OR 인라인 coordinator)
-
2-daemon 모드 (S1~S4 호환): edge 가 coordinator 인라인 — placement·rebalance·GC·repair 모두 edge 안에서.
-
3-daemon 모드 (S5~):
EDGE_COORD_URL설정 시 edge 는 thin gateway, coord 가 placement·메타·일관성 owner. coord 는 Raft (COORD_PEERS) + WAL replication (COORD_WAL_PATH) + transactional commit (COORD_TRANSACTIONAL_RAFT) 으로 HA + zero-RPO 가능. cli 는 coord 에 직접 admin (--coord URL). -
처음 읽기:
docs/GUIDE.md(또는 브라우저용docs/guide.html) — 13개 챕터 walkthrough -
짧은 reference:
docs/ARCHITECTURE.md -
에이전트 작업 규약:
AGENTS.md— Codex 기준.CLAUDE.md는 호환 shim
각 결정은 docs/adr/ 의 독립 문서로 박힘 — 불변 기록.
| ADR | 주제 | Blog |
|---|---|---|
| 001 | 독자적 프로젝트 identity (clean-slate) | 01 |
| 002 | 2-daemon MVP (edge + dn × 3) | — |
| 003 | HTTP REST 통신 (curl·tcpdump 친화) | — |
| 004 | bbolt 메타 (pure Go, 외부 의존 1) | — |
| 005 | sha256 content-addressable | — |
| 006 | 1 object = 1 chunk (superseded by 011) | — |
| 007 | UrlKey HMAC-SHA256 | — |
| ADR | 주제 | Demo | Blog |
|---|---|---|---|
| 009 | Rendezvous Hashing (HRW) | ζ | 02 |
| 010 | Rebalance worker (copy-then-update) | η | 03 |
| 012 | Surplus chunk GC (claimed-set + min-age) | θ | 04 |
| 011 | Chunking (ADR-006 supersede) | ι | 05 |
| 008 | Reed-Solomon EC (GF(2^8) from-scratch) | κ | 06 |
| ADR | 주제 | Demo | Blog |
|---|---|---|---|
| 013 | Auto-trigger (in-edge ticker) | λ | 07 |
| 024 | EC stripe rebalance (set-based 최소 이동) | μ | 08 |
| 025 | EC repair queue (K survivors → reconstruct) | ν | 09 |
| 014 | Meta backup (snapshot + offline restore) | ξ | 10 |
| 030 | DN heartbeat (in-edge pull) | ο | 11 |
| 016 | Auto-snapshot scheduler (ticker + retention) | π | 12 |
| 022 | Multi-edge HA (read-replica + atomic.Pointer hot-swap) | ρ | 13 |
| ADR | 주제 | Demo | Blog |
|---|---|---|---|
| 017 | Streaming PUT/GET (io.Reader 기반) | σ | 14 |
| 018 | Content-defined chunking (FastCDC, opt-in) | τ | 15 |
| 031 | Auto leader election (Raft-style, multi-edge HA) | υ | 16 |
| 019 | WAL of metadata mutations (audit + replay) | φ | 17 |
EC streaming = Ep.6 follow-up (demo-χ). EC+CDC = Ep.7 follow-up (demo-ψ). Sync WAL push = Ep.8 follow-up (demo-ω).
| ADR | 주제 |
|---|---|
| 027 | Dynamic DN registry (admin endpoint + bbolt 영속) |
| 028 | UrlKey kid rotation (multi-key Signer) |
| 029 | Optional TLS / mTLS (env-driven opt-in) |
| 032 | NFS gateway — deferred (scope 평가) |
| 033 | Strict replication (informational) |
| 034 | Transactional Raft (replicate-then-commit) |
| 035 | WAL group commit · 3-region CDC · chunker pool |
| 036 | WAL group commit observability gauges |
| 037 | Chunker scratch-pool soft cap |
| ADR | 주제 | Demo | Blog |
|---|---|---|---|
| 015 | Coordinator daemon 분리 (ADR-002 supersede) | aleph (Ep.1) | 29 |
| — | edge → coord meta client (EDGE_COORD_URL) |
bet (Ep.2) | 30 |
| 038 | Coord HA via Raft (ADR-031 reuse) | gimel (Ep.3) | 31 |
| 039 | Coord-to-coord WAL replication | dalet (Ep.4) | 32 |
| 040 | Coord transactional commit (ADR-034 port) | he (Ep.5) | 33 |
| 041 | Edge → coord placement RPC (single source of truth) | vav (Ep.6) | 34 |
| 042 | kvfs-cli direct coord admin (read-only inspect) | zayin (Ep.7) | 35 |
| ADR | 주제 | Demo | Blog |
|---|---|---|---|
| 043 | Rebalance plan on coord | chet (Ep.1) | 36 |
| 044 | Rebalance apply on coord (COORD_DN_IO) |
tet (Ep.2) | 37 |
| 045 | GC plan + apply on coord | yod (Ep.3) | 38 |
| 046 | EC repair on coord | kaf (Ep.4) | 39 |
| 047 | DN registry mutation on coord | lamed (Ep.5) | 40 |
| 048 | URLKey kid registry on coord | mem (Ep.6) | 41 |
| 049 | Edge urlkey.Signer polling propagation | nun (Ep.7) | 42 |
| ADR | 주제 | Demo | Blog |
|---|---|---|---|
| 051 | Failure domain hierarchy | samekh (Ep.1) | 43 |
| 052 | Degraded read | ayin (Ep.2) | 44 |
| 053 | Tunable consistency | pe (Ep.3) | 45 |
| 054 | Anti-entropy / Merkle + scrubber | tsadi (Ep.4) | 46 |
| ADR | 주제 | Demo | Blog |
|---|---|---|---|
| 050 | Raft stale-log protection + coord bootstrap | chaos-mixed | — |
| 055 | Anti-entropy auto-repair + scheduled audit | anti-entropy-repair | 47 |
| 056 | Corrupt repair + dry-run | anti-entropy-repair-corrupt | 48 |
| 057 | EC inline repair | anti-entropy-repair-ec | 49 |
| 058 | EC corrupt repair | anti-entropy-repair-ec-corrupt | 50 |
| 059 | Repair throttle + precision | anti-entropy-throttle | 51 |
| 060 | Concurrent EC repair | anti-entropy-concurrent | 52 |
| 061 | Resilience polishes | anti-entropy-resilience | 53 |
| 062 | Auto-repair scheduling + coord metrics | anti-entropy-auto-metrics | 54 |
| 063 | Anti-entropy observability completions | anti-entropy-observability | 55 |
| Env | Default | ADR | 용도 |
|---|---|---|---|
EDGE_ADDR |
:8000 |
002 | HTTP bind |
EDGE_DNS |
required | 002 | comma-sep DN addrs |
EDGE_DNS_RESET |
0 | 027 | bbolt dns_runtime 을 EDGE_DNS 로 재시드 |
EDGE_DATA_DIR |
./edge-data |
004 | bbolt 디렉토리 |
EDGE_URLKEY_SECRET |
required | 007 | HMAC 시크릿 |
EDGE_URLKEY_PRIMARY_KID |
(off) | 028 | primary URLKey kid override |
EDGE_QUORUM_WRITE |
auto | 002 | write quorum (0=auto) |
EDGE_CHUNK_SIZE |
4 MiB | 011 | bytes per chunk (fixed mode) |
EDGE_CHUNK_MODE |
fixed |
018 | fixed | cdc (FastCDC, replication only) |
EDGE_AUTO |
0 | 013 | auto rebalance/GC opt-in |
EDGE_AUTO_REBALANCE_INTERVAL |
5m | 013 | — |
EDGE_AUTO_GC_INTERVAL |
15m | 013 | — |
EDGE_AUTO_GC_MIN_AGE |
60s | 013 | GC safety window |
EDGE_AUTO_CONCURRENCY |
4 | 013 | auto worker concurrency |
EDGE_HEARTBEAT_INTERVAL |
10s | 030 | DN probe interval (0s = off) |
EDGE_HEARTBEAT_FAIL_THRESHOLD |
3 | 030 | 연속 실패 → unhealthy |
EDGE_SNAPSHOT_DIR |
(off) | 016 | auto-snapshot 디렉토리 |
EDGE_SNAPSHOT_INTERVAL |
1h | 016 | — |
EDGE_SNAPSHOT_KEEP |
7 | 016 | retention |
EDGE_ROLE |
primary | 022 | primary | follower |
EDGE_PRIMARY_URL |
(follower-only) | 022 | follower → primary URL |
EDGE_FOLLOWER_PULL_INTERVAL |
30s | 022 | snapshot pull 주기 |
EDGE_PEERS |
(off) | 031 | comma-sep peer URLs (election opt-in) |
EDGE_SELF_URL |
(req for election) | 031 | this edge's own peer URL |
EDGE_ELECTION_HB_INTERVAL |
500ms | 031 | leader heartbeat 주기 |
EDGE_ELECTION_TIMEOUT_MIN/MAX |
1500ms / 3000ms | 031 | follower election timer (jitter range) |
EDGE_WAL_PATH |
(off) | 019 | metadata mutation WAL file (opt-in audit log) |
EDGE_WAL_BATCH_INTERVAL |
(off) | 035 | group commit interval (e.g. 5ms) |
EDGE_STRICT_REPL |
0 | 033 | informational strict replication 503 on quorum miss |
EDGE_TRANSACTIONAL_RAFT |
0 | 034 | replicate-then-commit (PutObject only) |
EDGE_PLACEMENT_PREFER |
(off) | — | DN class bias (e.g. hot) |
EDGE_METRICS |
1 | 036/037 | /metrics Prometheus endpoint |
EDGE_CHUNKER_POOL_CAP_BYTES |
(off) | 037 | chunker scratch-pool soft cap |
EDGE_TLS_CERT/KEY |
(off) | 029 | edge HTTPS server |
EDGE_DN_TLS_CA, EDGE_DN_TLS_CLIENT_CERT/KEY |
(off) | 029 | edge → DN TLS / mTLS client |
EDGE_SKIP_AUTH |
0 | — | DEMO 전용 (production 금지) |
EDGE_COORD_URL |
(off) | 015 | coord proxy mode 활성 (메타·placement 위임) |
EDGE_COORD_URLKEY_POLL_INTERVAL |
30s | 049 | coord urlkey 변경 polling 주기 |
EDGE_COORD_LOOKUP_CACHE_TTL |
(off) | — | opt-in coord lookup 결과 캐시 TTL (e.g. 2s) |
COORD_ADDR |
:9000 |
015 | coord HTTP bind |
COORD_DATA_DIR |
./coord-data |
015 | coord bbolt 디렉토리 |
COORD_DNS |
required | 015 | coord 가 알 DN addrs (comma-sep) |
COORD_PEERS / COORD_SELF_URL |
(off) | 038 | coord-side election peer set |
COORD_WAL_PATH |
(off) | 039 | coord-to-coord WAL replication |
COORD_TRANSACTIONAL_RAFT |
0 | 040 | coord replicate-then-commit (Elector + WAL 필수) |
COORD_DN_IO |
0 | 044 | coord 가 chunk I/O — rebalance/gc/repair apply on coord |
COORD_ANTI_ENTROPY_INTERVAL |
(off) | 055 | leader-only scheduled audit |
COORD_AUTO_REPAIR_INTERVAL |
(off) | 062 | leader-only scheduled self-heal |
COORD_AUTO_REPAIR_MAX |
100 | 062 | auto-repair tick 당 max repairs |
COORD_AUTO_REPAIR_CONCURRENCY |
4 | 062 | auto-repair worker pool size |
COORD_METRICS |
1 | 062/063 | coord /metrics Prometheus endpoint |
DN_ADDR, DN_DATA_DIR, DN_ID |
:8080, required, required |
002 | DN 측 |
DN_SCRUB_INTERVAL |
(off) | 054 | bit-rot scrubber pacing |
DN_TLS_CERT/KEY/CLIENT_CA |
(off) | 029 | DN-side TLS / mTLS |
docs/FOLLOWUP.md — 우선순위별 pending 작업 단일 소스.
Apache 2.0. PR 환영. 교육적 가치를 최상위 기준으로 코드 리뷰합니다. 자세한
워크플로우는 CONTRIBUTING.md.