Skip to content

HardcoreMonk/kvfs

Repository files navigation

kvfs — Key-Value File System

분산 object storage 설계 원리를 살아있는 데모로 보여주는 오픈소스 레퍼런스. Go 1.26 · Apache 2.0 · 59 ADR · 55 blog episode · 48 라이브 데모 · 190 unit test

이것은 무엇인가

분산 object storage 의 핵심 원리를 작고 이해 가능한 코드로 구현한다. 각 원리마다 ADR(설계 결정) + 블로그 episode + 라이브 데모로 검증.

Season 트랙 상태 핵심
1 MVP ✅ closed 2-daemon · 3-way replication · UrlKey · CA chunk
2 분산 알고리즘 ✅ closed HRW placement · rebalance · GC · chunking · EC
3 운영성 ✅ closed auto-trigger · EC repair · meta backup · heartbeat · multi-edge HA
4 성능·효율 ✅ closed streaming · CDC · WAL · election · sync repl · transactional Raft · micro-opts
5 coord 분리 ✅ closed (Ep.1~7) ADR-015·038~042. skeleton → edge meta client → HA → WAL sync → txn commit → placement RPC → cli inspect
6 coord operational migration ✅ Ep.1~7 done ADR-043~049. rebalance · GC · repair · DN registry · URLKey rotation/propagation — all on coord
7 textbook primitives ✅ closed (Ep.1~4) ADR-051~054. failure domain · degraded read · tunable consistency · anti-entropy/Merkle
P8 self-heal polish ✅ P8-16 done ADR-050·055~063. chaos hardening · 4채널 self-heal · continuous repair · Prometheus observability

이것이 Ceph·MinIO·S3 가 하는 일의 단순화된 핵심. 목표는 production 이 아니라 이해 가능한 레퍼런스.

5분 데모

git clone https://github.com/HardcoreMonk/kvfs
cd kvfs

# 1. 클러스터 기동 (edge × 1 + dn × 3)
./scripts/up.sh

# 2. α — 3-way replication 내구성 (DN-1 죽여도 GET 성공)
./scripts/demo-alpha.sh

# 3. ε — UrlKey presigned URL (만료 후 401)
./scripts/demo-epsilon.sh

전체 48개 데모 라이브 PASS — Season 별 매핑은 아래 ADR 표 참조 (scripts/demo-*.sh). 그리스 letter (αω, 21개) = S1S4, 히브리 letter (alephnun, 14개) = S5S6, S7 + P8 anti-entropy specials 포함.

아키텍처

                                   ┌─ kvfs-coord (× N) ──┐  (S5 이후 옵션)
                                   │  placement·메타 owner │
                                   │  Raft + WAL replication │
   Client ─HTTP+UrlKey─► kvfs-edge ─┴─ HTTP REST ──► kvfs-dn (× N)
                            │                          │
                            ├─ thin gateway              └─ chunks/<sha256[0:2]>/<rest>
                            ├─ chunker / EC encoder
                            ├─ urlkey verify (multi-kid)
                            └─ (coord-proxy 모드 OR 인라인 coordinator)
  • 2-daemon 모드 (S1~S4 호환): edge 가 coordinator 인라인 — placement·rebalance·GC·repair 모두 edge 안에서.

  • 3-daemon 모드 (S5~): EDGE_COORD_URL 설정 시 edge 는 thin gateway, coord 가 placement·메타·일관성 owner. coord 는 Raft (COORD_PEERS) + WAL replication (COORD_WAL_PATH) + transactional commit (COORD_TRANSACTIONAL_RAFT) 으로 HA + zero-RPO 가능. cli 는 coord 에 직접 admin (--coord URL).

  • 처음 읽기: docs/GUIDE.md (또는 브라우저용 docs/guide.html) — 13개 챕터 walkthrough

  • 짧은 reference: docs/ARCHITECTURE.md

  • 에이전트 작업 규약: AGENTS.md — Codex 기준. CLAUDE.md 는 호환 shim

설계 결정 (ADR 전문)

각 결정은 docs/adr/ 의 독립 문서로 박힘 — 불변 기록.

Season 1 (MVP)

ADR 주제 Blog
001 독자적 프로젝트 identity (clean-slate) 01
002 2-daemon MVP (edge + dn × 3)
003 HTTP REST 통신 (curl·tcpdump 친화)
004 bbolt 메타 (pure Go, 외부 의존 1)
005 sha256 content-addressable
006 1 object = 1 chunk (superseded by 011)
007 UrlKey HMAC-SHA256

Season 2 (분산 알고리즘)

ADR 주제 Demo Blog
009 Rendezvous Hashing (HRW) ζ 02
010 Rebalance worker (copy-then-update) η 03
012 Surplus chunk GC (claimed-set + min-age) θ 04
011 Chunking (ADR-006 supersede) ι 05
008 Reed-Solomon EC (GF(2^8) from-scratch) κ 06

Season 3 (운영성)

ADR 주제 Demo Blog
013 Auto-trigger (in-edge ticker) λ 07
024 EC stripe rebalance (set-based 최소 이동) μ 08
025 EC repair queue (K survivors → reconstruct) ν 09
014 Meta backup (snapshot + offline restore) ξ 10
030 DN heartbeat (in-edge pull) ο 11
016 Auto-snapshot scheduler (ticker + retention) π 12
022 Multi-edge HA (read-replica + atomic.Pointer hot-swap) ρ 13

Season 4 (성능·효율)

ADR 주제 Demo Blog
017 Streaming PUT/GET (io.Reader 기반) σ 14
018 Content-defined chunking (FastCDC, opt-in) τ 15
031 Auto leader election (Raft-style, multi-edge HA) υ 16
019 WAL of metadata mutations (audit + replay) φ 17

EC streaming = Ep.6 follow-up (demo-χ). EC+CDC = Ep.7 follow-up (demo-ψ). Sync WAL push = Ep.8 follow-up (demo-ω).

운영 보강 + post-S4 wave (Accepted)

ADR 주제
027 Dynamic DN registry (admin endpoint + bbolt 영속)
028 UrlKey kid rotation (multi-key Signer)
029 Optional TLS / mTLS (env-driven opt-in)
032 NFS gateway — deferred (scope 평가)
033 Strict replication (informational)
034 Transactional Raft (replicate-then-commit)
035 WAL group commit · 3-region CDC · chunker pool
036 WAL group commit observability gauges
037 Chunker scratch-pool soft cap

Season 5 (coord 분리)

ADR 주제 Demo Blog
015 Coordinator daemon 분리 (ADR-002 supersede) aleph (Ep.1) 29
edge → coord meta client (EDGE_COORD_URL) bet (Ep.2) 30
038 Coord HA via Raft (ADR-031 reuse) gimel (Ep.3) 31
039 Coord-to-coord WAL replication dalet (Ep.4) 32
040 Coord transactional commit (ADR-034 port) he (Ep.5) 33
041 Edge → coord placement RPC (single source of truth) vav (Ep.6) 34
042 kvfs-cli direct coord admin (read-only inspect) zayin (Ep.7) 35

Season 6 (coord operational migration)

ADR 주제 Demo Blog
043 Rebalance plan on coord chet (Ep.1) 36
044 Rebalance apply on coord (COORD_DN_IO) tet (Ep.2) 37
045 GC plan + apply on coord yod (Ep.3) 38
046 EC repair on coord kaf (Ep.4) 39
047 DN registry mutation on coord lamed (Ep.5) 40
048 URLKey kid registry on coord mem (Ep.6) 41
049 Edge urlkey.Signer polling propagation nun (Ep.7) 42

Season 7 (textbook primitives)

ADR 주제 Demo Blog
051 Failure domain hierarchy samekh (Ep.1) 43
052 Degraded read ayin (Ep.2) 44
053 Tunable consistency pe (Ep.3) 45
054 Anti-entropy / Merkle + scrubber tsadi (Ep.4) 46

P8 (chaos + anti-entropy self-heal polish)

ADR 주제 Demo Blog
050 Raft stale-log protection + coord bootstrap chaos-mixed
055 Anti-entropy auto-repair + scheduled audit anti-entropy-repair 47
056 Corrupt repair + dry-run anti-entropy-repair-corrupt 48
057 EC inline repair anti-entropy-repair-ec 49
058 EC corrupt repair anti-entropy-repair-ec-corrupt 50
059 Repair throttle + precision anti-entropy-throttle 51
060 Concurrent EC repair anti-entropy-concurrent 52
061 Resilience polishes anti-entropy-resilience 53
062 Auto-repair scheduling + coord metrics anti-entropy-auto-metrics 54
063 Anti-entropy observability completions anti-entropy-observability 55

환경 변수

Env Default ADR 용도
EDGE_ADDR :8000 002 HTTP bind
EDGE_DNS required 002 comma-sep DN addrs
EDGE_DNS_RESET 0 027 bbolt dns_runtimeEDGE_DNS 로 재시드
EDGE_DATA_DIR ./edge-data 004 bbolt 디렉토리
EDGE_URLKEY_SECRET required 007 HMAC 시크릿
EDGE_URLKEY_PRIMARY_KID (off) 028 primary URLKey kid override
EDGE_QUORUM_WRITE auto 002 write quorum (0=auto)
EDGE_CHUNK_SIZE 4 MiB 011 bytes per chunk (fixed mode)
EDGE_CHUNK_MODE fixed 018 fixed | cdc (FastCDC, replication only)
EDGE_AUTO 0 013 auto rebalance/GC opt-in
EDGE_AUTO_REBALANCE_INTERVAL 5m 013
EDGE_AUTO_GC_INTERVAL 15m 013
EDGE_AUTO_GC_MIN_AGE 60s 013 GC safety window
EDGE_AUTO_CONCURRENCY 4 013 auto worker concurrency
EDGE_HEARTBEAT_INTERVAL 10s 030 DN probe interval (0s = off)
EDGE_HEARTBEAT_FAIL_THRESHOLD 3 030 연속 실패 → unhealthy
EDGE_SNAPSHOT_DIR (off) 016 auto-snapshot 디렉토리
EDGE_SNAPSHOT_INTERVAL 1h 016
EDGE_SNAPSHOT_KEEP 7 016 retention
EDGE_ROLE primary 022 primary | follower
EDGE_PRIMARY_URL (follower-only) 022 follower → primary URL
EDGE_FOLLOWER_PULL_INTERVAL 30s 022 snapshot pull 주기
EDGE_PEERS (off) 031 comma-sep peer URLs (election opt-in)
EDGE_SELF_URL (req for election) 031 this edge's own peer URL
EDGE_ELECTION_HB_INTERVAL 500ms 031 leader heartbeat 주기
EDGE_ELECTION_TIMEOUT_MIN/MAX 1500ms / 3000ms 031 follower election timer (jitter range)
EDGE_WAL_PATH (off) 019 metadata mutation WAL file (opt-in audit log)
EDGE_WAL_BATCH_INTERVAL (off) 035 group commit interval (e.g. 5ms)
EDGE_STRICT_REPL 0 033 informational strict replication 503 on quorum miss
EDGE_TRANSACTIONAL_RAFT 0 034 replicate-then-commit (PutObject only)
EDGE_PLACEMENT_PREFER (off) DN class bias (e.g. hot)
EDGE_METRICS 1 036/037 /metrics Prometheus endpoint
EDGE_CHUNKER_POOL_CAP_BYTES (off) 037 chunker scratch-pool soft cap
EDGE_TLS_CERT/KEY (off) 029 edge HTTPS server
EDGE_DN_TLS_CA, EDGE_DN_TLS_CLIENT_CERT/KEY (off) 029 edge → DN TLS / mTLS client
EDGE_SKIP_AUTH 0 DEMO 전용 (production 금지)
EDGE_COORD_URL (off) 015 coord proxy mode 활성 (메타·placement 위임)
EDGE_COORD_URLKEY_POLL_INTERVAL 30s 049 coord urlkey 변경 polling 주기
EDGE_COORD_LOOKUP_CACHE_TTL (off) opt-in coord lookup 결과 캐시 TTL (e.g. 2s)
COORD_ADDR :9000 015 coord HTTP bind
COORD_DATA_DIR ./coord-data 015 coord bbolt 디렉토리
COORD_DNS required 015 coord 가 알 DN addrs (comma-sep)
COORD_PEERS / COORD_SELF_URL (off) 038 coord-side election peer set
COORD_WAL_PATH (off) 039 coord-to-coord WAL replication
COORD_TRANSACTIONAL_RAFT 0 040 coord replicate-then-commit (Elector + WAL 필수)
COORD_DN_IO 0 044 coord 가 chunk I/O — rebalance/gc/repair apply on coord
COORD_ANTI_ENTROPY_INTERVAL (off) 055 leader-only scheduled audit
COORD_AUTO_REPAIR_INTERVAL (off) 062 leader-only scheduled self-heal
COORD_AUTO_REPAIR_MAX 100 062 auto-repair tick 당 max repairs
COORD_AUTO_REPAIR_CONCURRENCY 4 062 auto-repair worker pool size
COORD_METRICS 1 062/063 coord /metrics Prometheus endpoint
DN_ADDR, DN_DATA_DIR, DN_ID :8080, required, required 002 DN 측
DN_SCRUB_INTERVAL (off) 054 bit-rot scrubber pacing
DN_TLS_CERT/KEY/CLIENT_CA (off) 029 DN-side TLS / mTLS

다음 작업

docs/FOLLOWUP.md — 우선순위별 pending 작업 단일 소스.

기여

Apache 2.0. PR 환영. 교육적 가치를 최상위 기준으로 코드 리뷰합니다. 자세한 워크플로우는 CONTRIBUTING.md.

About

kvfs — Key-Value File System: an educational reference for distributed object storage (Go 1.26, Apache 2.0)

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages