Consistent Hashing

A from-scratch Go implementation of consistent hashing with virtual nodes. Built to show how distributed systems like Memcached, DynamoDB, and Cassandra route keys to servers without falling apart every time the topology changes.

The Problem

Modulo hashing (hash(key) % N) is the obvious approach, but it breaks badly when N changes:

Before (3 nodes):  hash("user:1") % 3 = 1  → Node B
After  (4 nodes):  hash("user:1") % 4 = 2  → Node C  ← moved!

In a cache cluster with 100K keys, adding one node remaps ~80,000 keys. That's 80K cache misses hitting your database at once.

How Consistent Hashing Fixes This

Place nodes and keys on the same ring ([0, 2^32)). A key belongs to the first node clockwise from its hash position:

                    0
                    │
           Node-C ──┤
                    │
      2^32 ────────┼──────── 2^8
                    │
                    ├── key:"user:1" → walks clockwise → Node-A
                    │
           Node-A ──┤
                    │
                   2^16
                    │
           Node-B ──┤

Add or remove a node, and only ~K/N keys move. The rest stay put.

Virtual Nodes

A few physical nodes on the ring leave big gaps — some nodes end up with way more keys than others. Virtual nodes fix this by placing multiple copies of each node (e.g. Node-A#0 through Node-A#149):

Virtual Nodes	Std Deviation	Max/Mean Ratio
1	11,353	3.20x
50	4,601	1.83x
150 (default)	2,824	1.47x
500	976	1.17x

150 is a reasonable default — good balance between memory and distribution quality.

Quick Start

# see it in action
go run cmd/demo/main.go

# tests
make test           # unit tests
make test-race      # + race detector
make test-catastrophic  # chaos/integration tests
make test-all       # everything

# coverage report
make cover

What the Demo Shows

The headline number — modulo vs consistent hashing with 100K keys:

Scenario                  │ Modulo (hash%N)        │ Consistent Hashing
──────────────────────────│────────────────────────│──────────────────────
5 → 6 nodes (scale up)    │  83,803 (83.8%)        │  15,723 (15.7%)
6 → 5 nodes (node failure) │  83,803 (83.8%)        │  15,723 (15.7%)
10 → 11 nodes             │  90,856 (90.9%)        │   4,667 ( 4.7%)

It also includes a cache simulation where you can watch keys get lost when a "Redis node" goes down, and see that only that node's keys are affected.

Project Layout

consistent-hashing/
├── cmd/demo/main.go              # interactive demo (6 sections)
├── pkg/
│   ├── hasher/                   # Hasher interface + FNV, MD5, CRC32
│   ├── hashring/                 # core ring: sorted positions, binary search, vnodes
│   ├── analysis/                 # distribution stats, modulo vs consistent comparison
│   └── cache/                    # distributed cache simulator
├── test/
│   └── catastrophic_test.go      # chaos tests (mass failure, churn, concurrency)
├── Makefile
└── go.mod                        # zero external dependencies

How It's Organized

The ring depends on a Hasher interface, not a concrete hash function — swap in a new one without touching the ring code. The Analyzer only needs NodeLocator (read-only lookups), not the full Ring interface. The cache takes a Ring, not a *HashRing.

Everything uses stdlib. No external deps.

Chaos Tests

These aren't just "does it work" tests — they verify that the mathematical properties of consistent hashing hold under pressure:

Test	What Happens	What We Check
Mass Failure	Kill 5 of 10 nodes at once	Only dead nodes' keys move; 0 spurious remaps
Rapid Churn	20 random add/remove cycles	Each op stays within 2.5x of K/N theoretical bound
No Keys Lost	Remove 3 of 8 cache nodes	Surviving keys are 100% intact
Scale-Up	Grow from 3 to 20 nodes	No node ever holds >3x its fair share
Concurrent Chaos	20 readers + 5 writers, 2 seconds	Passes with `-race`, 0% error rate

Run them with make test-catastrophic.

References

Karger et al. — Consistent Hashing and Random Trees (1997) — where it all started
Amazon DynamoDB (2007) — consistent hashing in production at scale
Jump Consistent Hash (2014) — faster alternative when you don't need named nodes

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
assets		assets
cmd/demo		cmd/demo
pkg		pkg
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
article.md		article.md
go.mod		go.mod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Consistent Hashing

The Problem

How Consistent Hashing Fixes This

Virtual Nodes

Quick Start

What the Demo Shows

Project Layout

How It's Organized

Chaos Tests

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Consistent Hashing

The Problem

How Consistent Hashing Fixes This

Virtual Nodes

Quick Start

What the Demo Shows

Project Layout

How It's Organized

Chaos Tests

References

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages