A fully lock-free hash table for Go, built on the split-ordered list
algorithm of Shalev & Shavit. It supports concurrent Set, Get, Contains,
and Delete from any number of goroutines with no locks anywhere, using only
atomic compare-and-swap.
All items live in a single shared, lock-free sorted linked list, ordered by a 64-bit split-order key derived from the user key's hash (the hash bits are bit-reversed so that splitting a bucket never reorders its elements). Buckets are just pointers to dummy sentinel nodes spliced into that list. Growing the table therefore never moves an item; it only adds new sentinels.
list.go: the shared lock-free ordered list. Each node'snextpointer packs a "logically deleted" mark into its low bit (an interior pointer), so deletion needs no extra allocation and the Go garbage collector defeats the ABA problem. Insertion, lookup, and the two-phase (mark-then-unlink) deletion are all CAS loops.hash_table.go: the split-ordered hash table. A lock-free two-level bucket directory (an atomically-swapped slice of fixed-size segments) maps buckets to their sentinels. The table grows by CASingsizetosize*2once the load factor passes3/4; buckets are initialized lazily.
import "splitorderedlist"
ht := splitorderedlist.NewHashTable[string, int](16, func(k string) uint64 {
h := fnv.New64a()
_, _ = h.Write([]byte(k))
return h.Sum64()
})
ht.Set("answer", 42)
if v, ok := ht.Get("answer"); ok {
fmt.Println(v) // 42
}
ht.Contains("answer") // true
ht.Delete("answer") // true
ht.Len() // number of stored itemsNewHashTable[K, V](segmentSize, hashFunc) takes:
segmentSize: bucket-directory segment size; must be a power of two.hashFunc: maps a keyKto auint64hash.
| Method | Description |
|---|---|
Set(key, value) |
Insert key→value, or overwrite if the key exists. |
Get(key) (V, bool) |
Return the stored value and whether the key was present. |
Contains(key) bool |
Report whether the key is present. |
Delete(key) bool |
Remove the key; returns true if it was present. |
Len() uint64 |
Number of stored items. |
All methods are safe to call concurrently from multiple goroutines.
go test ./... # full suite
go test -race ./... # with the race detector
go test -run TestConcurrent # concurrency tests onlyThe suite includes unit tests (hash_table_test.go), concurrent Set/Get
tests (concurrent_set_test.go), model-based and structural-invariant stress
tests plus a GC-churn test (stress_model_test.go, stress_test.go), and a
Porcupine linearizability check
(linearizability_test.go) that records a real concurrent history and verifies
it is equivalent to some valid sequential order. Benchmarks come in two flavors:
concurrent throughput vs. goroutine count (bench_test.go, plus the HTML report
in bench_report_test.go), and single-threaded latency swept over table size
with explicit hit / miss / cache-miss / insert / delete paths (bench_test.go).
The concurrent throughput benchmarks run for int, string, and composite
struct keys, so you can see how key hashing/equality cost shifts the results.
go test -bench . -benchmem # concurrent throughput (int/string/struct)
go test -bench BenchmarkSeq -benchmem # sequential latency vs. table size
go test -bench 'String|Struct' -benchmem # only the string/struct key variantsThe HTML report (report.html) is generated by the gated TestGenerateBenchReport
and covers all three key types behind a selector, plus the memory footprint:
BENCH_REPORT=1 go test -run TestGenerateBenchReportThe numbers below are throughput in millions of operations per second
(M ops/s), higher is better. Each value is how many Get/Set calls all
goroutines complete per second combined; e.g. 66.3 means 66.3 million ops/s
(≈ 15 ns per operation). It is not a latency or a time; a bigger number is
faster.
Measured at 32 goroutines on an Apple M3 Pro (11 cores), Go 1.25,
darwin/arm64; each figure is the median of 5 runs. SplitOrdered is this
library; RWMutexMap is the built-in map under a sync.RWMutex; SyncMap is
sync.Map. Best per row in bold.
| Key | Workload | SplitOrdered | RWMutexMap | SyncMap |
|---|---|---|---|---|
| (M ops/s) | (M ops/s) | (M ops/s) | ||
| int | Read | 66.3 | 12.7 | 68.9 |
| int | Write (uncontended) | 16.5 | 3.8 | 19.8 |
| int | Write (contended) | 136.9 | 8.7 | 49.7 |
| int | Mixed 90/10 | 57.4 | 11.6 | 53.9 |
| string | Read | 120.8 | 11.1 | 154.1 |
| string | Write (uncontended) | 40.5 | 3.7 | 26.7 |
| string | Write (contended) | 78.0 | 7.4 | 32.9 |
| string | Mixed 90/10 | 97.6 | 16.0 | 85.6 |
| struct | Read | 132.4 | 11.1 | 76.0 |
| struct | Write (uncontended) | 37.3 | 4.0 | 28.5 |
| struct | Write (contended) | 82.7 | 7.4 | 34.2 |
| struct | Mixed 90/10 | 108.7 | 13.2 | 69.7 |
Takeaways:
- Scales with cores.
SplitOrderedandSyncMapget faster with more goroutines;RWMutexMapgets slower, because its shared reader counter and exclusive writes serialize under contention (it leads only single-threaded, e.g. int Read at 1 goroutine: RWMutex 27.6 vs SplitOrdered 10.5). - Reads:
SyncMapwins for int/string;SplitOrderedwins for struct keys. - Writes & mixed:
SplitOrderedis fastest or tied almost everywhere, and dominates the contended-write case. - Memory (bytes/item, lower is better): int gives SplitOrdered 82.8,
RWMutexMap 36.1, SyncMap 126.7.
RWMutexMapis the leanest;SyncMapthe heaviest.
Numbers are hardware-specific, so regenerate report.html to measure your own
machine, and treat small gaps as ties (5-run medians, no confidence intervals).
Go 1.25 or newer (uses generics and sync/atomic typed pointers).