nibble-benchmarks

Comprehensive performance, correctness, safety, and memory comparison of three bit-field parsing approaches for Go:

Library	Description
nibble	Reflection-based, struct-tag-driven bit packing (`bits:"N"`)
manual	Hand-written bit arithmetic — the theoretical fastest baseline
go-bitfield	Tag-driven bit parsing (`bit:"N"`), unmarshal-only

The test packet is a 64-bit game-state struct (8 bytes exactly):

type BenchPacket struct {
    IsAlive  bool   `bits:"1"`
    WeaponID uint8  `bits:"4"`
    TeamID   uint8  `bits:"2"`
    Health   uint16 `bits:"9"`
    PosX     int16  `bits:"12"`
    PosY     int16  `bits:"12"`
    Rotation uint8  `bits:"8"`
    Score    uint32 `bits:"16"`
}  // total = 64 bits = 8 bytes

Quick Results (1 M packets, Intel i7-10510U)

Unmarshal — ns/op (lower is better)

Dataset	nibble	manual	go-bitfield	nibble/manual
100	~1 570	~6.3	~1 900	~249×
1 K	~15 700	~63	~19 000	~249×
10 K	~157 000	~630	~190 000	~249×
100 K	~1.57 M	~6 300	~1.9 M	~249×
1 M	~15.7 M	~63 000	~19 M	~249×

Run go test -bench=BenchmarkUnmarshal -benchmem -benchtime=10s -count=3 ./... for authoritative numbers on your machine.

Memory allocations per operation

Operation	nibble	manual	go-bitfield
Unmarshal	6 allocs/op	0 allocs/op	varies
Marshal	6 allocs/op	0 allocs/op	N/A

Safety comparison

Scenario	nibble	manual
Empty input	`ErrInsufficientData` ✓	index panic 💥
Truncated input	`ErrInsufficientData` ✓	index panic 💥
Field overflow (WeaponID=20 > 4-bit max)	`ErrFieldOverflow` ✓	silent truncation 🐛
All-zeros input	correct zero values ✓	correct zero values ✓
All-ones input	correct max values ✓	correct max values ✓

How to Reproduce

# 1. Clone and install dependencies
git clone https://github.com/PavanKumarMS/nibble-benchmarks
cd nibble-benchmarks
go mod tidy

# 2. Full Go benchmark suite (authoritative numbers)
go test -bench=. -benchmem -benchtime=10s -count=3 ./...

# 3. Specific benchmark groups
go test -bench=BenchmarkUnmarshal -benchmem ./...
go test -bench=BenchmarkMarshal   -benchmem ./...
go test -bench=BenchmarkRoundTrip -benchmem ./...

# 4. Simulation tests (pretty-printed tables)
go test -v -run TestSimulation ./...

# 5. Correctness proofs
go test -v -run TestCorrectness ./...

# 6. Edge-case / safety tests
go test -v -run TestEdgeCases ./...

# 7. Memory allocation analysis
go test -v -run TestMemory ./...

# 8. Concurrency scaling
go test -v -run TestConcurrent ./...

# 9. Race detector on concurrent test
go test -race -run TestConcurrent ./...

# 10. Generate HTML charts + markdown summary
go run cmd/runner/main.go

# 11. Generate charts AND open in browser
go run cmd/runner/main.go --open

# 12. Full run including memory/correctness checks
go run cmd/runner/main.go --full --open

Methodology

Data generation

Fixed seed 42 for all datasets — results are 100 % reproducible.
Realistic distributions: 90 % alive, health skewed toward critical, positions from a normal distribution (σ=500) clamped to the 12-bit signed range.
Dataset sizes: 100, 1 K, 10 K, 100 K, 1 M, 10 M.

Benchmark hygiene

b.ResetTimer() called after all setup — setup allocations are excluded.
b.ReportAllocs() on every benchmark — heap pressure is always visible.
b.SetBytes() set to size × 8 — go test reports MB/s automatically.
Concrete-typed sinks (var globalSink uint32) — no interface-boxing allocs.
Manual marshal benchmarks use sync.Pool — demonstrates zero-alloc path.
runtime.GC() before every memory measurement — post-GC live heap only.

Fairness

Manual bit arithmetic is implemented correctly — wrong manual code would make nibble look better than it really is.
The same pre-marshaled bytes (LittleEndian, LSB-first) feed all unmarshal benchmarks, ensuring apples-to-apples comparison.
nibble vs manual correctness is verified on 100 000 packets before any performance claims are made (TestCorrectness_NibbleVsManual).

Detailed Results

Game Server Simulation (1 M packets)

╔═══════════════════════════════════════════════════════════╗
║          GAME SERVER SIMULATION RESULTS                   ║
╠═══════════════════════════════════════════════════════════╣
║ Library      │ Marshal    │ Unmarshal  │ Total             ║
╠═══════════════════════════════════════════════════════════╣
║ nibble       │  ~1 800ms  │  ~1 700ms  │  ~3 500ms         ║
║ manual       │    ~25ms   │     ~8ms   │    ~33ms          ║
║ go-bitfield  │    N/A     │  ~2 400ms  │  N/A              ║
╠═══════════════════════════════════════════════════════════╣
║ nibble overhead vs manual: ~100–140×                      ║
╚═══════════════════════════════════════════════════════════╝

Memory (long-running, 10 × 1 M packets, post-GC heap)

Batch      │ Heap live (post-GC)
1          │  305 MiB
2          │  305 MiB
...        │  305 MiB   ← flat: GC reclaims all allocations
10         │  305 MiB

nibble allocates ~596 MiB cumulatively per 1 M packets processed (6 allocs × ~100 B × 1 M), but the live heap stays flat because the GC collects all transient objects.

Graphs

Running go run cmd/runner/main.go --open generates five charts in charts/:

File	Content
`unmarshal_comparison.html`	Grouped bar: ns/op by dataset size
`marshal_comparison.html`	Grouped bar: marshal ns/op
`throughput_scaling.html`	Line: millions of packets/second
`memory_pressure.html`	Bar: allocs/op per library

A markdown summary ready to paste into nibble's README is written to results/summary.md.

Conclusions

Concern	Winner	Notes
Raw throughput	manual	~100–250× faster; zero allocations
Safety	nibble	Catches overflow & insufficient data; manual panics or corrupts
Correctness guarantee	nibble	`TestCorrectness_NibbleVsManual` proves identical output
Developer ergonomics	nibble	One struct definition; no hand-written bit math to maintain
GC friendliness	manual	0 allocs/op vs 6 allocs/op for nibble
Concurrency scaling	both	Both scale linearly; manual ~200–500× faster at every level

When to choose nibble: protocol definition, correctness matters, maintainability is important, throughput < 1 M pkt/s is acceptable.

When to choose manual: hot path, throughput > 10 M pkt/s required, you can afford to write and maintain the bit math, and you add your own bounds checks.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
cmd/runner		cmd/runner
data		data
results		results
.gitignore		.gitignore
README.md		README.md
bench_test.go		bench_test.go
concurrent_test.go		concurrent_test.go
correctness_test.go		correctness_test.go
edgecase_test.go		edgecase_test.go
go.mod		go.mod
go.sum		go.sum
index.html		index.html
memory_test.go		memory_test.go
simulation_test.go		simulation_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nibble-benchmarks

Quick Results (1 M packets, Intel i7-10510U)

Unmarshal — ns/op (lower is better)

Memory allocations per operation

Safety comparison

How to Reproduce

Methodology

Data generation

Benchmark hygiene

Fairness

Detailed Results

Game Server Simulation (1 M packets)

Memory (long-running, 10 × 1 M packets, post-GC heap)

Graphs

Conclusions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

nibble-benchmarks

Quick Results (1 M packets, Intel i7-10510U)

Unmarshal — ns/op (lower is better)

Memory allocations per operation

Safety comparison

How to Reproduce

Methodology

Data generation

Benchmark hygiene

Fairness

Detailed Results

Game Server Simulation (1 M packets)

Memory (long-running, 10 × 1 M packets, post-GC heap)

Graphs

Conclusions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages