perf: add hex-simd benchmarks and optimize SIMD implementations by zerosnacks · Pull Request #35 · DaniPopes/const-hex

zerosnacks · 2026-01-24T21:40:07Z

Adds hex-simd to benchmarks (closes #25) and optimizes SIMD implementations to match/exceed hex-simd performance.

Benchmark results

CPU: Intel Core i9-13900K (x86_64, AVX2)

check

Size	const-hex (before)	const-hex (after)	hex-simd	faster-hex
32B	2.47 ns	2.44 ns	1.71 ns	3.36 ns
256B	17.58 ns	16.69 ns	13.50 ns	23.36 ns
2KB	150.6 ns	81.69 ns	113 ns	146.7 ns
16KB	1.13 µs	570.6 ns	885.6 ns	1.13 µs
128KB	8.96 µs	5.50 µs	6.99 µs	10.2 µs
1MB	68.73 µs	46.64 µs	53.29 µs	71.92 µs

encode

Size	const-hex (before)	const-hex (after)	hex-simd	faster-hex
32B	8.50 ns	8.98 ns	6.91 ns	12.03 ns
256B	23.69 ns	15.61 ns	16.22 ns	21.08 ns
2KB	88.69 ns	72.69 ns	107.3 ns	110.6 ns
16KB	599.6 ns	548.1 ns	845.6 ns	726.6 ns
128KB	4.42 µs	3.75 µs	6.64 µs	7.59 µs
1MB	47.43 µs	47.96 µs	53.70 µs	77.20 µs

encode_to_slice

Size	const-hex (before)	const-hex (after)	hex-simd	faster-hex
32B	1.90 ns	1.73 ns	1.71 ns	4.09 ns
256B	10.49 ns	8.28 ns	12.64 ns	13.38 ns
2KB	77.63 ns	60.10 ns	103.3 ns	85.44 ns
16KB	613.1 ns	539.9 ns	825.6 ns	653.9 ns
128KB	5.28 µs	5.11 µs	6.61 µs	5.22 µs
1MB	55.39 µs	55.40 µs	52.73 µs	51.51 µs

decode

Size	const-hex (before)	const-hex (after)	hex-simd	faster-hex
32B	28.69 ns	16.05 ns	9.46 ns	13.71 ns
256B	46.75 ns	26.16 ns	35.16 ns	43.66 ns
2KB	256.6 ns	161.8 ns	252.6 ns	270.4 ns
16KB	1.97 µs	1.28 µs	1.94 µs	2.03 µs
128KB	15.54 µs	9.62 µs	14.80 µs	15.98 µs
1MB	118.2 µs	82.16 µs	123.8 µs	135.8 µs

decode_to_slice

Size	const-hex (before)	const-hex (after)	hex-simd	faster-hex
32B	5.57 ns	7.20 ns	3.62 ns	6.49 ns
256B	30.77 ns	21.58 ns	28.66 ns	36.10 ns
2KB	249.3 ns	163.1 ns	235.1 ns	250.9 ns
16KB	1.96 µs	1.28 µs	1.97 µs	1.98 µs
128KB	15.55 µs	10.11 µs	14.85 µs	15.65 µs
1MB	118.2 µs	81.89 µs	124.5 µs	122 µs

format

Size	const-hex (before)	const-hex (after)	std
32B	13.95 ns	7.30 ns	549.6 ns
256B	20.35 ns	18.96 ns	3.81 µs
2KB	121.1 ns	108.5 ns	30.7 µs
16KB	1.22 µs	1.10 µs	135.5 µs
128KB	11.11 µs	11.04 µs	1.06 ms
1MB	151.1 µs	161.4 µs	8.45 ms

Changes

Keep original 6-comparison SSE2 algorithm for small inputs (addresses, hashes)
Add AVX2 check with signed overflow trick for larger inputs ≥128 bytes (Muła & Langdale)
Double encode throughput (32→64 bytes/iter) via new AVX2 path
Add 13 new edge case tests covering SIMD boundaries and all byte values

Adds hex-simd to benchmarks (closes DaniPopes#25) and significantly improves performance of check and encode operations. Performance optimizations: - Replace check algorithm with signed overflow trick (3x faster) - Add AVX2 check path processing 32 bytes per iteration - Double encode throughput (32→64 bytes/iter) via new AVX2 path Testing: - Add 13 new edge case tests covering SIMD boundaries and all byte values References: - http://0x80.pl/notesen/2022-01-17-validating-hex-parse.html Co-authored-by: Amp <amp@ampcode.com> Amp-Thread-ID: https://ampcode.com/threads/T-019bf1ee-2643-76af-8139-99d7c2c02486

DaniPopes · 2026-02-25T07:55:00Z

Split into #38 (benchmarks) and #39 (optimizations). Thanks @zerosnacks!

zerosnacks force-pushed the zerosnacks/hex-simd-bench branch from 36017f8 to 64f908a Compare January 24, 2026 21:57

zerosnacks marked this pull request as ready for review January 24, 2026 22:01

This was referenced Feb 25, 2026

bench: add hex-simd benchmarks #38

Merged

perf: optimize SIMD check and encode implementations #39

Merged

DaniPopes closed this Feb 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: add hex-simd benchmarks and optimize SIMD implementations#35

perf: add hex-simd benchmarks and optimize SIMD implementations#35
zerosnacks wants to merge 1 commit intoDaniPopes:masterfrom
zerosnacks:zerosnacks/hex-simd-bench

zerosnacks commented Jan 24, 2026 •

edited

Loading

Uh oh!

DaniPopes commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zerosnacks commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark results

check

encode

encode_to_slice

decode

decode_to_slice

format

Changes

Uh oh!

DaniPopes commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zerosnacks commented Jan 24, 2026 •

edited

Loading