Skip to content

cppNexus/fast-hex-lite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fast-hex-lite

Ultra-fast hex encoding/decoding in Rust with zero allocations and #![no_std] support.


Why fast-hex-lite?

  • Zero allocations (except optional encode_to_string)
  • no_std by default
  • Precise error reporting (byte index)
  • Deterministic performance across input sizes
  • Optional SIMD acceleration
  • Stable Rust only (no nightly features)

Crates.io Docs.rs Downloads MSRV no_std License CI Coverage

Designed for performance-critical systems such as cryptography, networking stacks, blockchain infrastructure, and embedded environments where no_std and zero heap usage are mandatory.


Features

Feature Default Description
(none) yes no_std, alloc-free scalar encoder/decoder
std Implements std::error::Error for Error
simd SIMD-accelerated decoder via architecture intrinsics (implies std)

Feature interactions

  • simd implies std
  • Scalar path is always available
  • encode_to_string requires std
  • no_std builds exclude any allocation-based helpers

Installation

# Default: no_std, scalar only
[dependencies]
fast-hex-lite = "0.1"

# With SIMD acceleration
[dependencies]
fast-hex-lite = { version = "0.1", features = ["simd"] }

# Explicit no_std (same as default)
[dependencies]
fast-hex-lite = { version = "0.1", default-features = false }

Usage

All APIs operate on caller-provided buffers. No heap allocations occur.

Decode hex to bytes

use fast_hex_lite::decode_to_slice;

let hex = b"deadbeef";
let mut buf = [0u8; 4];
let n = decode_to_slice(hex, &mut buf).unwrap();
assert_eq!(&buf[..n], &[0xde, 0xad, 0xbe, 0xef]);

// Uppercase and mixed-case are accepted
decode_to_slice(b"DEADBEEF", &mut buf).unwrap();
decode_to_slice(b"DeAdBeEf", &mut buf).unwrap();

Decode in-place

Decodes ASCII hex in a mutable buffer into its own first half. No secondary buffer required.

use fast_hex_lite::decode_in_place;

let mut buf = *b"deadbeef";
let n = decode_in_place(&mut buf).unwrap();
assert_eq!(&buf[..n], &[0xde, 0xad, 0xbe, 0xef]);

Decode into a fixed-size array

use fast_hex_lite::decode_to_array;

let bytes: [u8; 4] = decode_to_array(b"deadbeef").unwrap();
assert_eq!(bytes, [0xde, 0xad, 0xbe, 0xef]);

Encode bytes to hex

use fast_hex_lite::encode_to_slice;

let src = [0xde, 0xad, 0xbe, 0xef];
let mut out = [0u8; 8];

encode_to_slice(&src, &mut out, true).unwrap();   // lowercase
assert_eq!(&out, b"deadbeef");

encode_to_slice(&src, &mut out, false).unwrap();  // uppercase
assert_eq!(&out, b"DEADBEEF");

Length helpers

use fast_hex_lite::{decoded_len, encoded_len};

assert_eq!(decoded_len(8).unwrap(), 4);  // 8 hex chars -> 4 bytes
assert_eq!(encoded_len(4), 8);           // 4 bytes -> 8 hex chars

Error handling

use fast_hex_lite::{decode_to_slice, Error};

let mut buf = [0u8; 4];

// Odd-length input
assert_eq!(decode_to_slice(b"abc", &mut buf), Err(Error::OddLength));

// Output buffer too small
assert_eq!(decode_to_slice(b"deadbeef", &mut buf[..1]), Err(Error::OutputTooSmall));

// Invalid character: exact byte index reported
let err = decode_to_slice(b"deXd", &mut buf).unwrap_err();
assert!(matches!(err, Error::InvalidByte { index: 2, byte: b'X' }));

All errors include precise context. InvalidByte reports the zero-based index of the first invalid byte in the source slice.


SIMD acceleration

Enable the simd feature to use a SIMD-accelerated decoder built on std::simd:

fast-hex-lite = { version = "0.1", features = ["simd"] }

The SIMD path processes 32 hex bytes per iteration using Simd<u8, 32>. It is fully transparent: the public API, error types, and error index semantics are identical to the scalar path. Remaining tail bytes fall back to scalar automatically.


Safety

  • Scalar path contains no unsafe
  • SIMD paths use architecture intrinsics behind feature gates
  • No panics on valid input
  • All bounds are checked
  • Error indices are deterministic and reproducible

Security & correctness philosophy

fast-hex-lite is designed with a conservative correctness-first mindset suitable for cryptography-adjacent and infrastructure workloads.

Deterministic semantics

  • All decoding paths (scalar and SIMD) share identical observable behavior.
  • Error indices are guaranteed to point to the first invalid byte.
  • Mixed-case input does not change control flow or error semantics.
  • No whitespace normalization or implicit acceptance of non-hex characters.

No partial mutation guarantees

  • decode_to_slice and decode_in_place never partially mutate the destination buffer on error.
  • If an error is returned, the caller's output buffer remains unchanged.

No hidden allocations

  • No heap allocation occurs in the default configuration.
  • All APIs operate on caller-provided memory.
  • encode_to_string is explicitly opt-in and requires std.

SIMD is an optimization, not a different implementation

  • SIMD is gated behind a feature flag.
  • Scalar fallback is always available.
  • All SIMD logic is covered by the same tests and error contracts.
  • Tail handling is verified to match scalar semantics byte-for-byte.

Audit-friendly design

  • Error types are explicit and structured.
  • No UB-prone pointer arithmetic in scalar code.
  • SIMD intrinsics are isolated and architecture-gated.
  • High test coverage across scalar and SIMD paths (~99% line coverage).

The goal is predictable, verifiable behavior under all inputs — including malformed or adversarial data — rather than maximum theoretical throughput at the cost of clarity or guarantees.



Testing & Coverage

The crate is validated with:

  • cargo test
  • cargo test --features simd
  • cargo clippy --all-targets --all-features -- -D warnings

Coverage is measured using llvm-cov.

Current coverage:

  • Total line coverage: ~99%
  • Functions: 100%
  • Scalar and SIMD paths both tested
  • All error variants covered
  • No-partial-write guarantees validated
  • Full 0x00–0xFF roundtrip tests

Benchmarks

Measured on Apple M3 Pro (macOS, cargo bench --features simd).

Numbers are median Criterion throughput values.

Throughput is over decoded output bytes for decode, input bytes for encode and validate, and decoded output bytes for decode_in_place.

Decode: scalar (hex to bytes)

Input fast-hex-lite lower fast-hex-lite mixed hex crate lower hex crate mixed
32 B 1.67 GiB/s 1.66 GiB/s 663 MiB/s 696 MiB/s
256 B 1.57 GiB/s 1.58 GiB/s 636 MiB/s 700 MiB/s
4 KB 1.70 GiB/s 1.70 GiB/s 597 MiB/s 621 MiB/s
64 KB 1.67 GiB/s 1.68 GiB/s 357 MiB/s 370 MiB/s
1 MB 1.67 GiB/s 1.71 GiB/s 207 MiB/s 215 MiB/s

Decode: SIMD (hex to bytes)

Input fast-hex-lite lower fast-hex-lite mixed hex crate lower hex crate mixed
32 B 5.51 GiB/s 5.49 GiB/s 628 MiB/s 681 MiB/s
256 B 6.10 GiB/s 6.09 GiB/s 608 MiB/s 659 MiB/s
4 KB 6.03 GiB/s 6.04 GiB/s 584 MiB/s 617 MiB/s
64 KB 6.14 GiB/s 6.15 GiB/s 390 MiB/s 391 MiB/s
1 MB 6.09 GiB/s 6.15 GiB/s 201 MiB/s 202 MiB/s

Encode (bytes to hex)

Input fast-hex-lite lower fast-hex-lite upper hex crate lower
32 B 2.50 GiB/s 2.20 GiB/s 2.03 GiB/s
256 B 2.50 GiB/s 2.48 GiB/s 2.01 GiB/s
4 KB 2.61 GiB/s 2.59 GiB/s 2.06 GiB/s
64 KB 2.60 GiB/s 2.60 GiB/s 2.09 GiB/s
1 MB 2.59 GiB/s 2.59 GiB/s 2.09 GiB/s

decode_in_place

Input scalar simd
32 B 655 MiB/s 650 MiB/s
256 B 717 MiB/s 709 MiB/s
4 KB 764 MiB/s 775 MiB/s
64 KB 765 MiB/s 770 MiB/s
1 MB 780 MiB/s 785 MiB/s

Mixed-case input carries zero overhead versus lowercase. Decode throughput is stable across all input sizes. The SIMD path delivers ~3.5-3.7x uplift over scalar for decode at large inputs.


no_std support

The crate is #![no_std] by default. No allocator is required. All APIs work on caller-provided stack arrays or static buffers.

fast-hex-lite = { version = "0.1", default-features = false }

When to use

Use fast-hex-lite when:

  • You need deterministic performance
  • You run in no_std
  • You process large volumes of hex (RPC, blockchain, hashing)
  • You want explicit, index-aware error reporting

If you only need convenience APIs with heap allocation and minimal performance sensitivity, the hex crate may be sufficient.


Comparison

Crate no_std Alloc-free Precise error index SIMD ARM SIMD x86 Notes
fast-hex-lite ✅ NEON ✅ SSE2 Deterministic perf, zero-alloc by default
hex Convenience-focused
faster-hex ⚠ partial ✅ AVX/SSE x86-focused SIMD
const-hex Optimized for const-eval

Design goals

fast-hex-lite focuses on:

  • Zero heap usage by default
  • no_std compatibility
  • Deterministic throughput across sizes
  • Precise, reproducible error indices
  • Cross-architecture SIMD (x86_64 + aarch64)

Unlike x86-only SIMD crates, both Apple Silicon and x86_64 are first-class targets.


Architecture support

Architecture Scalar SIMD
x86_64 SSE2
aarch64 NEON
others

Code structure

src/
  lib.rs      -- public API, Error type, feature gates
  decode.rs   -- scalar decoder, 256-entry compile-time LUT, in-place decode
  encode.rs   -- scalar encoder
  simd.rs     -- SIMD decoder (compiled only with feature `simd`)
benches/
  bench.rs    -- Criterion benchmarks vs hex crate

MSRV

Rust 1.88, edition 2021. Stable only, no nightly features required.


License

About

Ultra-fast hex encoding/decoding in Rust with zero allocations and #![no_std] support.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages