Pure Rust zstd codec. Levels -8 through 4 (Fast and DFast strategies). Optimized for encode throughput in transfer pipelines that need standard zstd frames at high speed.
Fastest pure-Rust zstd encoder. Consistently outperforms other pure-Rust zstd implementations on both encode and decode across all supported levels. See the benchmarks below.
Negative levels (-8 through -1). Unlocks zstd's fastest compression tiers, useful when throughput matters more than ratio. L-8 is zrip's own addition beyond C zstd's range: raw literals only, no Huffman table build, approaching LZ4-class encode speed while still producing standard zstd frames.
Memory safety. Unsafe is minimized and confined to small, auditable
primitives modules. The paranoid feature eliminates all remaining unsafe.
See SAFETY.md.
Small codebase. ~12k lines of Rust. Levels above 4 add complexity for compression ratios that only matter in archival storage, not transfer pipelines.
Dictionary compression. COVER and FastCOVER training built in for small-message workloads (log lines, JSON records, RPC payloads).
no_std + alloc. Works in embedded and kernel contexts with the alloc
feature; frame requires std.
WebAssembly. Available as @paddor/zrip
on JSR. Auto-detects WASM SIMD support. 15% faster encode than C zstd compiled
to WASM.
// One-shot (allocating)
let compressed = zrip::compress(input, 1)?;
let original = zrip::decompress(&compressed)?;
// One-shot into caller buffer
let n = zrip::compress_into(input, &mut output_buf, 1)?;
zrip::decompress_into(&compressed, &mut output_vec)?;
// Reusable context (amortizes table allocation across calls)
let mut ctx = zrip::CompressContext::new(1)?;
let compressed = ctx.compress(input)?;
let mut dec = zrip::DecompressContext::new();
let original = dec.decompress(&compressed)?;use std::io::Write;
let mut enc = zrip::FrameEncoder::new(Vec::new(), 1)?;
enc.write_all(b"hello")?;
enc.write_all(b" world")?;
let compressed = enc.finish()?;
use std::io::Read;
let mut dec = zrip::FrameDecoder::new(&compressed[..]);
let mut out = String::new();
dec.read_to_string(&mut out)?;FrameEncoder and FrameDecoder own persistent hash tables and
workspace buffers. Call reset() to start a new frame while reusing
all allocations:
let mut enc = zrip::FrameEncoder::new(Vec::new(), 1)?;
enc.write_all(b"first frame")?;
let first = enc.reset(Vec::new())?; // finishes frame, keeps buffers
enc.write_all(b"second frame")?;
let second = enc.finish()?;The normal match finder operates within a sliding window (512 KiB at L1).
LDM finds matches at distances up to 1 << window_log bytes by sampling
positions into a separate hash table. Useful for data with long-range
repeats: log files, database dumps, source archives. See
DESIGN.md for how LDM works.
let opts = zrip::Options::default().window_log(24).ldm(true);
let compressed = zrip::compress_opts(input, 1, &opts)?;Streaming with LDM:
use std::io::Write;
let opts = zrip::Options::default().window_log(24).ldm(true);
let mut enc = zrip::FrameEncoder::with_options(Vec::new(), 1, &opts)?;
enc.write_all(input)?;
let compressed = enc.finish()?;window_log controls the maximum match distance (24 = 16 MiB, 27 = 128 MiB).
The main cost is memory: the encoder allocates an 8 MiB LDM hash table plus
a window buffer of 1 << window_log bytes, and the decoder allocates a
window buffer of the same size (declared in the frame header). At
window_log=27 that is ~136 MiB on each side. On data without long-range
repeats, LDM adds overhead with no ratio benefit.
let dict = zrip::Dictionary::from_bytes(&dict_bytes)?;
let compressed = zrip::compress_with_dict(input, 1, &dict)?;
let original = zrip::decompress_with_dict(&compressed, &dict)?;Streaming with a dictionary:
let mut enc = zrip::FrameEncoder::with_dict(Vec::new(), 1, dict.clone())?;
enc.write_all(input)?;
let compressed = enc.finish()?;
let mut dec = zrip::FrameDecoder::with_dict(&compressed[..], dict);
let mut output = Vec::new();
dec.read_to_end(&mut output)?;CompressContext::with_dict() and DecompressContext::with_dict()
provide the same reuse for one-shot compression.
Build a dictionary from sample data using the built-in FastCOVER trainer.
Requires the dict_builder feature.
use zrip::dict::{train_dict_fastcover, fastcover::FastCoverParams};
let samples: Vec<&[u8]> = messages.iter().map(|m| m.as_bytes()).collect();
let dict = train_dict_fastcover(&samples, 16384, FastCoverParams::default());
// Use with compress_with_dict() / decompress_with_dict()| Feature | Default | Description |
|---|---|---|
std |
yes | Enables CompressContext, DecompressContext |
frame |
yes | Frame header parsing and writing; implies std |
alloc |
yes | no_std + heap via alloc crate |
ldm |
yes | Long distance matching for large-window compression |
dict_builder |
no | COVER/FastCOVER dictionary training |
simd |
yes | fearless_simd runtime dispatch (AVX2+BMI2, NEON, SIMD128) |
paranoid |
no | Pure safe Rust: forbid(unsafe_code) on all crates |
nightly |
no | #[optimize] attributes on hot functions |
SAFETY.md documents the unsafe boundary and catalogs C zstd memory safety bugs that Rust prevents by construction.
All codec paths are fuzz-tested (16 targets, ~10.7M executions) and verified under Miri on both x86_64 and aarch64. Fuzz targets cover round-trip correctness, cross-validation against C zstd, streaming, dictionary modes, and corruption resistance (bitflip, splice, truncate, overwrite).
DESIGN.md covers the encode/decode pipeline, performance architecture, SIMD dispatch, compile-time specialization, and divergences from C zstd.
| Level | Strategy | Hash table | Min match | Literals | Sequences |
|---|---|---|---|---|---|
| -8 | Fast | 32 KB | 5 | Raw | Predefined FSE |
| -7..-1 | Fast | 32 KB | 5 | Huffman | Predefined/custom FSE |
| 1 | Fast | 64 KB | 4 | Huffman | Predefined/custom FSE |
| 2 | Fast | 512 KB | 4 | Huffman | Predefined/custom FSE |
| 3 | DFast | 2x 1 MB | 4 | Huffman | Predefined/custom FSE |
| 4 | DFast | 2x 2 MB | 4 | Huffman | Predefined/custom FSE |
Level 0 maps to the library default (currently level 1). See DESIGN.md for parameter details and pipeline behavior per level.