lz4rip

Rust LZ4 compression. 5-8% faster than C lz4 end-to-end (compress + 1 GB/s transfer + decompress, geomean across 16 corpus files). Originally derived from lz4_flex.

lz4rip = "0.2"

Performance

x86_64 details (per-file, size sweep, dictionary)

aarch64 (Apple M4)

Block format

use lz4rip::block::{compress, decompress_into, get_maximum_output_size};

let input = b"Hello people, what's up?";
let compressed = compress(input);

let mut output = vec![0u8; input.len()];
let n = decompress_into(&compressed, &mut output).unwrap();
assert_eq!(&output[..n], input);

The _into variants write into a caller-provided buffer. The plain variants allocate.

// One-shot (allocating)
fn compress(input: &[u8]) -> Vec<u8>;
fn decompress(input: &[u8], uncompressed_size: usize) -> Result<Vec<u8>>;

// One-shot (into caller buffer)
fn compress_into(input: &[u8], output: &mut [u8]) -> Result<usize>;
fn decompress_into(input: &[u8], output: &mut [u8]) -> Result<usize>;

Dictionary compression

Pre-seed the compressor and decompressor with shared context for better ratios on small messages (e.g. JSON records, log lines).

use lz4rip::block::{Compressor, Decompressor, get_maximum_output_size};

let dict = b"shared context bytes...";
let mut comp = Compressor::with_dict(dict);
let decomp = Decompressor::with_dict(dict);

let input = b"context bytes appear in messages";
let mut buf = vec![0u8; get_maximum_output_size(input.len())];
let n = comp.compress_into(input, &mut buf).unwrap();

let output = decomp.decompress(&buf[..n], input.len()).unwrap();
assert_eq!(&output[..], input);

Frame format

The frame format (feature frame, on by default) wraps block compression in the standard LZ4 frame container with checksums, content size, and streaming support. FrameEncoder and FrameDecoder implement Write and Read.

use lz4rip::frame::{FrameEncoder, FrameDecoder};
use std::io::{Write, Read};

// Compress
// FrameEncoder::with_dictionary(wtr, dict, dict_id) for dictionary support
let mut encoder = FrameEncoder::new(Vec::new());
encoder.write_all(b"Hello frame format!").unwrap();
let compressed = encoder.finish().unwrap();

// Decompress
let mut decoder = FrameDecoder::new(&compressed[..]);
let mut output = String::new();
decoder.read_to_string(&mut output).unwrap();
assert_eq!(output, "Hello frame format!");

Design

Divergences from C lz4 and lz4_flex that explain the performance difference. See DESIGN.md for details.

Aggressive skip acceleration (8 vs C lz4's 64 misses before stepping)
Generational hash table (16-bit for small inputs, 32-bit for large)
5-byte PRIME5 hash (vs C lz4's 4-byte KNUTH)
Compile-time specialization over hash table type, dict mode, and sink type
No HC/OPT/MID. Use zstd for ratio.

Safety

All compression and decompression logic is #[forbid(unsafe_code)]. See SAFETY.md for the unsafe boundary details and a catalog of C lz4 memory safety bugs that Rust prevents by construction.

Development

See DEVELOPMENT.md for benchmarking, fuzzing, and feature flag details.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
benches		benches
corpus		corpus
doc/charts		doc/charts
examples		examples
fuzz		fuzz
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.toml		Cargo.toml
DESIGN.md		DESIGN.md
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
README.md		README.md
SAFETY.md		SAFETY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lz4rip

Performance

Block format

Dictionary compression

Frame format

Design

Safety

Development

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

lz4rip

Performance

Block format

Dictionary compression

Frame format

Design

Safety

Development

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages