Skip to content

Fast, ergonomic compression library for Zig. Unified API across 7 codecs: LZ4, Snappy, Zstd, Gzip, Brotli, Zlib, Deflate. Pure Zig implementations with SIMD optimizations. Zero system dependencies—all C libraries vendored. Streaming, dictionaries, and archive support included.

License

Notifications You must be signed in to change notification settings

NerdMeNot/compressionz

Repository files navigation

compressionz

A fast, ergonomic compression library for Zig with a unified API across multiple codecs.

36 GB/s compression with LZ4 | 12 GB/s with Zstd | Zero dependencies | Pure Zig + vendored C


Why compressionz?

  • One API for everything — Learn once, use any codec. Switch from Gzip to Zstd with one line change.
  • Blazing fast — Pure Zig LZ4 and Snappy with SIMD optimizations. 36+ GB/s compression throughput.
  • Zero system dependencies — All C libraries vendored. No brew install, no apt-get. Just zig build.
  • Production ready — Streaming, dictionaries, checksums, decompression bomb protection.
  • Archive support — Read and write ZIP and TAR archives with simple APIs.

Quick Start

const cz = @import("compressionz");

// Compress
const compressed = try cz.compress(.zstd, data, allocator);
defer allocator.free(compressed);

// Decompress
const original = try cz.decompress(.zstd, compressed, allocator);
defer allocator.free(original);

That's it. Same API for all codecs — just change .zstd to .lz4, .gzip, .snappy, or .brotli.


Installation

build.zig.zon

.dependencies = .{
    .compressionz = .{
        .url = "https://github.com/NerdMeNot/compressionz/archive/refs/tags/v1.0.0-zig0.15.2.tar.gz",
        .hash = "...", // Run zig build to get the hash
    },
},

Or for local development:

.dependencies = .{
    .compressionz = .{
        .path = "../compressionz",
    },
},

build.zig

const compressionz = b.dependency("compressionz", .{
    .target = target,
    .optimize = optimize,
});
exe.root_module.addImport("compressionz", compressionz.module("compressionz"));

Codec Comparison

Codec Compress Decompress Ratio Best For
LZ4 Raw 36.6 GB/s 8.1 GB/s 99.5% Maximum speed, internal use
Snappy 31.6 GB/s 9.2 GB/s 95.3% Real-time, message passing
Zstd 12.0 GB/s 11.6 GB/s 99.9% General purpose (recommended)
LZ4 Frame 4.8 GB/s 3.8 GB/s 99.3% File storage with checksums
Gzip 2.4 GB/s 2.4 GB/s 99.2% HTTP, cross-platform
Brotli 1.3 GB/s 1.9 GB/s 99.9% Static web assets

Recommendation: Use Zstd unless you have specific requirements. It offers the best balance of speed and compression.

See BENCHMARKS.md for detailed performance analysis.


Supported Codecs

Codec Implementation Streaming Dictionary Auto-Detect Checksum
lz4 Pure Zig Yes No Yes Yes
lz4_raw Pure Zig No No No No
snappy Pure Zig No No Yes No
zstd Vendored C Yes Yes Yes Yes
gzip Vendored C Yes No Yes Yes
zlib Vendored C Yes Yes Yes Yes
deflate Vendored C Yes Yes No No
brotli Vendored C Yes No No No

API Reference

Basic Compression

const cz = @import("compressionz");

// Compress with default settings
const compressed = try cz.compress(.zstd, data, allocator);
defer allocator.free(compressed);

// Compress with options
const compressed = try cz.compressWithOptions(.zstd, data, allocator, .{
    .level = .best,  // .fastest, .fast, .default, .better, .best
});

Basic Decompression

// Decompress
const data = try cz.decompress(.zstd, compressed, allocator);
defer allocator.free(data);

// Decompress with safety limit (prevents decompression bombs)
const data = try cz.decompressWithOptions(.zstd, compressed, allocator, .{
    .max_output_size = 100 * 1024 * 1024,  // 100 MB limit
});

Compression Levels

Level Speed Ratio Use Case
.fastest Fastest Lowest CPU-bound, ratio doesn't matter
.fast Fast Good Real-time compression
.default Balanced Good Recommended for most uses
.better Slower Better Storage, archival
.best Slowest Best Static content, one-time compression

LZ4 Raw (Special Case)

LZ4 raw block format requires the original size for decompression:

// Compress
const compressed = try cz.compress(.lz4_raw, data, allocator);
defer allocator.free(compressed);

// Decompress — must provide expected_size
const original = try cz.decompressWithOptions(.lz4_raw, compressed, allocator, .{
    .expected_size = original_length,  // Required!
});
defer allocator.free(original);

Store the original size alongside the compressed data when using LZ4 raw.


Streaming API

Process large files without loading everything into memory:

Streaming Decompression

const cz = @import("compressionz");

// Decompress from file
var file = try std.fs.cwd().openFile("data.gz", .{});
defer file.close();

var decompressor = try cz.decompressor(.gzip, allocator, file.reader());
defer decompressor.deinit();

// Read decompressed data
const data = try decompressor.reader().readAllAlloc(allocator, max_size);
defer allocator.free(data);

Streaming Compression

var file = try std.fs.cwd().createFile("output.gz", .{});
defer file.close();

var compressor = try cz.compressor(.gzip, allocator, file.writer(), .{});
defer compressor.deinit();

// Write data (automatically compressed)
try compressor.writer().writeAll(data);
try compressor.finish();  // Flush and finalize

Streaming supported: .gzip, .zlib, .deflate, .zstd, .brotli, .lz4


Dictionary Compression

Dramatically improve compression ratios for small data with known patterns:

const cz = @import("compressionz");

// Dictionary containing common patterns in your data
const dictionary = @embedFile("my_dictionary.bin");

// Compress with dictionary
const compressed = try cz.compressWithOptions(.zstd, data, allocator, .{
    .dictionary = dictionary,
});
defer allocator.free(compressed);

// Decompress with same dictionary
const original = try cz.decompressWithOptions(.zstd, compressed, allocator, .{
    .dictionary = dictionary,  // Must match compression dictionary!
});
defer allocator.free(original);

Dictionary supported: .zstd, .zlib, .deflate

Tip: Train a dictionary on representative samples of your data for best results.


Zero-Copy Compression

Compress into pre-allocated buffers for zero-allocation hot paths:

const cz = @import("compressionz");

var output_buffer: [65536]u8 = undefined;

// Compress into buffer
const compressed = try cz.compressInto(.lz4, data, &output_buffer, .{});

// Decompress into buffer
const original = try cz.decompressInto(.lz4, compressed, &output_buffer);

Zero-copy supported: .lz4, .lz4_raw, .snappy


Codec Detection

Automatically detect compression format from magic bytes:

const cz = @import("compressionz");

if (cz.Codec.detect(data)) |codec| {
    const decompressed = try cz.decompress(codec, data, allocator);
    defer allocator.free(decompressed);
    // Process decompressed data...
} else {
    // Unknown or uncompressed data
}

Auto-detection supported: .lz4, .zstd, .gzip, .zlib, .snappy

Codec Capabilities

Query codec features at runtime:

const codec: cz.Codec = .zstd;

if (codec.supportsStreaming()) { /* ... */ }
if (codec.supportsDictionary()) { /* ... */ }
if (codec.requiresExpectedSize()) { /* ... */ }
if (codec.hasBuiltinChecksum()) { /* ... */ }
if (codec.isFramed()) { /* ... */ }

Archive Support

Read and write ZIP and TAR archives:

Extract Archives

const cz = @import("compressionz");

// Extract ZIP
const files = try cz.archive.extractZip(allocator, zip_data);
defer {
    for (files) |f| {
        allocator.free(f.name);
        allocator.free(f.data);
    }
    allocator.free(files);
}

for (files) |file| {
    std.debug.print("{s}: {} bytes\n", .{ file.name, file.data.len });
}

// Extract TAR (same API)
const tar_files = try cz.archive.extractTar(allocator, tar_data);

Create Archives

const cz = @import("compressionz");

const files = [_]cz.archive.FileEntry{
    .{ .name = "hello.txt", .data = "Hello, World!" },
    .{ .name = "data.json", .data = "{\"key\": \"value\"}" },
};

// Create ZIP
const zip_data = try cz.archive.createZip(allocator, &files);
defer allocator.free(zip_data);

// Create TAR
const tar_data = try cz.archive.createTar(allocator, &files);
defer allocator.free(tar_data);

Manual Archive Iteration

For more control, use the Reader/Writer APIs directly:

const cz = @import("compressionz");

// Read ZIP file
var fbs = std.io.fixedBufferStream(zip_data);
var reader = try cz.archive.zip.Reader(@TypeOf(&fbs)).init(allocator, &fbs);
defer reader.deinit();

while (try reader.next()) |entry| {
    if (!entry.is_directory) {
        const data = try entry.readAll(allocator);
        defer allocator.free(data);
        // Process file...
    }
}

Error Handling

All operations return a unified error type:

const cz = @import("compressionz");

const result = cz.decompress(.zstd, data, allocator) catch |err| switch (err) {
    error.InvalidData => {
        // Corrupted or invalid compressed data
    },
    error.ChecksumMismatch => {
        // Data integrity check failed
    },
    error.OutputTooLarge => {
        // Exceeds max_output_size (decompression bomb protection)
    },
    error.OutputTooSmall => {
        // Buffer too small for compressInto/decompressInto
    },
    error.UnexpectedEof => {
        // Input data truncated
    },
    error.UnsupportedFeature => {
        // Codec doesn't support requested feature
    },
    error.OutOfMemory => {
        // Allocation failed
    },
    else => return err,
};

Direct Codec Access

For advanced use cases, access codec implementations directly:

const cz = @import("compressionz");

// LZ4 frame with fine-grained options
const compressed = try cz.lz4.frame.compress(data, allocator, .{
    .level = .fast,
    .content_checksum = true,
    .content_size = data.len,
});

// LZ4 raw block (maximum speed)
const block = try cz.lz4.block.compress(data, allocator);

// Snappy
const snappy_out = try cz.snappy.compress(data, allocator);

// Zstd, Gzip, Brotli with specific levels
const zstd_out = try cz.zstd.compress(data, allocator, .fast);
const gzip_out = try cz.gzip.compress(data, allocator, .default);
const brotli_out = try cz.brotli.compress(data, allocator, .best);

Implementation Details

Codec Source SIMD Notes
LZ4 Pure Zig Yes 16-byte vectorized match finding
Snappy Pure Zig Yes 16-byte vectorized match finding
Zstd Vendored zstd 1.5.7 Yes SSE2/AVX2/NEON support
Gzip/Zlib Vendored zlib 1.3.1 Partial
Brotli Vendored brotli Partial

SIMD Optimizations

The pure Zig codecs use explicit SIMD via @Vector:

// 16-byte vectorized comparison for match extension
const v1: @Vector(16, u8) = src[pos..][0..16].*;
const v2: @Vector(16, u8) = src[match_pos..][0..16].*;
const eq = v1 == v2;
const mask = @as(u16, @bitCast(eq));

This enables competitive performance with hand-optimized C implementations.


Building

# Build library
zig build

# Run tests
zig build test

# Run benchmarks
zig build bench -Doptimize=ReleaseFast

Performance Tips

  1. Use Zstd for general purpose — Best balance of speed and ratio
  2. Use LZ4 Raw for maximum speed — 36+ GB/s, but requires tracking size
  3. Use dictionary compression for small data — Dramatically improves ratios
  4. Use streaming for large files — Avoids loading entire file into memory
  5. Use .default level — The difference between levels is often minimal
  6. Use max_output_size — Protects against decompression bombs

Comparison with Alternatives

Feature compressionz std.compress zstd-c
Unified API Yes No No
LZ4 Yes (Pure Zig) No No
Snappy Yes (Pure Zig) No No
Zstd Yes No Yes
Gzip Yes Yes No
Brotli Yes No No
Streaming Yes Yes Yes
Dictionary Yes No Yes
Zero dependencies Yes Yes No
Archives (ZIP/TAR) Yes No No

License

Apache 2.0 — See LICENSE

This project includes vendored third-party libraries that retain their original licenses:

Library License Copyright
zstd BSD-3-Clause Meta Platforms, Inc.
zlib zlib Jean-loup Gailly, Mark Adler
brotli MIT Google Inc.

All vendored licenses are permissive and compatible with Apache 2.0. See NOTICE for full attribution.


Contributing

Contributions are welcome! Please:

  1. Run zig build test before submitting
  2. Add tests for new functionality
  3. Update documentation as needed
  4. Run benchmarks if performance-related

Acknowledgments

  • LZ4 — Original LZ4 algorithm by Yann Collet
  • Snappy — Original Snappy algorithm by Google
  • Zstandard — Zstd library by Facebook
  • zlib — zlib library by Jean-loup Gailly and Mark Adler
  • Brotli — Brotli library by Google

About

Fast, ergonomic compression library for Zig. Unified API across 7 codecs: LZ4, Snappy, Zstd, Gzip, Brotli, Zlib, Deflate. Pure Zig implementations with SIMD optimizations. Zero system dependencies—all C libraries vendored. Streaming, dictionaries, and archive support included.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages