ripzip

A multi-threaded zip/unzip library and CLI for Rust.

Features

Parallel compression -- files are compressed concurrently with rayon + flate2 (zlib-rs), then assembled into a valid ZIP archive
Parallel extraction -- files are decompressed concurrently from mmap'd archives with zero-copy reads
CRC32 on every file -- SIMD-accelerated (crc32fast), validated on every extraction
Atomic archive writes -- compression writes to a tempfile, fsyncs, then renames; a crash mid-write never produces a corrupt archive
Path traversal prevention -- rejects ../ attacks, absolute paths, and Windows drive letters before any extraction begins
ZIP64 support -- automatic for >65,535 entries, >4 GB files, or >4 GB offsets
Zstd compression -- Zstandard (method 93) as an alternative to DEFLATE, with full interop
Incompressible data detection -- falls back to Stored when compression would inflate the data
Windows long path support -- \\?\ extended-length paths for paths exceeding MAX_PATH (260 chars)
Adaptive memory management -- dynamically sizes the in-memory compression threshold based on available system RAM (up to 400 MB budget), so small files stay in memory while large files stream through temp files
Deterministic output -- archives are byte-identical across runs (entries sorted by path)

Benchmarks

ripzip (parallel, rayon + flate2/zlib-rs) vs the zip crate (single-threaded, miniz_oxide). Both at DEFLATE compression level 1. Best of 5 runs, filesystem caches warm.

CPU: Intel Core i7-14700K (20 cores / 28 threads) -- Windows 11 -- NVMe SSD

Compression

Scenario	Files	Data	ripzip	zip crate	Speedup
50k small source files	50,000	14 MB	378ms (38 MB/s)	2.40s (6 MB/s)	6.3x
500 x 10 MB log files	500	5 GB	488ms (10.2 GB/s)	2.20s (2.3 GB/s)	4.5x
100 x 50 MB binary blobs	100	5 GB	214ms (23.4 GB/s)	2.29s (2.2 GB/s)	10.7x
Mixed (10k src + 1 GB assets)	10,050	1 GB	531ms (1.9 GB/s)	1.04s (967 MB/s)	2.0x

Extraction

Scenario	Files	Data	ripzip	zip crate	Speedup
50k small source files	50,000	14 MB	27.47s (1 MB/s)	33.73s (0 MB/s)	1.2x
500 x 10 MB log files	500	5 GB	1.13s (4.4 GB/s)	3.68s (1.4 GB/s)	3.3x
100 x 50 MB binary blobs	100	5 GB	1.18s (4.2 GB/s)	4.45s (1.1 GB/s)	3.8x
Mixed (10k src + 1 GB assets)	10,050	1 GB	4.24s (237 MB/s)	6.20s (162 MB/s)	1.5x

Takeaway: ripzip compresses 2.0--10.7x faster and extracts 1.2--3.8x faster across all workloads. Speedup scales with individual file size -- the 5 GB binary blob corpus sees the biggest compression wins (10.7x) because all 28 threads are saturated with real DEFLATE work on large chunks. The 50k small files scenario is filesystem-metadata-bound, where parallelism still helps but the per-file overhead floor is higher.

Archive sizes are identical between the two -- same DEFLATE algorithm, same compression level.

Zstd vs Deflate (ripzip, both parallel, level 1)

Scenario	Files	Data	Deflate	Zstd	Zstd speedup	Deflate archive	Zstd archive
50k small source files	50,000	14 MB	378ms (38 MB/s)	1.10s (13 MB/s)	0.3x	10 MB	10 MB
500 x 10 MB log files	500	5 GB	488ms (10.2 GB/s)	213ms (23.5 GB/s)	2.3x	62 MB	592 KB
100 x 50 MB binary blobs	100	5 GB	214ms (23.4 GB/s)	163ms (30.7 GB/s)	1.3x	64 MB	495 KB
Mixed (10k src + 1 GB assets)	10,050	1 GB	531ms (1.9 GB/s)	645ms (1.6 GB/s)	0.8x	36 MB	24 MB

Takeaway: Zstd achieves dramatically better compression ratios on large files (100x smaller archives for logs/blobs) while being comparable or faster for compression. On many small files, Deflate wins because Zstd's per-file initialization cost is higher. Extraction speeds are nearly identical -- both are I/O-bound at this level of parallelism.

Run benchmarks yourself

cargo bench -p ripzip

Library Usage

Add to your Cargo.toml:

[dependencies]
ripzip = { path = "ripzip" }

use std::path::Path;
use ripzip::{NoProgress, compress_directory, extract_to_directory};

// Compress a directory
use ripzip::CompressionMethod;

compress_directory(
    Path::new("my_project/"),
    Path::new("my_project.zip"),
    1,                            // compression level (1=fastest, 9=smallest)
    CompressionMethod::Deflate,   // or CompressionMethod::Zstd
    &NoProgress,                  // or implement ProgressReporter for progress bars
)?;

// Extract an archive
extract_to_directory(
    Path::new("my_project.zip"),
    Path::new("output/"),
    &NoProgress,
)?;
# Ok::<(), ripzip::RipzipError>(())

Progress Reporting

Implement the ProgressReporter trait for real-time progress updates:

use ripzip::ProgressReporter;

struct MyReporter;

impl ProgressReporter for MyReporter {
    fn start(&self, total_files: u64, total_bytes: u64) {
        println!("Processing {total_files} files ({total_bytes} bytes)");
    }

    fn progress(&self, bytes_delta: u64) {
        // Called from worker threads -- use atomics for aggregation.
        // bytes_delta is uncompressed bytes just processed.
    }

    fn finish(&self) {
        println!("Done!");
    }
}

Progress callbacks fire at chunk granularity (256 KB), so even single large files show smooth progress.

CLI

cargo install --path ripzip-cli

ripzip compress <DIR> -o <FILE> [--level 1-9] [--method deflate|zstd] [--quiet]
ripzip extract <ARCHIVE> [-o <DIR>] [--quiet]
ripzip list <ARCHIVE> [--verbose]

Aliases: c, x, l.

$ ripzip compress my_project/ -o my_project.zip --method zstd
 [00:00:00] [####################################] 142.3MB/142.3MB (1.8GB/s)
Created my_project.zip

$ ripzip extract my_project.zip -o output/
 [00:00:00] [####################################] 142.3MB/142.3MB (3.2GB/s)
Extracted to output/

$ ripzip list my_project.zip --verbose
Compressed   Original     Method   Name
------------------------------------------------------------
1234         5678         Deflate  src/main.rs
0            0            Stored   assets/

210 files, 142300000 bytes uncompressed

Safety Guarantees

CRC32 on every file -- computed during compression, validated during extraction. Tampered or corrupt archives are rejected. On CRC mismatch during extraction, the corrupt output file is deleted.
Atomic archive writes -- the archive is assembled into a tempfile, fsynced, then renamed. A crash or power loss mid-compression never produces a corrupt .zip file. (Extraction writes directly to destination for performance -- the archive is the source of truth and can always be re-extracted.)
Path traversal prevention -- all archive paths are validated before any extraction. Paths containing .., absolute paths, and Windows drive letters are rejected.
ZIP64 -- automatically used when entry counts exceed 65,535, file sizes exceed 4 GB, or offsets exceed 4 GB.
fsync before rename -- data is flushed to disk before the atomic rename, ensuring durability.
Incompressible data detection -- if compression produces output larger than the input, the file is stored uncompressed.

Architecture

                     COMPRESSION PIPELINE

  walkdir ──> Vec<FileEntry> ──> rayon::par_iter ──> Vec<CompressedEntry>
                                      |
                              per-file: read + CRC32 + DEFLATE/Zstd
                              (adaptive threshold: in memory or via temp file)
                                      |
                                      v
                            sequential ZIP assembly
                        (local headers + data + central dir + EOCD)
                                      |
                                  fsync + rename


                     EXTRACTION PIPELINE

  open archive ──> mmap (< 2GB) or per-thread file handles (>= 2GB)
       |
  parse EOCD ──> parse central directory ──> validate all paths
       |
  create directories (sequential)
       |
  rayon::par_iter (per file):
       zero-copy slice from mmap ──> DEFLATE/Zstd + CRC32 verify ──> write to destination

Project Structure

ripzip-rs/
  ripzip/           # Library crate
    src/
      lib.rs              # Public API
      error.rs            # RipzipError enum
      progress.rs         # ProgressReporter trait
      fs_utils.rs         # Path validation, directory walking, long path support
      compress/           # Compression pipeline
        mod.rs            # Orchestrator
        parallel.rs       # Per-file compression
        zip_writer.rs     # ZIP format assembler
      extract/            # Extraction pipeline
        mod.rs            # Orchestrator
        parallel.rs       # Per-file extraction + CRC validation
        zip_reader.rs     # EOCD + central directory parser
      zip_format/         # ZIP binary format
        mod.rs            # Constants, helpers
        local_header.rs   # Local file header
        central_dir.rs    # Central directory entry
        eocd.rs           # End of Central Directory
        zip64.rs          # ZIP64 extensions
        crc.rs            # CRC32 helpers
    tests/
      integration/        # 73 integration tests across 13 categories
    benches/
      compare.rs          # ripzip vs zip crate benchmarks
  ripzip-cli/       # CLI binary (clap + indicatif)

Testing

117 tests: 35 unit tests + 82 integration tests (3 ZIP64 stress tests are #[ignore]).

cargo test

Integration test categories: round-trip, empty files, unicode filenames, large files, deep directories, progress callbacks, error handling, CRC validation, path traversal, parallel determinism, binary data, single files, interop with the zip crate (Deflate + Zstd), ZIP64, Windows long paths.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
ripzip-cli		ripzip-cli
ripzip		ripzip
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ripzip

Features

Benchmarks

Compression

Extraction

Zstd vs Deflate (ripzip, both parallel, level 1)

Run benchmarks yourself

Library Usage

Progress Reporting

CLI

Safety Guarantees

Architecture

Project Structure

Testing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

velopack/ripzip-rs

Folders and files

Latest commit

History

Repository files navigation

ripzip

Features

Benchmarks

Compression

Extraction

Zstd vs Deflate (ripzip, both parallel, level 1)

Run benchmarks yourself

Library Usage

Progress Reporting

CLI

Safety Guarantees

Architecture

Project Structure

Testing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages