Skip to content

refo/zipstream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

zipstream

A CLI tool that downloads and extracts ZIP files in a single streaming pass — no temporary files, no download-then-extract.

Features

  • True streaming extraction — parses local file headers sequentially, extracts files as bytes arrive over HTTP
  • Store and Deflate compression (covers ~99% of ZIP files in the wild)
  • CRC32 verification for data integrity
  • Zip64 support for large archives
  • Path traversal protection — rejects ../ and absolute paths
  • Auto-wrapping — creates a containing folder when the archive has multiple top-level entries
  • --strip-components — strip leading path components (like tar)
  • Progress output to stderr with TTY-aware formatting
  • --json — NDJSON progress output for machine consumption

Installation

Homebrew (macOS & Linux)

brew install refo/tap/zipstream

Linux / macOS (shell script)

curl -fsSL https://raw.githubusercontent.com/refo/zipstream/main/install.sh | sh

Or install to a custom directory:

INSTALL_DIR=~/.local/bin curl -fsSL https://raw.githubusercontent.com/refo/zipstream/main/install.sh | sh

Windows (Scoop)

scoop bucket add refo https://github.com/refo/scoop-bucket
scoop install zipstream

Download binaries

Pre-built binaries for all platforms are available on the Releases page.

Build from source

Requires Zig 0.15.2.

git clone https://github.com/refo/zipstream.git
cd zipstream
zig build -Doptimize=ReleaseFast

The binary will be at zig-out/bin/zipstream.

Usage

zipstream <url> [options]

Options

Flag Description
-o, --output <dir> Extract to directory (default: .)
--strip-components <n> Strip N leading path components
--json Output progress as NDJSON to stderr
-h, --help Show help

Examples

# Extract a GitHub repo archive
zipstream https://github.com/user/repo/archive/main.zip

# Extract to a specific directory
zipstream https://example.com/data.zip -o /tmp/data

# Strip the top-level directory (common for GitHub archives)
zipstream https://github.com/user/repo/archive/main.zip --strip-components 1 -o ./repo

JSON output

With --json, progress is emitted as NDJSON (one JSON object per line) to stderr:

{"type":"progress","file":"repo-main/large-file.bin","bytes_downloaded":65536}
{"type":"extract","file":"repo-main/large-file.bin","bytes_downloaded":131072}
{"type":"done","files_extracted":42,"bytes_downloaded":1048576,"output":"/tmp/out"}

Event types:

  • progress — periodic update during file extraction (throttled)
  • extract — file extraction completed
  • warning — non-fatal issue (unsupported compression, bad filename)
  • error — fatal error with message field
  • done — extraction finished successfully

When Content-Length is available, progress and extract events include bytes_total and percent fields.

Exit codes

Code Meaning
0 Success
1 Usage error
2 Network/HTTP error
3 ZIP format error
4 I/O error

How it works

ZIP files store a central directory at the end of the archive, but each file entry also has a local file header immediately before its data. zipstream exploits this by parsing local headers sequentially from a forward-only HTTP stream:

[LFH₁][data₁] [LFH₂][data₂] ... [Central Dir] [End Record]
  ↑ read         ↑ read              ↑ stop

When a non-local-header signature is encountered (central directory), extraction stops. This means the entire archive never needs to exist on disk.

Limitations

  • Encrypted ZIPs are rejected
  • Compression methods other than Store (0) and Deflate (8) are skipped with a warning
  • Multi-disk archives are not supported

License

MIT

About

A CLI tool that downloads and extracts ZIP files in a single streaming pass — no temporary files, no download-then-extract.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors