Adaptive Logic Stream (ALS) compression library for structured data (CSV, JSON).
ALS is a high-performance compression format that describes how to generate data rather than listing it, achieving superior compression ratios for structured data. The library supports bidirectional conversion (CSV/JSON ↔ ALS), falls back to CTX compression when ALS compression ratio is insufficient, and leverages high-performance techniques including SIMD instructions, zero-copy operations, and concurrent data structures.
- CSV & JSON Compression: Convert CSV and JSON data to ALS format for superior compression
- Pattern Detection: Automatically detects and encodes patterns (ranges, repetitions, alternations)
- CTX Fallback: Automatically falls back to CTX compression when ALS provides insufficient compression
- SIMD Optimization: Leverages AVX2, AVX-512, and NEON instructions for maximum throughput
- Parallel Processing: Uses Rayon for multi-threaded compression and decompression
- Zero-Copy Operations: Minimizes memory allocations and copies using rkyv serialization
- Thread-Safe: Atomic operations and concurrent data structures for multi-threaded applications
- Multiple Bindings: Python (PyO3), C FFI, Go (CGO), WebAssembly, and Node.js support
Add to your Cargo.toml:
[dependencies]
als-compression = "0.0.1"use als_compression::AlsCompressor;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let compressor = AlsCompressor::new();
// Compress CSV
let csv_data = "col1,col2,col3\n1,a,x\n2,b,y\n3,c,z\n";
let compressed = compressor.compress_csv(csv_data)?;
println!("Compressed: {}", compressed);
Ok(())
}simd(default): Enable SIMD optimizationsparallel(default): Enable parallel processing with Rayonpython: Build Python bindings with PyO3ffi: Build C FFI bindingswasm: Build WebAssembly bindingsasync: Enable async/await support with Tokio
cargo build --release# Disable SIMD
cargo build --release --no-default-features --features parallel
# Enable all features
cargo build --release --all-features
# Python bindings
cargo build --release --features pythonRun the test suite:
cargo testRun property-based tests:
cargo test --test '*' -- --nocaptureRun benchmarks:
cargo benchBuild and view documentation:
cargo doc --openThe library is optimized for:
- Compression Ratio: Superior compression for structured data with patterns
- Throughput: High MB/s compression and decompression rates
- Memory Efficiency: Minimal allocations and zero-copy deserialization
- Latency: Low-latency decompression with streaming support
The library consists of several key components:
- Pattern Detection Engine: Identifies sequential ranges, repetitions, and alternations
- Dictionary Builder: Optimizes string dictionary for compression benefit
- SIMD Dispatcher: Runtime CPU feature detection and SIMD implementation selection
- Adaptive HashMap: Automatically selects HashMap or DashMap based on dataset size
- Streaming Support: Process large files without loading entirely into memory
The library provides a C-compatible FFI for use from C and other languages with C interop:
# Build with FFI support
cd app/lib
cargo build --release --features ffiSee C_FFI_README.md for detailed documentation and examples.
Python bindings are available via PyO3:
cargo build --release --features python- Go: CGO bindings (planned)
- WebAssembly: WASM bindings (planned)
- Node.js: Native addon (planned)
- Rust 1.70 or later
- For Python bindings: Python 3.8+
- For C FFI: GCC or Clang
- For WebAssembly: wasm-pack
Licensed under the Apache-2.0. See LICENSE file for details.