A high-performance WebAssembly optimizer with formal verification support, featuring a comprehensive 12-phase optimization pipeline built with ISLE (Instruction Selection and Lowering Engine) and Z3 SMT solver integration.
- 12 optimization phases including constant folding, strength reduction, CSE, inlining, LICM, and DCE
- Declarative rules using ISLE DSL for maintainable optimizations
- Stateful dataflow analysis tracking locals and memory state
- Idempotent optimizations - safe to run multiple times
- Z3 SMT-based verification proves optimization correctness via translation validation
- Property-based testing ensures idempotence and validity
- Counterexample generation for debugging failed optimizations
- Optional verification feature (build with
--features verification)
- Ultra-fast: 10-30 µs optimization time for most modules
- Excellent compression: 80-95% binary size reduction
- Instruction optimization: 0-40% instruction count reduction (varies by code)
- Lightweight: Minimal dependencies, pure Rust implementation
- Component Model support for modern WebAssembly
- wasm32-wasip2 build target support
- Comprehensive benchmarking with Criterion
- Full WAT and WASM format support
Loom is built with a modular architecture featuring a shared foundation:
- loom-shared: Core ISLE definitions and WebAssembly IR (stable API)
- loom-core: Optimization pipeline implementation
- loom-cli: Command-line interface and tooling
- loom-testing: Differential testing framework
The loom-shared crate provides a stable, reusable foundation that other WebAssembly tools can build upon. It contains:
- ISLE term definitions for all WebAssembly instructions
- Module IR (Module, Function, Instruction types)
- WASM parsing and encoding utilities
- Z3 verification infrastructure (optional)
This architecture enables both rapid prototyping in Loom and potential use in safety-critical applications through derived tools.
📖 See LOOM_SYNTH_ARCHITECTURE.md for detailed architecture documentation.
# Build from source
git clone https://github.com/pulseengine/loom
cd loom
cargo build --release
# Binary at target/release/loom
./target/release/loom --help# Optimize WebAssembly module
loom optimize input.wasm -o output.wasm
# Show detailed statistics
loom optimize input.wat -o output.wasm --stats
# Run verification checks
loom optimize input.wasm -o output.wasm --verify
# Output as WAT text format
loom optimize input.wasm -o output.wat --wat🔧 LOOM Optimizer v0.1.0
Input: input.wasm
✓ Parsed in 234µs
⚡ Optimizing...
✓ Optimized in 0 ms
📊 Optimization Statistics
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Instructions: 24 → 20 (16.7% reduction)
Binary size: 797 → 92 bytes (88.5% reduction)
Constant folds: 3
Optimization time: 0 ms
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✅ Optimization complete!
| Phase | Optimization | Example | Impact |
|---|---|---|---|
| 1 | Precompute | Global constant propagation | Enables folding |
| 2 | ISLE Folding | 10 + 20 → 30 |
Compile-time evaluation |
| 3 | Strength Reduction | x * 8 → x << 3 |
2-3x speedup |
| 4 | CSE | Cache duplicate computations | Reduces redundancy |
| 5 | Function Inlining | Inline small functions | Call overhead removal |
| 6 | ISLE (Post-inline) | Fold exposed constants | Cross-function optimization |
| 7 | Code Folding | Flatten blocks | Control flow simplification |
| 8 | LICM | Hoist loop invariants | Loop speedup |
| 9 | Branch Simplify | Simplify conditionals | Fewer branches |
| 10 | DCE | Remove unreachable code | Code cleanup |
| 11 | Block Merge | Merge consecutive blocks | Reduce overhead |
| 12 | Vacuum & Locals | Remove unused variables | Final cleanup |
Constant Folding: 8-11 µs
Strength Reduction: 10 µs
CSE: 9-14 µs
Function Inlining: 16-18 µs
Full Pipeline: 19-28 µs
Parser: 6.8 µs
Encoder: 183 ns (!)
| Fixture | Instructions | Binary Size | Constant Folds |
|---|---|---|---|
| bench_bitops | 24 → 20 (16.7%) | 88.5% reduction | 0 |
| test_input | 9 → 7 (22.2%) | 81.6% reduction | 1 |
| fibonacci | 12 → 12 (0%) | 92.6% reduction | 0 |
| quicksort | Complex | 92.5% reduction | 0 |
| game_logic | Complex | 92.5% reduction | 0 |
Key Insight: Binary size reductions are consistently excellent (80-93%), while instruction count improvements vary by code complexity.
LOOM supports two verification modes:
loom optimize input.wasm -o output.wasm --verify- Fast idempotence checks
- Constant folding validation
- ~5ms overhead
# Install Z3
brew install z3 # macOS
sudo apt install z3 # Linux
# Build with verification
cargo build --release --features verification
# Verify with formal proof
./target/release/loom optimize input.wasm -o output.wasm --verifyOutput:
🔬 Running Z3 SMT verification...
✅ Z3 verification passed: optimizations are semantically equivalent
Z3 verification proves mathematically that optimizations preserve program semantics via translation validation.
Input:
(module
(func $optimize_me (param $x i32) (result i32)
local.get $x
i32.const 8
i32.mul
)
)After Optimization:
(module
(func $optimize_me (param $x i32) (result i32)
local.get $x
i32.const 3
i32.shl ;; 2-3x faster than multiply!
)
)Input:
(func $calculate (result i32)
i32.const 10
i32.const 20
i32.add
i32.const 5
i32.mul
)After Optimization:
(func $calculate (result i32)
i32.const 150 ;; Computed at compile-time
)Input:
(func $duplicate (param $x i32) (result i32)
local.get $x
i32.const 4
i32.mul
local.get $x
i32.const 4
i32.mul ;; Duplicate computation!
i32.add
)After Optimization:
(func $duplicate (param $x i32) (result i32)
local.get $x
i32.const 4
i32.mul
local.tee $temp
local.get $temp ;; Reuse cached result
i32.add
) ┌──────────────────┐
│ WebAssembly │
│ (WAT or WASM) │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ wasmparser │
│ Parse to AST │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ ISLE Terms │
│ (IR) │
└────────┬─────────┘
│
▼
┌────────────────────────────────────────────┐
│ 12-Phase Optimization Pipeline │
│ ┌──────────────────────────────────────┐ │
│ │ 1. Precompute │ │
│ │ 2. ISLE Constant Folding │ │
│ │ 3. Strength Reduction │ │
│ │ 4. Common Subexpression Elimination │ │
│ │ 5. Function Inlining │ │
│ │ 6. ISLE (Post-inline) │ │
│ │ 7. Code Folding │ │
│ │ 8. Loop-Invariant Code Motion │ │
│ │ 9. Branch Simplification │ │
│ │ 10. Dead Code Elimination │ │
│ │ 11. Block Merging │ │
│ │ 12. Vacuum & Simplify Locals │ │
│ └──────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────┐ │
│ │ Dataflow Analysis │ │
│ │ (locals, memory state tracking) │ │
│ └──────────────────────────────────────┘ │
└────────────────────────────────────────────┘
│
▼
┌──────────────────────────┐
│ Optional: Z3 SMT │
│ Verification │
└────────────┬─────────────┘
│
▼
┌──────────────────────────┐
│ wasm-encoder │
│ Encode to binary │
└────────────┬─────────────┘
│
▼
┌──────────────────────────┐
│ Optimized WebAssembly │
└──────────────────────────┘
- Usage Guide - Complete CLI reference, examples, and best practices
- Quick Reference - Cheat sheet for common tasks
- Performance Comparison - LOOM vs wasm-opt benchmarks and analysis
- Architecture - Deep dive into the 12-phase pipeline and implementation
- Formal Verification Guide - Z3 SMT verification internals
- WASM Build Guide - Building LOOM to WebAssembly (wasm32-wasip2)
- Implementation Details - Technical implementation notes
- Contributing Guide - How to contribute to LOOM
- Design Documents - Individual optimization pass designs (CSE, DCE, LICM, etc.)
# Run all tests
cargo test
# Run optimization-specific tests
cargo test --test optimization_tests
# Run benchmarks
cargo bench
# Test on real fixtures
./target/release/loom optimize tests/fixtures/quicksort.wat -o /tmp/out.wasm --stats --verifyTest Status:
- ✅ 20/20 optimization tests passing (100%)
- ✅ 54/57 unit tests passing (95%)
- ✅ All benchmarks complete successfully
loom/
├── loom-core/ # Core optimizer implementation
│ ├── src/lib.rs # 12-phase pipeline, optimizations
│ ├── src/verify.rs # Z3 verification module
│ ├── tests/ # Unit and integration tests
│ └── benches/ # Criterion performance benchmarks
├── loom-isle/ # ISLE term definitions and rules
│ ├── isle/ # ISLE DSL files
│ └── src/lib.rs # Rust integration
├── loom-cli/ # Command-line interface
│ ├── src/main.rs # CLI implementation
│ └── BUILD.bazel # Bazel build rules
├── tests/fixtures/ # Real-world test cases
│ ├── fibonacci.wat
│ ├── quicksort.wat
│ ├── matrix_multiply.wat
│ └── ...
└── docs/ # Comprehensive documentation
├── USAGE_GUIDE.md
├── QUICK_REFERENCE.md
└── ...
cargo build --releasecargo build --release --features verification# Using Cargo
cargo build --release --target wasm32-wasip2
# Using Bazel
bazel build //loom-cli:loom_wasm --platforms=@rules_rust//rust/platform:wasmSee WASM_BUILD.md for details.
# Optimize and verify before deployment
loom optimize app.wasm -o app.optimized.wasm --verify --stats- name: Optimize WebAssembly
run: |
loom optimize dist/*.wasm -o dist/*.wasm --stats --verify# Optimize during build
cargo build --target wasm32-unknown-unknown
loom optimize target/wasm32-unknown-unknown/release/app.wasm -o dist/app.wasm# Compare before/after
ls -lh original.wasm optimized.wasm
loom optimize original.wasm -o optimized.wasm --statsx * 2^n→x << n(2-3x faster)x / 2^n→x >> n(2-3x faster)x % 2^n→x & (2^n - 1)(2-3x faster)
x | 0→xx & -1→xx ^ 0→xx + 0→xx * 1→x
- Compile-time evaluation of all constant expressions
- Propagation through local variables
- Cross-function constant propagation via inlining
Contributions welcome! This project is under active development.
- Check existing issues
- Run tests:
cargo test && cargo clippy - Follow Rust conventions
- Add tests for new features
- Update documentation
Apache License 2.0
- ISLE DSL from Cranelift
- Z3 SMT Solver from Microsoft Research
- wasmparser & wasm-encoder from Bytecode Alliance
- Binaryen - Reference WebAssembly optimizer (C++)
- wasm-opt - Industry-standard optimizer
- Cranelift - High-performance code generator
- WRT - WebAssembly Component Model runtime
Current Status (v0.1.0):
- ✅ 12-phase optimization pipeline
- ✅ Z3 formal verification
- ✅ Comprehensive benchmarking
- ✅ Component Model support
- ✅ wasm32-wasip2 build target
Coming Soon:
- 🚧 More aggressive LICM (arithmetic operations, global reads)
- 🚧 Profile-guided optimization
- 🚧 SIMD-specific optimizations
- 🚧 Inter-procedural analysis
- 🚧 Custom optimization passes
Built with ❤️ by PulseEngine