Skip to content

pulseengine/loom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LOOM - Formally Verified WebAssembly Optimizer

Status License Tests

A high-performance WebAssembly optimizer with formal verification support, featuring a comprehensive 12-phase optimization pipeline built with ISLE (Instruction Selection and Lowering Engine) and Z3 SMT solver integration.

✨ Features

🚀 Comprehensive Optimization Pipeline

  • 12 optimization phases including constant folding, strength reduction, CSE, inlining, LICM, and DCE
  • Declarative rules using ISLE DSL for maintainable optimizations
  • Stateful dataflow analysis tracking locals and memory state
  • Idempotent optimizations - safe to run multiple times

🔬 Formal Verification

  • Z3 SMT-based verification proves optimization correctness via translation validation
  • Property-based testing ensures idempotence and validity
  • Counterexample generation for debugging failed optimizations
  • Optional verification feature (build with --features verification)

⚡ Performance

  • Ultra-fast: 10-30 µs optimization time for most modules
  • Excellent compression: 80-95% binary size reduction
  • Instruction optimization: 0-40% instruction count reduction (varies by code)
  • Lightweight: Minimal dependencies, pure Rust implementation

🎯 Advanced Features

  • Component Model support for modern WebAssembly
  • wasm32-wasip2 build target support
  • Comprehensive benchmarking with Criterion
  • Full WAT and WASM format support

🏛️ Architecture

Loom is built with a modular architecture featuring a shared foundation:

  • loom-shared: Core ISLE definitions and WebAssembly IR (stable API)
  • loom-core: Optimization pipeline implementation
  • loom-cli: Command-line interface and tooling
  • loom-testing: Differential testing framework

The loom-shared crate provides a stable, reusable foundation that other WebAssembly tools can build upon. It contains:

  • ISLE term definitions for all WebAssembly instructions
  • Module IR (Module, Function, Instruction types)
  • WASM parsing and encoding utilities
  • Z3 verification infrastructure (optional)

This architecture enables both rapid prototyping in Loom and potential use in safety-critical applications through derived tools.

📖 See LOOM_SYNTH_ARCHITECTURE.md for detailed architecture documentation.

📦 Quick Start

Installation

# Build from source
git clone https://github.com/pulseengine/loom
cd loom
cargo build --release

# Binary at target/release/loom
./target/release/loom --help

Basic Usage

# Optimize WebAssembly module
loom optimize input.wasm -o output.wasm

# Show detailed statistics
loom optimize input.wat -o output.wasm --stats

# Run verification checks
loom optimize input.wasm -o output.wasm --verify

# Output as WAT text format
loom optimize input.wasm -o output.wat --wat

Example Output

🔧 LOOM Optimizer v0.1.0
Input: input.wasm
✓ Parsed in 234µs
⚡ Optimizing...
✓ Optimized in 0 ms

📊 Optimization Statistics
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Instructions: 24 → 20 (16.7% reduction)
Binary size:  797 → 92 bytes (88.5% reduction)
Constant folds: 3
Optimization time: 0 ms
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

✅ Optimization complete!

🏗️ 12-Phase Optimization Pipeline

Phase Optimization Example Impact
1 Precompute Global constant propagation Enables folding
2 ISLE Folding 10 + 2030 Compile-time evaluation
3 Strength Reduction x * 8x << 3 2-3x speedup
4 CSE Cache duplicate computations Reduces redundancy
5 Function Inlining Inline small functions Call overhead removal
6 ISLE (Post-inline) Fold exposed constants Cross-function optimization
7 Code Folding Flatten blocks Control flow simplification
8 LICM Hoist loop invariants Loop speedup
9 Branch Simplify Simplify conditionals Fewer branches
10 DCE Remove unreachable code Code cleanup
11 Block Merge Merge consecutive blocks Reduce overhead
12 Vacuum & Locals Remove unused variables Final cleanup

📊 Benchmark Results

Performance (Criterion benchmarks)

Constant Folding:        8-11 µs
Strength Reduction:      10 µs
CSE:                     9-14 µs
Function Inlining:       16-18 µs
Full Pipeline:           19-28 µs
Parser:                  6.8 µs
Encoder:                 183 ns (!)

Real-World Results

Fixture Instructions Binary Size Constant Folds
bench_bitops 24 → 20 (16.7%) 88.5% reduction 0
test_input 9 → 7 (22.2%) 81.6% reduction 1
fibonacci 12 → 12 (0%) 92.6% reduction 0
quicksort Complex 92.5% reduction 0
game_logic Complex 92.5% reduction 0

Key Insight: Binary size reductions are consistently excellent (80-93%), while instruction count improvements vary by code complexity.

🔬 Formal Verification

LOOM supports two verification modes:

1. Property-Based (Always Available)

loom optimize input.wasm -o output.wasm --verify
  • Fast idempotence checks
  • Constant folding validation
  • ~5ms overhead

2. Z3 SMT Formal Proof (Optional)

# Install Z3
brew install z3  # macOS
sudo apt install z3  # Linux

# Build with verification
cargo build --release --features verification

# Verify with formal proof
./target/release/loom optimize input.wasm -o output.wasm --verify

Output:

🔬 Running Z3 SMT verification...
  ✅ Z3 verification passed: optimizations are semantically equivalent

Z3 verification proves mathematically that optimizations preserve program semantics via translation validation.

💡 Examples

Example 1: Strength Reduction

Input:

(module
  (func $optimize_me (param $x i32) (result i32)
    local.get $x
    i32.const 8
    i32.mul
  )
)

After Optimization:

(module
  (func $optimize_me (param $x i32) (result i32)
    local.get $x
    i32.const 3
    i32.shl  ;; 2-3x faster than multiply!
  )
)

Example 2: Constant Folding

Input:

(func $calculate (result i32)
  i32.const 10
  i32.const 20
  i32.add
  i32.const 5
  i32.mul
)

After Optimization:

(func $calculate (result i32)
  i32.const 150  ;; Computed at compile-time
)

Example 3: CSE

Input:

(func $duplicate (param $x i32) (result i32)
  local.get $x
  i32.const 4
  i32.mul
  local.get $x
  i32.const 4
  i32.mul  ;; Duplicate computation!
  i32.add
)

After Optimization:

(func $duplicate (param $x i32) (result i32)
  local.get $x
  i32.const 4
  i32.mul
  local.tee $temp
  local.get $temp  ;; Reuse cached result
  i32.add
)

🏛️ Architecture

                    ┌──────────────────┐
                    │  WebAssembly     │
                    │  (WAT or WASM)   │
                    └────────┬─────────┘
                             │
                             ▼
                    ┌──────────────────┐
                    │   wasmparser     │
                    │  Parse to AST    │
                    └────────┬─────────┘
                             │
                             ▼
                    ┌──────────────────┐
                    │  ISLE Terms      │
                    │  (IR)            │
                    └────────┬─────────┘
                             │
                             ▼
    ┌────────────────────────────────────────────┐
    │     12-Phase Optimization Pipeline         │
    │  ┌──────────────────────────────────────┐  │
    │  │ 1. Precompute                        │  │
    │  │ 2. ISLE Constant Folding             │  │
    │  │ 3. Strength Reduction                │  │
    │  │ 4. Common Subexpression Elimination  │  │
    │  │ 5. Function Inlining                 │  │
    │  │ 6. ISLE (Post-inline)                │  │
    │  │ 7. Code Folding                      │  │
    │  │ 8. Loop-Invariant Code Motion        │  │
    │  │ 9. Branch Simplification             │  │
    │  │ 10. Dead Code Elimination            │  │
    │  │ 11. Block Merging                    │  │
    │  │ 12. Vacuum & Simplify Locals         │  │
    │  └──────────────────────────────────────┘  │
    │              │                              │
    │              ▼                              │
    │  ┌──────────────────────────────────────┐  │
    │  │   Dataflow Analysis                  │  │
    │  │   (locals, memory state tracking)    │  │
    │  └──────────────────────────────────────┘  │
    └────────────────────────────────────────────┘
                             │
                             ▼
              ┌──────────────────────────┐
              │   Optional: Z3 SMT       │
              │   Verification           │
              └────────────┬─────────────┘
                           │
                           ▼
              ┌──────────────────────────┐
              │   wasm-encoder           │
              │   Encode to binary       │
              └────────────┬─────────────┘
                           │
                           ▼
              ┌──────────────────────────┐
              │  Optimized WebAssembly   │
              └──────────────────────────┘

📚 Documentation

User Guides

Technical Documentation

For Contributors

🧪 Testing

# Run all tests
cargo test

# Run optimization-specific tests
cargo test --test optimization_tests

# Run benchmarks
cargo bench

# Test on real fixtures
./target/release/loom optimize tests/fixtures/quicksort.wat -o /tmp/out.wasm --stats --verify

Test Status:

  • ✅ 20/20 optimization tests passing (100%)
  • ✅ 54/57 unit tests passing (95%)
  • ✅ All benchmarks complete successfully

📁 Project Structure

loom/
├── loom-core/                # Core optimizer implementation
│   ├── src/lib.rs            # 12-phase pipeline, optimizations
│   ├── src/verify.rs         # Z3 verification module
│   ├── tests/                # Unit and integration tests
│   └── benches/              # Criterion performance benchmarks
├── loom-isle/                # ISLE term definitions and rules
│   ├── isle/                 # ISLE DSL files
│   └── src/lib.rs            # Rust integration
├── loom-cli/                 # Command-line interface
│   ├── src/main.rs           # CLI implementation
│   └── BUILD.bazel           # Bazel build rules
├── tests/fixtures/           # Real-world test cases
│   ├── fibonacci.wat
│   ├── quicksort.wat
│   ├── matrix_multiply.wat
│   └── ...
└── docs/                     # Comprehensive documentation
    ├── USAGE_GUIDE.md
    ├── QUICK_REFERENCE.md
    └── ...

🔧 Building

Standard Build

cargo build --release

With Z3 Verification

cargo build --release --features verification

WASM Build (wasm32-wasip2)

# Using Cargo
cargo build --release --target wasm32-wasip2

# Using Bazel
bazel build //loom-cli:loom_wasm --platforms=@rules_rust//rust/platform:wasm

See WASM_BUILD.md for details.

🚀 Use Cases

1. Production Deployment

# Optimize and verify before deployment
loom optimize app.wasm -o app.optimized.wasm --verify --stats

2. CI/CD Integration

- name: Optimize WebAssembly
  run: |
    loom optimize dist/*.wasm -o dist/*.wasm --stats --verify

3. Development Workflow

# Optimize during build
cargo build --target wasm32-unknown-unknown
loom optimize target/wasm32-unknown-unknown/release/app.wasm -o dist/app.wasm

4. Performance Analysis

# Compare before/after
ls -lh original.wasm optimized.wasm
loom optimize original.wasm -o optimized.wasm --stats

🎯 Optimization Patterns

Strength Reduction

  • x * 2^nx << n (2-3x faster)
  • x / 2^nx >> n (2-3x faster)
  • x % 2^nx & (2^n - 1) (2-3x faster)

Algebraic Simplification

  • x | 0x
  • x & -1x
  • x ^ 0x
  • x + 0x
  • x * 1x

Constant Folding

  • Compile-time evaluation of all constant expressions
  • Propagation through local variables
  • Cross-function constant propagation via inlining

🤝 Contributing

Contributions welcome! This project is under active development.

  1. Check existing issues
  2. Run tests: cargo test && cargo clippy
  3. Follow Rust conventions
  4. Add tests for new features
  5. Update documentation

📜 License

Apache License 2.0

🙏 Acknowledgments

🔗 Related Projects

  • Binaryen - Reference WebAssembly optimizer (C++)
  • wasm-opt - Industry-standard optimizer
  • Cranelift - High-performance code generator
  • WRT - WebAssembly Component Model runtime

📈 Roadmap

Current Status (v0.1.0):

  • ✅ 12-phase optimization pipeline
  • ✅ Z3 formal verification
  • ✅ Comprehensive benchmarking
  • ✅ Component Model support
  • ✅ wasm32-wasip2 build target

Coming Soon:

  • 🚧 More aggressive LICM (arithmetic operations, global reads)
  • 🚧 Profile-guided optimization
  • 🚧 SIMD-specific optimizations
  • 🚧 Inter-procedural analysis
  • 🚧 Custom optimization passes

Built with ❤️ by PulseEngine

About

WebAssembly optimizer using ISLE with formal verification - proof of concept

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •