Daily Perf Improver: Research and Plan

# Z3 Performance Research and Improvement Plan

## Executive Summary

Z3 is a high-performance theorem prover from Microsoft Research with a complex architecture spanning multiple domains (SAT solving, SMT theories, quantifier instantiation, etc.). This research identifies key performance bottlenecks, testing infrastructure, and establishes a systematic approach to performance improvements.

## Current Performance Testing Infrastructure

### Build System
- **Primary**: CMake-based build system with multiple configurations (Release, Debug, RelWithDebInfo)
- **Legacy**: Python-based build system (`scripts/mk_make.py`) 
- **Build Commands**:
  ```bash
  mkdir build &amp;&amp; cd build
  cmake -DCMAKE_BUILD_TYPE=Release ../
  make -j$(nproc)
  ```

### Testing Framework
- **Unit Tests**: `test-z3` executable runs comprehensive unit tests (`./test-z3 -a`)
- **Regression Tests**: Uses external z3test repository with benchmarks
  ```bash
  git clone https://github.com/z3prover/z3test
  python z3test/scripts/test_benchmarks.py build/z3 z3test/regressions/smt2
  ```
- **Performance Benchmarks**: Phoronix Test Suite integration (4 min average runtime, 0.1% std dev)

### Timing and Statistics Infrastructure
- **Statistics**: `statistics` class in `src/util/statistics.h` for collecting metrics
- **Timers**: `timer`, `scoped_timer`, `timeout` classes for performance measurement
- **Profiling Tools**: 
  - `contrib/qprofdiff/` - quantifier instance profiling differential tool
  - Built-in statistics reporting (`-st` flag)

## Performance Bottlenecks and Characteristics

### Key Performance Areas
1. **SAT Solver Core** (`src/sat/`)
   - Clause learning and deletion
   - Decision heuristics
   - Propagation algorithms
   - Preprocessing (simplification, symmetry breaking)

2. **SMT Theory Solvers** (`src/smt/theory_*.cpp`)
   - Arithmetic (linear/nonlinear)
   - Arrays, bitvectors, datatypes
   - String/sequence theory
   - Theory combination

3. **Quantifier Instantiation** (`src/smt/qi_*.cpp`, `src/muz/`)
   - Pattern matching
   - E-matching
   - Model-based quantifier instantiation

4. **Memory Management**
   - AST allocation/deallocation
   - Region-based allocation
   - Garbage collection

### Typical Workloads
- **Industrial Software Verification**: Large formulas with arrays, uninterpreted functions
- **Compiler Optimization**: Bitvector arithmetic, loop analysis
- **Security Analysis**: String constraints, cryptographic protocols
- **Hardware Verification**: Bitvector arithmetic, finite state machines

### Performance Characteristics
- **CPU-bound**: Most bottlenecks are algorithmic complexity
- **Memory-intensive**: Large AST structures, theory state
- **Variable performance**: 2x-15x differences based on configuration
- **Timeout-sensitive**: Many problems have exponential worst-case behavior

## Micro-Benchmark Strategy

### Quick Development Cycle Setup
- **Target**: Sub-minute benchmark runs for rapid iteration
- **Scope**: Focus on hottest code paths identified by profiling
- **Approach**: 
  1. Create minimal test cases exercising specific algorithms
  2. Use `scoped_timer` for microsecond-level timing
  3. Run multiple iterations for statistical significance

### Micro-Benchmark Framework
```cpp
// Example micro-benchmark pattern
void bench_sat_propagation() {
    timer t;
    for (int i = 0; i &lt; 1000; ++i) {
        // Setup minimal SAT instance
        // Run propagation
        // Measure time
    }
    (redacted) &lt;&lt; &quot;Avg time: &quot; &lt;&lt; t.get_seconds() / 1000 &lt;&lt; &quot;s\n&quot;;
}
```

## Performance Engineering Goals

### Round 1: Foundation (Immediate - 2 weeks)
- Establish reliable micro-benchmark infrastructure
- Profile typical workloads to identify hotspots
- Create before/after measurement framework
- Target: 5-10% improvements in core algorithms

### Round 2: Algorithmic Optimizations (1-2 months)
- SAT solver decision heuristics improvements
- Theory propagation optimizations
- Memory allocation pattern improvements
- Target: 10-25% improvements in specific domains

### Round 3: Advanced Optimizations (2-3 months)
- Parallel algorithm implementation
- SIMD vectorization where applicable
- Cache-friendly data structure reorganization
- Target: 25-50% improvements in targeted scenarios

## Specific Performance Improvement Areas

### High-Impact Opportunities
1. **SAT Solver Optimizations**
   - `src/sat/sat_solver.cpp` - core propagation loops
   - `src/sat/sat_clause.cpp` - clause management
   - Watch list optimization, better decision heuristics

2. **Theory Integration**
   - `src/smt/theory_arith.cpp` - arithmetic theory
   - `src/smt/theory_array.cpp` - array theory
   - Faster theory propagation, better conflict analysis

3. **Memory Management**
   - `src/util/region.cpp` - region allocator
   - `src/ast/ast.cpp` - AST allocation patterns
   - Reduce allocation overhead, improve cache locality

4. **Quantifier Handling**
   - `src/smt/mam.cpp` - multi-pattern matching
   - `src/smt/qi_queue.cpp` - instantiation queue
   - Faster pattern matching algorithms

### Low-Hanging Fruit
- Compiler optimization flags verification
- Loop unrolling in hot paths
- Branch prediction optimization
- Constant folding improvements

## Development Environment Setup

### Build Configuration for Performance Work
```bash
# Release build with debug symbols
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo \
      -DZ3_ENABLE_TRACING_FOR_NON_DEBUG=TRUE \
      -DZ3_BUILD_LIBZ3_SHARED=FALSE \
      ../
```

### Profiling Setup
```bash
# Build with profiling
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo \
      -DCMAKE_CXX_FLAGS=&quot;-pg -fno-omit-frame-pointer&quot; \
      ../
```

### Quick Testing Commands
```bash
# Fast unit tests
cd build &amp;&amp; ./test-z3 -a

# Targeted regression tests
python z3test/scripts/test_benchmarks.py build/z3 z3test/regressions/smt2-extra

# Statistics reporting
./z3 -st input.smt2
```

## Success Metrics

### Performance Measurements
- **Regression benchmarks**: Time to complete z3test/regressions/smt2
- **Unit test performance**: test-z3 execution time
- **Memory usage**: Peak RSS, allocation counts
- **Cache performance**: Cache miss rates (via perf)

### Quality Gates
- All existing tests pass
- No performance regressions on standard benchmarks
- Statistically significant improvements (&gt;5%)
- Memory usage neutral or improved

## Implementation Strategy

### Workflow
1. **Profile** existing code to identify hotspots
2. **Design** micro-benchmarks for targeted areas
3. **Measure** baseline performance
4. **Implement** optimization
5. **Verify** improvement with before/after measurements
6. **Test** regression test suite
7. **Document** changes and performance impact

### Tools
- **Profiler**: `perf`, `gprof`, `valgrind --tool=callgrind`
- **Benchmarking**: Custom timing harnesses, Phoronix Test Suite
- **Testing**: z3test repository, internal test-z3
- **Memory**: `valgrind --tool=massif`, AddressSanitizer

## Risks and Mitigation

### Technical Risks
- **Complexity**: Z3&apos;s codebase is highly optimized already
- **Correctness**: Performance changes may introduce bugs
- **Regression**: Optimizing one case may hurt others

### Mitigation Strategies
- Comprehensive testing before/after changes
- Small, incremental improvements
- Extensive regression testing
- Conservative optimization with fallbacks

## Next Steps

1. Set up micro-benchmark framework
2. Profile representative workloads
3. Identify highest-impact optimization targets
4. Begin with conservative algorithmic improvements
5. Measure and document all changes

---

&gt; AI-generated content by [Daily Perf Improver](https://github.com/Z3Prover/z3/actions/runs/17687729150) may contain mistakes.


> Generated by Agentic Workflow [Run](https://github.com/Z3Prover/z3/actions/runs/17687729150)

Daily Perf Improver: Research and Plan #7872

Description

Z3 Performance Research and Improvement Plan

Executive Summary

Current Performance Testing Infrastructure

Build System

Testing Framework

Timing and Statistics Infrastructure

Performance Bottlenecks and Characteristics

Key Performance Areas

Typical Workloads

Performance Characteristics

Micro-Benchmark Strategy

Quick Development Cycle Setup

Micro-Benchmark Framework

Performance Engineering Goals

Round 1: Foundation (Immediate - 2 weeks)

Round 2: Algorithmic Optimizations (1-2 months)

Round 3: Advanced Optimizations (2-3 months)

Specific Performance Improvement Areas

High-Impact Opportunities

Low-Hanging Fruit

Development Environment Setup

Build Configuration for Performance Work

Profiling Setup

Quick Testing Commands

Success Metrics

Performance Measurements

Quality Gates

Implementation Strategy

Workflow

Tools

Risks and Mitigation

Technical Risks

Mitigation Strategies

Next Steps

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions