-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Description
Z3 Performance Research and Improvement Plan
Executive Summary
Z3 is a high-performance theorem prover from Microsoft Research with a complex architecture spanning multiple domains (SAT solving, SMT theories, quantifier instantiation, etc.). This research identifies key performance bottlenecks, testing infrastructure, and establishes a systematic approach to performance improvements.
Current Performance Testing Infrastructure
Build System
- Primary: CMake-based build system with multiple configurations (Release, Debug, RelWithDebInfo)
- Legacy: Python-based build system (
scripts/mk_make.py) - Build Commands:
mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release ../ make -j$(nproc)
Testing Framework
- Unit Tests:
test-z3executable runs comprehensive unit tests (./test-z3 -a) - Regression Tests: Uses external z3test repository with benchmarks
git clone https://github.com/z3prover/z3test python z3test/scripts/test_benchmarks.py build/z3 z3test/regressions/smt2
- Performance Benchmarks: Phoronix Test Suite integration (4 min average runtime, 0.1% std dev)
Timing and Statistics Infrastructure
- Statistics:
statisticsclass insrc/util/statistics.hfor collecting metrics - Timers:
timer,scoped_timer,timeoutclasses for performance measurement - Profiling Tools:
contrib/qprofdiff/- quantifier instance profiling differential tool- Built-in statistics reporting (
-stflag)
Performance Bottlenecks and Characteristics
Key Performance Areas
-
SAT Solver Core (
src/sat/)- Clause learning and deletion
- Decision heuristics
- Propagation algorithms
- Preprocessing (simplification, symmetry breaking)
-
SMT Theory Solvers (
src/smt/theory_*.cpp)- Arithmetic (linear/nonlinear)
- Arrays, bitvectors, datatypes
- String/sequence theory
- Theory combination
-
Quantifier Instantiation (
src/smt/qi_*.cpp,src/muz/)- Pattern matching
- E-matching
- Model-based quantifier instantiation
-
Memory Management
- AST allocation/deallocation
- Region-based allocation
- Garbage collection
Typical Workloads
- Industrial Software Verification: Large formulas with arrays, uninterpreted functions
- Compiler Optimization: Bitvector arithmetic, loop analysis
- Security Analysis: String constraints, cryptographic protocols
- Hardware Verification: Bitvector arithmetic, finite state machines
Performance Characteristics
- CPU-bound: Most bottlenecks are algorithmic complexity
- Memory-intensive: Large AST structures, theory state
- Variable performance: 2x-15x differences based on configuration
- Timeout-sensitive: Many problems have exponential worst-case behavior
Micro-Benchmark Strategy
Quick Development Cycle Setup
- Target: Sub-minute benchmark runs for rapid iteration
- Scope: Focus on hottest code paths identified by profiling
- Approach:
- Create minimal test cases exercising specific algorithms
- Use
scoped_timerfor microsecond-level timing - Run multiple iterations for statistical significance
Micro-Benchmark Framework
// Example micro-benchmark pattern
void bench_sat_propagation() {
timer t;
for (int i = 0; i < 1000; ++i) {
// Setup minimal SAT instance
// Run propagation
// Measure time
}
(redacted) << "Avg time: " << t.get_seconds() / 1000 << "s\n";
}Performance Engineering Goals
Round 1: Foundation (Immediate - 2 weeks)
- Establish reliable micro-benchmark infrastructure
- Profile typical workloads to identify hotspots
- Create before/after measurement framework
- Target: 5-10% improvements in core algorithms
Round 2: Algorithmic Optimizations (1-2 months)
- SAT solver decision heuristics improvements
- Theory propagation optimizations
- Memory allocation pattern improvements
- Target: 10-25% improvements in specific domains
Round 3: Advanced Optimizations (2-3 months)
- Parallel algorithm implementation
- SIMD vectorization where applicable
- Cache-friendly data structure reorganization
- Target: 25-50% improvements in targeted scenarios
Specific Performance Improvement Areas
High-Impact Opportunities
-
SAT Solver Optimizations
src/sat/sat_solver.cpp- core propagation loopssrc/sat/sat_clause.cpp- clause management- Watch list optimization, better decision heuristics
-
Theory Integration
src/smt/theory_arith.cpp- arithmetic theorysrc/smt/theory_array.cpp- array theory- Faster theory propagation, better conflict analysis
-
Memory Management
src/util/region.cpp- region allocatorsrc/ast/ast.cpp- AST allocation patterns- Reduce allocation overhead, improve cache locality
-
Quantifier Handling
src/smt/mam.cpp- multi-pattern matchingsrc/smt/qi_queue.cpp- instantiation queue- Faster pattern matching algorithms
Low-Hanging Fruit
- Compiler optimization flags verification
- Loop unrolling in hot paths
- Branch prediction optimization
- Constant folding improvements
Development Environment Setup
Build Configuration for Performance Work
# Release build with debug symbols
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DZ3_ENABLE_TRACING_FOR_NON_DEBUG=TRUE \
-DZ3_BUILD_LIBZ3_SHARED=FALSE \
../Profiling Setup
# Build with profiling
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_CXX_FLAGS="-pg -fno-omit-frame-pointer" \
../Quick Testing Commands
# Fast unit tests
cd build && ./test-z3 -a
# Targeted regression tests
python z3test/scripts/test_benchmarks.py build/z3 z3test/regressions/smt2-extra
# Statistics reporting
./z3 -st input.smt2Success Metrics
Performance Measurements
- Regression benchmarks: Time to complete z3test/regressions/smt2
- Unit test performance: test-z3 execution time
- Memory usage: Peak RSS, allocation counts
- Cache performance: Cache miss rates (via perf)
Quality Gates
- All existing tests pass
- No performance regressions on standard benchmarks
- Statistically significant improvements (>5%)
- Memory usage neutral or improved
Implementation Strategy
Workflow
- Profile existing code to identify hotspots
- Design micro-benchmarks for targeted areas
- Measure baseline performance
- Implement optimization
- Verify improvement with before/after measurements
- Test regression test suite
- Document changes and performance impact
Tools
- Profiler:
perf,gprof,valgrind --tool=callgrind - Benchmarking: Custom timing harnesses, Phoronix Test Suite
- Testing: z3test repository, internal test-z3
- Memory:
valgrind --tool=massif, AddressSanitizer
Risks and Mitigation
Technical Risks
- Complexity: Z3's codebase is highly optimized already
- Correctness: Performance changes may introduce bugs
- Regression: Optimizing one case may hurt others
Mitigation Strategies
- Comprehensive testing before/after changes
- Small, incremental improvements
- Extensive regression testing
- Conservative optimization with fallbacks
Next Steps
- Set up micro-benchmark framework
- Profile representative workloads
- Identify highest-impact optimization targets
- Begin with conservative algorithmic improvements
- Measure and document all changes
> AI-generated content by Daily Perf Improver may contain mistakes.
Generated by Agentic Workflow Run
Metadata
Metadata
Assignees
Labels
No labels