Skip to content

perf(compiler): extend x86 cg peephole rules#437

Closed
abmcar wants to merge 6 commits into
DTVMStack:mainfrom
abmcar:perf/x86-cg-peephole-rules
Closed

perf(compiler): extend x86 cg peephole rules#437
abmcar wants to merge 6 commits into
DTVMStack:mainfrom
abmcar:perf/x86-cg-peephole-rules

Conversation

@abmcar
Copy link
Copy Markdown
Contributor

@abmcar abmcar commented Mar 29, 2026

Summary

  • extend the x86 CG peephole pass beyond the original cmp/setcc/test/jcc fold
  • add test/setcc simplification, zero-add elimination, adc-zero canonicalization, imm-zero no-op removal, and broader branch folding patterns seen in JIT output
  • keep the change localized to src/compiler/target/x86/x86_cg_peephole.cpp and .h

Validation

  • tools/format.sh check
  • ./build/evmStateTests --gtest_brief=1

Benchmark

Latest reliable local external/total comparison against the previous stable candidate:

  • 166 improved
  • 22 regressed
  • 6 flat
  • geomean: 1.069x
  • main/*: 1.023x
  • micro/*: 1.012x
  • synth/*: 1.078x

Largest wins in that run:

  • external/total/synth/GAS/a0,a1: about 2.36x to 2.41x
  • external/total/micro/memory_grow_mstore/by32: about 1.82x
  • external/total/micro/memory_grow_mload/by32: about 1.55x

Known regressions remain concentrated in a small set of micro cases:

  • memory_grow_mload/nogrow
  • memory_grow_mload/by1
  • memory_grow_mstore/nogrow
  • memory_grow_mstore/by1
  • memory_grow_mstore/by16
  • a few jump_around / loop_with_many_jumpdests / signextend cases

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 29, 2026

⚡ Performance Regression Check Results

✅ Performance Check Passed (interpreter)

Performance Benchmark Results (threshold: 25%)

Benchmark Baseline (us) Current (us) Change Status
total/main/blake2b_huff/8415nulls 1.67 1.68 +0.4% PASS
total/main/blake2b_huff/empty 0.03 0.03 +0.2% PASS
total/main/blake2b_shifts/8415nulls 13.04 12.70 -2.5% PASS
total/main/sha1_divs/5311 5.81 5.66 -2.6% PASS
total/main/sha1_divs/empty 0.07 0.06 -2.0% PASS
total/main/sha1_shifts/5311 3.41 3.19 -6.3% PASS
total/main/sha1_shifts/empty 0.04 0.04 -7.4% PASS
total/main/snailtracer/benchmark 55.00 54.50 -0.9% PASS
total/main/structarray_alloc/nfts_rank 1.12 1.07 -3.6% PASS
total/main/swap_math/insufficient_liquidity 0.00 0.00 -4.0% PASS
total/main/swap_math/received 0.01 0.01 -0.9% PASS
total/main/swap_math/spent 0.00 0.00 -2.8% PASS
total/main/weierstrudel/1 0.29 0.29 -0.2% PASS
total/main/weierstrudel/15 3.35 3.34 -0.3% PASS
total/micro/JUMPDEST_n0/empty 2.01 1.85 -7.8% PASS
total/micro/jump_around/empty 0.06 0.06 +0.3% PASS
total/micro/loop_with_many_jumpdests/empty 30.91 28.13 -9.0% PASS
total/micro/memory_grow_mload/by1 0.11 0.09 -18.6% PASS
total/micro/memory_grow_mload/by16 0.12 0.10 -18.3% PASS
total/micro/memory_grow_mload/by32 0.13 0.11 -10.8% PASS
total/micro/memory_grow_mload/nogrow 0.11 0.08 -22.6% PASS
total/micro/memory_grow_mstore/by1 0.11 0.10 -15.1% PASS
total/micro/memory_grow_mstore/by16 0.12 0.12 -5.7% PASS
total/micro/memory_grow_mstore/by32 0.14 0.14 +0.7% PASS
total/micro/memory_grow_mstore/nogrow 0.11 0.09 -15.7% PASS
total/micro/signextend/one 0.25 0.26 +2.2% PASS
total/micro/signextend/zero 0.25 0.25 -1.9% PASS
total/synth/ADD/b0 2.16 2.20 +1.9% PASS
total/synth/ADD/b1 2.30 2.28 -0.8% PASS
total/synth/ADDRESS/a0 5.56 5.54 -0.5% PASS
total/synth/ADDRESS/a1 6.16 5.81 -5.8% PASS
total/synth/AND/b0 1.87 1.86 -0.7% PASS
total/synth/AND/b1 1.90 1.88 -0.8% PASS
total/synth/BYTE/b0 6.85 6.79 -0.7% PASS
total/synth/BYTE/b1 5.44 5.42 -0.5% PASS
total/synth/CALLDATASIZE/a0 3.62 3.22 -10.8% PASS
total/synth/CALLDATASIZE/a1 4.19 3.85 -8.1% PASS
total/synth/CALLER/a0 5.55 5.53 -0.2% PASS
total/synth/CALLER/a1 6.17 5.95 -3.5% PASS
total/synth/CALLVALUE/a0 3.99 3.65 -8.5% PASS
total/synth/CALLVALUE/a1 4.16 3.47 -16.7% PASS
total/synth/CODESIZE/a0 4.23 3.59 -15.1% PASS
total/synth/CODESIZE/a1 4.25 4.04 -4.9% PASS
total/synth/DUP1/d0 1.57 1.38 -11.9% PASS
total/synth/DUP1/d1 1.59 1.48 -6.8% PASS
total/synth/DUP10/d0 1.57 1.10 -29.8% PASS
total/synth/DUP10/d1 1.58 1.11 -29.6% PASS
total/synth/DUP11/d0 1.57 1.38 -12.2% PASS
total/synth/DUP11/d1 1.58 0.86 -46.0% PASS
total/synth/DUP12/d0 1.57 1.29 -17.6% PASS
total/synth/DUP12/d1 1.58 1.11 -29.6% PASS
total/synth/DUP13/d0 1.57 1.11 -29.3% PASS
total/synth/DUP13/d1 1.58 1.11 -29.6% PASS
total/synth/DUP14/d0 1.57 1.29 -17.8% PASS
total/synth/DUP14/d1 1.58 1.11 -29.7% PASS
total/synth/DUP15/d0 1.57 1.11 -29.3% PASS
total/synth/DUP15/d1 1.58 1.39 -12.4% PASS
total/synth/DUP16/d0 1.57 1.29 -17.7% PASS
total/synth/DUP16/d1 1.58 1.39 -12.3% PASS
total/synth/DUP2/d0 1.57 1.38 -11.8% PASS
total/synth/DUP2/d1 1.58 1.11 -29.6% PASS
total/synth/DUP3/d0 1.57 1.38 -12.0% PASS
total/synth/DUP3/d1 1.58 1.12 -29.5% PASS
total/synth/DUP4/d0 1.57 1.11 -29.2% PASS
total/synth/DUP4/d1 1.58 1.12 -29.5% PASS
total/synth/DUP5/d0 1.57 1.29 -17.9% PASS
total/synth/DUP5/d1 1.58 0.85 -46.3% PASS
total/synth/DUP6/d0 1.57 1.38 -12.1% PASS
total/synth/DUP6/d1 1.58 1.39 -12.4% PASS
total/synth/DUP7/d0 1.57 1.38 -12.1% PASS
total/synth/DUP7/d1 1.58 1.13 -28.9% PASS
total/synth/DUP8/d0 1.57 1.29 -17.8% PASS
total/synth/DUP8/d1 1.58 0.85 -46.4% PASS
total/synth/DUP9/d0 1.57 1.30 -17.3% PASS
total/synth/DUP9/d1 1.58 1.39 -12.3% PASS
total/synth/EQ/b0 3.16 3.12 -1.5% PASS
total/synth/EQ/b1 1.49 1.47 -0.9% PASS
total/synth/GAS/a0 3.83 3.86 +0.9% PASS
total/synth/GAS/a1 4.26 3.85 -9.5% PASS
total/synth/GT/b0 2.99 3.02 +1.1% PASS
total/synth/GT/b1 1.86 1.58 -15.5% PASS
total/synth/ISZERO/u0 1.66 0.96 -42.4% PASS
total/synth/JUMPDEST/n0 2.03 1.85 -9.0% PASS
total/synth/LT/b0 3.04 3.00 -1.3% PASS
total/synth/LT/b1 1.87 1.49 -20.4% PASS
total/synth/MSIZE/a0 4.88 4.90 +0.3% PASS
total/synth/MSIZE/a1 5.45 5.08 -6.8% PASS
total/synth/MUL/b0 6.23 6.18 -0.8% PASS
total/synth/MUL/b1 6.18 6.26 +1.3% PASS
total/synth/NOT/u0 1.83 1.83 -0.0% PASS
total/synth/OR/b0 1.87 1.87 -0.1% PASS
total/synth/OR/b1 1.89 1.88 -0.8% PASS
total/synth/PC/a0 3.71 3.23 -12.8% PASS
total/synth/PC/a1 3.97 4.28 +7.6% PASS
total/synth/PUSH1/p0 1.48 0.87 -41.0% PASS
total/synth/PUSH1/p1 1.47 1.38 -6.5% PASS
total/synth/PUSH10/p0 1.20 1.02 -15.3% PASS
total/synth/PUSH10/p1 1.49 1.39 -6.7% PASS
total/synth/PUSH11/p0 1.20 1.01 -16.2% PASS
total/synth/PUSH11/p1 1.49 1.14 -23.8% PASS
total/synth/PUSH12/p0 1.20 1.02 -15.0% PASS
total/synth/PUSH12/p1 1.50 1.12 -25.3% PASS
total/synth/PUSH13/p0 1.20 1.02 -15.3% PASS
total/synth/PUSH13/p1 1.49 1.38 -7.5% PASS
total/synth/PUSH14/p0 1.20 1.02 -15.2% PASS
total/synth/PUSH14/p1 1.47 1.18 -20.0% PASS
total/synth/PUSH15/p0 1.21 1.02 -15.5% PASS
total/synth/PUSH15/p1 1.50 1.13 -24.9% PASS
total/synth/PUSH16/p0 1.21 1.20 -0.4% PASS
total/synth/PUSH16/p1 1.49 1.11 -25.3% PASS
total/synth/PUSH17/p0 1.20 1.02 -15.4% PASS
total/synth/PUSH17/p1 1.49 1.13 -23.8% PASS
total/synth/PUSH18/p0 1.21 1.02 -15.9% PASS
total/synth/PUSH18/p1 1.48 1.12 -24.4% PASS
total/synth/PUSH19/p0 1.20 1.29 +7.5% PASS
total/synth/PUSH19/p1 1.48 1.11 -24.7% PASS
total/synth/PUSH2/p0 1.20 1.29 +7.5% PASS
total/synth/PUSH2/p1 1.47 1.11 -24.9% PASS
total/synth/PUSH20/p0 1.20 1.02 -15.3% PASS
total/synth/PUSH20/p1 1.49 1.12 -24.7% PASS
total/synth/PUSH21/p0 1.20 0.89 -25.7% PASS
total/synth/PUSH21/p1 1.49 1.39 -6.9% PASS
total/synth/PUSH22/p0 1.20 1.02 -15.2% PASS
total/synth/PUSH22/p1 1.50 1.12 -25.3% PASS
total/synth/PUSH23/p0 1.20 1.29 +7.6% PASS
total/synth/PUSH23/p1 1.49 1.40 -6.0% PASS
total/synth/PUSH24/p0 1.22 1.02 -16.6% PASS
total/synth/PUSH24/p1 1.50 1.12 -25.5% PASS
total/synth/PUSH25/p0 1.20 1.02 -14.9% PASS
total/synth/PUSH25/p1 1.52 1.39 -8.2% PASS
total/synth/PUSH26/p0 1.20 1.29 +7.4% PASS
total/synth/PUSH26/p1 1.50 1.13 -24.9% PASS
total/synth/PUSH27/p0 1.48 1.02 -31.0% PASS
total/synth/PUSH27/p1 1.50 1.14 -23.6% PASS
total/synth/PUSH28/p0 1.20 0.93 -22.7% PASS
total/synth/PUSH28/p1 1.50 1.12 -25.3% PASS
total/synth/PUSH29/p0 1.20 1.02 -15.2% PASS
total/synth/PUSH29/p1 1.50 1.40 -7.3% PASS
total/synth/PUSH3/p0 1.20 1.29 +7.5% PASS
total/synth/PUSH3/p1 1.49 1.12 -24.9% PASS
total/synth/PUSH30/p0 1.20 1.30 +8.3% PASS
total/synth/PUSH30/p1 1.50 1.12 -24.9% PASS
total/synth/PUSH31/p0 1.24 1.02 -17.5% PASS
total/synth/PUSH31/p1 1.66 1.20 -27.4% PASS
total/synth/PUSH32/p0 1.20 0.93 -22.8% PASS
total/synth/PUSH32/p1 1.50 1.38 -7.5% PASS
total/synth/PUSH4/p0 1.22 1.02 -16.5% PASS
total/synth/PUSH4/p1 1.50 1.40 -6.3% PASS
total/synth/PUSH5/p0 1.20 0.89 -26.1% PASS
total/synth/PUSH5/p1 1.50 1.12 -25.8% PASS
total/synth/PUSH6/p0 1.20 1.03 -14.1% PASS
total/synth/PUSH6/p1 1.49 1.13 -24.1% PASS
total/synth/PUSH7/p0 1.20 1.29 +7.6% PASS
total/synth/PUSH7/p1 1.49 1.13 -24.2% PASS
total/synth/PUSH8/p0 1.20 1.02 -15.3% PASS
total/synth/PUSH8/p1 1.49 1.38 -7.2% PASS
total/synth/PUSH9/p0 1.20 0.93 -22.9% PASS
total/synth/PUSH9/p1 1.48 1.12 -24.6% PASS
total/synth/RETURNDATASIZE/a0 4.33 3.60 -16.8% PASS
total/synth/RETURNDATASIZE/a1 4.43 3.97 -10.4% PASS
total/synth/SAR/b0 4.34 4.33 -0.1% PASS
total/synth/SAR/b1 4.91 4.92 +0.4% PASS
total/synth/SGT/b0 3.04 3.01 -0.9% PASS
total/synth/SGT/b1 1.86 1.85 -0.9% PASS
total/synth/SHL/b0 3.51 3.50 -0.3% PASS
total/synth/SHL/b1 1.92 1.67 -12.7% PASS
total/synth/SHR/b0 3.43 3.51 +2.4% PASS
total/synth/SHR/b1 1.92 1.63 -15.3% PASS
total/synth/SIGNEXTEND/b0 4.19 3.65 -12.8% PASS
total/synth/SIGNEXTEND/b1 4.24 3.79 -10.7% PASS
total/synth/SLT/b0 3.04 3.08 +1.4% PASS
total/synth/SLT/b1 1.86 1.94 +4.1% PASS
total/synth/SUB/b0 2.24 2.20 -1.9% PASS
total/synth/SUB/b1 2.24 2.28 +2.1% PASS
total/synth/SWAP1/s0 1.68 1.84 +10.0% PASS
total/synth/SWAP10/s0 1.71 1.86 +8.7% PASS
total/synth/SWAP11/s0 1.71 1.86 +8.7% PASS
total/synth/SWAP12/s0 1.71 1.85 +8.4% PASS
total/synth/SWAP13/s0 1.71 1.86 +8.2% PASS
total/synth/SWAP14/s0 1.72 1.86 +8.3% PASS
total/synth/SWAP15/s0 1.72 1.86 +8.5% PASS
total/synth/SWAP16/s0 1.72 1.86 +8.4% PASS
total/synth/SWAP2/s0 1.68 1.85 +10.2% PASS
total/synth/SWAP3/s0 1.68 1.85 +10.0% PASS
total/synth/SWAP4/s0 1.69 1.67 -0.8% PASS
total/synth/SWAP5/s0 1.69 1.68 -0.8% PASS
total/synth/SWAP6/s0 1.74 1.86 +7.0% PASS
total/synth/SWAP7/s0 1.70 1.85 +9.1% PASS
total/synth/SWAP8/s0 1.70 1.72 +0.7% PASS
total/synth/SWAP9/s0 1.71 1.86 +8.7% PASS
total/synth/XOR/b0 1.74 1.74 +0.0% PASS
total/synth/XOR/b1 1.76 1.75 -0.5% PASS
total/synth/loop_v1 4.69 4.55 -3.0% PASS
total/synth/loop_v2 4.66 4.54 -2.6% PASS

Summary: 194 benchmarks, 0 regressions


✅ Performance Check Passed (multipass)

Performance Benchmark Results (threshold: 25%)

Benchmark Baseline (us) Current (us) Change Status
total/main/blake2b_huff/8415nulls 0.86 0.83 -4.2% PASS
total/main/blake2b_huff/empty 0.01 0.01 -4.8% PASS
total/main/blake2b_shifts/8415nulls 4.49 4.30 -4.2% PASS
total/main/sha1_divs/5311 0.59 0.55 -6.8% PASS
total/main/sha1_divs/empty 0.01 0.01 -6.1% PASS
total/main/sha1_shifts/5311 0.55 0.51 -8.4% PASS
total/main/sha1_shifts/empty 0.01 0.01 -7.7% PASS
total/main/snailtracer/benchmark 31.63 30.53 -3.5% PASS
total/main/structarray_alloc/nfts_rank 0.30 0.26 -12.7% PASS
total/main/swap_math/insufficient_liquidity 0.00 0.00 -3.5% PASS
total/main/swap_math/received 0.00 0.00 -3.4% PASS
total/main/swap_math/spent 0.00 0.00 -3.0% PASS
total/main/weierstrudel/1 0.24 0.24 +0.2% PASS
total/main/weierstrudel/15 2.59 2.61 +0.7% PASS
total/micro/JUMPDEST_n0/empty 0.00 0.00 -18.9% PASS
total/micro/jump_around/empty 0.05 0.07 +30.8% PASS
total/micro/loop_with_many_jumpdests/empty 0.00 0.00 -2.8% PASS
total/micro/memory_grow_mload/by1 0.01 0.01 -9.3% PASS
total/micro/memory_grow_mload/by16 0.01 0.01 -9.0% PASS
total/micro/memory_grow_mload/by32 0.01 0.01 -10.0% PASS
total/micro/memory_grow_mload/nogrow 0.01 0.01 -9.3% PASS
total/micro/memory_grow_mstore/by1 0.01 0.01 -9.4% PASS
total/micro/memory_grow_mstore/by16 0.02 0.01 -9.5% PASS
total/micro/memory_grow_mstore/by32 0.02 0.01 -10.5% PASS
total/micro/memory_grow_mstore/nogrow 0.01 0.01 -9.9% PASS
total/micro/signextend/one 0.07 0.07 +0.7% PASS
total/micro/signextend/zero 0.07 0.07 +0.3% PASS
total/synth/ADD/b0 0.00 0.00 -14.5% PASS
total/synth/ADD/b1 0.00 0.00 -7.4% PASS
total/synth/ADDRESS/a0 0.15 0.15 -0.1% PASS
total/synth/ADDRESS/a1 0.15 0.15 -0.0% PASS
total/synth/AND/b0 0.00 0.00 -14.5% PASS
total/synth/AND/b1 0.00 0.00 -7.4% PASS
total/synth/BYTE/b0 0.00 0.00 -14.3% PASS
total/synth/BYTE/b1 0.00 0.00 -7.4% PASS
total/synth/CALLDATASIZE/a0 0.07 0.07 +0.4% PASS
total/synth/CALLDATASIZE/a1 0.07 0.07 -0.5% PASS
total/synth/CALLER/a0 0.18 0.18 +0.1% PASS
total/synth/CALLER/a1 0.18 0.18 +0.1% PASS
total/synth/CALLVALUE/a0 0.26 0.24 -7.8% PASS
total/synth/CALLVALUE/a1 0.26 0.24 -7.9% PASS
total/synth/CODESIZE/a0 0.07 0.07 +0.3% PASS
total/synth/CODESIZE/a1 0.07 0.07 -0.4% PASS
total/synth/DUP1/d0 0.00 0.00 -14.5% PASS
total/synth/DUP1/d1 0.00 0.00 -7.5% PASS
total/synth/DUP10/d0 0.00 0.00 -14.5% PASS
total/synth/DUP10/d1 0.00 0.00 -7.4% PASS
total/synth/DUP11/d0 0.00 0.00 -14.4% PASS
total/synth/DUP11/d1 0.00 0.00 -7.6% PASS
total/synth/DUP12/d0 0.00 0.00 -14.4% PASS
total/synth/DUP12/d1 0.00 0.00 -7.4% PASS
total/synth/DUP13/d0 0.00 0.00 -14.7% PASS
total/synth/DUP13/d1 0.00 0.00 -7.3% PASS
total/synth/DUP14/d0 0.00 0.00 -14.5% PASS
total/synth/DUP14/d1 0.00 0.00 -7.3% PASS
total/synth/DUP15/d0 0.00 0.00 -14.5% PASS
total/synth/DUP15/d1 0.00 0.00 -7.5% PASS
total/synth/DUP16/d0 0.00 0.00 -14.5% PASS
total/synth/DUP16/d1 0.00 0.00 -7.7% PASS
total/synth/DUP2/d0 0.00 0.00 -14.5% PASS
total/synth/DUP2/d1 0.00 0.00 -7.3% PASS
total/synth/DUP3/d0 0.00 0.00 -14.5% PASS
total/synth/DUP3/d1 0.00 0.00 -7.5% PASS
total/synth/DUP4/d0 0.00 0.00 -14.5% PASS
total/synth/DUP4/d1 0.00 0.00 -7.3% PASS
total/synth/DUP5/d0 0.00 0.00 -14.4% PASS
total/synth/DUP5/d1 0.00 0.00 -7.4% PASS
total/synth/DUP6/d0 0.00 0.00 -14.5% PASS
total/synth/DUP6/d1 0.00 0.00 -7.3% PASS
total/synth/DUP7/d0 0.00 0.00 -14.5% PASS
total/synth/DUP7/d1 0.00 0.00 -7.4% PASS
total/synth/DUP8/d0 0.00 0.00 -14.4% PASS
total/synth/DUP8/d1 0.00 0.00 -7.3% PASS
total/synth/DUP9/d0 0.00 0.00 -14.4% PASS
total/synth/DUP9/d1 0.00 0.00 -7.7% PASS
total/synth/EQ/b0 0.00 0.00 -14.6% PASS
total/synth/EQ/b1 0.00 0.00 -7.3% PASS
total/synth/GAS/a0 0.76 0.62 -18.4% PASS
total/synth/GAS/a1 0.76 0.62 -18.4% PASS
total/synth/GT/b0 0.00 0.00 -14.4% PASS
total/synth/GT/b1 0.00 0.00 -7.4% PASS
total/synth/ISZERO/u0 0.00 0.00 -14.5% PASS
total/synth/JUMPDEST/n0 0.00 0.00 -18.5% PASS
total/synth/LT/b0 0.00 0.00 -14.3% PASS
total/synth/LT/b1 0.00 0.00 -7.4% PASS
total/synth/MSIZE/a0 0.00 0.00 -14.5% PASS
total/synth/MSIZE/a1 0.00 0.00 -7.5% PASS
total/synth/MUL/b0 0.00 0.00 -14.5% PASS
total/synth/MUL/b1 0.00 0.00 -7.4% PASS
total/synth/NOT/u0 0.00 0.00 -14.6% PASS
total/synth/OR/b0 0.00 0.00 -14.4% PASS
total/synth/OR/b1 0.00 0.00 -7.3% PASS
total/synth/PC/a0 0.00 0.00 -14.6% PASS
total/synth/PC/a1 0.00 0.00 -7.5% PASS
total/synth/PUSH1/p0 0.00 0.00 -14.5% PASS
total/synth/PUSH1/p1 0.00 0.00 -7.4% PASS
total/synth/PUSH10/p0 0.00 0.00 -14.5% PASS
total/synth/PUSH10/p1 0.00 0.00 -7.3% PASS
total/synth/PUSH11/p0 0.00 0.00 -14.4% PASS
total/synth/PUSH11/p1 0.00 0.00 -7.3% PASS
total/synth/PUSH12/p0 0.00 0.00 -14.5% PASS
total/synth/PUSH12/p1 0.00 0.00 -7.8% PASS
total/synth/PUSH13/p0 0.00 0.00 -14.4% PASS
total/synth/PUSH13/p1 0.00 0.00 -7.4% PASS
total/synth/PUSH14/p0 0.00 0.00 -14.6% PASS
total/synth/PUSH14/p1 0.00 0.00 -7.5% PASS
total/synth/PUSH15/p0 0.00 0.00 -14.5% PASS
total/synth/PUSH15/p1 0.00 0.00 -7.5% PASS
total/synth/PUSH16/p0 0.00 0.00 -14.7% PASS
total/synth/PUSH16/p1 0.00 0.00 -7.4% PASS
total/synth/PUSH17/p0 0.00 0.00 -14.6% PASS
total/synth/PUSH17/p1 0.00 0.00 -7.5% PASS
total/synth/PUSH18/p0 0.00 0.00 -14.6% PASS
total/synth/PUSH18/p1 0.00 0.00 -6.9% PASS
total/synth/PUSH19/p0 0.00 0.00 -14.4% PASS
total/synth/PUSH19/p1 0.00 0.00 -7.3% PASS
total/synth/PUSH2/p0 0.00 0.00 -14.4% PASS
total/synth/PUSH2/p1 0.00 0.00 -7.5% PASS
total/synth/PUSH20/p0 0.00 0.00 -14.6% PASS
total/synth/PUSH20/p1 0.00 0.00 -7.9% PASS
total/synth/PUSH21/p0 0.00 0.00 -14.5% PASS
total/synth/PUSH21/p1 0.00 0.00 -7.5% PASS
total/synth/PUSH22/p0 1.31 1.15 -12.5% PASS
total/synth/PUSH22/p1 1.43 1.30 -8.7% PASS
total/synth/PUSH23/p0 1.31 1.07 -18.6% PASS
total/synth/PUSH23/p1 1.43 1.32 -8.3% PASS
total/synth/PUSH24/p0 1.31 1.07 -18.6% PASS
total/synth/PUSH24/p1 1.43 1.29 -9.8% PASS
total/synth/PUSH25/p0 1.31 1.07 -18.6% PASS
total/synth/PUSH25/p1 1.43 1.29 -10.0% PASS
total/synth/PUSH26/p0 1.07 0.83 -22.3% PASS
total/synth/PUSH26/p1 1.43 1.30 -9.3% PASS
total/synth/PUSH27/p0 1.31 1.07 -18.6% PASS
total/synth/PUSH27/p1 1.42 1.29 -9.3% PASS
total/synth/PUSH28/p0 1.31 1.07 -18.6% PASS
total/synth/PUSH28/p1 1.42 1.30 -8.7% PASS
total/synth/PUSH29/p0 1.31 1.07 -18.7% PASS
total/synth/PUSH29/p1 1.43 1.29 -9.4% PASS
total/synth/PUSH3/p0 0.00 0.00 -14.4% PASS
total/synth/PUSH3/p1 0.00 0.00 -7.3% PASS
total/synth/PUSH30/p0 1.35 1.10 -18.3% PASS
total/synth/PUSH30/p1 1.44 1.30 -9.4% PASS
total/synth/PUSH31/p0 1.31 1.10 -16.4% PASS
total/synth/PUSH31/p1 1.50 1.44 -4.1% PASS
total/synth/PUSH32/p0 1.31 1.13 -14.1% PASS
total/synth/PUSH32/p1 1.43 1.31 -8.1% PASS
total/synth/PUSH4/p0 0.00 0.00 -14.4% PASS
total/synth/PUSH4/p1 0.00 0.00 -7.4% PASS
total/synth/PUSH5/p0 0.00 0.00 -14.5% PASS
total/synth/PUSH5/p1 0.00 0.00 -7.4% PASS
total/synth/PUSH6/p0 0.00 0.00 -14.4% PASS
total/synth/PUSH6/p1 0.00 0.00 -7.0% PASS
total/synth/PUSH7/p0 0.00 0.00 -14.7% PASS
total/synth/PUSH7/p1 0.00 0.00 -7.6% PASS
total/synth/PUSH8/p0 0.00 0.00 -14.6% PASS
total/synth/PUSH8/p1 0.00 0.00 -7.4% PASS
total/synth/PUSH9/p0 0.00 0.00 -14.5% PASS
total/synth/PUSH9/p1 0.00 0.00 -7.4% PASS
total/synth/RETURNDATASIZE/a0 0.03 0.03 -0.5% PASS
total/synth/RETURNDATASIZE/a1 0.03 0.03 -0.2% PASS
total/synth/SAR/b0 6.00 5.79 -3.5% PASS
total/synth/SAR/b1 6.82 6.43 -5.7% PASS
total/synth/SGT/b0 0.00 0.00 -14.3% PASS
total/synth/SGT/b1 0.00 0.00 -7.5% PASS
total/synth/SHL/b0 13.06 12.81 -1.9% PASS
total/synth/SHL/b1 13.08 12.85 -1.8% PASS
total/synth/SHR/b0 11.24 11.11 -1.1% PASS
total/synth/SHR/b1 11.39 11.16 -2.0% PASS
total/synth/SIGNEXTEND/b0 0.00 0.00 -14.6% PASS
total/synth/SIGNEXTEND/b1 0.00 0.00 -7.2% PASS
total/synth/SLT/b0 0.00 0.00 -14.4% PASS
total/synth/SLT/b1 0.00 0.00 -7.4% PASS
total/synth/SUB/b0 0.00 0.00 -14.4% PASS
total/synth/SUB/b1 0.00 0.00 -7.4% PASS
total/synth/SWAP1/s0 0.00 0.00 -14.9% PASS
total/synth/SWAP10/s0 0.00 0.00 -14.4% PASS
total/synth/SWAP11/s0 0.00 0.00 -14.5% PASS
total/synth/SWAP12/s0 0.00 0.00 -14.5% PASS
total/synth/SWAP13/s0 0.00 0.00 -14.5% PASS
total/synth/SWAP14/s0 0.00 0.00 -14.4% PASS
total/synth/SWAP15/s0 0.00 0.00 -14.6% PASS
total/synth/SWAP16/s0 0.00 0.00 -14.2% PASS
total/synth/SWAP2/s0 0.00 0.00 -14.3% PASS
total/synth/SWAP3/s0 0.00 0.00 -14.5% PASS
total/synth/SWAP4/s0 0.00 0.00 -14.3% PASS
total/synth/SWAP5/s0 0.00 0.00 -14.4% PASS
total/synth/SWAP6/s0 0.00 0.00 -14.5% PASS
total/synth/SWAP7/s0 0.00 0.00 -14.4% PASS
total/synth/SWAP8/s0 0.00 0.00 -14.3% PASS
total/synth/SWAP9/s0 0.00 0.00 -14.5% PASS
total/synth/XOR/b0 0.00 0.00 -14.6% PASS
total/synth/XOR/b1 0.00 0.00 -7.4% PASS
total/synth/loop_v1 1.20 1.02 -15.3% PASS
total/synth/loop_v2 1.11 0.95 -15.1% PASS

Summary: 194 benchmarks, 0 regressions


@abmcar abmcar marked this pull request as draft March 30, 2026 09:08
@abmcar abmcar force-pushed the perf/x86-cg-peephole-rules branch 2 times, most recently from 5c5941f to a30d58d Compare April 10, 2026 14:05
abmcar and others added 6 commits April 13, 2026 18:12
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add comments for intentional design decisions (ZeroToErase asymmetry,
  TEST dual-trigger, NoOpImm whitelist)
- Use SmallVector for small peephole collections to avoid heap allocs
- Add function-level pattern descriptions for complex match functions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- matchDirectBoolBranch: note optional or-with-zero and direct self-test path
- matchCmovBoolBranch: note order-independent mov defs, cmp/test output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@abmcar abmcar force-pushed the perf/x86-cg-peephole-rules branch from a30d58d to 3f4c8c4 Compare April 13, 2026 10:19
@abmcar
Copy link
Copy Markdown
Contributor Author

abmcar commented Apr 25, 2026

Closing in favor of PR #439 (feat/cgir-peephole-pr).

Both PRs branched from the same main commit and both touch
src/compiler/target/x86/x86_cg_peephole.cpp in incompatible ways.
PR #439 replaces the hand-written framework that this PR extends with a
declarative JSON rule table + Z3-verified synthesis + CI gate, which is
the strategic direction. The rule-by-rule coverage analysis and
follow-up plan for the rules unique to this PR (bool→branch multi-instr
fold, zero-imm no-op erase, ADC reg→imm form, ADD-zero erase) is
documented and tracked separately in the research direction notes.

The 988-line implementation from this branch was already ported to the
WISA 2026 paper submission branch as the D1 variant (paper benchmarks
are unaffected).

@abmcar abmcar closed this Apr 25, 2026
@abmcar abmcar deleted the perf/x86-cg-peephole-rules branch May 7, 2026 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant