Skip to content

Merge branch 'develop' into ad/cap-values-per-tile

ac24aaf
Select commit
Loading
Failed to load commit list.
Merged

perf[gpu]: reduce register pressure in dyn dispatch #7489

Merge branch 'develop' into ad/cap-values-per-tile
ac24aaf
Select commit
Loading
Failed to load commit list.
CodSpeed HQ / CodSpeed Performance Analysis succeeded Apr 16, 2026 in 0s

Performance Gate Passed

⚡ 9 improved benchmarks
✅ 1154 untouched benchmarks
⏩ 1457 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation take_map[(0.1, 0.5)] 1,154.5 µs 980.3 µs +17.77%
Simulation take_map[(0.1, 1.0)] 2 ms 1.6 ms +20.02%
Simulation patched_take_10k_contiguous_patches 258.1 µs 227.7 µs +13.32%
Simulation patched_take_10k_dispersed 316 µs 285.8 µs +10.58%
Simulation patched_take_10k_contiguous_not_patches 258.4 µs 228.1 µs +13.28%
Simulation patched_take_10k_first_chunk_only 302 µs 271.8 µs +11.14%
Simulation take_10k_first_chunk_only 270.6 µs 225.7 µs +19.89%
Simulation patched_take_10k_random 270.3 µs 240 µs +12.64%
Simulation take_10k_dispersed 284.4 µs 239.5 µs +18.76%

Comparing ad/cap-values-per-tile (ac24aaf) with develop (1169d84)

Open in CodSpeed

Footnotes

  1. 1457 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.