Skip to content

Revert "fix: build CUDA kernels as multi-arch fatbin with PTX fallbac…

68dc253
Select commit
Loading
Failed to load commit list.
Merged

Revert "fix: build CUDA kernels as multi-arch fatbin with PTX fallback" #8055

Revert "fix: build CUDA kernels as multi-arch fatbin with PTX fallbac…
68dc253
Select commit
Loading
Failed to load commit list.
CodSpeed HQ / CodSpeed Performance Analysis failed May 21, 2026 in 0s

Performance Regression: -1.68%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 2 improved benchmarks
❌ 1 regressed benchmark
✅ 1234 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation baseline_eq[16, 65536] 287.6 µs 259.6 µs +10.76%
Simulation baseline_lt[16, 65536] 302.7 µs 274.7 µs +10.17%
Simulation fast_lt_out_of_range[4, 65536] 204.3 µs 262.3 µs -22.11%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing revert-8047-fix-cuda-ptx-gpu-invalidation (68dc253) with develop (c54ce7e)

Open in CodSpeed