Fix cosine similarity optimization bug #7724
Performance Regression: -42.11%
⚠️ Unknown Walltime execution environment detected
Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.
For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.
⚠️ Different runtime environments detected
Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.
⚡ 1 improved benchmark
❌ 4 regressed benchmarks
✅ 1184 untouched benchmarks
⏩ 9 skipped benchmarks1
⚠️ Please fix the performance issues or acknowledge them on CodSpeed.
Performance Changes
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | WallTime | dynamic_dispatch_u32[10M] |
162.8 µs | 105.5 µs | +54.37% |
| ❌ | WallTime | for[10M_u8] |
73.8 µs | 127.4 µs | -42.11% |
| ❌ | WallTime | for[10M_u16] |
95.5 µs | 157.3 µs | -39.33% |
| ❌ | WallTime | mix[0%_in/100%_out] |
227.3 µs | 278.5 µs | -18.39% |
| ❌ | Simulation | new_bp_prim_test_between[i64, 32768] |
176.4 µs | 235.3 µs | -25.04% |
Comparing ct/fix-cosine-denorm-opt (3a54a19) with develop (0bb712b)
Footnotes
-
9 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩