Thread scope dtype through stats rewrites #7958

Performance Regression: -14.75%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 10 improved benchmarks
❌ 69 regressed benchmarks
✅ 1137 untouched benchmarks
⏩ 5 skipped benchmarks¹

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	WallTime	`cuda/bitpacked_u8/unpack/3bw[100M]`	299.5 µs	353.9 µs	-15.36%
❌	Simulation	`chunked_varbinview_canonical_into[(100, 100)]`	308.1 µs	400.7 µs	-23.11%
❌	Simulation	`chunked_varbinview_canonical_into[(1000, 10)]`	197.9 µs	284.7 µs	-30.49%
❌	Simulation	`chunked_varbinview_into_canonical[(10, 1000)]`	1.9 ms	2.2 ms	-12.45%
❌	Simulation	`chunked_varbinview_into_canonical[(100, 100)]`	358.6 µs	461.6 µs	-22.31%
❌	Simulation	`chunked_varbinview_into_canonical[(1000, 10)]`	212.1 µs	300.6 µs	-29.44%
❌	Simulation	`chunked_varbinview_opt_canonical_into[(100, 100)]`	411.4 µs	501.5 µs	-17.96%
❌	Simulation	`chunked_varbinview_opt_canonical_into[(1000, 10)]`	188.2 µs	307.3 µs	-38.78%
❌	Simulation	`chunked_varbinview_opt_into_canonical[(100, 100)]`	467.2 µs	565.2 µs	-17.34%
❌	Simulation	`chunked_varbinview_opt_into_canonical[(1000, 10)]`	240.4 µs	324.7 µs	-25.95%
❌	Simulation	`encode_primitives[u8, (10000, 2)]`	313.9 µs	358.6 µs	-12.45%
❌	Simulation	`encode_primitives[u8, (10000, 32)]`	318.4 µs	360.8 µs	-11.75%
❌	Simulation	`encode_primitives[u8, (10000, 4)]`	314.3 µs	358 µs	-12.2%
❌	Simulation	`encode_primitives[u8, (10000, 512)]`	335.2 µs	377.1 µs	-11.11%
❌	Simulation	`encode_primitives[u8, (10000, 8)]`	315.2 µs	358.1 µs	-11.98%
❌	Simulation	`varbinview_large`	130.1 µs	174.5 µs	-25.46%
❌	Simulation	`execute_scalar_struct_simple`	407.6 µs	464.3 µs	-12.2%
❌	Simulation	`binary_search_vortex`	485 ns	727.2 ns	-33.31%
❌	Simulation	`take_search[(0.005, 0.05)]`	131 µs	168.5 µs	-22.26%
❌	Simulation	`take_search[(0.005, 0.1)]`	246.6 µs	320.5 µs	-23.07%
...	...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing ngates/stats-7707/typed-stats-rewrite-api (dd555d9) with ngates/stats-7707/min-max-aggregate-fns (30b42c6)}

5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thread scope dtype through stats rewrites #7958

Uh oh!

Uh oh!

Thread scope dtype through stats rewrites #7958

Uh oh!

Performance Regression: -14.75%

Performance Changes

Re-running checks...

Thread scope dtype through stats rewrites #7958

Are you sure you want to change the base?

Uh oh!

Rename stats rewrite session

Uh oh!

Thread scope dtype through stats rewrites #7958

Uh oh!

Performance Regression: -14.75%

Performance Changes

Footnotes

Re-running checks...