Skip to content

Skip validity dispatch for non-nullable arrays#7748

Merged
palaska merged 1 commit into
developfrom
bp/shortcut-validity
May 1, 2026
Merged

Skip validity dispatch for non-nullable arrays#7748
palaska merged 1 commit into
developfrom
bp/shortcut-validity

Conversation

@palaska
Copy link
Copy Markdown
Contributor

@palaska palaska commented May 1, 2026

Two small optimizations:

  • Skip the is_invalid vtable dispatch when dtype is non-nullable
  • Demote the dtype-equality post-check to debug assertion. It's an encoding-correctness invariant, not runtime input validation

Signed-off-by: Baris Palaska <barispalaska@gmail.com>
@palaska palaska added the changelog/performance A performance improvement label May 1, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 1, 2026

Merging this PR will degrade performance by 17.47%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 10 improved benchmarks
❌ 5 regressed benchmarks
✅ 1183 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime bitunpack[3bw] 300.5 µs 353.1 µs -14.9%
Simulation case_when_fragmented[1000] 501.8 µs 450.8 µs +11.32%
Simulation extend_from_array_non_zctl_overlapping[(10000, 8)] 6 ms 5.5 ms +10.18%
Simulation execute_scalar_struct_simple 628.9 µs 515.3 µs +22.06%
Simulation execute_scalar_struct_wide 2.6 ms 2.1 ms +25.18%
Simulation decompress_rd[f32, (100000, 0.01)] 495.1 µs 582.6 µs -15.02%
Simulation decompress_rd[f64, (10000, 0.0)] 138.5 µs 122.3 µs +13.27%
Simulation decompress_rd[f32, (100000, 0.1)] 495.1 µs 582.6 µs -15.03%
Simulation decompress_rd[f32, (100000, 0.0)] 583.5 µs 495.7 µs +17.72%
Simulation decompress_rd[f32, (10000, 0.0)] 94.5 µs 85.7 µs +10.28%
Simulation decompress_rd[f32, (10000, 0.1)] 90.2 µs 81.9 µs +10.11%
Simulation decompress_rd[f64, (10000, 0.01)] 138.6 µs 122.1 µs +13.53%
Simulation decompress_rd[f64, (100000, 0.01)] 842.6 µs 1,020.7 µs -17.45%
Simulation decompress_rd[f64, (100000, 0.1)] 842.5 µs 1,020.9 µs -17.47%
Simulation decompress_rd[f64, (10000, 0.1)] 138.7 µs 122.2 µs +13.45%

Comparing bp/shortcut-validity (bbba3ca) with develop (c4feed7)

Open in CodSpeed

@palaska palaska merged commit 4c1ae92 into develop May 1, 2026
70 of 72 checks passed
@palaska palaska deleted the bp/shortcut-validity branch May 1, 2026 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/performance A performance improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants