Add Sparse pushdown kernels for is_constant, sum, and compare by joseph-isaacs · Pull Request #8028 · vortex-data/vortex

joseph-isaacs · 2026-05-20T07:37:26Z

Sparse arrays previously had no aggregate or compare pushdown, so
is_constant, sum, and <column> op <scalar> on a Sparse column all
fell through to full canonical materialization — O(N) work regardless of
patch density.

Sparse arrays previously had no aggregate or compare pushdown, so `is_constant`, `sum`, and `<column> op <scalar>` on a Sparse column all fell through to full canonical materialization — O(N) work regardless of patch density. Each new kernel pushes the operation into the patches: - `SparseIsConstantKernel` checks `is_constant(patch_values)` and whether the common patch value equals the fill scalar. - `SparseSumKernel` folds `fill * (N - P) + sum(patch_values)` through the existing `Sum` accumulator so overflow saturation is preserved. - `CompareKernel for Sparse` maps a constant-RHS comparison through `patches.map_values` and rebuilds a `Sparse<Bool>` with `scalar_cmp` applied to the fill, preserving downstream sparsity (the filter parent kernel already handles `Sparse<Bool>` masks). All three are O(P) instead of O(N). Benchmarks on a 1M-element Sparse i32 with non-null fill show: - `is_constant`: 78-93x speedup (137us -> 1.7us at P=10..1000) - `sum`: 109-581x speedup (768us -> 1.3us at P=10) - `compare`: 19-84x speedup (777us -> 9us at P=10 with downstream canonicalization; bigger when consumers stay sparse) Aggregate kernels are wired through the session-scoped registry via a new `vortex_sparse::initialize` (called from `vortex-file`'s default encodings). Compare is wired through `PARENT_KERNELS` so it fires during `execute_parent` on `ScalarFn(Binary, cmp)` nodes whose child is Sparse. Signed-off-by: Claude <noreply@anthropic.com>

CodSpeed tracked 24 entries (canonical+kernel × 4 args × 3 ops). Collapse to exactly three benchmarks — one per kernel, single config each, sized so each lands in the ~10-100µs range: - sparse_is_constant: ~87µs (150k constant patches, worst case: full scan) - sparse_sum: ~33µs (100k patches) - sparse_compare: ~41µs (10k patches, materialized result) The canonical baselines are dropped; CodSpeed only needs to track the kernel path going forward. Signed-off-by: Claude <noreply@anthropic.com>

codspeed-hq · 2026-05-20T07:44:27Z

Merging this PR will improve performance by 19.73%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 1 improved benchmark
✅ 1236 untouched benchmarks
🆕 5 new benchmarks

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	Simulation	`chunked_varbinview_opt_canonical_into[(1000, 10)]`	224.6 µs	187.6 µs	+19.73%
🆕	Simulation	`sparse_null_count`	N/A	34.5 µs	N/A
🆕	Simulation	`sparse_compare`	N/A	393.2 µs	N/A
🆕	Simulation	`sparse_min_max`	N/A	233.5 µs	N/A
🆕	Simulation	`sparse_is_constant`	N/A	250.7 µs	N/A
🆕	Simulation	`sparse_sum`	N/A	455.6 µs	N/A

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing claude/sparse-pushdown-kernels-qxU4x (56d5689) with develop (f97805d)}

Extends the Sparse pushdown set with the remaining high-value kernels, all O(num_patches) instead of O(N): Aggregates (session-registered in `initialize`): - SparseMinMaxKernel: folds min/max(patch_values) with the fill scalar when the fill is reachable (P < N) and non-null. - SparseNullCountKernel: null_count(patch_values) + (fill null ? N-P : 0); O(1) when the patch null-count stat is cached. - SparseNanCountKernel: nan_count(patch_values) + (fill NaN ? N-P : 0); declines for non-float dtypes. Filter pushdowns (wired via PARENT_KERNELS): - BetweenKernel: range predicate with constant bounds → Sparse<Bool>, same shape as the compare kernel. - FillNullKernel: replaces null fill/patches with the constant, stays sparse. MinMax and NullCount in particular are the zone-map/pruning kernels that Dict and RunEnd already had and Sparse lacked. Deliberately skipped: Mask (a dense mask masks unpatched fill positions, so the result can't stay sparse), IsSorted (rarely true for sparse, position-dependent and error-prone), Like/Zip/ListContainsElement (niche string/list cases), and Mean (already free via Combined<Sum, Count>). Benches: added sparse_min_max and sparse_null_count alongside the existing three (skipping between/fill_null/nan_count, which mirror compare/null_count cost profiles). All five single-config, ~50-80µs. Tests compare each kernel against the canonical baseline (aggregates via an unregistered session; parent kernels by canonicalizing the input first). Signed-off-by: Claude <noreply@anthropic.com>

claude added 2 commits May 19, 2026 23:35

joseph-isaacs added the changelog/performance A performance improvement label May 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Sparse pushdown kernels for is_constant, sum, and compare#8028

Add Sparse pushdown kernels for is_constant, sum, and compare#8028
joseph-isaacs wants to merge 3 commits into
developfrom
claude/sparse-pushdown-kernels-qxU4x

joseph-isaacs commented May 20, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joseph-isaacs commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 19.73%

Performance Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joseph-isaacs commented May 20, 2026 •

edited

Loading

codspeed-hq Bot commented May 20, 2026 •

edited

Loading