vortex-row: compute_sizes helper and RowSize ScalarFn#7991
vortex-row: compute_sizes helper and RowSize ScalarFn#7991joseph-isaacs wants to merge 1 commit into
Conversation
Add the size-pass machinery used by both RowSize and the upcoming
RowEncode pipeline. `compute_sizes` walks the N input columns once,
classifying each via `row_width_for_dtype` and accumulating
fixed-width-prefix sums in `fixed_per_row` while pushing per-row sums
of variable-length columns into a lazily allocated `var_lengths` vec.
The classification result (`ColKind` + `SizePassResult`) is private to
the crate; RowEncode consumes it in a later commit to choose between
the arithmetic and cursor encode paths.
`RowSize` returns a `Struct { fixed: U32, var: U32 }` so callers can
read the per-row width without realizing the constant `fixed` slot as
a per-row buffer (it's a `ConstantArray`); the `var` slot is a
`ConstantArray(0)` when no varlen column is present.
`dispatch_size` is the fallback-only path for PR 1 (canonicalize, then
codec::field_size). The `RowSizeKernel` trait exists but is unused; per-
encoding fast paths and the inventory registry arrive in PR 3.
`initialize()` does NOT register RowSize yet - that lands once
RowEncode is in place, so the session-registered pair appears together.
Signed-off-by: Claude <noreply@anthropic.com>
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | chunked_varbinview_canonical_into[(1000, 10)] |
197.6 µs | 161.9 µs | +22.08% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(100, 100)] |
358.2 µs | 325 µs | +10.19% |
| ❌ | Simulation | chunked_varbinview_opt_canonical_into[(1000, 10)] |
187.6 µs | 224.9 µs | -16.56% |
| ❌ | Simulation | new_alp_prim_test_between[f32, 16384] |
103.8 µs | 118.3 µs | -12.3% |
| ❌ | Simulation | new_alp_prim_test_between[f32, 32768] |
153.1 µs | 182.1 µs | -15.89% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing claude/row-c06-rowsize-scalarfn (5374f3b) with claude/row-c05-codec-nested (570d358)
Part 6 of 25 in the stacked PR series adding
vortex-row.This PR contains exactly one commit; review just that diff in isolation.
What this commit does
Adds the size-pass machinery used by both RowSize and the upcoming RowEncode pipeline.
compute_sizeswalks the N input columns once, classifying each viarow_width_for_dtypeand accumulating fixed-width-prefix sums infixed_per_rowwhile pushing per-row sums of variable-length columns into a lazily allocatedvar_lengthsvec. The classification result (ColKind+SizePassResult) is private to the crate; RowEncode consumes it in a later commit to choose between the arithmetic and cursor encode paths.RowSizereturns aStruct { fixed: U32, var: U32 }so callers can read the per-row width without realizing the constantfixedslot as a per-row buffer.dispatch_sizeis the fallback-only path here (canonicalize, then codec::field_size). TheRowSizeKerneltrait exists but is unused; per-encoding fast paths and the inventory registry arrive in PR 3.initialize()does NOT register RowSize yet — that lands once RowEncode is in place.Stack
claude/row-c01-crate-scaffoldingclaude/row-c02-sortfield-optionsclaude/row-c03-codec-fixed-widthclaude/row-c04-codec-varlenclaude/row-c05-codec-nestedclaude/row-c06-rowsize-scalarfnclaude/row-c07-rowencode-scalarfnclaude/row-c08-convert-columns-tests-benchclaude/row-c09-skip-listview-validationclaude/row-c10-validity-fast-pathclaude/row-c11-skip-zero-initclaude/row-c12-vectorize-pure-fixed-offsetsclaude/row-c13-vectorize-mixed-offsetsclaude/row-c14-varlen-block-copy-nonoverlappingclaude/row-c15-walk-varbinview-directlyclaude/row-c16-arith-write-fast-pathclaude/row-c17-specialize-constant-arithclaude/row-c18-kernel-dispatch-helpersclaude/row-c19-inventory-registryclaude/row-c20-constant-kernelclaude/row-c21-dict-kernelclaude/row-c22-patched-kernelclaude/row-c23-runend-kernelclaude/row-c24-bitpacked-kernelclaude/row-pr3-kernelsBase of this PR: #7990 (
claude/row-c05-codec-nested)Next in stack: #7992 (
claude/row-c07-rowencode-scalarfn)Combined context
For the full design + rationale, see PR #7985 (top of stack).