feat[vortex-array]: add stepped pipeline execution #4312

joseph-isaacs · 2025-08-21T13:39:09Z

⏺ Vortex Vector Pipeline: Stepped Execution Engine for Columnar Compute

This PR introduces a vectorized pipeline execution engine for Vortex that processes data in fixed-size chunks to maximize cache locality and enable efficient query planning. The design
implements a DAG-based execution model where compute operations are broken into simple, composable kernels that can be optimized at planning time and executed efficiently at runtime.

This should be the new way of implementing compute function going forwards.

Solution: Stepped Execution with Cache-Resident Chunks

Core Concept

Instead of processing entire arrays operation by operation, we process small chunks through the entire operation pipeline while data remains cache-resident:

chunk[0..1024]
→ filter → expr_1 → expr_2 → expr_3 → output[0..n]
chunk[1024..2048]
→ filter → expr_1 → expr_2 → expr_3 → output[n..m]
...

Each 1024-element chunk flows through all operations while staying in L1 cache, dramatically improving memory bandwidth utilization.

Concepts

Kernel Interface

Each kernel implements a single, simple operation:

  pub trait Kernel {
      fn step(
          &mut self,
          ctx: &KernelContext,      // Access to child vectors
          selected: BitView,         // Which elements to process
          out: &mut ViewMut,         // Output buffer
      ) -> VortexResult<()>;
  }

This simplicity makes kernels:

Easy to implement correctly
Easy to optimize (vectorization, SIMD)
Easy to test in isolation

Physical Type System (VType)

Operations work on canonical physical representations, not logical types:

Bool - Byte-sized booleans (for SIMD efficiency)
Primitive(PType) - Native numeric types (and decimals)
Binary - Variable-length data with 16-byte views

This eliminates encoding complexity from kernel implementations.

Compile-Time Dispatch

Operations use trait-based dispatch for zero-overhead abstraction:

impl<T: Element + NativePType> Kernel for ComparePrimitiveKernel<T> {
    fn step(
        &mut self,
        ctx: &KernelContext,
        selected: BitView,
        out: &mut ViewMut,
    ) -> VortexResult<()> {
        let lhs_vec = ctx.vector(self.lhs);
        let lhs = lhs_vec.as_slice::<T>();
        let rhs_vec = ctx.vector(self.rhs);
        let rhs = rhs_vec.as_slice::<T>();
        let bools = out.as_slice_mut::<bool>();

        assert_eq!(
            lhs.len(),
            rhs.len(),
            "LHS and RHS must have the same length"
        );

        lhs.iter()
            .zip(rhs.iter())
            .zip(bools)
            .for_each(|((lhs, rhs), bool)| *bool = lhs > rhs;

        Ok(())
    }
}

Planning-Time Optimization

The DAG structure enables powerful optimizations before execution:

Operator Fusion: compare(array, lit(2)) → compare_scalar(array, 2)
Common Sub-expression Elimination: Reuse computed intermediates
Dead Code Elimination: Remove unused branches
Buffer Management: Optimal allocation and reuse of intermediate vectors

Implementation Status

Implemented Operators & Kernels

Primitive, FoR, bitpacking, Compare, Constant.

Future Work

Expression → Operation Conversion (Next PR)
- Convert vortex-expr to operators and a operator DAGs
Array Encoding Integration
- Replace to_canonical for specific encodings (BitPacking, FOR, Primitive)
- Transparent handling of encoded data
Advanced Optimizations
- In-place operations for unary functions

Performance Impact

For typical analytical workloads (filter + projections):

1-3x gain in filter + decode of FoR bitpacked kernels (see benchmarks)
1-3x gain in filter + compare kernels

Integration Path

Phase 1 (Experimental): Internal use for specific array encodings
Phase 2: Replace to_canonical for performance-critical paths
Phase 3: Full integration with expression evaluation

Handling other arrays:

This can be done with multiple pipeline where each one can be composed with either materialise nodes or IO nodes.

Dict(codes, value) operator would be defined as decompressing the array in the usual way. However we would likely want to either decompress all the values at once [or take the values optimal]. This would be modelled as two pipelines one to materialise to values into a full array [this can be stepped or a compute function] and then another to take values using a codes in a stepped pipeline.

           ┌──────────────┐
           │   Output     │
           │  (result)    │
           └──────▲───────┘
                  │
           ┌──────┴───────┐
           │     Add      │
           │              │
           └──────▲───────┘
                  │
          ┌───────┴────────┐
          │                │
     ┌────▼─────┐    ┌─────▼─────┐
     │   Dict   │    │     b     │
     │  Lookup  │    │(Primitive)│
     └────▲─────┘    └───────────┘
          │
     ┌────┴─────────────┐
     │                  │
  ┌─────┐               |
  │codes│               │            
  │     │               │             
  └─────┘               |                 
                         │
          ╔══════════════╧═══════════════╗
          ║  Pipeline 1 (Embedded)       ║
          ║                               ║
          ║      ┌──────────────┐        ║
          ║      │ScalarCompare │        ║
          ║      │   (== 2)     │        ║
          ║      └──────▲───────┘        ║
          ║             │                ║
          ║      ┌──────┴───────┐        ║
          ║      │  Primitive   │        ║
          ║      │   (values)   │        ║
          ║      └──────────────┘        ║
          ╚═══════════════════════════════╝

Missing elements:

optimise the pipeline execution [order, intermediate vector allocation]

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

vortex-array/src/arrays/constant/mod.rs

vortex-array/src/arrays/constant/compute/pipeline.rs

vortex-array/src/arrays/primitive/compute/mod.rs

vortex-array/src/arrays/primitive/mod.rs

vortex-array/src/arrays/varbinview/compute/pipeline.rs

vortex-array/src/pipeline/bits/vector.rs

vortex-array/src/pipeline/vec.rs

vortex-array/src/pipeline/view.rs

gatesn · 2025-08-22T13:32:25Z

Cargo.toml

 many_single_char_names = "deny"
 mem_forget = "deny"
 multiple_crate_versions = "allow"
+needless_range_loop = "allow"


We talked about this https://spiraldb.slack.com/archives/C07BV3GKAJ2/p1755518451455489

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

vortex-array/src/pipeline/bits/vector.rs

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

gatesn

Let's gooooo

gatesn added 30 commits July 26, 2025 21:42

Vectors ideas

a357b3a

Vectors ideas

e734a1c

Fix edit team

eaaedfc

Fix up

d28dfea

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Fix up

b916325

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Fix up

2db28f0

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Parameterized vtables

77f867f

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Merge branch 'develop' into ngates/vector

238e262

Parameterized vtables

bd05446

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Vectors

248a8b5

Signed-off-by: Nicholas Gates <nick@nickgates.com>

BitPacked export

dc8b742

BitPacked export

57854a3

BitPacked export

469f53c

BitPacked export

7ff3aab

BitPacked export

7f3c9a4

BitPacked export

c04826d

BitPacked export

0f82920

Stuff

6bcc727

Nodes

c23fac1

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Ops

5a0753e

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Fix team click area

0a8f73c

Ops

dc37aa2

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Ops

a9d9972

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Ops

2bc6736

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Ops

c6a7301

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Ops

4c60a38

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Ops

b362ee5

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Ops

b28989a

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Ops

e45bd5d

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Ops

39b5349

Signed-off-by: Nicholas Gates <nick@nickgates.com>

joseph-isaacs added 4 commits August 22, 2025 11:39

move

8d4a0f8

move

564eee5

move

6b48178

move

ea15400

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

joseph-isaacs marked this pull request as ready for review August 22, 2025 11:37

joseph-isaacs added 6 commits August 22, 2025 12:40

move

aa29d73

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

move

dea5a15

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

move

9cdb48a

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

fixup wasm

c9ace8b

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

fixup wasm

70c6ab9

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

fixup wasm

f591fc9

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

gatesn reviewed Aug 22, 2025

View reviewed changes

joseph-isaacs added 4 commits August 22, 2025 14:32

fixup wasm

de19e5e

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

fixup wasm

a37d313

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

fixup wasm

c04a945

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

fixup

212ccab

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

joseph-isaacs force-pushed the ji/vector-pipeline branch from 5d831d4 to 212ccab Compare August 22, 2025 14:51

fixup

8010e17

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

gatesn reviewed Aug 22, 2025

View reviewed changes

vortex-array/src/pipeline/bits/vector.rs Outdated Show resolved Hide resolved

joseph-isaacs added 2 commits August 22, 2025 15:56

fixup

694d9ca

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

fixup

4c11f14

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

joseph-isaacs enabled auto-merge (squash) August 22, 2025 14:58

joseph-isaacs requested a review from gatesn August 22, 2025 14:58

joseph-isaacs added 2 commits August 22, 2025 16:01

fixup

643775d

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

fixup

a912c98

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

gatesn approved these changes Aug 22, 2025

View reviewed changes

joseph-isaacs merged commit 517e293 into develop Aug 22, 2025
36 of 37 checks passed

joseph-isaacs deleted the ji/vector-pipeline branch August 22, 2025 15:21

robert3005 mentioned this pull request Aug 26, 2025

Vectors I #4188

Closed

joseph-isaacs mentioned this pull request Sep 3, 2025

vortex-kernel/pipeline: roadmap #4493

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat[vortex-array]: add stepped pipeline execution #4312

feat[vortex-array]: add stepped pipeline execution #4312

Uh oh!

joseph-isaacs commented Aug 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gatesn Aug 22, 2025

Uh oh!

joseph-isaacs Aug 22, 2025

Uh oh!

Uh oh!

gatesn left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat[vortex-array]: add stepped pipeline execution #4312

feat[vortex-array]: add stepped pipeline execution #4312

Uh oh!

Conversation

joseph-isaacs commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⏺ Vortex Vector Pipeline: Stepped Execution Engine for Columnar Compute

Solution: Stepped Execution with Cache-Resident Chunks

Core Concept

Concepts

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gatesn Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

joseph-isaacs Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gatesn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

joseph-isaacs commented Aug 21, 2025 •

edited

Loading