Skip to content

RunArray::slice() should align run_ends and values with the logical slice #10017

@Rich-T-kid

Description

@Rich-T-kid

RunArray::slice() should trim the underlying values array

RunArray::slice() currently tracks slicing logically without adjusting the underlying values array, so a sliced RunArray still references the full original values. This has performance implications for patterns that operate on array.values().

Example

let run = Int32Array::from(vec![3, 6, 9]);
let values = Int32Array::from(vec![1, 2, 3]);
// [1, 1, 1, 2, 2, 2, 3, 3, 3]
let array = RunArray::try_new(&run, &values).unwrap();

// Logically [2, 2, 2], but still references the original arrays:
// {1, 1, 1, [2, 2, 2], 3, 3, 3}
let array_sliced = array.slice(3, 3);

Ideally array_sliced would be { run_ends: [3], values: [2] }, but the current implementation of RunArray::slice() preserves the full values array.

Why it matters

Patterns like the following do extra work on values outside the slice:

let values = date_part(array.values(), part)?;
let new_array = array.with_values(values);

Not a correctness bug, but worth avoiding.

Regression checks

Each call site of RunArray::slice() needs to be reviewed to ensure this change doesn't introduce breaking behavior.

Reference

[Original comment thread](#9959 (comment))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions