Skip to content

Implement logical comparison for run encoded array #3747

@askoa

Description

@askoa

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Part of #3520

The equals method for run encoded array implemented in the PR #3662 is incomplete as it compares only the underlying physical arrays.

Describe the solution you'd like
Implement a logical comparison for run encoded arrays. Update the below run_equal method to do a full logical comparison.

/// The current implementation of comparison of run array support physical comparison.
/// Comparing run encoded array based on logical indices (`lhs_start`, `rhs_start`) will
/// be time consuming as converting from logical index to physical index cannot be done
/// in constant time. The current comparison compares the underlying physical arrays.
pub(super) fn run_equal(
lhs: &ArrayData,
rhs: &ArrayData,
lhs_start: usize,
rhs_start: usize,
len: usize,
) -> bool {

Describe alternatives you've considered
We cannot do it.

Additional context
Implementing a full logical comparison in arrow-data crate would be somewhat challenging as the crate does not depend on arrow-array. The arrow-array crate has functions that are useful to parse run_ends in run encoded array. Either arrow-array has to be added as dependency for arrow-data or some of the code in arrow-array has to be duplicated in arrow-data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions