-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Part of #3520
The equals method for run encoded array implemented in the PR #3662 is incomplete as it compares only the underlying physical arrays.
Describe the solution you'd like
Implement a logical comparison for run encoded arrays. Update the below run_equal method to do a full logical comparison.
arrow-rs/arrow-data/src/equal/run.rs
Lines 22 to 32 in 61ea9f2
| /// The current implementation of comparison of run array support physical comparison. | |
| /// Comparing run encoded array based on logical indices (`lhs_start`, `rhs_start`) will | |
| /// be time consuming as converting from logical index to physical index cannot be done | |
| /// in constant time. The current comparison compares the underlying physical arrays. | |
| pub(super) fn run_equal( | |
| lhs: &ArrayData, | |
| rhs: &ArrayData, | |
| lhs_start: usize, | |
| rhs_start: usize, | |
| len: usize, | |
| ) -> bool { |
Describe alternatives you've considered
We cannot do it.
Additional context
Implementing a full logical comparison in arrow-data crate would be somewhat challenging as the crate does not depend on arrow-array. The arrow-array crate has functions that are useful to parse run_ends in run encoded array. Either arrow-array has to be added as dependency for arrow-data or some of the code in arrow-array has to be duplicated in arrow-data.