-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add divide_opt kernel which produce null values on division by zero error #2710
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,7 +18,7 @@ | |
//! Defines kernels suitable to perform operations to primitive arrays. | ||
|
||
use crate::array::{ | ||
Array, ArrayData, ArrayRef, BufferBuilder, DictionaryArray, PrimitiveArray, | ||
Array, ArrayData, ArrayIter, ArrayRef, BufferBuilder, DictionaryArray, PrimitiveArray, | ||
}; | ||
use crate::buffer::Buffer; | ||
use crate::compute::util::combine_option_bitmap; | ||
|
@@ -257,6 +257,50 @@ where | |
Ok(unsafe { build_primitive_array(len, buffer.finish(), null_count, null_buffer) }) | ||
} | ||
|
||
/// Applies the provided binary operation across `a` and `b`, collecting the optional results | ||
/// into a [`PrimitiveArray`]. If any index is null in either `a` or `b`, the corresponding | ||
/// index in the result will also be null. The binary operation could return `None` which | ||
/// results in a new null in the collected [`PrimitiveArray`]. | ||
/// | ||
/// The function is only evaluated for non-null indices | ||
/// | ||
/// # Panic | ||
/// | ||
/// Panics if the arrays have different lengths | ||
pub(crate) fn binary_opt<A, B, F, O>( | ||
a: &PrimitiveArray<A>, | ||
b: &PrimitiveArray<B>, | ||
op: F, | ||
) -> PrimitiveArray<O> | ||
where | ||
A: ArrowPrimitiveType, | ||
B: ArrowPrimitiveType, | ||
O: ArrowPrimitiveType, | ||
F: Fn(A::Native, B::Native) -> Option<O::Native>, | ||
{ | ||
assert_eq!(a.len(), b.len()); | ||
|
||
if a.is_empty() { | ||
return PrimitiveArray::from(ArrayData::new_empty(&O::DATA_TYPE)); | ||
} | ||
|
||
let iter_a = ArrayIter::new(a); | ||
let iter_b = ArrayIter::new(b); | ||
|
||
let values = iter_a | ||
.into_iter() | ||
.zip(iter_b.into_iter()) | ||
.map(|(item_a, item_b)| { | ||
if let (Some(a), Some(b)) = (item_a, item_b) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what if both There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You mean two input arrays are all non-null? Is there a fast-path? Because There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea, I'm thinking in that case we can avoid this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, you said iterating inputs from two arrays without There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added the fast-path which simply iterates all values without |
||
op(a, b) | ||
} else { | ||
None | ||
} | ||
}); | ||
|
||
values.collect() | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
use super::*; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a note for other reviewer even post-review, we need such divide kernel which doesn't return Error on overflow (wrapping) or division by zero (getting null instead). We may consider to change
divide
's division by zero behavior but it will downgradedivide
performance. So currently the best option seems to have this separate kernel.