-
Notifications
You must be signed in to change notification settings - Fork 113
feat: add BinaryNumericFn for array arithmetic #1640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Why ScalarNumeric instead of using is_constant as with all other compute functions? |
|
We could remove |
|
Alright, scalar_subtract and scalar_numeric are completely gone. Four convenience functions survive: |
557f141 to
2f9fc40
Compare
I did not implement any binary numeric functions because it is not clear that there are any cases where we can out run decompression. Two run end arrays might be a happy path? Two dictionaries, maybe, if the dictionaries are much smaller than the decompressed arrays? Scalar numeric functions are more obviously valuable: clickbench includes several uses of scalar add or subtract.
18a0e3a to
6e171f9
Compare
|
@danking none of our compute functions have internal casting. Let's just check LHS and RHS are exactly equal (including null-ability) and fail if not. The caller has to decide casting rules, e.g. the compute engine, else it gets very confusing to keep track of coercion semantics |
|
@gatesn the old subtract_scalar casted the constant (which is what this PR preserves and extends to add, multiply, and divide), we seem to rely on this for shifting usize indices around. I could push the cast into the call sites though? |
|
Yeah I think the call site is best, albeit I bit more annoying |
|
@gatesn done. |
gatesn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it's taken me a while to get to this
| other: PrimitiveScalar<'_>, | ||
| op: NumericOperator, | ||
| ) -> VortexResult<Scalar> { | ||
| if !self.dtype().eq_ignore_nullability(other.dtype()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we ignoring nullability? Not saying we shouldn't be (although maybe we shouldn't be), but if we do this it should be commented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting different nullabilities isn't difficult and does not seem to me likely to affect speed much since we're already working with Scalar rather than primitives. Compare works similarly. What kind of comment are you looking for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only because we of the general approach in Vortex that a compute function should never perform type coercion.
danking
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I resolved all the threads that I think are uncontroversially resolved. I think the unresolved ones still need confirmation or more discussion.
| other: PrimitiveScalar<'_>, | ||
| op: NumericOperator, | ||
| ) -> VortexResult<Scalar> { | ||
| if !self.dtype().eq_ignore_nullability(other.dtype()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting different nullabilities isn't difficult and does not seem to me likely to affect speed much since we're already working with Scalar rather than primitives. Compare works similarly. What kind of comment are you looking for?
vortex-scalar/src/primitive.rs
Outdated
| let lhs = self.typed_value::<$P>(); | ||
| let rhs = other.typed_value::<$P>(); | ||
| match (lhs, rhs) { | ||
| (_, None) | (None, _) => Some(Scalar::null(self.dtype().clone().as_nullable())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the bug I think is still here. If (_, None) is true, and lhs is non-nullable, then you're going to try to create a Scalar::null that's non-nullable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is correct (b/c of the as_nullable) but now every case uses the same (least viable) nullability.
vortex-scalar/src/primitive.rs
Outdated
| vortex_bail!("types must match: {} {}", self.dtype(), other.dtype()); | ||
| } | ||
|
|
||
| let nullability = self.dtype().nullability(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think this should be let nullability = self.dtype.is_nullable() || other.dtype.is_nullable()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
…me way regardless of values
I did not implement any binary numeric functions because it is not clear that there are any cases where we can out run decompression. Two run end arrays might be a happy path? Two dictionaries, maybe, if the dictionaries are much smaller than the decompressed arrays?
Binary scalar numeric functions are more obviously valuable: clickbench includes several uses of scalar add or subtract.