Skip to content

Conversation

@danking
Copy link
Contributor

@danking danking commented Dec 10, 2024

I did not implement any binary numeric functions because it is not clear that there are any cases where we can out run decompression. Two run end arrays might be a happy path? Two dictionaries, maybe, if the dictionaries are much smaller than the decompressed arrays?

Binary scalar numeric functions are more obviously valuable: clickbench includes several uses of scalar add or subtract.

@danking danking marked this pull request as ready for review December 10, 2024 22:21
@gatesn
Copy link
Contributor

gatesn commented Dec 10, 2024

Why ScalarNumeric instead of using is_constant as with all other compute functions?

@danking
Copy link
Contributor Author

danking commented Dec 10, 2024

subtract_scalar was already present and this was the natural generalization.

We could remove subtract_scalar and friends and replace them with functions that create constant arrays of the right length and apply the binary operator and have the binary operators handle constants RHSes. What was the reasoning for subtract_scalar?

@danking danking added the benchmark Run benchmarks on this branch label Dec 11, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 11, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 11, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 11, 2024
@danking
Copy link
Contributor Author

danking commented Dec 11, 2024

Alright, scalar_subtract and scalar_numeric are completely gone. Four convenience functions survive: sub_scalar, etc. which delegate to binary_numeric.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

I did not implement any binary numeric functions because it is not clear that there are any cases
where we can out run decompression. Two run end arrays might be a happy path? Two dictionaries,
maybe, if the dictionaries are much smaller than the decompressed arrays?

Scalar numeric functions are more obviously valuable: clickbench includes several uses of scalar add
or subtract.
@gatesn
Copy link
Contributor

gatesn commented Dec 13, 2024

@danking none of our compute functions have internal casting.

Let's just check LHS and RHS are exactly equal (including null-ability) and fail if not. The caller has to decide casting rules, e.g. the compute engine, else it gets very confusing to keep track of coercion semantics

@danking
Copy link
Contributor Author

danking commented Dec 13, 2024

@gatesn the old subtract_scalar casted the constant (which is what this PR preserves and extends to add, multiply, and divide), we seem to rely on this for shifting usize indices around. I could push the cast into the call sites though?

@gatesn
Copy link
Contributor

gatesn commented Dec 13, 2024

Yeah I think the call site is best, albeit I bit more annoying

@danking
Copy link
Contributor Author

danking commented Dec 13, 2024

@gatesn done.

@danking danking requested review from gatesn and lwwmanning December 13, 2024 20:07
@danking danking changed the title feat: add BinaryNumericFn and ScalarNumericFn for array arithmetic feat: add BinaryNumericFn for array arithmetic Dec 13, 2024
Copy link
Contributor

@gatesn gatesn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it's taken me a while to get to this

other: PrimitiveScalar<'_>,
op: NumericOperator,
) -> VortexResult<Scalar> {
if !self.dtype().eq_ignore_nullability(other.dtype()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we ignoring nullability? Not saying we shouldn't be (although maybe we shouldn't be), but if we do this it should be commented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting different nullabilities isn't difficult and does not seem to me likely to affect speed much since we're already working with Scalar rather than primitives. Compare works similarly. What kind of comment are you looking for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only because we of the general approach in Vortex that a compute function should never perform type coercion.

Copy link
Contributor Author

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I resolved all the threads that I think are uncontroversially resolved. I think the unresolved ones still need confirmation or more discussion.

other: PrimitiveScalar<'_>,
op: NumericOperator,
) -> VortexResult<Scalar> {
if !self.dtype().eq_ignore_nullability(other.dtype()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting different nullabilities isn't difficult and does not seem to me likely to affect speed much since we're already working with Scalar rather than primitives. Compare works similarly. What kind of comment are you looking for?

@danking danking requested a review from gatesn December 17, 2024 18:13
@danking danking enabled auto-merge (squash) December 17, 2024 18:35
let lhs = self.typed_value::<$P>();
let rhs = other.typed_value::<$P>();
match (lhs, rhs) {
(_, None) | (None, _) => Some(Scalar::null(self.dtype().clone().as_nullable())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bug I think is still here. If (_, None) is true, and lhs is non-nullable, then you're going to try to create a Scalar::null that's non-nullable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is correct (b/c of the as_nullable) but now every case uses the same (least viable) nullability.

vortex_bail!("types must match: {} {}", self.dtype(), other.dtype());
}

let nullability = self.dtype().nullability();
Copy link
Contributor

@gatesn gatesn Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think this should be let nullability = self.dtype.is_nullable() || other.dtype.is_nullable()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@danking danking requested a review from gatesn December 17, 2024 20:16
@danking danking merged commit fa08a07 into develop Dec 17, 2024
20 checks passed
@danking danking deleted the dk/arithmetic branch December 17, 2024 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants