Make previous unchecked division kernel available #6970

kazuyukitanimura · 2023-07-14T21:15:02Z

Which issue does this PR close?

Rationale for this change

With #6792, now divisions check dividing by zero.
That is an API change and we are (and some other users may be) looking for an old behavior.

What changes are included in this PR?

This PR implements DivideUnchecked operation that behaves like the previous Division operator (return null for dividing by zero)

Are these changes tested?

Yes

Are there any user-facing changes?

No, but users have a choice explicitly to use the previous division behavior.

alamb

Makes sense to me -- thank you for the contribution @kazuyukitanimura

cc @liukun4515 and @tustvold

tustvold · 2023-07-15T12:34:16Z

This operator has actually been removed upstream... Perhaps this could be implemented as a nullif zero prior to division? This would be consistent with postgres.

As written this PR would reopen #6791

alamb · 2023-07-15T21:15:15Z

This operator has actually been removed upstream... Perhaps this could be implemented as a nullif zero prior to division? This would be consistent with postgres.

Confirming you mean something like https://docs.rs/datafusion/latest/datafusion/logical_expr/fn.nullif.html so like

let expr = nullif(expr, lit(0.0f64))

What do you think @kazuyukitanimura ?

ozankabak · 2023-07-17T11:52:17Z

Let's make this preference (exception vs. special value as the default behavior) configurable. There are many use cases where one wants the output of division to be nullable for integers, or NaN for floats. IIRC, we follow IEEE 754 for floats, so there is no issue there.

For integers, PostgreSQL goes the exception route, but others (e.g. Spark IIRC) proceed with execution in terms of default behavior. Applications involving numerics (custom fixed point data etc.) may prefer the latter default behavior and handle such situations downstream. The entire family of streaming use cases is also another example where exception behavior is problematic.

If we make nullable divisions a configurable behavior for integer types and act according to user's preferences, I think everyone will be happy.

tustvold · 2023-07-17T12:30:03Z

I would be fine with making this configurable and use the nullif kernel as suggested.

That being said it does seem inconsistent to special case integer division by zero and not all the other sources of errors in DF, from casting to arithmetic overflow. I wonder if there is a higher level issue if not producing errors is important? FWIW in ANSI mode, which I think is what we should be aiming to replicate, Spark will return errors.

ozankabak · 2023-07-17T12:57:20Z

That being said it does seem inconsistent to special case integer division by zero and not all the other sources of errors in DF, from casting to arithmetic overflow.

This is true. We will probably have a small family of flags to control these (adding them as we make progress in the project and needs arise) and have an accompanying general (exception-vs-special-value) flag to collectively handle them. Most users will simply deal with that one flag and be happy. Advanced users will be able to set that one flag to establish their baseline and fine tune the behavior according to their own requirements by toggling individual flags. The experience will be in line with compilers' approach to these things so we would have POLS even for advanced users.

kazuyukitanimura · 2023-07-17T22:12:45Z

Thank you all for the reviews. I am currently testing the performance of the nullif approach.

alamb · 2023-07-18T20:49:15Z

Marking as a draft so I don't accidentally merge this PR

kazuyukitanimura · 2023-07-31T20:26:57Z

It seems no performance degradation for the nullif solution. Closing.
Thanks

Make previous unchecked division kernel available

3f01130

github-actions bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates substrait Changes to the substrait crate labels Jul 14, 2023

fix

6fcbc4f

alamb approved these changes Jul 15, 2023

View reviewed changes

alamb marked this pull request as draft July 18, 2023 20:49

kazuyukitanimura closed this Jul 31, 2023

tustvold mentioned this pull request Aug 8, 2023

Make unchecked division kernel available #6967

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make previous unchecked division kernel available #6970

Make previous unchecked division kernel available #6970

Uh oh!

kazuyukitanimura commented Jul 14, 2023

Uh oh!

alamb left a comment

Uh oh!

tustvold commented Jul 15, 2023 •

edited

Loading

Uh oh!

alamb commented Jul 15, 2023 •

edited

Loading

Uh oh!

ozankabak commented Jul 17, 2023 •

edited

Loading

Uh oh!

tustvold commented Jul 17, 2023

Uh oh!

ozankabak commented Jul 17, 2023

Uh oh!

kazuyukitanimura commented Jul 17, 2023

Uh oh!

alamb commented Jul 18, 2023

Uh oh!

kazuyukitanimura commented Jul 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Make previous unchecked division kernel available #6970

Make previous unchecked division kernel available #6970

Uh oh!

Conversation

kazuyukitanimura commented Jul 14, 2023

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

tustvold commented Jul 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb commented Jul 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ozankabak commented Jul 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tustvold commented Jul 17, 2023

Uh oh!

ozankabak commented Jul 17, 2023

Uh oh!

kazuyukitanimura commented Jul 17, 2023

Uh oh!

alamb commented Jul 18, 2023

Uh oh!

kazuyukitanimura commented Jul 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tustvold commented Jul 15, 2023 •

edited

Loading

alamb commented Jul 15, 2023 •

edited

Loading

ozankabak commented Jul 17, 2023 •

edited

Loading