-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat: Add ansi enable parameter for execution config
#18635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@Omega359 FYI |
datafusion/common/src/config.rs
Outdated
| /// semantics for expressions, casting, and error handling. This includes: | ||
| /// - Strict type coercion rules: implicit casts between incompatible types are disallowed. | ||
| /// - Standard SQL arithmetic behavior: operations like division by zero or overflow | ||
| /// result in runtime errors instead of returning `NULL` or silently truncating values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) In abs(), the value that results in overflow is returned as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hsiang-c let me think how to rephrase it better
hsiang-c
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we leave a note saying it has experimental support, considering it will need to be gradually introduced to functions?
Especially as I think default behaviour for some DataFusion functions lines up more with true than false (e.g. abs overflow will return compute error which is the true behaviour described here)
Thanks @Jefffrey I think I overwrote those changes saying this is experimental and relevant for Spark crate only, redoing it |
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next steps would be to create an issue to retrofit this config onto existing Spark functions 👍
Thats what @hsiang-c is working on with |
|
Just saw this. I was thinking of a 'spark' extension option to handle this myself. |
Could you please elaborate? Do you mean to have a set of spark related config options? |
Essentially yes. Spark seems to have a - let's put this politely - abundance of configuration options. I'm sure more than just the ansi one will apply to DF spark functions at some point. The only real concern I have with having an I'm not against that at all but it's a concern for me because I have my doubts DF itself would (or perhaps even should?) support anything but what spark deems strict ansi mode (ansi = true). It's a topic that likely should get much more community involvement if this was to be used outside of the spark crate. |
Which issue does this PR close?
ansienable parameter for execution config #18634 .Rationale for this change
Adding a ansi flag to expand coverage for Spark built in functions, Spark 4.0 sets ansi mode as true by default.
Currently the flag is planned to be used to
datafusion-sparkcrate via ScalarConfigArgs, however it can also be used for DF if ansi mode is in the roadmapWhat changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?