New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: many Operations can't handle NULL optional arguments #8833
Comments
None indicates that the argument wasn't provided, NULL is something else. We should avoid conflating these two uses. |
The failures notwithstanding, the duckdb behavior looks correct to me in this case: null inputs produce null outputs, NULL doesn't mean "no argument". The representation of "argument not provided" is unrelated to whether an input is NULL. |
hmmm, I think you are right. We are victims of the SQL standard here. I don't really like this footgun:
|
I think there are other approaches to the problem. I'll state the problem so that we have it written somewhere. Ibis uses Users explicitly passing Since |
One approach that will break user code, but is probably the right approach IMO is to use some other sentinel value to mean "argument wasn't provided". Bit of a hack, but fairly common is to create a dummy object like |
I find the NO_ARGUMENT solution fairly good, I think it might be worth going through a deprecation cycle to get this right. |
wait, would this then mean that for |
What happened?
consider the implementation of
This allows you to leave out the length argument, meaning "until the end of the string". But this is only represented on the python side. If the
length
is evaluated at runtime to be NULL, then this errors on some backends, like postgres, or gives the wrong result on duckdb.For example,
ibis.literal("abcde").substr(2, ibis.literal(1).nullif(1))
results in NULL in duckdb, but I would expect it to be "cde".During compilation, we are naive and do the NULL checking only on the python side:
I discovered this when adding more tests to #8832 .
I think what we should do is make Substring more like
and then use sql CASE statements if we can't determine the nullness statically.
but I wasn't able to get this to work. Does this seem like the right direction? Any tips on what to do here?
What version of ibis are you using?
main
What backend(s) are you using, if any?
No response
Relevant log output
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: