Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: different behavior of StringConcat with NULLs depending on backend #8302

Closed
1 task done
NickCrews opened this issue Feb 10, 2024 · 1 comment
Closed
1 task done
Labels
bug Incorrect behavior inside of ibis

Comments

@NickCrews
Copy link
Contributor

NickCrews commented Feb 10, 2024

What happened?

import ibis

NULL = ibis.literal(None, type=str)
assert ibis.duckdb.connect().execute(NULL + NULL) == ""
assert ibis.duckdb.connect().execute(NULL + "abc") == "abc"

assert ibis.polars.connect().execute(NULL + NULL) == None
assert ibis.polars.connect().execute(NULL + "abc") == None

assert ibis.sqlite.connect().execute(NULL + NULL) == None
assert ibis.sqlite.connect().execute(NULL + "abc") == None

# TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'
# assert ibis.pandas.connect().execute(NULL + NULL) == None
# assert ibis.pandas.connect().execute(NULL + "abc") == None

# in DuckDB, NULL propagation depends on the exact operator used.
# postgres has the same semantics as DuckDB.
# sqlite only has the || operator, which always propagates NULL.
import duckdb

conn = duckdb.connect()
assert conn.execute("SELECT NULL::TEXT || NULL::TEXT;").fetchone()[0] == None
assert conn.execute("SELECT NULL::TEXT || 'abc'::TEXT;").fetchone()[0] == None
assert conn.execute("SELECT CONCAT(NULL::TEXT, NULL::TEXT);").fetchone()[0] == ""
assert conn.execute("SELECT CONCAT(NULL::TEXT, 'abc'::TEXT);").fetchone()[0] == "abc"


# pandas always propagates NULL
import pandas as pd

a = pd.Series([None, "abc"], dtype="string")
b = pd.Series([None, None], dtype="string")
assert (a + b).tolist() == [pd.NA, pd.NA]

# polars always propagates NULL
import polars as pl

a = pl.Series([None, "abc"])
b = pl.Series([None, None])
assert (a + b).to_list() == [None, None]

I haven't tested other backends so IDK what they do.

This is currently the cause of the failing duckdb test case in #8270, where currently ibis.literal(None, "string").capitalize() results in "", not NULL.

We need to

  1. fix duckdb
  2. add pandas implementation
  3. add some test cases
  4. add docstring examples to show it off.

What version of ibis are you using?

8.0.0

What backend(s) are you using, if any?

No response

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the bug Incorrect behavior inside of ibis label Feb 10, 2024
@NickCrews NickCrews changed the title bug: Specify behavior of StringConcat with NULLs bug: different behavior of StringConcat with NULLs depending on backend Feb 10, 2024
NickCrews added a commit to NickCrews/ibis that referenced this issue Feb 10, 2024
partially fixes ibis-project#8302.
Still need to make pandas actually handle NULLs at all.
kszucs pushed a commit that referenced this issue Feb 12, 2024
@cpcloud
Copy link
Member

cpcloud commented Feb 12, 2024

Fixed by #8305.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants