-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: several example datasets broken on postgres #8792
Comments
I tried this out on 8.0.0 and it's also broken there, so not a new regression related to the sqlglot shift. |
I was poking at this a little bit, just trying to narrow down the problem. The immediate cause appears to be the pyarrow CSV reader -- if the backend isn't one of duckdb or polars, then we load the csv using pyarrow.csv, then make it into a memtable, then pass it to
I will note that this impacts most SQL backends that have strong opinions about types (so all of them except sqlite). |
@gforsyth Thanks for the pointer! I dug a little deeper into what you had said above and I believe the issue arises when there are null values in a column. In the Can we replace these NaN values with None/NULL? I wasn't able to test all of the backends (because I have trouble setting some of them up) but this is working for bigquery, clickhouse, dask, datafusion, pandas, sqlite, trino. Broken in mysql, pyspark. I can raise a PR to fix this if we're on the same page. |
I think it's definitely fine to convert NaNs to NULL -- the important thing here is for the backends to load some example data, so anything in service of that seems good to me! |
What happened?
penguins
is also broken. Forstarwars
, at least, the issue is that theheight
column is defined asbigint
when it should definitely be a float.What version of ibis are you using?
main @ b3e27eb
What backend(s) are you using, if any?
postgres
Relevant log output
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: