Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(polars): add limited support for table dot sql #8528

Merged
merged 2 commits into from
Mar 20, 2024

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Mar 3, 2024

Adds limited support for .sql to the Polars backend. Closes #8525.

@cpcloud cpcloud added this to the 9.0 milestone Mar 3, 2024
@cpcloud cpcloud added feature Features or general enhancements polars The polars backend labels Mar 3, 2024
@cpcloud cpcloud marked this pull request as draft March 3, 2024 12:44
@lostmygithubaccount
Copy link
Member

this at least works for the simple example I want to use in docs:

[ins] In [1]: import ibis

[ins] In [2]: ibis.options.interactive = True

[ins] In [3]: ibis.set_backend("polars")

[ins] In [4]: t = ibis.examples.penguins.fetch()

[ins] In [5]: t
Out[5]:
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsexyear  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64int64int64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ AdelieTorgersen39.118.71813750male2007 │
│ AdelieTorgersen39.517.41863800female2007 │
│ AdelieTorgersen40.318.01953250female2007 │
│ AdelieTorgersenNULLNULLNULLNULLNULL2007 │
│ AdelieTorgersen36.719.31933450female2007 │
│ AdelieTorgersen39.320.61903650male2007 │
│ AdelieTorgersen38.917.81813625female2007 │
│ AdelieTorgersen39.219.61954675male2007 │
│ AdelieTorgersen34.118.11933475NULL2007 │
│ AdelieTorgersen42.020.21904250NULL2007 │
│ …       │ …         │              … │             … │                 … │           … │ …      │     … │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

[ins] In [6]: t.sql("SELECT species, island, count(*) AS count FROM penguins GROUP BY species, island")
Out[6]:
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━┓
┃ speciesislandcount  ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━┩
│ stringstringuint32 │
├───────────┼───────────┼────────┤
│ ChinstrapDream68 │
│ AdelieTorgersen52 │
│ GentooBiscoe124 │
│ AdelieBiscoe44 │
│ AdelieDream56 │
└───────────┴───────────┴────────┘

@cpcloud
Copy link
Member Author

cpcloud commented Mar 3, 2024

@lostmygithubaccount That only works because the examples fetch method creates a table/view named penguins. If you call .alias(something that isn't "penguins") you won't be able to refer to that new name in subsequent SQL strings.

@cpcloud
Copy link
Member Author

cpcloud commented Mar 3, 2024

I guess for the docs example it's fine. It might be good to call out the lack of SQL support in .sql for Polars in the docs somewhere.

@lostmygithubaccount
Copy link
Member

yeah it's fine for the docs...if users run into issues we can point to the Polars issue(s) on SQL support

@cpcloud
Copy link
Member Author

cpcloud commented Mar 6, 2024

Ugh, this fails because Polars doesn't respect/allow/handle quoted identifiers.

@cpcloud
Copy link
Member Author

cpcloud commented Mar 6, 2024

CTEs are implemented but a table with name "t" somehow doesn't equal a table with name t.

@cpcloud cpcloud marked this pull request as ready for review March 12, 2024 11:01
@cpcloud cpcloud force-pushed the polars-dot-sql branch 3 times, most recently from 1c4306b to 8a868a3 Compare March 13, 2024 11:23
Copy link
Member

@lostmygithubaccount lostmygithubaccount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want to have a proper engineer review the code but looks fine to me :)

@cpcloud cpcloud requested review from kszucs and gforsyth March 13, 2024 13:22
Copy link
Member

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, question about dependencies (our favorite!)

ibis/backends/polars/__init__.py Show resolved Hide resolved
@cpcloud cpcloud requested a review from gforsyth March 19, 2024 13:48
@cpcloud cpcloud enabled auto-merge (squash) March 19, 2024 13:51
@cpcloud cpcloud added the sql Backends that generate SQL label Mar 19, 2024
@cpcloud cpcloud disabled auto-merge March 19, 2024 13:52
Copy link
Member

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few thoughts, but they can be done in a follow-up:

  1. I think we should have some kind of try except import check, so we can raise a helpful error message since a user using the polars backend might get thrown by having an "unrelated" import error for psycopg2.
  2. Because psycopg2 can be a pain to install (for pip users, at least), does this have to make use of the postgres compiler? Would duckdb work? Because duckdb would always be installed anyway. And that would remove the need for point 1 above, now that I'm thinking about it.

@cpcloud
Copy link
Member Author

cpcloud commented Mar 19, 2024

The psycopg2 import should never be hit when compiling using dialect="postgres".

Given that we're not installing the postgres extra in the polars CI job, I think we can be reasonably sure we don't have an implicit dependency on it for this functionality.

With that, is there anything to follow up with?

@gforsyth
Copy link
Member

You've got an XPASS in the exasol test suite, but other than that :shipit:

@cpcloud cpcloud enabled auto-merge (squash) March 20, 2024 15:24
@cpcloud cpcloud merged commit b2a4fbb into ibis-project:main Mar 20, 2024
82 checks passed
@cpcloud cpcloud deleted the polars-dot-sql branch March 20, 2024 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements polars The polars backend sql Backends that generate SQL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: support table.sql() for Polars
3 participants