-
-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Ibis Backend #1105
Comments
@cosmicBboy Hey 👋🏻! This looks pretty interesting! Is there anything we can do over in ibis to help enable this? Happy to help! |
Thanks @cpcloud ! The pandera internals re-write is still happening (the last PR should be merged soon #1109), after which I'm gonna start chipping away at a
At this stage some conceptual help would be much appreciated! The main uncertainty in my mind is how well the current pandera abstractoins fit into ibis. It would be awesome if the ibis team can take a look at the Schema and Schema Components classes described here and answer this high-level question:
For example: pandas -> ibis
And a follow-up to this would be:
And finally, because pandera relies a lot on user-defined validation Checks:
|
@gforsyth this tracks the ibis integration! I'll circle back when I have capacity to get started on an integration in earnest |
Hello all, I and my team are working on a couple of data projects and for quite a few months we have been using Lately, we have moved to sourcing all our tables from a data warehouse and also our data size has grown (some tables have >50 columns ranging from 100-250 GB). And since we did not want to refactor most of our transformation steps, earlier written in We did try and explore some hand-rolled alternatives by implementing a thin wrapper around Then, to my surprise, I found this thread ! I know that a few core developers from the
Nonetheless, I am eagerly waiting for this feature to be rolled out so that my team can get their hands on it. P.S. I am not an expert in Thanks ! |
Yes!
Also yes! At least, the goal is to offload as much computation as possible to the Ibis backend.
I will need to look into this; I just tested using the DuckDB backend for the example in #1451, and Hope that helps a bit! And sorry I have left the Ibis backend work in a dangling state; number of priorities keep coming up, but I do hope to resume progress on it soon! |
Thanks a lot @deepyaman for your insights ! Looking forward 🤞🏽 😃 |
Is your feature request related to a problem? Please describe.
Pandera currently doesn't support validating data in a persistent datastore (e.g. MySQL, Postgres, etc). It would benefit users to be able to write pandera schemas that can then be compiled to a query language (like SQL), executed on a remote DB, that either:
A high-leverage integration to enable this behavior would be with ibis, a data analytics framework that hooks into various backends (duckdb, mysql, postgres, etc).
Describe the solution you'd like
For the MVP integration with ibis:
Describe alternatives you've considered
NA
The text was updated successfully, but these errors were encountered: