You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
It's a small issue, but in a repo that is attempting to transition from Pandas to Polars over time, there is a mix of possible Pandas and Polars dataframes of the same basic schema. Currently, it seems like I need to define two schemas for each: one for Pandas using pa.DataFrameModel, one for polars using pa.polars.DataFrameModel.
Describe the solution you'd like
Ideally, the top-level pa.DataFrameModel and pa.DataFrameSchema functions would use something like @singledispatch to delegate to the appropriate backend version based on the input dataframe. This is similar to an Ibis Table where it's rare that you actually need to go into the specific backend to request a specific function.
Describe alternatives you've considered
What I'm currently doing is just being more verbose and defining multiple schemas. It works fine! It just seems a bit strange as a workflow. Obviously if we were always in Polars it wouldn't be an issue, but that'll take a while.
The text was updated successfully, but these errors were encountered:
I've thought about this a lot, and I think we're getting closer to this world. However my main concern is that this generic dataframe schema will have to include a superset of all the options for all of the dataframes. I think eventually we'll nail down a "common dataframe schema api to rule them all", in which case this concern is less of an issue.
If folks engage with this issue (👍 or comment/discuss) we can prioritize this effort, but in the mean time @DavidSlayback if you can write down a spec for how this would all work with perhaps a code snippet sketch of how dispatching would work that would get the ball rolling.
Is your feature request related to a problem? Please describe.
It's a small issue, but in a repo that is attempting to transition from Pandas to Polars over time, there is a mix of possible Pandas and Polars dataframes of the same basic schema. Currently, it seems like I need to define two schemas for each: one for Pandas using
pa.DataFrameModel
, one for polars usingpa.polars.DataFrameModel
.Describe the solution you'd like
Ideally, the top-level
pa.DataFrameModel
andpa.DataFrameSchema
functions would use something like@singledispatch
to delegate to the appropriate backend version based on the input dataframe. This is similar to an Ibis Table where it's rare that you actually need to go into the specific backend to request a specific function.Describe alternatives you've considered
What I'm currently doing is just being more verbose and defining multiple schemas. It works fine! It just seems a bit strange as a workflow. Obviously if we were always in Polars it wouldn't be an issue, but that'll take a while.
The text was updated successfully, but these errors were encountered: