Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove DataFrame assumptions from Expr #142

Closed
mrocklin opened this issue Jun 12, 2023 · 3 comments · Fixed by #470 · May be fixed by #158
Closed

Remove DataFrame assumptions from Expr #142

mrocklin opened this issue Jun 12, 2023 · 3 comments · Fixed by #470 · May be fixed by #158

Comments

@mrocklin
Copy link
Member

Eventually we'll have all sorts of Exprs like arrays, bags, etc.. Today we're mostly focused on dataframe and that's good I think.

However, even today we have non-DataFrame exprs like scalars, and the results of to_parquet calls. Maybe it's time to pull out some of the dataframe metadata assumptions like divisions and maybe even meta from the Expr class and make some new Frame class (or some better name) from which dataframe-like-things (dataframe, series, index) inherit.

@phofl
Copy link
Collaborator

phofl commented Jun 14, 2023

I think this makes sense doing now if we already run into other expressions. I can take a look at this.

@mrocklin
Copy link
Member Author

Just to make my current thinking concrete, I'm guessing that we have something like the following:

class Expr:
    def simplify(...)
    def operands(...)
    def __dask_graph__(...)
    ... def other things that have to do with graph manipulation

class Frame(Expr):
    def divisions(...)
    ...  other things that have to do with dataframe metadata
    def groupby(...)
    ... other dataframe API methods

Probably it gets more complicated once someone gets more deeply into this topic though.

@phofl
Copy link
Collaborator

phofl commented Jun 14, 2023

Yeah that was my understanding as well.

I'll take a look and see whether I can come up with something useful.

@phofl phofl mentioned this issue Jun 16, 2023
1 task
mrocklin added a commit to mrocklin/dask-expr that referenced this issue Dec 5, 2023
@phofl phofl closed this as completed in #470 Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants