Skip to content

Conversation

milenkovicm
Copy link

@milenkovicm milenkovicm commented Oct 13, 2025

Which issue does this PR close?

Closes #1270 .

Rationale for this change

Expose DataFrame.select_exprs method, which is supported in DataFusion, this method is similar to pyspark.select_expr

Method will parse string expressions into logical plan expressions, making easier to create select statements when expressions are involved

df_3 = df.select_exprs(
        "abs(a + b)",
        "abs(a - b)",
    )

It would be ideal if we could support expressions on df.select(...) but that change looked as a bit complicated for me.

What changes are included in this PR?

  • new method exposed
  • test to cover it

Are there any user-facing changes?

  • additional method has been added

Comment on lines 408 to 415
def select_exprs(self, *args: str) -> DataFrame:
"""Project arbitrary list of expression strings into a new DataFrame. Method will parse string expressions into logical plan expressions.
The output DataFrame has one column for each element in exprs.
Returns:
DataFrame only containing the specified columns.
"""
return self.df.select_exprs(*args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will fail the ruff linter

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed, will follow up later if it fails

Copy link
Member

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. We need CI to show green and then I'll merge.

@milenkovicm
Copy link
Author

thanks @timsaucer

@timsaucer
Copy link
Member

I fixed the lint error. Thank you for the PR! I will merge as soon as I see CI finish.

@milenkovicm
Copy link
Author

you were three minutes faster than me fixing it :) thanks @timsaucer
as a follow up I filled #1273

maybe we could expose

DataFrame.parse_sql_expr(e)

as a first step, not sure what do you think ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose select_exprs on DataFrame

2 participants