Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Allow to pass in a string as filter #6657

Merged
merged 8 commits into from
Feb 7, 2023

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Jan 24, 2023

Often I have to look up the name of the operator, I think it would be nice to allow the end user to provide a string that we'll parse with the excellent parser that we already have.

Works both:

scan = table.scan(
    row_filter=GreaterThanOrEqual("trip_distance", 10.0),
)

# Or filter using a string predicate
scan = table.scan(
    row_filter="trip_distance > 10.0",
)

Often I have to look up the exact of the operator, I think it would
be nice to allow the end user to provide a string that we'll parse
with the excellent parser that we already have.
@Fokko Fokko added this to the Python 0.4.0 release milestone Jan 25, 2023
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit comment, overall this makes sense to me, it would be really nice to just pass in a string predicate. Thanks @Fokko !

Comment on lines 195 to 198
if row_filter is None:
self.row_filter = AlwaysTrue()
else:
self.row_filter = parser.parse(row_filter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe we could use a ternary operator here?

self.row_filter = parser.parse(row_filter) if row_filter else AlwaysTrue()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, I've split it into a separate method for now, that also takes care of the parsing.

@@ -232,6 +233,9 @@ def handle_or(result: ParseResults) -> Or:
).set_name("expr")


def parse(expr: str) -> BooleanExpression:
def parse(expr: Union[str, BooleanExpression]) -> BooleanExpression:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why you'd do this, but it seems better to make the caller check before calling parse. It's a bit confusing to accept an expression here and do nothing.

@rdblue rdblue merged commit 7dc4800 into apache:master Feb 7, 2023
@rdblue
Copy link
Contributor

rdblue commented Feb 7, 2023

Thanks for the update, @Fokko! Looks good now so I merged.

krvikash pushed a commit to krvikash/iceberg that referenced this pull request Mar 16, 2023
@Fokko Fokko deleted the fd-string-expression branch May 18, 2023 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants