Skip to content

Conversation

@findepi
Copy link
Member

@findepi findepi commented Sep 29, 2025

This allows writing a function that accepts &[Expr] or &[&Expr], thus allowing less cloning when inspecting expression trees.

This allows writing a function that accepts `&[Expr]` or `&[&Expr]`,
thus allowing less cloning when inspecting expression trees.
Clone lazily, only when needed
@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate labels Sep 29, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just feels wrong to me because if Rust meant to implict allow converting a type to a reference the would have built it into the language

I googled around , however, and didn't find many compelling reasons, and even found a few where it looked like maybe it would be automatically implemented but couldn't for some other reasons

However, given this PR avoids some clones and I don't think there is anything wrong , I think it could be merged as is

I have an alternate suggestion which avoids the clone() as well here:

But realistically if I am honest that is a matter of preference

Thank you @findepi

// If the filters can be resolved using only partition cols, there is no need to
// pushdown it to TableScan, otherwise, `unhandled` pruning predicates will be generated
let (partition_filters, filters): (Vec<_>, Vec<_>) =
// TODO avoid .cloned() here by accepting &[&Expr] or equivalent downstream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean to do this in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Should i revert?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, that is fine, I was just confused

// Split the parent predicate into individual conjunctive parts.
let predicates = parent_predicate
.map_or_else(Vec::new, |pred| split_conjunction_owned(pred.clone()));
let predicates =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this certainly looks nicer but it seems like the expressions are still cloned at the end. Though maybe in this case only the conjunctions are cloned (not the BinaryExprs with AND) so it would be an improvement

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this certainly looks nicer but it seems like the expressions are still cloned at the end.

Correct. The intention was to delay cloning, but I admit the Avoid clones in push_down_join commit is not ideal. I realized that we avoid cloning only in case there wasn't much to clone in the first place. I will revert the commit.

What I wanted to show with this commit is that implementing AsRef<Expr> for Expr allows writing APIs that are impossible to model otherwise. Note also that downstream projects have no way to add such impl or workaround lack of it.

@findepi
Copy link
Member Author

findepi commented Sep 30, 2025

I have an alternate suggestion which avoids the clone() as well here:

This works fine today, but the IntoIterator-based API is significantly less powerful.
You can iterate only once (so eg you cannot call 2+ downstream functions that you want to pass the collection into) and you cannot check the length upfront (eg to size the buffer or check two params are same length).

This reverts commit d886be5.
Revert per PR discussion. The change didn't buy much.
@github-actions github-actions bot removed optimizer Optimizer rules core Core DataFusion crate labels Sep 30, 2025
@alamb
Copy link
Contributor

alamb commented Sep 30, 2025

I have an alternate suggestion which avoids the clone() as well here:

This works fine today, but the IntoIterator-based API is significantly less powerful. You can iterate only once (so eg you cannot call 2+ downstream functions that you want to pass the collection into) and you cannot check the length upfront (eg to size the buffer or check two params are same length).

Yeah, I am fine with this change

(though note #17831 the additional flexibility of passing in a slice is not actually used)

@findepi
Copy link
Member Author

findepi commented Sep 30, 2025

merging per #17831 (comment)

@findepi findepi added this pull request to the merge queue Sep 30, 2025
Merged via the queue into apache:main with commit 0104822 Sep 30, 2025
28 checks passed
@alamb
Copy link
Contributor

alamb commented Oct 1, 2025

Thanks @findepi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants