-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Is your feature request related to a problem or challenge?
Filing on behalf of @colinmarc and @drin so we can get more community help
It is much easier for table providers such as the build in parquet one as well as iceberg to push down <col> op <constant> expressions. They often don't push the predicates down if they have a scalar function wrapping them
For example using the predicate
date_trunc(part, column) <= constant_rhs Will not typically be able to be pushed down
However, this equivalent expression can be
column <= date_trunc(part, date_add(constant_rhs, INTERVAL 1 part)Describe the solution you'd like
It would be good if DataFusion could do this type of rewrite
Describe alternatives you've considered
This seems to fit well into the existing simply expressions framework:
https://github.com/apache/datafusion/tree/main/datafusion/optimizer/src/simplify_expressions
And we already do something very similar for casts in unwrap casts: https://github.com/apache/datafusion/blob/e12ef3ae90677fe4b1bc548feea2b3082eecdaa2/datafusion/optimizer/src/simplify_expressions/unwrap_cast.rs#
For example
CAST(x AS float) = 5is rewritten to
x = CAST(5 as float)The code for that is here:
Perhaps we can follow a similar model for date_trunc
Note There is already a ScalarUDFImpl::simplify for simplifying functions, however that method doesn't get any part of the larger expression (the = in the above code) so we might have to extend the API somewhat
Additional context
No response