Add function origin API to replace name-based function checks in the optimizer#20868
Add function origin API to replace name-based function checks in the optimizer#20868alexandreyc wants to merge 6 commits intoapache:mainfrom
Conversation
|
This seems like a major change in the DF change. Perhaps we could break this into smaller PRs (if that it is even possible? ) |
|
Thank you for the help!
I think I suggest we first wait several days to see if there are other opinions. |
|
Thanks @2010YOUY01 for your reply. I updated the PR replacing Also, I added a default implementation for the new method that returns |
…name to be more robust
`UDFOrigin::UserDefined`
Which issue does this PR close?
Rationale for this change
See the issue.
What changes are included in this PR?
I followed the proposal made by @2010YOUY01 in #18643.
is_builtin() -> boolmethod toAggregateUDF/AggregateUDFImpland toWindowUDF/WindowUDFImplAre these changes tested?
Not directly but I can't see a relevant way to test this. Suggestions are welcome.
Are there any user-facing changes?
Currently yes, but the change could be made non-breaking, see questions below.
Request for advices
I'm new to the codebase so feel free to challenge this PR. In particular, I'd like to have your opinion on the following items:
is_builtinhave a default implementation that returnsfalse? That would make this change non-breaking for users and slightly simplify this PR. But in return it would be more error-prone when implementing built-in functions.is_builtintoScalarUDF/ScalarUPDImpl? I didn't do it because it seems there doesn't exist any scalar UDF name matching in the codebase. It might be desirable to add it for the sake of consistency across all kinds of UDF.is_builtin() -> boolbyorigin() -> UDFOrigin?UDFOriginwould be something likeenum { BuiltIn, Spark, UserDefined }. Asking because it's not clear to me if functions in thedatafusion_sparkcrates should be considered built-in or not.