Draft: Extend Expr::ScalarUDF to support Expr for ScalarUDF #8222

2010YOUY01 · 2023-11-15T20:16:26Z

Which issue does this PR close?

POC for #8157

Rationale for this change

The rationale is described in #8180
This PR is just used to demonstrate approach 1 in #8180

What changes are included in this PR?

When Expr for functions is created using call_fn(), it first stores a ScalarUDF with only name and stub implementation inside
During 1st analyzing rule, resolve name-only UDF into the implementation from the function registry

This approach requires less change to existing code.
(I'm wondering do you have any thoughts on this one? @alamb )

Are these changes tested?

Are there any user-facing changes?

alamb · 2023-11-15T21:05:41Z

datafusion/expr/src/expr_fn.rs

-        Err(e) => Err(e),
+        Err(_) => {
+            // Constructing a `ScalarUDF` with only name and stub impl/return_type...
+            // This unresolved UDF will be resolved during analyzing using registered functions


If you can make this work, it seems like a good idea to me.

However, it is not clear to me how the subsequent passes will know they have to resolve this particular scalar UDF though 🤔 so I am not sure this approach is viable

Specifically, how would the code look that the analysis pass used to test if a ScalarUDF was unresolved or not?

That's a great point, without enum the implementation seems a bit vulnerable to bugs
I think we can either make Expr::ScalarFunction or Expr::ScalarUDF a enum, since the long-term goal is to remove BuiltinScalarFunction they should be same.
I'll check which way can make future migration easier

pub enum ScalarFunctionDefinition { /// Resolved to a user defined function UDF(ScalarUDF), /// A scalar function that will be called by name Name(Arc<str>), } #[derive(Clone, PartialEq, Eq, Hash, Debug)] pub struct ScalarUDF { /// The function pub fun: ScalarFunctionDefinition, /// List of expressions to feed to the functions as arguments pub args: Vec<Expr>, }

pub enum ScalarFunctionDefinition { /// Resolved to a built in scalar function /// (will be removed long term) BuiltIn(built_in_function::BuiltinScalarFunction), /// Resolved to a user defined function UDF(ScalarUDF), /// A scalar function that will be called by name Name(Arc<str>), } #[derive(Clone, PartialEq, Eq, Hash, Debug)] pub struct ScalarFunction { /// The function pub fun: ScalarFunctionDefinition, /// List of expressions to feed to the functions as arguments pub args: Vec<Expr>, }

Demonstrate extending ScalarUDF

3be6a1a

github-actions bot added the logical-expr Logical plan and expressions label Nov 15, 2023

alamb reviewed Nov 15, 2023

View reviewed changes

2010YOUY01 closed this Nov 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Extend Expr::ScalarUDF to support Expr for ScalarUDF #8222

Draft: Extend Expr::ScalarUDF to support Expr for ScalarUDF #8222

2010YOUY01 commented Nov 15, 2023

alamb Nov 15, 2023

2010YOUY01 Nov 15, 2023

Draft: Extend Expr::ScalarUDF to support Expr for ScalarUDF #8222

Draft: Extend Expr::ScalarUDF to support Expr for ScalarUDF #8222

Conversation

2010YOUY01 commented Nov 15, 2023

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

alamb Nov 15, 2023

Choose a reason for hiding this comment

2010YOUY01 Nov 15, 2023

Choose a reason for hiding this comment