Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plugable handler for CREATE FUNCTION #9333

Merged
merged 25 commits into from
Mar 5, 2024

Conversation

milenkovicm
Copy link
Contributor

@milenkovicm milenkovicm commented Feb 24, 2024

Which issue does this PR close?

Closes #9332.

Rationale for this change

explained in #9332

What changes are included in this PR?

this is minimum viable product, with some open question which i hope to resolve in the process of PR review (note TODOs)

Are these changes tested?

There is a test in place but would add more if community aggress to accept the change.
Usage example can be found at https://github.com/milenkovicm/torchfusion

Are there any user-facing changes?

There are using facing changes, but they are backward compatible. Would update docs if community aggress to accept the change.

A problem I see is from perspective of user experience, as generic parser does not support CREATE FUNCTION statement, user should change parser configuration to PostgreSQL. Maybe it would make sense after this change is merged adding CREATE FUNCTION to generic flavour

@github-actions github-actions bot added sql logical-expr Logical plan and expressions core Core datafusion crate labels Feb 24, 2024
@milenkovicm milenkovicm changed the title Add plugable function factory Add plugable handler for CREATE FUNCTION Feb 24, 2024
@milenkovicm
Copy link
Contributor Author

Drop function will be added as well

@alamb
Copy link
Contributor

alamb commented Feb 28, 2024

Hi @milenkovicm -- I plan to review this PR shortly

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR @milenkovicm -- I think the idea and the PR is really neat and the code was easy to follow. ❤️

I left some API suggestions, and comments in regards to your questions. Let me know what you think.

datafusion/core/src/execution/context/mod.rs Outdated Show resolved Hide resolved
datafusion/core/src/execution/context/mod.rs Outdated Show resolved Hide resolved
datafusion/expr/src/logical_plan/statement.rs Outdated Show resolved Hide resolved
datafusion/core/src/execution/context/mod.rs Outdated Show resolved Hide resolved
datafusion/core/tests/sql/sql_api.rs Outdated Show resolved Hide resolved
@@ -39,6 +52,99 @@ async fn unsupported_ddl_returns_error() {
ctx.sql_with_options(sql, options).await.unwrap();
}

struct MockFunctionFactory;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very cool -- I think it would be great (as a follow on PR) if we turned this into an example (with docs, etc like the others in https://github.com/apache/arrow-datafusion/tree/main/datafusion-examples/examples)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this functionality looks important and definitely needs to be covered with docs, examples in following PR, looking forward for it

datafusion/core/tests/sql/sql_api.rs Outdated Show resolved Hide resolved
@milenkovicm
Copy link
Contributor Author

Comments totally make sense thanks @alamb,
it would involve a bit of creativity to nail the trait interface, let's try. I'm a bit tied up over next few days will follow up over the weekend probably

@milenkovicm
Copy link
Contributor Author

milenkovicm commented Feb 29, 2024

Items to follow up (new issues or discussion)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very 👨‍🍳 👌 nice @milenkovicm

datafusion/core/src/execution/context/mod.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @milenkovicm -- I think this PR is almost good to go from my perspective. I had a few minor comments, but nothing major.

I think we should have at least one other maintainer give it a look. Perhaps @comphead or @universalmind303 ?

One interesting implication of this feature is that once we migrate all functions to be "udfs" (e.g. #8045) it means we can now drop functions that weren't explicitly made with CREATE FUNCTION

For example, on this branch I can drop the abs function:

DataFusion CLI v36.0.0
❯ DROP FUNCTION abs;
0 rows in set. Query took 0.009 seconds.

Although I do get an appropriate error when I try to drop a function that is still a BuiltInScalarFunction like add:

❯ DROP FUNCTION add;
Execution error: Function does not exist

People could

datafusion/core/src/execution/context/mod.rs Outdated Show resolved Hide resolved
datafusion/expr/src/logical_plan/ddl.rs Outdated Show resolved Hide resolved
datafusion/expr/src/logical_plan/ddl.rs Show resolved Hide resolved
datafusion/expr/src/logical_plan/ddl.rs Show resolved Hide resolved
datafusion/sql/src/statement.rs Outdated Show resolved Hide resolved
datafusion/sql/src/statement.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @milenkovicm -- the only thing left from my perspective is to "error on multi-part identifiers" which I think we could do as a follow on PR if needed.

However, I do think we should leave this open for a few more days to let other people have a chance to comment on it before we merge.

Thanks again. This is a really neat feature that I think will let people build all sorts of cool things using DataFusion 🙏

datafusion/core/src/execution/context/mod.rs Show resolved Hide resolved
datafusion/core/src/execution/context/mod.rs Outdated Show resolved Hide resolved
datafusion/expr/src/logical_plan/ddl.rs Show resolved Hide resolved
datafusion/expr/src/logical_plan/ddl.rs Show resolved Hide resolved
@milenkovicm
Copy link
Contributor Author

Thanks again @milenkovicm -- the only thing left from my perspective is to "error on multi-part identifiers" which I think we could do as a follow on PR if needed.

However, I do think we should leave this open for a few more days to let other people have a chance to comment on it before we merge.

Thanks again. This is a really neat feature that I think will let people build all sorts of cool things using DataFusion 🙏

@alamb multi-part identifiers fixed as well, please have a look.

also, can you please let me know once new version of sqlparser lands, will remove dialect configuration from test.

thanks a lot for all your help

@milenkovicm
Copy link
Contributor Author

and please squash commit on merge, thanks

@alamb
Copy link
Contributor

alamb commented Mar 2, 2024

and please squash commit on merge, thanks

Yes, this will be done (as on all commits)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @milenkovicm

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @milenkovicm
I would say it is nice to get also tests what will happen if the function is invalid/not compilable or something

another question if function can call another function?

@milenkovicm
Copy link
Contributor Author

Thanks @milenkovicm I would say it is nice to get also tests what will happen if the function is invalid/not compilable or something

another question if function can call another function?

Thanks for your comments @comphead ,
can you please elaborate your comment, thanks

@comphead
Copy link
Contributor

comphead commented Mar 2, 2024

Thanks @milenkovicm I would say it is nice to get also tests what will happen if the function is invalid/not compilable or something
another question if function can call another function?

Thanks for your comments @comphead , can you please elaborate your comment, thanks

Its great to have a test with invalid function, the function which is not parseable or compilable.

And the next question if the user created functions f1 and f2, can f1 be called inside f2?

milenkovicm and others added 16 commits March 5, 2024 09:07
... `DROP FUNCTION` will look for function name
in all available registries (udf, udaf, udwf).

`remove` may be necessary if UDaF and UDwF do not
get `simplify` method from apache#9304.
FunctionDefinition already exists, DefinitionStatement makes more sense.
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…ns.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@alamb
Copy link
Contributor

alamb commented Mar 5, 2024

I merged up from main to pick up a fix for the CI (I hope 🤞 )

@milenkovicm
Copy link
Contributor Author

Thanks @alamb

@alamb
Copy link
Contributor

alamb commented Mar 5, 2024

Merged up from main again to resolve conflicts

@alamb alamb merged commit ea01e56 into apache:main Mar 5, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented Mar 5, 2024

Thanks again @milenkovicm -- this is a pretty epic feature

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core datafusion crate logical-expr Logical plan and expressions sql sqllogictest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plugable handler for CREATE FUNCTION
4 participants