New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Add scalar UDF #6862
Comments
@gforsyth what do you think of this syntax? |
Here is sqlite3's syntax as another reference point! I have not used it though... |
Yea I need to look at the binding code for varargs, we support it for other scalar functions, just not familiar with the internals so I can't envision it. But we can probably support varargs. I also had some ideas for extra options, null-handling is one, and exception handling (return null or rethrow?) |
The |
@cpcloud We can definitely support varargs, in which case it's not required to provide the arguments explicitly. |
In this implementation, what object types are passed to the function when called? Scalars (an |
+1 from me on that too. I'd strongly prefer pyarrow objects, since they can easily be converted to pandas objects with a single method call. |
This first implementation will not be, the idea here is that when you need a UDF, you likely don't need to do operations that can efficiently be done on pyarrow/pandas/numpy, as those are all database operations - which could just as easily be done with SQL. But we could definitely add something like a |
I disagree with this. In python there are many things that operate on those containers that aren't necessarily database operations. For our use case, we'd often want to be hooking in functionality that's already inherently vectorized, but not expressible using duckdb native operations. For example, calling |
That makes sense, I'll adjust the focus to the vectorized pyarrow-backed version instead then 👍 |
We want to add the ability to register UDFs in the python client.
Proposed syntax:
The text was updated successfully, but these errors were encountered: