Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: extensible function registry #2699

Merged
merged 12 commits into from
Feb 29, 2024
Merged

Conversation

universalmind303
Copy link
Contributor

@universalmind303 universalmind303 commented Feb 23, 2024

I didn't want to create 1 massive undecipherable PR, so I wanted to submit this first.

This ties a FunctionRegistry instance to a session instead of the previously global FUNCTION_REGISTRY.

Right now, the entries are written to both the catalog and the function registry.

I'd prefer writing to just one or the other, but havent quite figured that out yet as our udfs are not serializable so they cant be written directly to the catalog.

Some other the notable changes so far

  • some catalog changes to allow for adding new user defined functions to the catalog.
  • internally to expose a new method register_function on the session.
  • rework cosine_similarity to the new datafusion udf impl pattern for easier usage with the new register_function method.

I manually tested this by removing cosine_similarity from the builtin functions and registering it manually

let mut sess = engine
    .new_local_session_context(SessionVars::default(), SessionStorageConfig::default())
    .await?;

let udf = Arc::new(CosineSimilarity::new());
sess.register_function(udf).await?;

This works as expected in local sessions only. We'll have to modify the planner to handle udfs in a hybrid context at a later date.


Next steps.

I'm going to try to integrate the work I did with this datafusion <--> polars POC in a subsequent PR.

https://github.com/universalmind303/polar-fusion

@universalmind303 universalmind303 changed the title wip: extensible function registry feat: extensible function registry Feb 23, 2024
Copy link
Collaborator

@tychoish tychoish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems reasonable.

  • this needs automated tests.
  • should our other functions migrate over as well?

crates/protogen/src/metastore/types/service.rs Outdated Show resolved Hide resolved
crates/sqlbuiltins/src/functions/mod.rs Show resolved Hide resolved
@universalmind303
Copy link
Contributor Author

this needs automated tests.

There's really nothing to test so far that's not already covered by all of tests. This is all internal. If things were to have broke, our existing tests would have caught known any regressions.

should our other functions migrate over as well?
yeah we have an issue open for that already.

@universalmind303 universalmind303 changed the title feat: extensible function registry refactor: extensible function registry Feb 26, 2024
Copy link
Member

@scsmithr scsmithr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add an SLT for the cosine similarity function just to make sure it's all wired up correctly (even if it needs to be skipped for rpc/flight).

@universalmind303
Copy link
Contributor Author

Could we add an SLT for the cosine similarity function just to make sure it's all wired up correctly (even if it needs to be skipped for rpc/flight).

@scsmithr we already have a bunch of tests for it. testdata/sqllogictests/functions/cosine_similarity.slt

@tychoish
Copy link
Collaborator

Let's hold off on merging this until we decide if #2715 is worth cutting a quick 0.9.1 over.

@greyscaled
Copy link
Contributor

ps: this is cool!

@universalmind303 universalmind303 merged commit 6a03e0e into main Feb 29, 2024
25 checks passed
@universalmind303 universalmind303 deleted the universalmind303/ffi-udfs branch February 29, 2024 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants