New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Async UDFs #483
Add Async UDFs #483
Conversation
a510b56
to
0fb3132
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple stylistic things I'd like to see addressed.
This query is panicking in the planner: create table logs (
ip TEXT,
city TEXT GENERATED ALWAYS AS (get_city(ip))
) with (
connector = 'sse',
endpoint = 'http://127.0.0.1:9563/sse',
format = 'json'
);
select city
from logs; with
It also panics (in a different way) if you move the computation into the query: create table logs (
ip TEXT
) with (
connector = 'sse',
endpoint = 'http://127.0.0.1:9563/sse',
format = 'json'
);
select get_city(ip)
from logs;
It works if you alias it ( |
Also panics if you try to group by an async udf: select
get_city(logs.ip) as city,
count(*)
from logs
group by 1; or use it in a WHERE clause: select
get_city(logs.ip) as city
from logs
where get_city(logs.ip) = 'San Francisco';
|
arroyo-datastream/src/lib.rs
Outdated
output_assignments: String, | ||
null_handlers: String, | ||
return_nullable: bool, | ||
timeout_seconds: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a Duration to avoid unit issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so because it gets deserialized from the toml in the UDF definition.
This is really great! I'm so excited to see how people use this. |
665094f
to
9085f50
Compare
Add a new 'AsyncMapOperator' that applies an async UDF to its input. Async UDFs use the existing interface for defining the UDF and are called in SQL the same as non-async UDFs. Options are specified as a TOML configuration block in the special comment with the dependencies.
Add a new 'AsyncMapOperator' that applies an async udf to its input.
Async UDFs use the existing interface for defining the UDF and are called in SQL the same as non-async UDFs. Options are specified as a TOML configuration block in the special comment with the dependencies.
Here's an example of an Async UDF:
which can be used in a query like this:
I've left these things for a future commit: