-
Notifications
You must be signed in to change notification settings - Fork 704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(query): 14925, support udf wasm #15107
feat(query): 14925, support udf wasm #15107
Conversation
Signed-off-by: shamb0 <r.raajey@gmail.com>
Integration of Wasm User-Defined Functions (UDFs), Draft, Code Drop for Initial Review
Overview of the strategy used for integrating Wasm UDF (User-Defined Function) functionality. The two-stage integration process steps, which are outlined below. Compilation and Storage
Execution
TestingThe integration includes test code for the Wasm UDF functionality, located at To run the it tests, use the following CLI commands:
RUST_LOG=info \
cargo test \
--test it ddl_basic::ddl_basic_01_create_udf::test_udf_wasm_gcd
RUST_LOG=info \
cargo test \
--test it ddl_basic::ddl_basic_01_create_udf::test_udf_py_gcd
clear && \
RUST_LOG=info \
cargo test \
--test it ddl_basic::ddl_basic_01_create_udf::test_udf_js_gcd |
src/query/service/src/pipelines/processors/transforms/transform_udf_script.rs
Show resolved
Hide resolved
src/query/service/src/pipelines/processors/transforms/transform_udf_script.rs
Outdated
Show resolved
Hide resolved
…naging script execution Signed-off-by: shamb0 <r.raajey@gmail.com>
…compressed and uncompressed WASM modules Signed-off-by: shamb0 <r.raajey@gmail.com>
Hi @sundy-li , Please find quick update below ...
Here's a code snippet illustrating the issue: let blocking_operator = DataOperator::instance().operator().blocking();
blocking_operator
.write_with(&wasm_module_path, code_blob)
.content_type("application/wasm")
.call()?;
|
src/query/service/src/pipelines/processors/transforms/transform_udf_script.rs
Outdated
Show resolved
Hide resolved
Others LGTM. Let's continue:
|
@sundy-li, Thank you for your review feedback. I am currently working on understanding the logical test, and I will provide an update soon. Before proceeding with the implementation, I need to clarify the approach for preparing the "wasm module". Specifically, should we prepare the WASM module within the test application itself, or do we need to integrate it into the Physical layer under the |
we should read the wasm binary codes inside
|
Signed-off-by: shamb0 <r.raajey@gmail.com>
Hi @sundy-li I am resuming work on this task after a long break, as I was busy with my bootcamp capstone project which I completed yesterday. I would like to discuss a few things and get some clarification from you. Here's the context:
let command = format!(
r#"CREATE FUNCTION wasm_gcd (INT, INT) RETURNS BIGINT LANGUAGE wasm HANDLER = 'wasm_gcd(int4,int4)->int4' AS $${wasm_module_path}$$"#
);
log::info!("Create UDF DDL command: {}", command);
fixture.execute_command(&command).await?;
let command = format!(
r#"CREATE FUNCTION wasm_gcd (INT, INT) RETURNS BIGINT LANGUAGE wasm HANDLER = 'wasm_gcd(int4,int4)->int4' AS $${:#?}$$"#,
&code_blob
);
log::info!("Create UDF DDL command:");
fixture.execute_command(&command).await?; My questions are:
I hope I've provided enough details. Please let me know if you need any additional information that I may have missed. |
That will be better. Maybe we can support a new syntax for WASM, eg:
We need to refactor the struct UDFScript into:
During
|
Fine Thankyou |
Signed-off-by: shamb0 <r.raajey@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
Hi @sundy-li, Please find update on latest changes ...
#[derive(Clone, Debug, Hash, Eq, PartialEq, serde::Serialize, serde::Deserialize, EnumAsInner)]
pub enum UDFType {
Server(String),
Script((String, String, String)),
WasmScript((String, String, Vec<u8>)),
}
|
Signed-off-by: shamb0 <r.raajey@gmail.com>
LGTM, and i believe that this pull request does not require support for |
Signed-off-by: shamb0 <r.raajey@gmail.com>
Thank you for the review comments. I value the feedback. I will go through the comments carefully and get back to you soon if I have any questions or updates. |
Signed-off-by: shamb0 <r.raajey@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
I have posted our current questions to the pyo3 Discord discussion, to see if there are any targeted suggestions from the pyo3 official team later on: |
Signed-off-by: shamb0 <r.raajey@gmail.com>
@hanxuanliang, as per @sundy-li's suggestion, we will create a separate pull request (PR) for integrating the 'arrow-udf-python' library. I have a backup of the changes, and they are working fine with Python version 3.12.2. However, a few commands in the 'make lint' process are failing, which requires further investigation. |
Signed-off-by: shamb0 <r.raajey@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
------------
Summary [ 191.363s] 1623 tests run: 1621 passed, 2 failed, 9 skipped
SIGABRT [ 32.188s] databend-enterprise-query::it aggregating_index::index_scan::test_fuzz
SIGABRT [ 32.737s] databend-enterprise-query::it aggregating_index::index_scan::test_fuzz_with_spill
error: test run failed
Error: Process completed with exit code 100.
Could you please provide guidance on how to investigate and resolve this stack overflow issue? Thank you for your assistance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @shamb0 for your contribution in this impressive pr.
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
Integration of Wasm User-Defined Functions (UDFs)
Close Feature: support wasm udf #14925
Fixes 14925
Tests
Type of change
This change is