-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow registering and executing WebAssembly functions #45
Conversation
474870f
to
8d2089f
Compare
This is only a draft for multitude of reasons, the most important ones being:
|
Great work! Perhaps you could consider CNCF's WasmEdge, which has a well maintained C SDK with LLVM-based AOT support for embedding. :) https://github.com/wasmedge/wasmedge https://wasmedge.org/book/en/sdk/c.html Disclaimer: I am a maintainer at WasmEdge. We helped Nebula Graph and TiDB to support similar Wasm UDFs in their SQL DBs. |
@juntao I actually looked it up earlier today, we're definitely interested in giving it a go! And, eventually, make the implementation runtime-agnostic by relying on Wasm C API (https://github.com/WebAssembly/wasm-c-api) that @losfair mentioned in another issue. I remember from my morning research that the C dynamic library from WasmEdge release page was ~50MB, which is quite heavy compared to libwasmtime's 17 - are you aware of any thinner versions of it? |
Yes. I believe WasmEdge supports the standard C API -- I will confirm. The WasmEdge dynamic library really should not be that big. The distribution binary of WasmEdge is only 8MB. I think the large version contains LLVM so that it can do AOT compilation w/o external dependency. Let me double check and revert. Thank you! |
Hi,
|
Thank you @hydai @psarna I think the 1.8MB WasmEdge runtime library is sufficient for your use case. Developers can compile their functions to regular Wasm in any tool they choose. They can further use the Ref: https://wasmedge.org/book/en/quick_start/run_in_aot_mode.html Also, we do not yet support the proposed "standard" C API. But it could be supported if there is user demand. :) |
Splendid, thanks guys! 1.8MiB sounds way more aligned with edge use cases indeed, will give it a try |
0bcb106
to
cec8c90
Compare
v2:
Still to do: automated tests |
d2626a2
to
66d22d0
Compare
v3:
TODO:
|
1f4e898
to
9b7fa01
Compare
Great works. We also made a demo to run wasm code in openGuass database, supplied with a docker image to experience. Hope we will keep in touch and exchange the thoughts further more. |
c3e0dde
to
0745f02
Compare
The routine creates the libsql_wasm_func_table table, responsible for storing WebAssembly source code for dynamically added Wasm functions.
It will be used to drop functions via the DROP FUNCTION statement.
The table will be created on startup in order to allow registering Wasm functions dynamically.
The new experimental syntax loosely follows SQL's CREATE FUNCTION. It still misses OR REPLACE keywords which would allow overriding an already existing function.
The suite is wrapped in a feature flag, because user-defined functions need to be compiled opt-in into libSQL.
The new command runs Rust test with udf feature enabled, which assumes that libSQL was compiled with --enable-wasm-runtime.
Previous dynamic lookup of Wasm function was lazy and only performed on its first use - this is redundant, and the logic is much clearer when the functions are initialized on startup + when they're registered dynamically.
The newly covered cases also check operations on strings, blobs and null.
This document will serve as an entrypoint for various extensions added to libSQL and not necessarily compatible with SQLite. Eventually it might grow to become a separate directory.
Before the fix, single quotes were not properly loaded from the database.
If .init_wasm_func_table is the initial call to the shell, call open_db() first to initialize the connection.
By accepting compiled wasm blobs as well as .wat files, we allow skipping the wasm2wat translation and save some storage, as the function source code is also stored as a binary blob, which is way more concise.
Previous ad-hoc solution of registering functions during parsing was not in line with libSQL layers - execution should happen in VDBE. Therefore, 2 new opcodes are added for registering and dropping user-defined functions.
The test verifies that EXPLAIN command can be successfully ran on CREATE FUNCTION and DROP FUNCTION statements.
The error can be either freely ignored or used in order to print an error message.
We don't need rowid, as name is already the primary key.
This would allow easier integration with other runtimes later. The interface only needs two functions right now: 1. try_instantiate_wasm_function responsible for registering a new function dynamically 2. run_wasm responsible for executing given function
Instead of binding to the Wasmtime C API library, the support is now moved entirely to Rust, with only the minimal set of C-compatible functions exported to be callable from libSQL main code. The new code produces a libwblibsql.so dynamic library which contains the implementation of all functions required by our ext/udf/wasm_bindings.h header. The stripped library weighs 6.4MiB, which is quite heavy, but already much better than Wasmtime's default C API library, which weighted ~17MiB.
Why not - users may want to prefer to create a static binary without having to worry about library paths.
Dynamic linking translates to smaller binaries, but makes it more ergonomic to quickly try the shell, so let's go with static by default.
The Dockerfile can be used to build a container with precompiled sqlite3 shell inside, with WebAssembly user-defined function support.
6d12e63
to
101c471
Compare
Is there any plan to support SQLite aggregate or window functions? |
It's all already possible via the C API (https://www.sqlite.org/c3ref/create_function.html), we don't have any support for the SQL syntax (e.g. |
That said, contributions are most welcome!!! I'd be glad to help/guide if need be |
45: Full support for query parameters r=penberg a=MarinPostma This PR introduces full support for query parameters. Both positional and named parameters are supported. The supported syntax is the same as the one described in https://www.sqlite.org/c3ref/bind_blob.html. Unbound parameters are interpreted as NULL. ## HTTP query parameters Parameters can also be bound in http request. The syntax is quite flexible: * Request without params: ```json { "statements": ["select * from users where name = 'adhoc'"] } ``` or (syntaxes can be mixed in the same array): ```json { "statements": [{"q": "select * from users where name = 'adhoc'"}] } ``` * Request with params: - positional: ```json { "statements": [ {"q": "select * from users where name = ?", "params": ["adhoc"]}, {"q": "select * from users where name = ?1", "params": ["adhoc"]}, {"q": "select * from users where name = $1", "params": ["adhoc"]} ] } ``` - named: ```json { "statements": [ {"q": "select * from users where name = $name", "params": {"name": "adhoc"}}, {"q": "select * from users where name = :name", "params": {"name": "adhoc"}}, {"q": "select * from users where name = `@name",` "params": {"name": "adhoc"}}, {"q": "select * from users where name = $1", "params": {"name": "adhoc"}}, # object is order sensitive {"q": "select * from users where name = ?", "params": {"name": "adhoc"}}, ] } ``` ### Handling of BLOB BLOBS are handled as base64 encoded string (standards alphabet, no padding), and are nested into an object for disambiguition: ```json { "statements": [ {"q": "select * from users where name = $name", "params": {"name": {"blob": "984HG3e"}}}, # some b64 blob --^ ] } ``` Co-authored-by: ad hoc <postma.marin@protonmail.com>
This series implements a mechanism for registering and running Wasm functions. The current runtime of choice is wasmtime and its libwasmtime.so library with C bindings (but a switch to Rust should be considered, because that's the native language of wasmtime and the only interface which offers all of its features).
It operates on a very crude ABI (ref:#16), where ints and doubles are passed to/from WebAssembly as is,
and for strings/blobs/null it passes a pointer to a structure:
[1 byte for type specification][data]
[1 byte for type specification][4 bytes of size][data]
[1 byte for type specification]
The way it's implemented now is twofold:
run_wasm
function, capable of running WebAssembly and translating the parameter types from and to the Wasm moduleCREATE TABLE libsql_wasm_func_table(name text PRIMARY KEY, body text)
. The table can be initialized from C code by callinglibsql_try_initialize_wasm_func_table()
or from shell by using a.init_wasm_func_table
command.After creating and filling the new meta-table, when a function call is used in a statement, e.g.
SELECT id, fib(id) FROM t
, and functionfib
is neither built-in nor user-defined, it will be looked up in the table. If found, its body will be assumed to hold valid WebAssembly code, compiled and run.In order to enable WebAssembly integration, run configure with
./configure --enable-wasm-runtime
parameter.A few examples WebAssembly-based user-defined functions coded in Rust can be found here: https://github.com/psarna/libsql_bindgen
Here's an inline demo for testing purposes, with a WebAssembly fibonacci sequence already compiled from Rust and copied in-place:
This series also comes with syntactic sugar for registering and deregistering Wasm functions dynamically via SQL:
CREATE FUNCTION
andDROP FUNCTION
: Fixes #18Fixes #17