Add Java vectorized scalar function support (1.5)#648
Merged
staticlibs merged 1 commit intoduckdb:v1.5-variegatafrom Apr 12, 2026
Merged
Add Java vectorized scalar function support (1.5)#648staticlibs merged 1 commit intoduckdb:v1.5-variegatafrom
staticlibs merged 1 commit intoduckdb:v1.5-variegatafrom
Conversation
This is a backport of the PR duckdb#630 to `v1.5-variegata` stable branch. ## Summary This PR adds the implementation of Java Scalar Functions (UDFs) in duckdb-java, using a vectorized callback model for execution. It introduces function registration, callback bridging, typed vector read/write APIs, documentation, and test coverage for supported types. ## What this PR adds - New public API on DuckDBConnection: - registerScalarFunction(String name, String[] parameterTypes, String returnType, DuckDBVectorizedScalarFunction function) - New callback contract and vector APIs: - DuckDBVectorizedScalarFunction - DuckDBDataChunkReader - DuckDBReadableVector - DuckDBWritableVector - JNI/C bridge needed to connect Java callbacks to DuckDB native scalar callback execution - SQL type parsing helper used by the string-based Java registration API - Scalar UDF documentation (UDF.MD) and README reference - Dedicated test suite (TestScalarFunctions) plus binding-level regression tests ## Main design decisionsV ### 1) Prioritize Java-side logic The design keeps most registration and type wiring logic in Java, with JNI used only for unavoidable callback bridging responsibilities. ### 2) Keep JNI additions minimal and essential JNI is limited to: - native callback pointer/state installation - JVM thread attach/detach from DuckDB execution threads - callback lifecycle and error propagation - required helpers for logical type parsing and safe VARCHAR extraction ### 3) Performance-focused vector path The UDF execution path uses dedicated typed vector classes (DuckDBReadableVector/DuckDBWritableVector) instead of generic JDBC row/object paths, to reduce overhead in callback hot loops: - primitive typed access/write APIs - direct output vector writes - explicit null-mask handling - reduced boxing/unboxing and object allocation ## Correctness and hardening included - DECIMAL output validates declared precision/scale - VARCHAR helper validates row bounds - VARCHAR null rows are guarded in Java and JNI - Vector code uses ByteOrder.nativeOrder() consistently - UBIGINT read/write is endian-correct ## Testing - Added broad scalar UDF coverage in TestScalarFunctions Co-Authored-By: Luis Fernando Kauer <lfkauer@yahoo.com.br>
staticlibs
added a commit
to staticlibs/duckdb-java
that referenced
this pull request
Apr 12, 2026
This is a backport of the PR duckdb#648 to `v1.5-variegata` stable branch. This PR is a continuation of duckdb#630, it adds support for writing DuckDB table functions in Java. Documentation is added to UDF.md. Testing: new test added
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a backport of the PR #630 to
v1.5-variegatastable branch.Summary
This PR adds the implementation of Java Scalar Functions (UDFs) in
duckdb-java, using a vectorized callback model for execution.
It introduces function registration, callback bridging, typed vector
read/write APIs, documentation, and test coverage for supported types.
What this PR adds
Main design decisionsV
1) Prioritize Java-side logic
The design keeps most registration and type wiring logic in Java,
with JNI used only for unavoidable callback bridging
responsibilities.
2) Keep JNI additions minimal and essential
JNI is limited to:
3) Performance-focused vector path
The UDF execution path uses dedicated typed vector classes
(DuckDBReadableVector/DuckDBWritableVector) instead of generic
JDBC row/object paths, to reduce overhead in callback hot loops:
Correctness and hardening included
Testing