-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement java bindings #1928
Changes from all commits
08853d1
da9e820
1e9e42f
d03978f
71c676a
4bac79c
b891c13
19575e3
32c3d16
fed49df
5a63331
e4ca66d
fd1dd55
05ec38e
2a1bf4b
13f1800
285b723
5518d1e
4cff35a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
name: Build and Run Java JNI Tests | ||
on: | ||
push: | ||
branches: | ||
- main | ||
pull_request: | ||
paths: | ||
- java/** | ||
- rust/** | ||
- .github/workflows/java.yml | ||
env: | ||
# This env var is used by Swatinem/rust-cache@v2 for the cache | ||
# key, so we set it to make sure it is always consistent. | ||
CARGO_TERM_COLOR: always | ||
# Disable full debug symbol generation to speed up CI build and keep memory down | ||
# "1" means line tables only, which is useful for panic tracebacks. | ||
RUSTFLAGS: "-C debuginfo=1" | ||
RUST_BACKTRACE: "1" | ||
# according to: https://matklad.github.io/2021/09/04/fast-rust-builds.html | ||
# CI builds are faster with incremental disabled. | ||
CARGO_INCREMENTAL: "0" | ||
CARGO_BUILD_JOBS: "1" | ||
jobs: | ||
linux-build: | ||
runs-on: ubuntu-22.04 | ||
name: ubuntu-22.04 + Java 11 & 17 | ||
defaults: | ||
run: | ||
working-directory: ./java | ||
steps: | ||
- name: Checkout repository | ||
uses: actions/checkout@v4 | ||
- uses: Swatinem/rust-cache@v2 | ||
with: | ||
workspaces: java/java-jni | ||
- name: Run cargo fmt | ||
run: cargo fmt --check | ||
working-directory: ./java/lance-jni | ||
- name: Install dependencies | ||
run: | | ||
sudo apt update | ||
sudo apt install -y protobuf-compiler libssl-dev | ||
- name: Install Java 17 | ||
uses: actions/setup-java@v4 | ||
with: | ||
distribution: temurin | ||
java-version: 17 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should be 11 here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed |
||
cache: "maven" | ||
- run: echo "JAVA_17=$JAVA_HOME" >> $GITHUB_ENV | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we use a matrix to split it into two jobs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you look at the CI job, testing another Java version only took 6s. If we split to another job, it would have to recompile the Rust (and also do the installation steps). I think we should keep as-is so that our CI usage is kept minimal. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for the comments @wjones127 ! Yes, that's why I merge these two together as recompiling the rust part really takes a lot of time. |
||
- name: Install Java 11 | ||
uses: actions/setup-java@v4 | ||
with: | ||
distribution: temurin | ||
java-version: 11 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this should be 17 instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed |
||
cache: "maven" | ||
- name: Java Style Check | ||
run: mvn checkstyle:check | ||
- name: Rust Clippy | ||
working-directory: java/lance-jni | ||
run: cargo clippy --all-targets -- -D warnings | ||
- name: Build with Maven with Java 11 | ||
run: mvn package -DskipTests=true | ||
- name: Running tests with Java 11 | ||
run: mvn test | ||
- name: Running tests with Java 17 | ||
run: JAVA_HOME=$JAVA_17 JAVA_TOOL_OPTIONS="--add-opens=java.base/java.nio=ALL-UNNAMED" mvn test |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -35,9 +35,6 @@ jobs: | |
toolchain: | ||
- stable | ||
- nightly | ||
defaults: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think https://github.com/lancedb/lance/blob/main/.github/workflows/bump-version/action.yml also needs to be updated There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed. |
||
run: | ||
working-directory: ./rust | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- uses: Swatinem/rust-cache@v2 | ||
|
@@ -73,9 +70,6 @@ jobs: | |
linux-arm: | ||
runs-on: buildjet-4vcpu-ubuntu-2204-arm | ||
timeout-minutes: 30 | ||
defaults: | ||
run: | ||
working-directory: ./rust | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- uses: Swatinem/rust-cache@v2 | ||
|
@@ -94,9 +88,6 @@ jobs: | |
clippy: | ||
runs-on: ubuntu-22.04 | ||
timeout-minutes: 30 | ||
defaults: | ||
run: | ||
working-directory: ./rust | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- uses: Swatinem/rust-cache@v2 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
*.iml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
[package] | ||
name = "lance-jni" | ||
version.workspace = true | ||
edition.workspace = true | ||
repository.workspace = true | ||
readme.workspace = true | ||
license.workspace = true | ||
description = "JNI bindings for Lance Columnar format" | ||
keywords.workspace = true | ||
categories.workspace = true | ||
|
||
[lib] | ||
crate-type = ["cdylib"] | ||
|
||
[dependencies] | ||
lance.workspace = true | ||
arrow = { workspace = true, features = ["ffi"] } | ||
tokio = { workspace = true } | ||
jni = "0.21.1" | ||
snafu.workspace = true | ||
lazy_static.workspace = true |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
// Copyright 2023 Lance Developers. | ||
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); | ||
// you may not use this file except in compliance with the License. | ||
// You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, software | ||
// distributed under the License is distributed on an "AS IS" BASIS, | ||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
use arrow::array::RecordBatchReader; | ||
|
||
use lance::dataset::{Dataset, WriteParams}; | ||
use lance::Result; | ||
|
||
use crate::RT; | ||
|
||
pub struct BlockingDataset { | ||
inner: Dataset, | ||
} | ||
|
||
impl BlockingDataset { | ||
pub fn write( | ||
reader: impl RecordBatchReader + Send + 'static, | ||
uri: &str, | ||
params: Option<WriteParams>, | ||
) -> Result<Self> { | ||
let inner = RT.block_on(Dataset::write(reader, uri, params))?; | ||
Ok(Self { inner }) | ||
} | ||
|
||
pub fn open(uri: &str) -> Result<Self> { | ||
let inner = RT.block_on(Dataset::open(uri))?; | ||
Ok(Self { inner }) | ||
} | ||
|
||
pub fn count_rows(&self) -> Result<usize> { | ||
RT.block_on(self.inner.count_rows()) | ||
} | ||
|
||
pub fn close(&self) {} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So input and output are There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the input of the write path can be either a FFI_ArrowArrayStream or FFI_ArrowArray (I have implement the stream api for write, I'm thinking to implement append api for both structure). For the output of the read path, I think it will be something very similar but I didn't get a chance to try to implement it. |
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
// Copyright 2023 Lance Developers. | ||
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); | ||
// you may not use this file except in compliance with the License. | ||
// You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, software | ||
// distributed under the License is distributed on an "AS IS" BASIS, | ||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
use jni::objects::{JMap, JObject, JString}; | ||
use jni::JNIEnv; | ||
|
||
pub fn throw_java_exception(env: &mut JNIEnv, err_msg: &str) { | ||
env.throw_new("java/lang/RuntimeException", err_msg) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: should we make a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, will do as follow up. |
||
.expect("Error throwing exception"); | ||
} | ||
|
||
pub fn extract_path_str(env: &mut JNIEnv, path: &JString) -> Result<String, String> { | ||
match env.get_string(path) { | ||
Ok(path) => Ok(path.into()), | ||
Err(err) => Err(format!("Invalid path string: {}", err)), | ||
} | ||
} | ||
|
||
pub trait ExtractJniValue { | ||
// extract JNI JObject to rust type | ||
fn extract(env: &mut JNIEnv, obj: JObject) -> Result<Self, String> | ||
where | ||
Self: Sized; | ||
} | ||
|
||
impl ExtractJniValue for i32 { | ||
fn extract(env: &mut JNIEnv, obj: JObject) -> Result<Self, String> { | ||
env.call_method(obj, "intValue", "()I", &[]) | ||
.and_then(|jvalue| jvalue.i()) | ||
.map_err(|e| e.to_string()) | ||
} | ||
} | ||
|
||
impl ExtractJniValue for i64 { | ||
fn extract(env: &mut JNIEnv, obj: JObject) -> Result<Self, String> { | ||
env.call_method(obj, "longValue", "()J", &[]) | ||
.and_then(|jvalue| jvalue.j()) | ||
.map_err(|e| e.to_string()) | ||
} | ||
} | ||
|
||
impl ExtractJniValue for String { | ||
fn extract(env: &mut JNIEnv, obj: JObject) -> Result<Self, String> { | ||
env.get_string(&JString::from(obj)) | ||
.map(|jstr| jstr.into()) | ||
.map_err(|e| e.to_string()) | ||
} | ||
} | ||
|
||
pub fn extract_and_process_jni_map_value<T, F>( | ||
env: &mut JNIEnv, | ||
map: &JMap, | ||
key: &str, | ||
mut process: F, | ||
) -> Result<(), String> | ||
where | ||
T: ExtractJniValue, | ||
F: FnMut(T) -> Result<(), String>, // func to apply to return value | ||
{ | ||
let key_obj = env | ||
.new_string(key) | ||
.map(JObject::from) | ||
.map_err(|e| e.to_string())?; | ||
|
||
match map.get(env, &key_obj) { | ||
Ok(Some(value_obj)) => { | ||
if !value_obj.is_null() { | ||
let value = T::extract(env, value_obj)?; | ||
process(value)?; | ||
} | ||
Ok(()) | ||
} | ||
Ok(None) => Ok(()), // Key is not present in the map, so we do nothing | ||
Err(e) => Err(e.to_string()), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do expect to run this everytime with rust change, could we change this package as a workspace package with the rest of rust code, and share the same workspace Cargo.toml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds great to me!