Skip to content

Conversation

@a10y
Copy link
Contributor

@a10y a10y commented Mar 24, 2025

Switch from JNA -> JNI. On some microbenchmarks I've seen this result in a ~3x speedup for simple string-passing.

I wired this into the Iceberg fork for Vortex and am seeing an immediate ~40% speedup on Citibike scan queries

Subsequently #2781 gives us another 2x speedup on Citibike.

@a10y a10y force-pushed the aduffy/jni-crate branch from d3b4da1 to 5c4197e Compare March 24, 2025 19:53
@a10y a10y changed the title [WIP] feat(java): use JNI instead of JNA feat(java): use JNI instead of JNA Mar 24, 2025
@a10y a10y marked this pull request as ready for review March 24, 2025 19:54
@a10y a10y requested a review from lwwmanning March 24, 2025 19:55
Spark column vectors return `UTF8String`s, which can wrap one of

- JVM heap-allocated `String`
- JVM heap-allocated `byte[]`
- A native pointer + len to native off-heap memory

Previously we've been using the String pathway, this PR changes our Java
scans to canonicalize on read (`.with_canonicalize(true)`), and sends
back to Java a ptr + len pair.

This patch when ferried down into Iceberg gives us another 2x speedup on
Citibike scan
NativeLoader.loadJni();
}

private OptionalLong pointer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this ever absent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, the idea here is that on close I set it to empty. And then I unwrap the optional before calling into JNI. The goal is to turn something that would be a SEG into a Java NPE. I do this for all of the wrapped classes that have close implemented.

}

tasks.withType<Test>().all {
classpath +=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this warms my heart

}

#[unsafe(no_mangle)]
pub extern "system" fn Java_dev_vortex_jni_NativeArrayStreamMethods_free(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤮

@a10y a10y merged commit 014c958 into develop Mar 25, 2025
27 checks passed
@a10y a10y deleted the aduffy/jni-crate branch March 25, 2025 13:19
joseph-isaacs pushed a commit that referenced this pull request Mar 26, 2025
Switch from JNA -> JNI. On some microbenchmarks I've seen this result in
a ~3x speedup for simple string-passing.

I wired this into the Iceberg fork for Vortex and am seeing an immediate
~40% speedup on Citibike scan queries

Subsequently #2781 gives us another 2x speedup on Citibike.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants