-
Notifications
You must be signed in to change notification settings - Fork 113
feat(java): use JNI instead of JNA #2773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Spark column vectors return `UTF8String`s, which can wrap one of - JVM heap-allocated `String` - JVM heap-allocated `byte[]` - A native pointer + len to native off-heap memory Previously we've been using the String pathway, this PR changes our Java scans to canonicalize on read (`.with_canonicalize(true)`), and sends back to Java a ptr + len pair. This patch when ferried down into Iceberg gives us another 2x speedup on Citibike scan
| NativeLoader.loadJni(); | ||
| } | ||
|
|
||
| private OptionalLong pointer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this ever absent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, the idea here is that on close I set it to empty. And then I unwrap the optional before calling into JNI. The goal is to turn something that would be a SEG into a Java NPE. I do this for all of the wrapped classes that have close implemented.
| } | ||
|
|
||
| tasks.withType<Test>().all { | ||
| classpath += |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this warms my heart
java/vortex-spark/src/test/java/dev/vortex/spark/VortexScanTest.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| #[unsafe(no_mangle)] | ||
| pub extern "system" fn Java_dev_vortex_jni_NativeArrayStreamMethods_free( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤮
Switch from JNA -> JNI. On some microbenchmarks I've seen this result in a ~3x speedup for simple string-passing. I wired this into the Iceberg fork for Vortex and am seeing an immediate ~40% speedup on Citibike scan queries Subsequently #2781 gives us another 2x speedup on Citibike.
Switch from JNA -> JNI. On some microbenchmarks I've seen this result in a ~3x speedup for simple string-passing.
I wired this into the Iceberg fork for Vortex and am seeing an immediate ~40% speedup on Citibike scan queries
Subsequently #2781 gives us another 2x speedup on Citibike.