Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing heap-allocated byte[]s #68

Closed
benalexau opened this issue Jul 16, 2016 · 12 comments · Fixed by #219
Closed

Passing heap-allocated byte[]s #68

benalexau opened this issue Jul 16, 2016 · 12 comments · Fixed by #219
Milestone

Comments

@benalexau
Copy link

We are using JNR-FFI to wrap the C LMDB library in LmdbJava. LMDB requires a MDB_val with a size and pointer to the data. This works fine with direct ByteBuffers (where we can fetch the memory address and capacity from the buffer and put it directly into memory we allocated for the struct), but we've had a request to support heap-allocated byte[]s. We can copy the byte[] to a direct buffer, but this has considerable cost and performance is a major consideration of LMDB users.

Is there some way we can fetch the on-heap byte[] address and protect it from being moved during the timespan of an JNR-FFF native call? GetByteArrayElements in JNINativeInterface might be suitable, but I cannot see any information on how to use it. Any suggestions appreciated.

@headius
Copy link
Member

headius commented Sep 26, 2016

There's no way through standard JNI to get a direct reference to memory on the heap. Some of the APIs that allow you to access such memory (such as GetByteArrayElements you mentioned) say that the VM may give you a pinned reference directly into the heap, but that this is not guaranteed. I believe Hotspot, the most-deployed JVM, always chooses to copy instead.

There's JDK improvements coming up that will make it easier to blur the lines between on-heap and off-heap memory, but for the foreseeable future there's no safe way to expose heap memory directly to native code.

@headius headius closed this as completed Sep 26, 2016
@phraktle
Copy link

Hi @headius,

I believe GetPrimitiveArrayCritical should provide direct access to the array without copying, on HotSpot as well (with the caveat that the operation shouldn't block for long, since that could delay VM housekeeping).

@headius
Copy link
Member

headius commented Sep 26, 2016

@phraktle I can't find information as to whether HotSpot will actually pin these days; most discussions of that function are old and refer to the now-defunct, non-moving CMS GC. From what I can gather, it did do this at some point in the past, you pay a locking/unlocking cost in addition to normal JNI overhead, and you still might not actually get the real array anyway. So in the best case, you have to acquire a lock and block the GC and other critical JVM subsystems. In the worst case, you're no better (or worse) than copying the data out.

It's probably worth looking into. If someone would like to do that, I'd be happy to open this and look forward to a PR :-)

@phraktle
Copy link

phraktle commented Sep 26, 2016

The difference in GCs re GetCritical is mostly about whether it completely suspends GC operations or only partially (as is the case with CMS, which is still very much alive, and still works best for many use cases :). Several popular native bindings use GetCritical where performance was a concern (e.g. Netty networking layer or LZ4). Of course a benchmark would be best to demonstrate it.

Not familiar w/ JNR-FFI internals, but if you can outline a sketch of where one would need to apply the hammer, I can take a look.

@headius
Copy link
Member

headius commented Sep 26, 2016

@benalexau Thoughts on using GetCritical? I haven't dug into the logic for passing heap byte[] out but perhaps the right direction would be tagging the param as "direct" or "critical" and modifying the value-marshaling logic to use GetCritical for those parameters?

@headius
Copy link
Member

headius commented Sep 26, 2016

@phraktle Thanks for the info. I'm not sure the right location myself, at the moment, but you may be on to something. I'll reopen this.

@headius headius reopened this Sep 26, 2016
@phraktle
Copy link

Some more on details on how HotSpot implements Get*Critical without actually locking on the fast-path: http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/58d961f47dd4/src/share/vm/memory/gcLocker.hpp#l127

@Spasi
Copy link

Spasi commented Sep 27, 2016

You may want to have a look at Hotspot Critical Natives. A random example from LWJGL's lmdb bindings, here. (compare Java_org_lwjgl_util_lmdb_LMDB_nmdb_1version___3I_3I_3I to JavaCritical_org_lwjgl_util_lmdb_LMDB_nmdb_1version___3I_3I_3I)

Pros:

  • Works and is as fast as passing ByteBuffer addresses.

Cons:

  • Undocumented and supported unofficially on Hotspot only.
  • The critical natives are ignored in C0 (that's why having both standard and critical versions of the same function is required). This is important for functions that are called infrequently with big arrays, you risk always paying the array copy cost.

@headius
Copy link
Member

headius commented Sep 28, 2016

👍 from me...this is potentially huge news!

@Spasi That is incredibly interesting! I did not know about JavaCritical, but this could allow us to improve the perf of JNR significantly by allowing users to opt-in to critical function binding. Libraries like jnr-posix could start using it immediately for known non-blocking calls.

So yeah I think this needs to happen, and soon.

@headius
Copy link
Member

headius commented Sep 28, 2016

@phraktle Thank you also for the information on locking and critical array references from JNI.

I think we will focus in this issue on the possibility of marking primitive arrays as "critical" or "pass-through" and I will open a separate issue for the more ambitious use of JavaCritical across jnr-*.

@headius
Copy link
Member

headius commented Sep 28, 2016

I believe supporting byte[] pass-through will still require new jffi native binaries, so this bug is likely to stall while we work on #86 and jnr/jffi#34 (since we don't want to have to re-build the native bits again later).

bkgood added a commit to bkgood/jnr-ffi that referenced this issue Feb 21, 2021
Pinning primitive arrays enables access to their contents in native code
without requiring copying.

The contents of the array is accessed via the GetPrimitiveArrayCritical
JNI method. It may be able to pin the object on the heap in some
implementations. In G1 (on Hotspot in OpenJDK 11) it currently acquires
a lock that prevents garbage collection ("GCLocker") and thus can
negatively impact forward progress of mutators if a collection is
required. Native methods are thus expected to not hold these arrays for
long, as is documented in JNI docs [1] and in the @Pinned javadocs.

In the case that a VM is unable for some reason to avoid a copy,
GetPrimitiveArrayCritical will return a copy. This will cause test
failure: I've included a test to verify that pinning 'works' - this is
the best strategy I've come up with to verify that
GetPrimitiveArrayCritical is being called as a result of the param
annotation + flag. Access to some sort of counter, if available, might
be preferrable, but I'm unaware any such portable and accessible
counters.

Fortunately, OpenJDK 11, OpenJ9's JDK 11-compatible distribution both
seem to support pinning, as does Dalvik/ART. Given that all the popular
implementations seem to support pinning, I've left the test enabled.

Performance benefits of pinning is significant: the attached test
displays this, as does the example at
bkgood/jnr-ffi-array-pinning-tests. I embarked on this due to
surprisingly limited performance I got in a WIP libsnappy binding: with
copying (including judicial use of @in and @out) I get about 90 MB/s but
enabling array pinning gets it closer to 250 MB/s.

This was largely implemented in jnr/jffi@a61b1fc42aa7; the requisite
flag just

Fixes jnr#68.

[1]
https://docs.oracle.com/en/java/javase/11/docs/specs/jni/functions.html#getprimitivearraycritical-releaseprimitivearraycritical
bkgood added a commit to bkgood/jnr-ffi that referenced this issue Feb 21, 2021
Pinning primitive arrays enables access to their contents in native code
without requiring copying.

The contents of the array is accessed via the GetPrimitiveArrayCritical
JNI method. It may be able to pin the object on the heap in some
implementations. In G1 (on Hotspot in OpenJDK 11) it currently acquires
a lock that prevents garbage collection ("GCLocker") and thus can
negatively impact forward progress of mutators if a collection is
required. Native methods are thus expected to not hold these arrays for
long, as is documented in JNI docs [1] and in the `@Pinned` javadocs.

In the case that a VM is unable for some reason to avoid a copy,
GetPrimitiveArrayCritical will return a copy. This will cause test
failure: I've included a test to verify that pinning 'works' - this is
the best strategy I've come up with to verify that
GetPrimitiveArrayCritical is being called as a result of the param
annotation + flag. Access to some sort of counter, if available, might
be preferrable, but I'm unaware any such portable and accessible
counters.

Fortunately, OpenJDK 11, OpenJ9's JDK 11-compatible distribution both
seem to support pinning, as does Dalvik/ART. Given that all the popular
implementations seem to support pinning, I've left the test enabled.

Performance benefits of pinning is significant: the attached test
displays this, as does the example at
bkgood/jnr-ffi-array-pinning-tests. I embarked on this due to
surprisingly limited performance I got in a WIP libsnappy binding: with
copying (including judicial use of @in and @out) I get about 90 MB/s but
enabling array pinning gets it closer to 250 MB/s.

This was largely implemented in jnr/jffi@a61b1fc42aa7; the requisite
flag just

Fixes jnr#68.

[1]
https://docs.oracle.com/en/java/javase/11/docs/specs/jni/functions.html#getprimitivearraycritical-releaseprimitivearraycritical
bkgood added a commit to bkgood/jnr-ffi that referenced this issue Feb 21, 2021
Pinning primitive arrays enables access to their contents in native code
without requiring copying.

The contents of the array is accessed via the GetPrimitiveArrayCritical
JNI method. It may be able to pin the object on the heap in some
implementations. In G1 (on Hotspot in OpenJDK 11) it currently acquires
a lock that prevents garbage collection ("GCLocker") and thus can
negatively impact forward progress of mutators if a collection is
required. Native methods are thus expected to not hold these arrays for
long, as is documented in JNI docs [1] and in the `@Pinned` javadocs.

In the case that a VM is unable for some reason to avoid a copy,
GetPrimitiveArrayCritical will return a copy. This will cause test
failure: I've included a test to verify that pinning 'works' - this is
the best strategy I've come up with to verify that
GetPrimitiveArrayCritical is being called as a result of the param
annotation + flag. Access to some sort of counter, if available, might
be preferrable, but I'm unaware any such portable and accessible
counters.

Fortunately, OpenJDK 11, OpenJ9's JDK 11-compatible distribution both
seem to support pinning, as does Dalvik/ART. Given that all the popular
implementations seem to support pinning, I've left the test enabled.

Performance benefits of pinning is significant: the attached test
displays this, as does the example at
bkgood/jnr-ffi-array-pinning-tests. I embarked on this due to
surprisingly limited performance I got in a WIP libsnappy binding: with
copying (including judicial use of `@In` and `@Out`) I get about 90 MB/s
but enabling array pinning gets it closer to 250 MB/s.

This was largely implemented in jnr/jffi@a61b1fc42aa7; the requisite
flag just

Fixes jnr#68.

[1]
https://docs.oracle.com/en/java/javase/11/docs/specs/jni/functions.html#getprimitivearraycritical-releaseprimitivearraycritical
bkgood added a commit to bkgood/jnr-ffi that referenced this issue Feb 21, 2021
Pinning primitive arrays enables access to their contents in native code
without requiring copying.

The contents of the array is accessed via the GetPrimitiveArrayCritical
JNI method. It may be able to pin the object on the heap in some
implementations. In G1 (on Hotspot in OpenJDK 11) it currently acquires
a lock that prevents garbage collection ("GCLocker") and thus can
negatively impact forward progress of mutators if a collection is
required. Native methods are thus expected to not hold these arrays for
long, as is documented in JNI docs [1] and in the `@Pinned` javadocs.

In the case that a VM is unable for some reason to avoid a copy,
GetPrimitiveArrayCritical will return a copy. This will cause test
failure: I've included a test to verify that pinning 'works' - this is
the best strategy I've come up with to verify that
GetPrimitiveArrayCritical is being called as a result of the param
annotation + flag. Access to some sort of counter, if available, might
be preferrable, but I'm unaware any such portable and accessible
counters.

Fortunately, OpenJDK 11, OpenJ9's JDK 11-compatible distribution both
seem to support pinning, as does Dalvik/ART. Given that all the popular
implementations seem to support pinning, I've left the test enabled.

Performance benefits of pinning is significant: the attached test
displays this, as does the example at
bkgood/jnr-ffi-array-pinning-tests. I embarked on this due to
surprisingly limited performance I got in a WIP libsnappy binding: with
copying (including judicious use of `@In` and `@Out`) I get about 90
MB/s but enabling array pinning gets it closer to 250 MB/s.

This was largely implemented in jnr/jffi@a61b1fc42aa7; the requisite
flag just needs to make it to the native stubs.

Fixes jnr#68.

[1]
https://docs.oracle.com/en/java/javase/11/docs/specs/jni/functions.html#getprimitivearraycritical-releaseprimitivearraycritical
bkgood added a commit to bkgood/jnr-ffi that referenced this issue Feb 21, 2021
Pinning primitive arrays enables access to their contents in native code
without requiring copying.

The contents of the array is accessed via the GetPrimitiveArrayCritical
JNI method. It may be able to pin the object on the heap in some
implementations. In G1 (on Hotspot in OpenJDK 11) it currently acquires
a lock that prevents garbage collection ("GCLocker") and thus can
negatively impact forward progress of mutators if a collection is
required. Native methods are thus expected to not hold these arrays for
long, as is documented in JNI docs [1] and in the `@Pinned` javadocs.

In the case that a VM is unable for some reason to avoid a copy,
GetPrimitiveArrayCritical will return a copy. This will cause test
failure: I've included a test to verify that pinning 'works' - this is
the best strategy I've come up with to verify that
GetPrimitiveArrayCritical is being called as a result of the param
annotation + flag. Access to some sort of counter, if available, might
be preferrable, but I'm unaware any such portable and accessible
counters.

Fortunately, OpenJDK 11, OpenJ9's JDK 11-compatible distribution both
seem to support pinning, as does Dalvik/ART. Given that all the popular
implementations seem to support pinning, I've left the test enabled.

Performance benefits of pinning is significant: the attached test
displays this, as does the example at
bkgood/jnr-ffi-array-pinning-tests. I embarked on this due to
surprisingly limited performance I got in a WIP libsnappy binding: with
copying (including judicious use of `@In` and `@Out`) I get about 90
MB/s but enabling array pinning gets it closer to 250 MB/s.

This was largely implemented in jnr/jffi@a61b1fc42aa7; the requisite
flag just needs to make it to the native stubs.

Fixes jnr#68.

[1]
https://docs.oracle.com/en/java/javase/11/docs/specs/jni/functions.html#getprimitivearraycritical-releaseprimitivearraycritical
bkgood added a commit to bkgood/jnr-ffi that referenced this issue Feb 21, 2021
Pinning primitive arrays enables access to their contents in native code
without requiring copying.

The contents of the array is accessed via the GetPrimitiveArrayCritical
JNI method. It may be able to pin the object on the heap in some
implementations. In G1 (on Hotspot in OpenJDK 11) it currently acquires
a lock that prevents garbage collection ("GCLocker") and thus can
negatively impact forward progress of mutators if a collection is
required. Native methods are thus expected to not hold these arrays for
long, as is documented in JNI docs [1] and in the `@Pinned` javadocs.

In the case that a VM is unable for some reason to avoid a copy,
GetPrimitiveArrayCritical will return a copy. This will cause test
failure: I've included a test to verify that pinning 'works' - this is
the best strategy I've come up with to verify that
GetPrimitiveArrayCritical is being called as a result of the param
annotation + flag. Access to some sort of counter, if available, might
be preferrable, but I'm unaware any such portable and accessible
counters.

Fortunately, OpenJDK 11, OpenJ9's JDK 11-compatible distribution both
seem to support pinning, as does Dalvik/ART. Given that all the popular
implementations seem to support pinning, I've left the test enabled.

Performance benefits of pinning is significant: the attached test
displays this, as does the example at
bkgood/jnr-ffi-array-pinning-tests. I embarked on this due to
surprisingly limited performance I got in a WIP libsnappy binding: with
copying (including judicious use of `@In` and `@Out`) I get about 90
MB/s but enabling array pinning gets it closer to 250 MB/s.

This was largely implemented in jnr/jffi@a61b1fc42aa7; the requisite
flag just needs to make it to the native stubs.

Fixes jnr#68.

[1]
https://docs.oracle.com/en/java/javase/11/docs/specs/jni/functions.html#getprimitivearraycritical-releaseprimitivearraycritical
bkgood added a commit to bkgood/jnr-ffi that referenced this issue Feb 21, 2021
Pinning primitive arrays enables access to their contents in native code
without requiring copying.

The contents of the array is accessed via the GetPrimitiveArrayCritical
JNI method. Different JVMs and different JVM configurations have
different implementations and consequences of this method. In G1 (on
Hotspot in OpenJDK 11) it currently acquires a lock that prevents
garbage collection ("GCLocker") and thus can negatively impact forward
progress of mutators if a collection is required. Native methods are
thus expected to not hold these arrays for long, as is documented in JNI
docs [1] and in the `@Pinned` javadocs.

In the case that a VM is unable for some reason to avoid a copy,
GetPrimitiveArrayCritical will return a copy. This will cause test
failure: I've included a test to verify that pinning 'works': this is
the best strategy I've come up with to verify that
GetPrimitiveArrayCritical is being called as a result of the param
annotation + flag. Access to some sort of counter, if available, might
be preferrable but I'm unaware any such portable and accessible
counters.

Fortunately, OpenJDK 11 and OpenJ9's JDK 11-compatible distribution both
seem to support pinned access, as does Dalvik/ART. Given that all the
popular implementations seem to support pinning, I've left the test
enabled.

Performance benefits of pinning is significant: the attached test
displays this, as does the example at
bkgood/jnr-ffi-array-pinning-tests. I embarked on this due to
surprisingly limited performance I got in a WIP libsnappy binding: with
copying (including judicious use of `@In` and `@Out`) I get about 90
MB/s but enabling array pinning gets it closer to 250 MB/s.

This was largely implemented in jnr/jffi@a61b1fc42aa7; the requisite
flag just needs to make it to the native stubs.

Fixes jnr#68.

[1]
https://docs.oracle.com/en/java/javase/11/docs/specs/jni/functions.html#getprimitivearraycritical-releaseprimitivearraycritical
bkgood added a commit to bkgood/jnr-ffi that referenced this issue Feb 21, 2021
Pinning primitive arrays enables access to their contents in native code
without requiring copying.

The contents of the array is accessed via the GetPrimitiveArrayCritical
JNI method. Different JVMs and different JVM configurations have
different implementations and consequences of this method. In G1 (on
Hotspot in OpenJDK 11) it currently acquires a lock that prevents
garbage collection ("GCLocker") and thus can negatively impact forward
progress of mutators if a collection is required. Native methods are
thus expected to not hold these arrays for long, as is documented in JNI
docs [1] and in the `@Pinned` javadocs.

In the case that a VM is unable for some reason to avoid a copy,
GetPrimitiveArrayCritical will return a copy. This will cause test
failure: I've included a test to verify that pinning 'works': this is
the best strategy I've come up with to verify that
GetPrimitiveArrayCritical is being called as a result of the param
annotation + flag. Access to some sort of counter, if available, might
be preferrable but I'm unaware any such portable and accessible
counters.

Fortunately, OpenJDK 11 and OpenJ9's JDK 11-compatible distribution both
seem to support pinned access, as does Dalvik/ART. Given that all the
popular implementations seem to support pinning, I've left the test
enabled.

Performance benefits of pinning is significant: the attached test
displays this, as does the example at
bkgood/jnr-ffi-array-pinning-tests. I embarked on this due to
surprisingly limited performance I got in a WIP libsnappy binding: with
copying (including judicious use of `@In` and `@Out`) I get about 90
MB/s but enabling array pinning gets it closer to 250 MB/s.

This was largely implemented in jnr/jffi@a61b1fc42aa7; the requisite
flag just needs to make it to the native stubs.

Fixes jnr#68.

[1]
https://docs.oracle.com/en/java/javase/11/docs/specs/jni/functions.html#getprimitivearraycritical-releaseprimitivearraycritical
@headius
Copy link
Member

headius commented Feb 23, 2021

Fixed by @bkgood in #219, for at least the asm-generated stub logic. We will look at getting releases out based on this change soon.

@headius headius added this to the 2.2.2 milestone Feb 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants