Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Array wrapper types with lifetimes #400

Merged
merged 10 commits into from
Feb 1, 2023
Merged

Conversation

rib
Copy link
Contributor

@rib rib commented Jan 18, 2023

Overview

Instead of all the array-based APIs being based on sys types like
jbyteArray (which can easily lead to the use of invalid local
references) this introduces types like JByteArray<'local> that
have a lifetime parameter to ensure the local reference can't
be read if it is deleted or goes out of scope.

JByteArray and JIntArray etc are based on a generic
JPrimitiveArray<T> which takes any of the primitive types that
may be put into arrays, like jbyte or jint.

JObjectArray is handled separately, considering that it can't be used
with AutoArray or AutoPrimitiveArray.

The implementations of AutoArray and AutoPrimitiveArray were updated
so they can be created based on a JPrimitiveAarray.

While updating the lifetime parameters to the AutoArray and
AutoPrimitiveArray APIs numerous implementation inconsistencies were
found, even though they should only really differ in how they acquire /
release the pointer to array elements. This patch also normalizes the
implementation of both of these APIs, so they are easily comparable. It
may even be possible to unify these at some point.

Closes: #396
Closes #41

Definition of Done

  • There are no TODOs left in the code
  • Change is covered by automated tests
  • The coding guidelines are followed
  • Public API has documentation
  • User-visible changes are mentioned in the Changelog
  • The continuous integration build passes

@rib
Copy link
Contributor Author

rib commented Jan 18, 2023

@argv-minus-one this has a bit of overlap with the kind of thing that was update in #398 and it would be great if you'd be able to take a look over this too if possible

@rib rib added this to the 0.21 milestone Jan 18, 2023
@argv-minus-one
Copy link
Contributor

Nice. My project mostly uses Java byte arrays to move data between Java and Rust, so this is just what I needed. I'll give this a try tomorrow.

@rib
Copy link
Contributor Author

rib commented Jan 18, 2023

just to add; I had intended to open this as a "draft" PR but just noticed I must have clicked the wrong button in the end when opening the PR. This was a first pass that I did fairly quickly yesterday but wanted to post here to get some initial feedback. Thanks for taking a look.

@rib
Copy link
Contributor Author

rib commented Jan 18, 2023

Something I had meant to highlight too was that I ended up removing the AsRef<JObject<'local>> implementation for AutoArray and AutoPrimitiveArray.

On the one hand I wanted to remove the owned JObjects that were being created for these structs which looked amiss but, more generally, I think of .as_ref() a bit like a type cast and since I don't think either of these types really represent the array itself it feels like a misuse to use .as_ref() to query a kind of property instead of acting like a type cast.

I currently feel like AutoArray and AutoPrimitiveArray are poorly named types because they don't so much represent arrays as much as they represent an auto-release guard for a temporary pointer for accessing the elements of an array.

I think of them more like AutoElements and AutoElementsCritical.

I could see that it might be desirable to be able to query the link between these and the array that's being accessed but I'd think of it like querying a relationship/link to a parent object and I'd expect that to have a more explicit API instead of using .as_ref().

@argv-minus-one
Copy link
Contributor

Here are my thoughts so far. Let me know if you'd like me to work on any of them.

Merge AutoArray and AutoPrimitiveArray

I wonder if we could make JNIEnv::get_primitive_array_critical return AutoArray too, and remove AutoPrimitiveArray. They're pretty similar in API and behavior.

There is one important difference, though: AutoPrimitiveArray holds a &mut JNIEnv, to prevent the caller from making JNI calls inside the critical region, whereas AutoArray contains its own unsafe_cloned JNIEnv. I think this can be dealt with by adding a lifetime parameter 'env to AutoArray that represents:

  • If critical, the lifetime of the &mut JNIEnv. This prevents other use of the JNIEnv while the AutoArray is live.
  • If not critical, 'static. This allows other use of the JNIEnv while the AutoArray is live.
pub struct AutoArray<'local, 'other_local, 'env, T> {
    env: AutoArrayKind<'local, 'env>,
    ..
}

enum AutoArrayKind<'local, 'env> {
    // Not a critical `AutoArray`. Contains a `JNIEnv` created by `JNIEnv::unsafe_clone`.
    Normal(JNIEnv<'local>),
    
    // A critical `AutoArray`. Mutably borrows the `JNIEnv` it was created with.
    Critical(&'env mut JNIEnv<'local>),
}

impl<'local> JNIEnv<'local> {
    pub fn get_array_elements<'other_local, T>(
        &mut self,
        ..
    ) -> Result<AutoArray<'local, 'other_local, 'static, T>> {_}

    pub fn get_primitive_array_critical<'env, 'other_local, T>(
        &'env mut self,
        ..
    ) -> Result<AutoArray<'local, 'other_local, 'env, T>> {_}
}

The destructor for AutoArray would need to look at what variant self.env is to decide which JNI function to release the array with. For AutoArrayKind::Normal, use Release<Type>ArrayElements; for AutoArrayKind::Critical, use ReleasePrimitiveArrayCritical.

Deref to primitive slice

I'd really like to see an implementation of Deref<Target = [T]> and DerefMut for AutoArray<T> (and AutoPrimitiveArray<T>, if we're keeping it). Currently, there is no way to actually use AutoArray without unsafely dereferencing the raw array pointer.

To make that work, AutoArray::new would need to get the size of the array, convert it to usize, and store that where the Deref implementation can find it along with the array pointer.

Use after free

I see that AutoArray and AutoPrimitiveArray no longer take ownership of the local reference to the array. I agree with you that this is a good idea, but in that case, they should store a borrowed &JPrimitiveArray, not just the raw jarray. Currently, it looks like you can delete the local reference to the array while an AutoArray is still live, causing use-after-free in AutoArray's methods and destructor:

let mut env: JNIEnv;
let array_ref: JPrimitiveArray;

let auto_array = env.get_int_array_elements(&array_ref, _)?;
env.delete_local_ref(array_ref)?;
auto_array.size()?; // UB: uses the local reference after it is freed!

JNIEnv::get_<type>_array_elements

Should the JNIEnv::get_<type>_array_elements methods be removed? I don't see any need for them now that there's a generic JNIEnv::get_array_elements method.

@rib
Copy link
Contributor Author

rib commented Jan 19, 2023

For reference, I've made a second pass over this now.

For now I've tried to use a more consistent lifetime name of 'ptr to name the lifetime of the pointer to the array elements.

I'll look at splitting out the change to add as_raw() methods to types like JString which I added for consistency.

This iteration also ends up making quite a lot of changes to both AutoArray and AutoPrimitiveArray for the sake of normalizing their implementation, and I was also starting to consider that they could potentially be unified later through the TypeArray trait.

@argv-minus-one
Copy link
Contributor

JString already has an as_raw method from Deref<Target = JObject>.

@rib
Copy link
Contributor Author

rib commented Jan 19, 2023

Merge AutoArray and AutoPrimitiveArray

I wonder if we could make JNIEnv::get_primitive_array_critical return AutoArray too, and remove AutoPrimitiveArray. They're pretty similar in API and behavior.

There is one important difference, though: AutoPrimitiveArray holds a &mut JNIEnv, to prevent the caller from making JNI calls inside the critical region, whereas AutoArray contains its own unsafe_cloned JNIEnv. I think this can be dealt with by adding a lifetime parameter 'env to AutoArray that represents:

Yeah, I was having similar thoughts.

I was also pondering whether the API to access elements with a "critical" section should simply be marked as unsafe (along with clearer docs explaining that JNI can't be used in the critical section) instead of holding mutable env reference to try and block any use of JNI. In the documentation for GetPrimitiveArrayCritical there is a notable exception that allows nested critical sections in case you want to access multiple arrays, and if we take the mutable reference to the env then we limit that use case.

@rib
Copy link
Contributor Author

rib commented Jan 19, 2023

JString already has an as_raw method from Deref<Target = JObject>.

ah, thanks for the reminder. I guess I was initially adding as_raw() implementations that would return the more specific sys types for the arrays and then thought it was maybe an oversight that we hadn't added as_raw() methods for JString and JThrowable. I've dropped those now.

@rib
Copy link
Contributor Author

rib commented Jan 19, 2023

I see that AutoArray and AutoPrimitiveArray no longer take ownership of the local reference to the array. I agree with you that this is a good idea, but in that case, they should store a borrowed &JPrimitiveArray, not just the raw jarray. Currently, it looks like you can delete the local reference to the array while an AutoArray is still live, causing use-after-free in AutoArray's methods and destructor:

Ah yep, makes sense.

@argv-minus-one
Copy link
Contributor

I notice that this PR also contains 910ee0d and eb82c38, which don't have anything to do with arrays. Is that intentional? If not, you can remove them from this branch with git rebase --onto master eb82c38 (assuming your local master is the same as what's in this GitHub repository).

@rib
Copy link
Contributor Author

rib commented Jan 19, 2023

I notice that this PR also contains 910ee0d and eb82c38, which don't have anything to do with arrays. Is that intentional? If not, you can remove them from this branch with git rebase --onto master eb82c38 (assuming your local master is the same as what's in this GitHub repository).

Yeah, sorry for the distraction - I originally noted this in the PR description. When I first made the PR it was intended to be a draft to test out the feasibility of adding these wrapper types and I just did that based on the branch for #399. I used that branch as a base mainly for the sake of being based on the non-Copy local ref changes which would have led to lots of conflicts to rebase. I've just rebased on master.

@rib
Copy link
Contributor Author

rib commented Jan 19, 2023

@argv-minus-one I've just added a patch that implements Deref and DerefMut and updated the tests to check these. This makes it so the array length is implicitly called during ::new() to ensure we always know the array length upfront. These internally call a new_with_len() constructor in case we ever want to create an AutoArray with a pre-determined length.

.size() is replaced with .len() that doesn't return a Result.

@rib
Copy link
Contributor Author

rib commented Jan 20, 2023

Regarding the idea to try and consolidate AutoArray and AutoPrimitiveArray I'm currently a little reluctant to look at that here currently (though still open to experimenting later via a separate PR) my current gut feeling is that it may end up kind of klunky / over complicated having to deal with the env ownership differences. It's also notable that .commit() is another difference since JNI_COMMIT is fraught with issues across different JNI implementations when using critical sections.

That said - I'd be very tempted to rename AutoPrimitiveArray to something like AutoArrayCritical to at least try and better represent the difference between these two APIs.

@argv-minus-one
Copy link
Contributor

I've changed my project to use these new features, and they work very nicely.

I did need to use the bytemuck library to get a [u8] from a Java byte array, which normally gives [i8]. I imagine this will be a common need whenever using Java byte arrays to move serialized data across JNI, so it might be worth recommending bytemuck in the documentation.

@maurolacy
Copy link
Contributor

maurolacy commented Jan 20, 2023

Regarding the idea to try and consolidate AutoArray and AutoPrimitiveArray I'm currently a little reluctant to look at that here currently (though still open to experimenting later via a separate PR) my current gut feeling is that it may end up kind of klunky / over complicated having to deal with the env ownership differences.

I remember trying to do that at the time, and in the end deciding that it was simpler / clearer to have just two different impls.

It's also notable that .commit() is another difference since JNI_COMMIT is fraught with issues across different JNI implementations when using critical sections.

Yes. JNI_COMMIT is not even consistent in openjdk, depending on the flags, in particular -Xcheck:jni.

That said - I'd be very tempted to rename AutoPrimitiveArray to something like AutoArrayCritical to at least try and better represent the difference between these two APIs.

Go ahead. I tried to come up with representative names at the time, which is not easy; having Critical in the name sounds like a good idea.

@rib
Copy link
Contributor Author

rib commented Jan 22, 2023

Okey, not really wanting to bike shed naming but still; @argv-minus-one + @maurolacy can you let me know if you'd be ok with the names AutoElements / AutoElementsCritical or would you rather go with the more similar AutoArray / AutoArrayCritical names?

Personally I think that the use of Array in the name is a little misleading - more so now that this PR adds JPrimitiveArray + JByteArray/JIntArray etc as a wrapper type for Array references.

The AutoArray type doesn't own the array itself - it's an auto-release guard that gives temporary access to the elements of an array so I tend to think that AutoElements / AutoElementsCritical help to intuitively differentiate what these are for compared to JPrimitiveArray + JByteArray etc.

Please react with 🎉 for AutoArray / AutoArrayCritical or 🚀 for AutoElements / AutoElementsCritical

@maurolacy
Copy link
Contributor

Please react with 🎉 for AutoArray / AutoArrayCritical or 🚀 for AutoElements / AutoElementsCritical

Both work for me, but I tend to stick with 'Array' for historical reasons.

@rib
Copy link
Contributor Author

rib commented Jan 22, 2023

I've changed my project to use these new features, and they work very nicely.

I did need to use the bytemuck library to get a [u8] from a Java byte array, which normally gives [i8]. I imagine this will be a common need whenever using Java byte arrays to move serialized data across JNI, so it might be worth recommending bytemuck in the documentation.

It could even be worth adding a sys type for casting bytes as unsigned. If we added a jubyte that would allow Rust code to interpret any Java byte as unsigned if that's what makes sense.

@rib rib force-pushed the array-wrapper-types branch 2 times, most recently from b742a21 to 003dbe1 Compare January 31, 2023 01:23
@rib
Copy link
Contributor Author

rib commented Jan 31, 2023

Okey, I've rebased on master, squashed the doc fixes and dropped the patch that made get_array_elements[_critical] borrow the array mutably (and updated the CHANGELOG, MIGRATION guide and docs accordingly)

Hopefully this is ok to land now @argv-minus-one

@argv-minus-one
Copy link
Contributor

I've pushed two commits to your branch:

  • 1b56c74 adds a missing unsafe to the JNIEnv::get_array_elements example in the migration guide.
  • de909b1 changes the changelog and migration guide to avoid implying that AutoElements can be used without unsafe, which is unfortunately no longer true.

I've also added a comment at src/wrapper/jnienv.rs line 1750.

Everything else looks good to me.

@rib rib force-pushed the array-wrapper-types branch 2 times, most recently from 47dc986 to 36dd212 Compare February 1, 2023 18:16
rib and others added 10 commits February 1, 2023 21:16
Instead of all the array-based APIs being based on sys types like
`jbyteArray` (which can easily lead to the use of invalid local
references) this introduces types like `JByteArray<'local>` that
have a lifetime parameter to ensure the local reference can't
be read if it is deleted or goes out of scope.

`JByteArray` and `JIntArray` etc are based on a generic
`JPrimitiveArray<T>` which takes any of the primitive types that
may be put into arrays, like `jbyte` or `jint`.

Since types like `JByteArray` are simply aliases for `JPrimitiveArray<T>`
there's no need to keep separate `JNIEnv::get_<type>_array_elements`
methods since they can all be passed to `JNIEnv::get_array_elements`.

`JObjectArray` is handled separately, considering that it can't be used
with `AutoArray` or `AutoPrimitiveArray`.

The implementations of `AutoArray` and `AutoPrimitiveArray` were updated
so they can be created based on a `JPrimitiveAarray`.

While updating the lifetime parameters to the `AutoArray` and
`AutoPrimitiveArray` APIs numerous implementation inconsistencies were
found, even though they should only really differ in how they acquire /
release the pointer to array elements. This patch also normalizes the
implementation of both of these APIs, so they are easily comparable. It
may even be possible to unify these at some point.

For now the `test_get_array_elements` tests have been kept at least for
the sake of checking the data integrity from updating array elements of
different types/sizes, even though they now utilize the same
`get_array_elements` code.

Closes; jni-rs#396
Closes: jni-rs#41
This is to make it clear that `TypeArray` trait is only intended to
implemented internally.
This makes it possible to read and write the elements of a Java array
without resorting to unsafe code.

Both AutoArray and AutoPrimitiveArray now need to be constructed with
a known array length (::new() will query the length and an unsafe
::new_with_len() API could be used in case the length was already known)

The `.size()` API has been replace with `.len()` and `.is_empty()` and
`.len()` doesn't return a Result since it's no longer possible to
create an `Auto[Primitive]Array` without a known length.
So that `JNIEnv::get_array_length()` can be used with `JPrimitiveArray`
or `JObjectArray` this adds a `AsJArrayRaw` trait that has a
`.as_jarray_raw()` method that is more-or-less equivalent to
`JObject::as_raw()` but in addition to returning a `jarray` type pointer
the trait serves as a Java type specification so that we know
its safe for `.get_array_length()` to assume the reference corresponds
to a Java array.
This makes get_primitive_array_critical unsafe because we can't ensure
the safety of the critical section just by taking a mutable reference on
the environment (to block JNI calls).

Users of the API also need to make sure they avoid any system calls that
could depend on other threads (which could lead to a deadlock) and
should generally be aware of the side effects from temporarily disabling
the garbage collector.

This also makes get_array_elements unsafe since there is no built-in
synchronization to avoid data races (also applies to
get_primitive_array_critical)
These names aim to:
1. better highlight the connection between AutoElements and
   AutoElementsCritical (both give temp. access to array elements)
2. better highlight the difference (AutoElementsCritical is an unsafe
   alternative which defines a "critical" section with strict rules)
3. better differentiate these from the new JPrimitiveArray API (and
   type aliases)

Accordingly `JNIEnv::get_primitive_array_critical` has been renamed to
`JNIEnv::get_array_elements_critical`, consistent with
`JNIEnv::get_array_elements` which returns an `AutoArray`.
@rib rib merged commit 6690b6c into jni-rs:master Feb 1, 2023
rib added a commit to jni-rs/jni-sys that referenced this pull request Aug 31, 2023
JNI only strictly defines two valid values for a `jboolean`
and there's no consensus on whether other values greater than one will
be interpreted as TRUE in all situations.

The safest interpretation is to say that it will lead to undefined
behaviour to pass any value besides zero or one as a `jboolean`.

Addresses jni-rs/jni-rs#400
Closes #19
@rib rib mentioned this pull request Jan 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants