-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make JObject and JNIEnv !Copy to prevent use-after-free #398
Conversation
591a67b
to
a31cdb0
Compare
Cool, thanks for creating the PR for this work. I think I'll try and port the Android backend for Bluey (https://github.com/rib/bluey) to this branch to help get a feel for this and I have a work-in-progress Winit branch I may update too (which I was getting frustrated with due to the terrible ergonomics of AutoLocal). If you have any real-world code using jni-rs that you can port over to this too it would be good if you're able to try an initial port to see if it helps us catch something we've overlooked. A few minor things initially:
Considering the foot gun with the |
So far in updating the Android backend for Bluey it's been a fair number of changes which have been fairly straightforward. One place were I did scratch my head for a while was with updating some native JNI functions that had a nested closure for catching impl BleSessionNative {
extern "C" fn on_companion_device_select(
mut env: JNIEnv,
session: JObject,
session_handle: JHandle<AndroidSession>, // (jlong wrapper)
device: JObject, // BluetoothDevice
address: JString,
name: JString,
) {
let result = (|| -> Result<()> { // Like a try{} block since the JNI func doesn't return Result
let address = env
.get_string(&address)?
.to_str()
.map_err(|err| {
Error::Other(anyhow!("JNI: invalid utf8 for returned String: {:?}", err))
})?
.to_string();
Ok(())
})();
// ...
}
} and getting a compiler error like:
and it took me a while to realize that I can no longer rely on the default behaviour for lifetime elision which will bind all unnamed lifetimes to a unique lifetime. I had to update the signature to look like this instead: extern "C" fn on_companion_device_select<'local>(
mut env: JNIEnv<'local>,
session: JObject<'local>,
session_handle: JHandle<AndroidSession>, // (jlong wrapper)
device: JObject<'local>, // BluetoothDevice
address: JString<'local>,
name: JString<'local>,
) {
} to make sure all the object references were bound to the same I think I saw that you updated the docs to highlight that JNI functions will need to declare the |
Updates JNI code for non-Copy object references + plumbing mutable JNIEnv references for all APIs that may create new local references. Ref: jni-rs/jni-rs#398 Ref: jni-rs/jni-rs#392
a31cdb0
to
d82b1f1
Compare
Will do, but I have some other work on my project that I need to finish first. I'm hoping to have it done this week, and then I'll migrate it over. I'll let you know when I'm done or if anything comes up.
Done and done.
Sounds like a good idea. I never used
I think this was caused by the lifetimes on |
d82b1f1
to
e48ac66
Compare
I've ported over my project to use this new Nested
|
okey, yeah this makes sense. Conceptually it feels a bit more correct to tie the lifetimes together explicitly in the signature of the native function but at the same time it also makes sense that |
Right, this looks good too. |
While looking at updating the Winit branch I have then I'm reminded that the ergonomics for In this case I have code that's running in a thread that doesn't just return to Java (so there's no implicit local frame) and instead of creating lots of The poor ergonomics (imho) come from not having an elegant way to generically return whatever you'd like from the I'm wondering if the Just thinking out loud at the moment, this probably doesn't need to be addressed as part of this PR. |
Is this what you have in mind? pub fn with_local_frame_2<T, F>(&mut self, capacity: i32, f: F) -> Result<(T, JObject<'local>)>
where
F: for<'new_local> FnOnce(&mut JNIEnv<'new_local>) -> Result<(T, JObject<'new_local>)>,
{
unsafe {
self.push_local_frame(capacity)?;
let res = f(self);
match res {
Ok((t, obj)) => Ok((t, self.pop_local_frame(&obj)?)),
Err(e) => {
self.pop_local_frame(&JObject::null())?;
Err(e)
}
}
}
} This would have been a big footgun before this PR, because you could easily return an invalid local reference through fn example(env: &mut JNIEnv) -> Result<()> {
let (obj, _): (JString, JObject) = env.with_local_frame_2(1, |env| {
let s = env.new_string("hi")?;
Ok((s, JObject::null()))
})?;
Ok(())
}
|
At least something along these lines. Instead of returning a tuple from the closure I was wondering about a second argument that would act as an out parameter for optionally returning a local reference. At least so far in the places I've wanted to use Another consideration is that even without that feature it's still possible to send global references back (and since those release on drop I probably wouldn't be worried about leaking global references) - so the feature is mainly just an optimization (though at least for me it's an optimization for something I haven't needed yet) Maybe there could even be a separate version that would support passing a local reference back. |
Maybe it wouldn't really be unreasonable to just not expose that feature of being able to pass back a local reference - and instead just suggest that a global reference can be used instead. This could simplify the API for common cases perhaps. Another API could be added later if there really are some use cases where it's critical to optimize the efficiency of passing back a single local reference without going via a global reference. |
So thinking of simplifying it to something like: /// Executes the given function in a new local reference frame, in which at least a given number
/// of references can be created. Once this method returns, all references allocated
/// in the frame are freed.
///
/// If a frame can't be allocated with the requested capacity for local
/// references, returns `Err` with a pending `OutOfMemoryError`.
///
/// Since local references created within this frame won't be accessible to the calling
/// frame then if you need to pass an object back to the caller then you can do that via a
/// [`GlobalRef`] / [`Self::make_global`].
pub fn with_local_frame<F, R>(&mut self, capacity: i32, f: F) -> Result<R>
where
F: FnOnce(&mut JNIEnv) -> R,
{
unsafe {
self.push_local_frame(capacity)?;
let ret = f(self);
self.pop_local_frame(&JObject::null())?;
Ok(ret)
}
} |
So then code like: let mut result = None
jenv.with_local_frame(10, |env| {
result = Some(Self::handle_io_request_with_jni_local_frame(env, state, session, request, peripheral_handle, ble_device));
Ok(JObject::null())
})?;
result.unwrap() can become: jenv.with_local_frame(10, |env| {
Self::handle_io_request_with_jni_local_frame(env, state, session, request, peripheral_handle, ble_device)
})? |
I can do that, sure. Seems like a shame to give up the performance gain of returning a local reference, though. Remember that global references are about an order of magnitude slower to create and delete. The We could support both: pub fn with_local_frame<F, R>(&mut self, capacity: i32, f: F) -> Result<R>
where
F: for<'new_local> FnOnce(&mut JNIEnv<'new_local>) -> Result<R>,
{
let (r, _) = self.with_local_frame_returning_local(capacity, |env| {
let r = f(env)?;
Ok((r, JObject::null()))
})?;
Ok(r)
}
pub fn with_local_frame_returning_local<F, R>(&mut self, capacity: i32, f: F) -> Result<(R, JObject<'local>)>
where
F: for<'new_local> FnOnce(&mut JNIEnv<'new_local>) -> Result<(R, JObject<'new_local>)>,
{
unsafe {
self.push_local_frame(capacity)?;
let res = f(self);
match res {
Ok((r, obj)) => Ok((r, self.pop_local_frame(&obj)?)),
Err(e) => {
self.pop_local_frame(&JObject::null())?;
Err(e)
}
}
}
} What do you think? |
even though there's a performance difference between global/local refs in normal usage there are other factors here too:
Those two things together make me tend to think you'd be hard pushed to find a use case that would ever notice the performance difference but maybe if there are enough times where it's exactly what you want it might still be good to have an alternative API for that case. Instead of stacking the APIs though I think it could be better to just keep them separate and simplify them both. I also think the closures shouldn't really be returning a If we have a version for returning a local reference that one probably doesn't need to return a tuple, but it could be good it returned a (I think these same issues with the
Here's an initial patch with the |
Making a micro benchmark to roughly quantify the difference between returning a global reference that is then converted into a local vs directly returning a local I get numbers like:
so apparently about 26% slower to return a global and then create a local from that vs returning a local. at least for me that tends to make me think it's super unlikely for that kind of difference to matter in practice but it doesn't really do any harm to have two versions. |
Just to avoid getting too side-tracked with the Any initial thoughts on https://github.com/rib/jni-rs/tree/general-with-local-frame would be great to hear (esp if we think we should change something related to making references non-Copy) but hopefully we can aim to land this soon without making this discussion a dependency for that. In terms of porting code over to this branch for smoke testing I'm currently also using the above I'll probably try to make another review pass over everything - mainly trying to think about the pov of people porting code from 0.20 - but I think atm that this is probably good to land soon. |
Ok. I've opened #399 to continue the discussion about In the mean time, is there anything else you'd like me to do with this PR? |
…#392). Fix local reference leaks (jni-rs#109). Extensive breaking API changes (see changelog). See jni-rs#392 for the discussion that led to this commit. Closes jni-rs#392 Closes jni-rs#384 Closes jni-rs#381 Closes jni-rs#109 Co-authored-by: Robert Bragg <robert@sixbynine.org>
e48ac66
to
5539c55
Compare
I've changed the commit message to mention that this also fixes #384. (I didn't notice that issue until just now.) |
Not atm, and hopefully it'll be good to land as is. I was just aiming to get time to take another pass over it. I'm a little busy this week but hopefully we can land this pretty soon and then tie up a few loose ends before hopefully looking at a 0.21 release too. |
|
||
use log::{debug, warn}; | ||
|
||
use crate::{errors::Result, objects::JObject, sys, JNIEnv, JavaVM}; | ||
|
||
// Note: `GlobalRef` must not implement `Into<JObject>`! If it did, then it would be possible to | ||
// wrap it in `AutoLocal`, which would cause undefined behavior upon drop as a result of calling | ||
// the wrong JNI function to delete the reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having this documented special-case consideration that we need to be careful to not forget I wonder if we should look at adding an IsLocalRef
trait (or similar) that can be used like a label/tag for all the types that can be safely wrapped via AutoLocal
.
We can consider this in a follow up issue / PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's possible. It's unsafe to wrap JObject<'static>
in AutoLocal
, but it's not unsafe to wrap JObject
with any other lifetime in AutoLocal
, and I don't think there's any way to implement a trait for all lifetimes except 'static
.
We could add another wrapper type, Local<'local, T>
, where T
is JObject
or something that wraps it, and 'local
is the local reference frame it was created in. Then, JNIEnv::auto_local
accepts Local<'local, T>
and returns AutoLocal<'local, T>
. JObject
and the other object wrapper types would no longer have a lifetime parameter and would become borrowed-only types like str
, never appearing in owned form.
This would also make it possible to generify GlobalRef
and WeakRef
without using a GAT to change the lifetime.
This would be quite the footgun, though, since every extern fn
that takes a parameter of type JObject
/JClass
/etc has to be changed to take Local<JObject>
/Local<JClass>
/etc, and if you forget any, you get undefined behavior! Rust has a way to solve this problem, extern type
, but it's unstable and has been for a long time.
Okey, I had another read through the PR and also gave one of my projects (Bluey-UI) a smoke test on Android after updating it to use the latest branch based on this work (from #399) There will be a few things to follow up on related to this before making the next 0.21 release, and it'll probably be good to think about having some guidance / hints to add to the release notes about some of the churn that downstream users should expect when updating. Big thanks @argv-minus-one for working through all of the knock-on details that arose while making this change! This definitely feels like a step in the right direction for making |
This was recently marked as deprecated since landing jni-rs#398 meant we could also get a reference to a JObject via an implementation of `AsRef<JObject>`. Since there are multiple `AsRef<T>` implementations for `GlobalRef` there are times when the compiler isn't able to infer which implementation of `.as_ref()` you are trying to call, and so it's still convenient to have a less ambiguous `.as_obj()` API that can be used instead. Closes: jni-rs#408
Overview
The over-arching change here is that local reference types (like JObject) are !Copy and any APIs that may allocate new local references must take a mutable reference on the
JNIEnv
.These rules then mean you can no longer easily access a copy of a local reference after it has been deleted and also mean you can't easily copy local references outside of a local reference frame created using
with_local_frame
(where they would be invalid)An inter-related change also addresses the leak described in #109 as the Desc trait was updated to account for non-Copy reference types.
This PR has the changes discussed in #392.
This PR is the result of squashing a branch into a single commit. Here is the original, unsquashed branch.
Closes #392
Closes #384
Closes #381
Closes #109
Definition of Done