Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mobile: Rewrite Java/JNI stream callbacks #34234

Merged
merged 2 commits into from
May 24, 2024

Conversation

fredyw
Copy link
Member

@fredyw fredyw commented May 17, 2024

This PR rewrites the Java/JNI implementation of the stream callbacks by removing a lot of indirection. This should reduce the number of translation layers, copies, as well as making the code easier to read and maintain. Prior to this PR, the filter and callback implementation was somewhat shared, but it makes the code confusing to read because there's some logic that's only relevant for the filters and vice versa. With the new implementation, the stream callback implementation is separate from the filter implementation, but some common helpers have been written so that the code can still be shared between the stream callback and filter implementations. In the next PR, I will update the filter implementation, so that we can finally get rid of the C wrapper types.

Risk Level: medium
Testing: unit & integration tests
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: mobile

Copy link

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #34234 was opened by fredyw.

see: more, trace.

@fredyw fredyw force-pushed the java_callback_rewrite branch 12 times, most recently from c4c1365 to 687fa1c Compare May 23, 2024 22:19
@fredyw fredyw changed the title [WIP] mobile: Rewrite Java stream callbacks mobile: Rewrite Java stream callbacks May 23, 2024
@fredyw
Copy link
Member Author

fredyw commented May 23, 2024

/retest

@fredyw fredyw changed the title mobile: Rewrite Java stream callbacks mobile: Rewrite Java/JNI stream callbacks May 23, 2024
@fredyw fredyw force-pushed the java_callback_rewrite branch 3 times, most recently from 6593e0e to 5534231 Compare May 24, 2024 01:37
Signed-off-by: Fredy Wijaya <fredyw@google.com>
@fredyw
Copy link
Member Author

fredyw commented May 24, 2024

/retest

@fredyw fredyw marked this pull request as ready for review May 24, 2024 05:56
@fredyw
Copy link
Member Author

fredyw commented May 24, 2024

/assign @abeyad

Copy link
Contributor

@abeyad abeyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is awesome!

mUserCurrentReadBuffer = null; // Avoid the reference to a potentially large buffer.
int dataRead = data.remaining();
// It is important to copy the `data` into the `userBuffer` outside the thread execution
// because the `data` is backed by a direct `ByteBuffer` and it will be destroyed upon once
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/destroyed upon once/destroyed once

Also, was this a problem before this PR? If not, why is it a problem now? I think I'm missing the correlation between this particular change and the rest of the PR

Copy link
Member Author

@fredyw fredyw May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/destroyed upon once/destroyed once

Fixed the typo.

Also, was this a problem before this PR? If not, why is it a problem now? I think I'm missing the correlation between this particular change and the rest of the PR

With the new change, we are passing the direct ByteBuffer into the callback without any copy (no more envoy_data (in C++) to byte[] (in Java)). The direct ByteBuffer is backed by the native C++ Envoy::Buffer::Instance. That also means it's tied to the lifetime of Envoy::Buffer::Instance. In Cronvoy, we run the onData on a separate thread, so we have to make a copy of the ByteBuffer, which is what the existing code does (the old is also making a copy). The new code is simply moving the copy outside the thread because making a copy inside the thread poses a risk of having a dangling ByteBuffer.

int dataRead = data.remaining();
// It is important to copy the `data` into the `userBuffer` outside the thread execution
// because the `data` is backed by a direct `ByteBuffer` and it will be destroyed upon once
// the `onData` completes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is this the only place we'll have buffer copies once everything is in the final desired end state?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup!

* See https://developer.android.com/training/articles/perf-jni#jclass,-jmethodid,-and-jfieldid
*
* Another reason for caching the `jclass` object is to able to find a non-built-in class when the
* native code creates a thread and then attaching it with `AttachCurrentThread`, i.e. calling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/attaching/attaches

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* Another reason for caching the `jclass` object is to able to find a non-built-in class when the
* native code creates a thread and then attaching it with `AttachCurrentThread`, i.e. calling
* `getThreadLocalEnv()->getEnv()->FindClass`. This is because there are no stack frames from the
* application. When calling `FindClass` from the thread, the `JavaVM` will start in the "system"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this comment, very helpful to know! One thing that wasn't clear is what stack frames have to do with the reset of the explanation about FindClass?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://developer.android.com/training/articles/perf-jni#faq:-why-didnt-findclass-find-my-class

If the JNIEnv is attached to the current thread, there's no stack frame and FindClass needs the stack frame to find the class. For built-in classes, e.g. java.util.*, etc this is fine because we look in the system classloader, but for non-built-in classes, this will result in a class not found. Specific to this PR, we need to find EnvoyStreamIntel and EnvoyFinalStreamIntel, which are custom classes. In the next PR, I will update the rest of the code to cache the jclass-es since that's also the recommendation from https://developer.android.com/training/articles/perf-jni#jclass,-jmethodid,-and-jfieldid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another note for this, on Android each app has a separate ClassLoader (for sandboxing purposes). The old code does this:

LocalRefUniquePtr<jclass> findClass(const char* class_name) {
JniHelper jni_helper(JniHelper::getThreadLocalEnv());
LocalRefUniquePtr<jclass> class_loader = jni_helper.findClass("java/lang/ClassLoader");
jmethodID find_class_method = jni_helper.getMethodId(class_loader.get(), "loadClass",
"(Ljava/lang/String;)Ljava/lang/Class;");
LocalRefUniquePtr<jstring> str_class_name = jni_helper.newStringUtf(class_name);
LocalRefUniquePtr<jclass> clazz = jni_helper.callObjectMethod<jclass>(
getClassLoader(), find_class_method, str_class_name.get());
return clazz;
}

It basically tries to read any custom code from the app ClassLoader, which is what the third recommendation says in https://developer.android.com/training/articles/perf-jni#faq:-why-didnt-findclass-find-my-class

Cache a reference to the ClassLoader object somewhere handy, and issue loadClass calls directly. This requires some effort.

My new solution is to actually go with the first recommendation with caching (caching is also better for performance as mentioned in https://developer.android.com/training/articles/perf-jni#jclass,-jmethodid,-and-jfieldid).

Do your FindClass lookups once, in JNI_OnLoad, and cache the class references for later use. Any FindClass calls made as part of executing JNI_OnLoad will use the class loader associated with the function that called System.loadLibrary (this is a special rule, provided to make library initialization more convenient). If your app code is loading the library, FindClass will use the correct class loader.

*
* https://docs.oracle.com/en/java/javase/17/docs/specs/jni/functions.html#findclass
*/
[[nodiscard]] jclass findClassFromCache(const char* class_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't this static if addClassToCache is static?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong reason for it not to be static other than it's a bit nicer being able to call jni_helper.findClass similar like the rest of the code. The JniHelper is meant to be a tiny wrapper around JNI API and findClass is a function in the JNI API, e.g. (env->FindClass) so having jni_helper.findClass makes it more consistent with the JNI API.

In the next PR, I plan to remove findClassFromCache and simply replace findClass implementation with the cache implementation, so the existing code doesn't need to change.

@@ -10,6 +12,7 @@ constexpr jint JNI_VERSION = JNI_VERSION_1_6;
constexpr const char* THREAD_NAME = "EnvoyMain";
std::atomic<JavaVM*> java_vm_cache_;
thread_local JNIEnv* jni_env_cache_ = nullptr;
absl::flat_hash_map<absl::string_view, jclass> JCLASS_CACHES;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how large would this map grow? I'm concerned about keeping it unbounded in size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We control the cache via addClassToCache in the JNI_OnLoad. Right now we only cache 2 classes, but in the future it'll be whatever Envoy Mobile needs, but it'll be around 10-20.

auto java_stream_callbacks_class = jni_helper.getObjectClass(java_stream_callbacks_global_ref);
auto java_on_headers_method_id = jni_helper.getMethodId(
java_stream_callbacks_class.get(), "onHeaders",
"(Ljava/util/Map;ZLio/envoyproxy/envoymobile/engine/types/EnvoyStreamIntel;)V");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we're looking a lot of methods up now by their exact argument types, whereas before it seems we'd look up the same methods without that... is this something required by the new jclass cache? I think I'm not following the new logic here, but I knew very little about JNI so I'm mostly asking for educational purposes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the old code, basically we are passing byte[] key, byte[] value (

void passHeader(byte[] key, byte[] value, boolean start) {
bridgeUtility.passHeader(key, value, start);
}
) and converting them into a Map (
Map<String, List<String>> retrieveHeaders() {
final Map<String, List<String>> headers = headerAccumulator;
headerAccumulator = null;
headerCount = 0;
return headers;
}
).

The old code uses intermediate types and has more translation layers, which this PR is trying to eliminate. For the onHeader case, it will first create a pair of byte[]s and then convert them into Java Map, which I personally think is wasteful and not to mention the code isn't very easy to read.

override fun onHeaders(
headers: Map<String, List<String>>,
endStream: Boolean,
streamIntel: EnvoyStreamIntel
) {
callbacks.onHeaders?.invoke(ResponseHeaders(headers), endStream, StreamIntel(streamIntel))
if (executor != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it looks like we're making the executor optional - what's the advantage of that? Seems to put more onus on the caller. What about having a default executor of DirectExecutor?

Copy link
Member Author

@fredyw fredyw May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are defaulting it to the direct executor, we always have to make a copy since there's no way we can omit the copy or else we'll risk having a dangling ByteBuffer. For Cronvoy, this would result in two copies (one from here and the other one on the Cronvoy side) and this is what we want to avoid. Basically we want to eliminate the copy when it's not necessary, so making it optional is a better design IMO.

Another side note, the C++ API doesn't actually enforce some kind of threading model. It's basically up to the callback implementer to decide whether they need some kind of threading. Although the onus is on the caller, it provides more flexibility IMO.

Signed-off-by: Fredy Wijaya <fredyw@google.com>
@fredyw
Copy link
Member Author

fredyw commented May 24, 2024

/retest

1 similar comment
@fredyw
Copy link
Member Author

fredyw commented May 24, 2024

/retest

Copy link
Contributor

@abeyad abeyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanations!

@fredyw fredyw merged commit c38670b into envoyproxy:main May 24, 2024
35 checks passed
@fredyw fredyw deleted the java_callback_rewrite branch May 24, 2024 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants