Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

android platform channels: moved to direct buffers for c <-> java interop #26331

Merged

Conversation

gaaclarke
Copy link
Member

@gaaclarke gaaclarke commented May 21, 2021

This provides the ability to delete the extra data copy when sending platform channel data from C++ to Java. In local testing, platform channels benchmarks with binary codecs and payloads of 1MB this increases speed from 2637.8 µs to 1670.7 µs (~37%).

This work was outlined in the design doc 2021 Platform Channels Tuneup. We can do this safely now because of the refactoring work done in #25860, now we know we have exclusive ownership of the buffers and they won't be modified by someone else.

Since moving everyone to direct ByteBuffers could break code, particularly when using BasicMessageChannels in conjunction with the BinaryCodec, we provide the ability for users to specify if they want direct ByteBuffers or not via the MessageCodec.wantsDirectByteBufferForDecoding method.

issue: flutter/flutter#81559

Pre-launch Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I read and followed the Flutter Style Guide and the C++, Objective-C, Java style guides.
  • I listed at least one issue that this PR fixes in the description above.
  • I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See testing the engine for instructions on
    writing and running engine tests.
  • I updated/added relevant documentation (doc comments with ///).
  • I signed the CLA.
  • All existing and new tests are passing.
  • The reviewer has submitted any presubmit flakes in this PR using the engine presubmit flakes form before re-triggering the failure.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

@google-cla google-cla bot added the cla: yes label May 21, 2021
@gaaclarke gaaclarke changed the title android platform channels: moved to direct buffers android platform channels: moved to direct buffers for c -> java May 21, 2021
@gaaclarke gaaclarke changed the title android platform channels: moved to direct buffers for c -> java android platform channels: moved to direct buffers for c <-> java interop May 21, 2021
@gaaclarke
Copy link
Member Author

Hey @Hixie do you think we should go through the breaking change protocol for this change? It could technically break someone's code but they'd have to be using the Java API for ByteBuffer's incorrectly (caching a ByteBuffer without checking if it is direct) and doing something improbable with the Flutter API (caching the supplied ByteBuffer in a custom implementation of MessageCodec).

@Hixie
Copy link
Contributor

Hixie commented May 21, 2021

The breaking change protocol defines breaking change as something that required a change to tests in flutter/tests or in google3. Did this change require that?

@gaaclarke
Copy link
Member Author

@Hixie nope, thanks.

@gaaclarke gaaclarke force-pushed the platform-channels-direct-buffer branch from 9cb34ba to 3a5e448 Compare May 21, 2021 21:23
@jason-simmons
Copy link
Member

This is risky - embedders may cause random native code crashes if they use these ByteBuffers outside the call to FlutterJNI.handlePlatformMessage.

Hopefully this will work in practice because embedders typically would not do that. But if the consensus is that the performance improvement is worth the risk, then at the very least I'd like to add a big noticeable warning somewhere in the docs so that users know that this ByteBuffer has special restrictions.

Users of ByteBuffers are not expected to call isDirect. Even if they did, normal direct ByteBuffers do not have this restriction. Consumers of a ByteBuffer expect that the contents of the buffer will be valid for as long as they are holding onto a reference to the buffer. But in this case, the memory backing the ByteBuffer will be freed after handlePlatformMessage completes. Anyone who continues to use the ByteBuffer after that will get unpredictable results.

@gaaclarke
Copy link
Member Author

gaaclarke commented May 21, 2021

Thanks Jason, that's certainly an important consideration. Here's my thoughts on it:

This is risky - embedders may cause random native code crashes if they use these ByteBuffers outside the call to FlutterJNI.handlePlatformMessage.

They won't be so random, they will pertain to using the direct ByteBuffer that has been stored beyond the invocation of decode. People with experience with ByteBuffer's would realize they are dealing with a direct ByteBuffer after that crash. The error will be very local to their usage of the ByteBuffer and should be apparent. On Android you'll get a segfault for reading out of memory, or you'll get junk memory (the chance that wouldn't cause an error is very small), or it will work fine in the off chance that the entirety of the memory is malloc'd to someone else that hasn't written to it by the time you use the cached ByteBuffer.

Napkin math calculation of the risk that someone would introduce a bug they didn't notice right away: I suppose maybe 10 our Flutter's 2 million users might actually write a custom MessageCodec. Let's say 1/10 implementers of MessageCodec wouldn't think to look to see if it is a direct ByteBuffer before caching it, maybe 1 out of 1000 custom MessageCodecs would attempt to actually cache the encoded message (just a guess since no one has compelling reason why someone might do that since the method is synchronous). Then the chance that when they read from that memory and it has been reallocated entirely but not written to, 1/1,000,000.

(10 / 2,000,000) * (1/10) * (1 / 1,000) * (1 / 1,000,000) = 5e-16

The chances that someone wouldn't see it the second time they run the code is much lower and we get lower still when considering someone shipping the bug.

Hopefully this will work in practice because embedders typically would not do that. But if the consensus is that the performance improvement is worth the risk, then at the very least I'd like to add a big noticeable warning somewhere in the docs so that users know that this ByteBuffer has special restrictions.

I can add that warning, that's a good idea.

Users of ByteBuffers are not expected to call isDirect. Even if they did, normal direct ByteBuffers do not have this restriction. Consumers of a ByteBuffer expect that the contents of the buffer will be valid for as long as they are holding onto a reference to the buffer. But in this case, the memory backing the ByteBuffer will be freed after handlePlatformMessage completes. Anyone who continues to use the ByteBuffer after that will get unpredictable results.

You don't need to call isDirect to use the ByteBuffer. If you want to store the ByteBuffer beyond the lifetime of the call you do. That is a consideration that is inherent to using ByteBuffers, otherwise there wouldn't be a need to have isDirect. The java documentation says that method is is provided so that explicit buffer management can be done in performance-critical code.

@Hixie
Copy link
Contributor

Hixie commented May 21, 2021

I have no opinion on the specific issue being discussed here, but those numbers seem optimistic. In my experience, if an API can be misused, it absolutely will be misused, and it is guaranteed that that misuse will have the worst possible outcome, and somehow we will end up having to deal with it.

@jason-simmons
Copy link
Member

No - this is not normal for users of direct ByteBuffers. A standard direct ByteBuffer is valid until it is GCed.

Direct ByteBuffers can have performance benefits because JNI APIs can access the raw buffer address of a direct ByteBuffer. So some performance sensitive code might want to call isDirect to check whether that feature is usable. But direct ByteBuffers do not ordinarily violate Java's memory safety guarantees.

@gaaclarke
Copy link
Member Author

No - this is not normal for users of direct ByteBuffers. A standard direct ByteBuffer is valid until it is GCed.

I don't believe so. The reason why is because the JNI api provides no means of communicating when a ByteBuffer is garbage collected. What you say is true only if you only ever use ByteBuffers for regions of memory that are guaranteed to be valid for the full duration of your application. So, someone can't reasonably believe that the direct ByteBuffer will always be valid.

If you really feel strongly about it we could make the MessageCodec specify if it wants a direct byte buffer or not. It's a bit more work and unnecessary at my estimation but if this is a sticking point we could do that.

@gaaclarke
Copy link
Member Author

gaaclarke commented May 22, 2021

Here's what it would look like to make this more robust if you want to get a better picture:

  1. We add boolean wantsDirectByteBufferForDecoding(); to the java MessageCodec interface. Remember there are 2 codecs involved, Dart and Java, the Dart one won't know anything about the preference.
  2. Update all of our codecs to say true, with the exception of BinaryCodec. We'd add a new codec that is called DirectBinaryCodec, it is the same as the BinaryCodec but returns true from wantsDirectByteBufferForDecoding.
  3. When a channel's handler is registered with the DartMessenger, DartMessenger will call into JNI with the result of that method and it will be stored in a static set of channel names that want non direct bytebuffers (since that will be the exception the performance will be better if we are usually checking against an empty set).
  4. When JNI passes up the ByteBuffer to java it will look up the preference via the channel name and either send a direct or indirect ByteBuffer to the MessageCodec.
  5. When the DartMessenger's handler is cleared out it will call down to JNI and potentially erase the field from the static set.

It's kind of funky but it will work. In most cases it will just add a lookup into an empty set with the channel name.

Or we could just put a warning in the docstring in the unlikely event that someone tries this without checking the documentation or checking isDirect.

I think the answer to how long is a direct ByteBuffer suppose to be valid is: it depends on your intention. In this case it is unlikely but possible that it gets in the hands of a users and they don't know the semantics and they mishandle it. There really isn't an inherent semantics for direct ByteBuffers. The added method makes them state their desired semantics.

Let me know what you think @jason-simmons.

@gaaclarke
Copy link
Member Author

One thing I didn't consider is the usage of the BinaryCodec, that passes the ByteBuffer to the user directly. We wouldn't want that ByteBuffer to get deleted under the covers. That leaves us having to create a BinaryCodec and a DirectBinaryCodec and solving Jason's issue. Now that I say this I think I came to this conclusion previously but got sidetracked fighting Android's profiling tools for so long I forgot.

@gaaclarke
Copy link
Member Author

I stubbed in the work I described in #26331 (comment), it isn't all hooked up but you should be able to see the big picture. It worked out a bit cleaner than I had hoped with the set. I'll wrap it up next week.

@jason-simmons
Copy link
Member

I'd like to understand more about any applications that are finding the copying of message contents to be a significant performance issue. Is there something else besides the platform message API that the engine should offer to solve their needs?

What APIs are these applications using to register their message handlers? Are they calling the DartMessenger.setMessageHandler API that gives a reference to the potentially unsafe ByteBuffer directly to the app? Are they using BasicMeessageChannel with BinaryCodec or a custom MessageCodec implementation (which also passes the ByteBuffer)? Or are they using one of the higher level APIs that parses the message and does not provide access to the ByteBuffer?

If the user is aware of the special rules for these buffers, then they can copy the data out of the unsafe buffer into something that will persist beyond the call to handlePlatformMessage. But the user needs to know to do this.
We could special case BinaryCodec and have it copy the buffer before passing it to the app. But the embedding can not detect whether some other implementation of MessageCodec provided by the app is handling these buffers safely.

Another potential option would be giving the BinaryMessageHandler implementations an instance of a ByteBuffer subclass that delegates to the unsafe buffer and can be invalidated before the underlying buffer is freed. But looking at ByteBuffer I don't think this is feasible - ByteBuffer has no public constructors.

So if we want to ensure that Java code can no longer hold references to the native buffer after it is freed, then it looks like that would require changing APIs like BinaryMessageHandler to provide something other than a ByteBuffer. That is probably too big of a breaking change.

@gaaclarke
Copy link
Member Author

I'd like to understand more about any applications that are finding the copying of message contents to be a significant performance issue. Is there something else besides the platform message API that the engine should offer to solve their needs?

Here's the benchmark where we get the speedup:
https://github.com/flutter/flutter/blob/efdbe403307b2e63e14eda78c8d517f8675207e3/dev/benchmarks/platform_channels_benchmarks/lib/main.dart#L157

There is some work looking into synchronous channels that could potentially make FFI easier to use and avoid thread hops but the work is uncertain right now. Kaushik is looking into that and I'm looking into the specific case of Dart->C/Java. I haven't shared my design doc but for a class of users that would be preferable to our current platform channels. I don't think there is anything else.

What APIs are these applications using to register their message handlers? Are they calling the DartMessenger.setMessageHandler API that gives a reference to the potentially unsafe ByteBuffer directly to the app? Are they using BasicMeessageChannel with BinaryCodec or a custom MessageCodec implementation (which also passes the ByteBuffer)? Or are they using one of the higher level APIs that parses the message and does not provide access to the ByteBuffer?

Yea, the test in question is using BasicMessageChannel with the BinaryCodec. For other codecs the removal of the copy doesn't make as big of a difference because the cost of encoding and decoding is so high. FWIW we could reduce that cost when using pigeon since most of the cost is using reflection recursively.

If the user is aware of the special rules for these buffers, then they can copy the data out of the unsafe buffer into something that will persist beyond the call to handlePlatformMessage. But the user needs to know to do this.
We could special case BinaryCodec and have it copy the buffer before passing it to the app. But the embedding can not detect whether some other implementation of MessageCodec provided by the app is handling these buffers safely.

Yep as part of this code review we'll have to audit that they handle it correctly. It looked good to me when I went through. They all parse the ByteBuffer directly. I was thinking this weekend it might just be better to provide a parameter to the constructor for BinaryCodec if you want a direct buffer. That's probably more discoverable.

Another potential option would be giving the BinaryMessageHandler implementations an instance of a ByteBuffer subclass that delegates to the unsafe buffer and can be invalidated before the underlying buffer is freed. But looking at ByteBuffer I don't think this is feasible - ByteBuffer has no public constructors.

Yea, you can't create a subclass of ByteBuffer and allocate it as a direct buffer from JNI. We would have to create some class that has a ByteBuffer, but that would be a massive breaking change.

So if we want to ensure that Java code can no longer hold references to the native buffer after it is freed, then it looks like that would require changing APIs like BinaryMessageHandler to provide something other than a ByteBuffer. That is probably too big of a breaking change.

Yea, exactly.

I'll continue to pull on this thread, this sounds promising and it solves the points you raised. Thanks for pushing on this.

@gaaclarke gaaclarke force-pushed the platform-channels-direct-buffer branch 6 times, most recently from 204e029 to 88aa46c Compare May 25, 2021 00:26
@gaaclarke gaaclarke force-pushed the platform-channels-direct-buffer branch from 88aa46c to 8945151 Compare May 25, 2021 01:10
@gaaclarke
Copy link
Member Author

Existing tests are passing locally. I still have to give it a glance for testing / documentation tomorrow, but all the moving pieces should be in place and operating.

@gaaclarke gaaclarke force-pushed the platform-channels-direct-buffer branch from 852925c to 081f281 Compare May 27, 2021 01:06
@gaaclarke gaaclarke force-pushed the platform-channels-direct-buffer branch from 081f281 to 81d632f Compare May 27, 2021 01:06
@gaaclarke
Copy link
Member Author

I removed wantsDirectByteBufferForDecoding. I still have to do a bit of cleanup thinking about BinaryCodec to make sure everything is good, I'll wrap that up tomorrow.

@gaaclarke gaaclarke added the waiting for tree to go green This PR is approved and tested, but waiting for the tree to be green to land. label May 27, 2021
@fluttergithubbot fluttergithubbot merged commit a6eb224 into flutter:master May 27, 2021
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request May 27, 2021
fluttergithubbot pushed a commit to flutter/flutter that referenced this pull request May 27, 2021
zanderso added a commit that referenced this pull request May 28, 2021
zanderso added a commit to flutter/flutter that referenced this pull request May 28, 2021
zanderso added a commit to flutter/flutter that referenced this pull request May 28, 2021
zanderso added a commit that referenced this pull request May 28, 2021
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request May 28, 2021
gaaclarke added a commit to gaaclarke/engine that referenced this pull request Jun 1, 2021
@gaaclarke
Copy link
Member Author

The performance tests for this have run now: https://flutter-flutter-perf.skia.org/e/?begin=1621972627&end=1622677144&queries=sub_result%3Dplatform_channel_basic_binary_2host_1MB%26test%3Dlinux_platform_channels_benchmarks&requestType=0

It is only showing a 15% increase instead of a 37% increase. I'll have to double check that I didn't mess up the performance in the process of responding to code reviews.

@gaaclarke
Copy link
Member Author

gaaclarke commented Jun 3, 2021

I ran the tests locally with a Pixel 4a:

non-direct bytebuffer:

I/flutter (29557): ================ FORMATTED ==============
I/flutter (29557): BasicMessageChannel/StandardMessageCodec/Flutter->Host/Small: 205.8 µs
I/flutter (29557): BasicMessageChannel/StandardMessageCodec/Flutter->Host/Large: 2899.1 µs
I/flutter (29557): BasicMessageChannel/BinaryCodec/Flutter->Host/Large: 155.1 µs
I/flutter (29557): BasicMessageChannel/BinaryCodec/Flutter->Host/1MB: 2005.0 µs

direct bytebuffer:

I/flutter (29732): ================ FORMATTED ==============
I/flutter (29732): BasicMessageChannel/StandardMessageCodec/Flutter->Host/Small: 215.3 µs
I/flutter (29732): BasicMessageChannel/StandardMessageCodec/Flutter->Host/Large: 2843.7 µs
I/flutter (29732): BasicMessageChannel/BinaryCodec/Flutter->Host/Large: 133.6 µs
I/flutter (29732): BasicMessageChannel/BinaryCodec/Flutter->Host/1MB: 961.5 µs

That's a 52% reduction. I'm not sure why the devicelab performance tests are showing such wildly different results.

@Hixie
Copy link
Contributor

Hixie commented Jun 3, 2021

Different hardware, different OS, difference in how long since someone touched the display, different temperature in the room, the engine built with slightly different alignment causing the CPU to use different microcode to decode the instructions... any number of things can affect performance numbers unfortunately.

naudzghebre pushed a commit to naudzghebre/engine that referenced this pull request Sep 2, 2021
naudzghebre pushed a commit to naudzghebre/engine that referenced this pull request Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes platform-android waiting for tree to go green This PR is approved and tested, but waiting for the tree to be green to land.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants