Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Android] Cache GPU resources using HardwareBuffer's id as key #50028

Merged
merged 18 commits into from
Jan 27, 2024

Conversation

jonahwilliams
Copy link
Member

@jonahwilliams jonahwilliams commented Jan 25, 2024

Once a hardware buffer has been imported (a VkImage created for it), we don't ever need to re-create a VkImage, even when the contents change. The same hardware buffer can be identified by ID. Part of flutter/flutter#142153

Otherwise we spend a lot of time re-creating VkImages:

image

Draft is here, but is currently leaky: #50028
We only need something like a LRU with the max image size (seems to be 3 for me). This does log locally that I'm not calling close correctly:

E/flutter ( 5580): [ERROR:flutter/shell/platform/android/image_external_texture_vk.cc(51)] Size: 3
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 

FYI @johnmccutchan

@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact "@test-exemption-reviewer" in the #hackers channel in Chat (don't just cc them here, they won't see it! Use Discord!).

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

@jonahwilliams
Copy link
Member Author

This is a bit crashy right now. I think the stack itself is probably bogus but potentially we're holding onto some of these resources for too long now:

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'google/cheetah/cheetah:13/TQ3A.230901.001/10750268:user/release-keys'
Revision: 'MP1.0'
ABI: 'arm64'
Timestamp: 2024-01-24 22:35:11.448716124-0500
Process uptime: 42s
Cmdline: io.flutter.examples.hello_world
pid: 15610, tid: 15810, name: DartWorker  >>> io.flutter.examples.hello_world <<<
uid: 10259
tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
Abort message: 'Scudo ERROR: corrupted chunk header at address 0x2000073e9eb8390'
    x0  0000000000000000  x1  0000000000003dc2  x2  0000000000000006  x3  000000733d6bd6d0
    x4  0080808080808080  x5  0080808080808080  x6  0080808080808080  x7  8080808080808000
    x8  00000000000000f0  x9  00000075fe3bb9e0  x10 0000000000000001  x11 00000075fe3fd370
    x12 0101010101010101  x13 000000007fffffff  x14 0000000005692542  x15 0000000000000050
    x16 00000075fe46ad50  x17 00000075fe445eb0  x18 00000072b5464000  x19 0000000000003cfa
    x20 0000000000003dc2  x21 00000000ffffffff  x22 0000000000000000  x23 00000072d6b1d79c
    x24 b400007479eb5a30  x25 00000072d6f61b40  x26 b400007419ea7cc8  x27 00000000004c4b40
    x28 00000072d6fde000  x29 000000733d6bd750
    lr  00000075fe3ed1c8  sp  000000733d6bd6b0  pc  00000075fe3ed1f4  pst 0000000000001000
backtrace:
      #00 pc 00000000000531f4  /apex/com.android.runtime/lib64/bionic/libc.so (abort+164) (BuildId: dc4001c2ef2dfc23467040797a96840c)
      #01 pc 00000000000418b8  /apex/com.android.runtime/lib64/bionic/libc.so (scudo::die()+8) (BuildId: dc4001c2ef2dfc23467040797a96840c)
      #02 pc 0000000000042260  /apex/com.android.runtime/lib64/bionic/libc.so (scudo::ScopedErrorReport::~ScopedErrorReport()+32) (BuildId: dc4001c2ef2dfc23467040797a96840c)
      #03 pc 00000000000423b0  /apex/com.android.runtime/lib64/bionic/libc.so (scudo::reportHeaderCorruption(void*)+96) (BuildId: dc4001c2ef2dfc23467040797a96840c)
      #04 pc 0000000000044158  /apex/com.android.runtime/lib64/bionic/libc.so (scudo::Allocator<scudo::AndroidConfig, &(scudo_malloc_postinit)>::deallocate(void*, scudo::Chunk::Origin, unsigned long, unsigned long)+296) (BuildId: dc4001c2ef2dfc23467040797a96840c)
      #05 pc 0000000001c8b914  /data/app/~~O5JBwnW-KJeDqYg0DC8ADQ==/io.flutter.examples.hello_world-E_y0zhD3xVStGkB4FW3MGw==/lib/arm64/libflutter.so (std::_fl::__shared_weak_count::__release_shared[abi:v15000]()+52) (BuildId: 492348a712f22b2437d66e4e3550b962bab98e6a)
      #06 pc 00000000020fdd98  /data/app/~~O5JBwnW-KJeDqYg0DC8ADQ==/io.flutter.examples.hello_world-E_y0zhD3xVStGkB4FW3MGw==/lib/arm64/libflutter.so (flutter::CanvasPath::~CanvasPath()+32) (BuildId: 492348a712f22b2437d66e4e3550b962bab98e6a)
      #07 pc 00000000020fddbc  /data/app/~~O5JBwnW-KJeDqYg0DC8ADQ==/io.flutter.examples.hello_world-E_y0zhD3xVStGkB4FW3MGw==/lib/arm64/libflutter.so (flutter::CanvasPath::~CanvasPath()+8) (BuildId: 492348a712f22b2437d66e4e3550b962bab98e6a)
      #08 pc 0000000002498cfc  /data/app/~~O5JBwnW-KJeDqYg0DC8ADQ==/io.flutter.examples.hello_world-E_y0zhD3xVStGkB4FW3MGw==/lib/arm64/libflutter.so (dart::FinalizablePersistentHandle::Finalize(dart::IsolateGroup*, dart::FinalizablePersistentHandle*)+100) (BuildId: 492348a712f22b2437d66e4e3550b962bab98e6a)
      #09 pc 00000000021a266c  /data/app/~~O5JBwnW-KJeDqYg0DC8ADQ==/io.flutter.examples.hello_world-E_y0zhD3xVStGkB4FW3MGw==/lib/arm64/libflutter.so (dart::IsolateGroup::VisitWeakPersistentHandles(dart::HandleVisitor*)+80) (BuildId: 492348a712f22b2437d66e4e3550b962bab98e6a)
      #10 pc 00000000022eccd8  /data/app/~~O5JBwnW-KJeDqYg0DC8ADQ==/io.flutter.examples.hello_world-E_y0zhD3xVStGkB4FW3MGw==/lib/arm64/libflutter.so (dart::Scavenger::IterateWeak()+196) (BuildId: 492348a712f22b2437d66e4e3550b962bab98e6a)

@johnmccutchan
Copy link
Contributor

One meta thought about this change:

Does it need to be Vulkan specific? I wonder if we can just cache the DlImages in the base class and reuse the eglImageKHRs too.

@jonahwilliams
Copy link
Member Author

This appears to approximately work for GL too, with the same caveat that I still get crashes sometimes:

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'google/coral/coral:13/TP1A.220905.004/8927612:user/release-keys'
Revision: 'MP1.0'
ABI: 'arm64'
Timestamp: 2024-01-25 13:28:57.317064593-0500
Process uptime: 159s
Cmdline: io.flutter.examples.hello_world
pid: 7909, tid: 7936, name: 1.raster  >>> io.flutter.examples.hello_world <<<
uid: 10289
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x00001ee500000011
    x0  00001ee500000011  x1  000000005c000000  x2  b400007b59621b20  x3  00000079e82be1b3
    x4  000000796e2fc978  x5  00000079e70d4b57  x6  0000000000000002  x7  000000796e2fc6a8
    x8  468cdac489e4036e  x9  468cdac489e4036e  x10 468cdac489e4036e  x11 0000000000000000
    x12 0000000000000040  x13 00000079e86368a0  x14 000000796e2fbe80  x15 0000000000000000
    x16 0000000000000001  x17 0000000000000000  x18 000000796dcc8000  x19 0000000000000001
    x20 b400007a3965c5f8  x21 000000796e2fcce0  x22 b400007a796251e0  x23 b400007a99661870
    x24 0000000000000000  x25 b400007aa96508b0  x26 0000000000000000  x27 b400007a996204a8
    x28 0000000000000000  x29 000000796e2fdb40
    lr  0000007971133ca0  sp  000000796e2fcb60  pc  00000079711222a4  pst 0000000060000000
backtrace:
      #00 pc 0000000001d162a4  /data/app/~~X7SInOCGXQBEe3A3NomBTw==/io.flutter.examples.hello_world-YU7EV4Rr6c_2P7COj_BzJA==/lib/arm64/libflutter.so (OUTLINED_FUNCTION_3+0) (BuildId: 6e1612df1780bcd83b80b6653ba3a49146945dfd)
      #01 pc 0000000001d27c9c  /data/app/~~X7SInOCGXQBEe3A3NomBTw==/io.flutter.examples.hello_world-YU7EV4Rr6c_2P7COj_BzJA==/lib/arm64/libflutter.so (flutter::DlImage::bounds() const+4) (BuildId: 6e1612df1780bcd83b80b6653ba3a49146945dfd)
      #02 pc 0000000001c921bc  /data/app/~~X7SInOCGXQBEe3A3NomBTw==/io.flutter.examples.hello_world-YU7EV4Rr6c_2P7COj_BzJA==/lib/arm64/libflutter.so (flutter::ImageExternalTexture::Paint(flutter::Texture::PaintContext&, SkRect const&, bool, flutter::DlImageSampling)+136) (BuildId: 6e1612df1780bcd83b80b6653ba3a49146945dfd)
      #03 pc 0000000001fcf508  /data/app/~~X7SInOCGXQBEe3A3NomBTw==/io.flutter.examples.hello_world-YU7EV4Rr6c_2P7COj_BzJA==/lib/arm64/libflutter.so (flutter::TextureLayer::Paint(flutter::PaintContext&) const+124) (BuildId: 6e1612df1780bcd83b80b6653ba3a49146945dfd)
      #04 pc 0000000001fc9e0c  /data/app/~~X7SInOCGXQBEe3A3NomBTw==/io.flutter.examples.hello_world-YU7EV4Rr6c_2P7COj_BzJA==/lib/arm64/libflutter.so (flutter::ContainerLayer::PaintChildren(flutter::PaintContext&) const+84) (BuildId: 6e1612df1780bcd83b80b6653ba3a49146945dfd)
      #05 pc 0000000001fc8e2c  /data/app/~~X7SInOCGXQBEe3A3NomBTw==/io.flutter.examples.hello_world-YU7EV4Rr6c_2P7COj_BzJA==/lib/arm64/libflutter.so (flutter::ClipShapeLayer<SkPath>::Paint(flutter::PaintContext&) const+212) (BuildId: 6e1612df1780bcd83b80b6653ba3a49146945dfd)

I suspect there is some lifecycle where the texture or dl_image is becoming invalid without the texture view becoming aware of it

@jonahwilliams
Copy link
Member Author

Ahh I need to clear the cache when we reset the dl image.

@jonahwilliams
Copy link
Member Author

Waiting to rebase on the other image reader patch.

@jonahwilliams jonahwilliams marked this pull request as ready for review January 26, 2024 01:02
@jonahwilliams
Copy link
Member Author

Open question is how many different images we'll get from an Image/Camera implementation. This code works fine with a different # of swapchain images than the cache, though if its too small we'll obviously be dropping images unnecessarily.

@jonahwilliams
Copy link
Member Author

I plan to pull the LRU code out into a helper so I can test it more easily, since getting that wrong makes the cache ineffective.

@jonahwilliams
Copy link
Member Author

The difference in behavior is due to the fact that impeller::Textures (which DlImages wrap) own their gl texture handle, whereas SkImages do not. The ImageExternalTextureGLSkia manages a texture handle and that needs to be 1-1 with the DlImage.

@johnmccutchan
Copy link
Contributor

Can confirm that there was more of a flicker on GL when resuming without the code in attach, but now it seems closer to ToT. That said, there is still a slight flicker even without my patch, I'm not sure what the cause of that is.

there is a flicker issue that impacts all Flutter apps on resume. I need to dig into that.

@johnmccutchan johnmccutchan changed the title [Android] re-use already imported hardware buffers. [Android] Cache GPU resources using HardwareBuffer's id as key Jan 26, 2024
@johnmccutchan johnmccutchan added the autosubmit Merge PR when tree becomes green via auto submit App label Jan 26, 2024
@auto-submit auto-submit bot removed the autosubmit Merge PR when tree becomes green via auto submit App label Jan 26, 2024
Copy link
Contributor

auto-submit bot commented Jan 26, 2024

auto label is removed for flutter/engine/50028, due to - The status or check suite Linux linux_clang_tidy has failed. Please fix the issues identified (or deflake) before re-applying this label.

@jonahwilliams jonahwilliams added the autosubmit Merge PR when tree becomes green via auto submit App label Jan 27, 2024
@auto-submit auto-submit bot merged commit bb9a6d1 into flutter:main Jan 27, 2024
30 checks passed
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Jan 27, 2024
@jonahwilliams jonahwilliams deleted the reuse_vk_image branch January 27, 2024 03:23
auto-submit bot pushed a commit to flutter/flutter that referenced this pull request Jan 27, 2024
…142362)

flutter/engine@2687ddb...2adad88

2024-01-27 skia-flutter-autoroll@skia.org Roll Skia from 6279c88b9e29 to b9b80230c87b (4 revisions) (flutter/engine#50110)
2024-01-27 jonahwilliams@google.com [Impeller] add missing barrier to compute tessellator. (flutter/engine#50108)
2024-01-27 jonahwilliams@google.com [Android] Cache GPU resources using HardwareBuffer's id as key (flutter/engine#50028)
2024-01-27 54558023+keyonghan@users.noreply.github.com Move Mac builder_cache to prod (flutter/engine#50044)

If this roll has caused a breakage, revert this CL and stop the roller
using the controls here:
https://autoroll.skia.org/r/flutter-engine-flutter-autoroll
Please CC jacksongardner@google.com,rmistry@google.com,zra@google.com on the revert to ensure that a human
is aware of the problem.

To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose

To report a problem with the AutoRoller itself, please file a bug:
https://issues.skia.org/issues/new?component=1389291&template=1850622

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
@jonahwilliams jonahwilliams added the revert Label used to revert changes in a closed and merged pull request. label Jan 27, 2024
auto-submit bot pushed a commit that referenced this pull request Jan 27, 2024
@auto-submit auto-submit bot removed the revert Label used to revert changes in a closed and merged pull request. label Jan 27, 2024
auto-submit bot added a commit that referenced this pull request Jan 27, 2024
…ey" (#50114)

Reverts #50028
Initiated by: jonahwilliams
This change reverts the following previous change:
Original Description:
Once a hardware buffer has been imported (a VkImage created for it), we don't ever need to re-create a VkImage, even when the contents change. The same hardware buffer can be identified by ID. Part of flutter/flutter#142153

Otherwise we spend a lot of time re-creating VkImages:

![image](https://github.com/flutter/flutter/assets/8975114/700bc0e2-ab00-417e-89c5-04abe7e1db96)

Draft is here, but is currently leaky: #50028
We only need something like a LRU with the max image size (seems to be 3 for me).  This does log locally that I'm not calling close correctly:

```
E/flutter ( 5580): [ERROR:flutter/shell/platform/android/image_external_texture_vk.cc(51)] Size: 3
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
W/System  ( 5580): A resource failed to call HardwareBuffer.close. 
```

FYI @johnmccutchan
auto-submit bot added a commit to flutter/flutter that referenced this pull request Jan 27, 2024
…isions)" (#142366)

Reverts #142362
Initiated by: jonahwilliams
This change reverts the following previous change:
Original Description:

flutter/engine@2687ddb...2adad88

2024-01-27 skia-flutter-autoroll@skia.org Roll Skia from 6279c88b9e29 to b9b80230c87b (4 revisions) (flutter/engine#50110)
2024-01-27 jonahwilliams@google.com [Impeller] add missing barrier to compute tessellator. (flutter/engine#50108)
2024-01-27 jonahwilliams@google.com [Android] Cache GPU resources using HardwareBuffer's id as key (flutter/engine#50028)
2024-01-27 54558023+keyonghan@users.noreply.github.com Move Mac builder_cache to prod (flutter/engine#50044)

If this roll has caused a breakage, revert this CL and stop the roller
using the controls here:
https://autoroll.skia.org/r/flutter-engine-flutter-autoroll
Please CC jacksongardner@google.com,rmistry@google.com,zra@google.com on the revert to ensure that a human
is aware of the problem.

To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose

To report a problem with the AutoRoller itself, please file a bug:
https://issues.skia.org/issues/new?component=1389291&template=1850622

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Jan 27, 2024
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Jan 27, 2024
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Jan 27, 2024
auto-submit bot pushed a commit to flutter/flutter that referenced this pull request Jan 27, 2024
…142369)

flutter/engine@2687ddb...95e9a15

2024-01-27 flar@google.com Cache Impeller paths in the DisplayList to amortize conversion (flutter/engine#50076)
2024-01-27 98614782+auto-submit[bot]@users.noreply.github.com Reverts "[Android] Cache GPU resources using HardwareBuffer's id as key" (flutter/engine#50114)
2024-01-27 skia-flutter-autoroll@skia.org Roll Dart SDK from 58665e3dee42 to 7ae508ee09a3 (1 revision) (flutter/engine#50112)
2024-01-27 skia-flutter-autoroll@skia.org Roll Fuchsia Linux SDK from WHlwlOwznFknNm5IS... to GBTh3gOOgmndwT70X... (flutter/engine#50111)
2024-01-27 skia-flutter-autoroll@skia.org Roll Skia from 6279c88b9e29 to b9b80230c87b (4 revisions) (flutter/engine#50110)
2024-01-27 jonahwilliams@google.com [Impeller] add missing barrier to compute tessellator. (flutter/engine#50108)
2024-01-27 jonahwilliams@google.com [Android] Cache GPU resources using HardwareBuffer's id as key (flutter/engine#50028)
2024-01-27 54558023+keyonghan@users.noreply.github.com Move Mac builder_cache to prod (flutter/engine#50044)

Also rolling transitive DEPS:
  fuchsia/sdk/core/linux-amd64 from WHlwlOwznFkn to GBTh3gOOgmnd

If this roll has caused a breakage, revert this CL and stop the roller
using the controls here:
https://autoroll.skia.org/r/flutter-engine-flutter-autoroll
Please CC jacksongardner@google.com,rmistry@google.com,zra@google.com on the revert to ensure that a human
is aware of the problem.

To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose

To report a problem with the AutoRoller itself, please file a bug:
https://issues.skia.org/issues/new?component=1389291&template=1850622

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
@jonahwilliams jonahwilliams restored the reuse_vk_image branch January 27, 2024 18:06
auto-submit bot pushed a commit that referenced this pull request Jan 30, 2024
Reland of #50028 , which was reverted because I missed the different API level requirement of getId. getID now returns an optional which is nullopt on API level < 31.

Fixes flutter/flutter#142365
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autosubmit Merge PR when tree becomes green via auto submit App e: impeller platform-android
Projects
No open projects
Status: ✅ Done
2 participants