RenderRepaintBoundary.toImage() occasionally returns a blank image #43085

ianpark · 2019-10-20T14:01:45Z

Problem

RenderRepaintBoundary.toImage() occasionally returns blank image. It happens very often in my app when app is busy, and many other widgets are rendered together. In the demo app for reproducing the problem, it's much rare but still reproducible by repeating the test. I've created a package.

https://github.com/ianpark/flutter_capture_bug_demo

Previously closed issue that seems to be same problem:
#17687

Google team, please raise the priority for this issue:

This issue is a launch blocker of many other Flutter projects including mine, and I am still using v1.7.8+hotfix.4 which is the last stable build where this bug does not exist.

Any app depending on this feature must not upgrade to the next stable build and if you upgrade your Mac to Catalina, you won't be able to run prod build with v1.7.8+hotfix.4. However you cannot upgrade to v1.9.1x due to this issue. You also cannot downgrade to Mojave. So your will get stuck.

Really hope Google team take this issue seriously and find a solution.

Steps to Reproduce

As this is a race condition problem, it does not reproduce in the demo app as much as it does in the real apps. However my demo app still can reproduce the problem and around 10% fail on a real device.

When toImage() fails, the length of the returned bytes is exceptionally small. Usually several thousands KB. When it captures partially it could be bigger.

In the demo, there is a console that you can easily tell when the failure happens as [nth-try byte-size] will be appended to the yellow console. And also there is failure % which will tell you the test result is actually worse on more constrained devices. Never seen any failure with the prod build of this demo app, but seen the problem on my prod app.

Please follow this step to reproduce the problem.

Clone https://github.com/ianpark/flutter_capture_bug_demo
make sure you are on the stable channel and 1.9.1+hotfix.2 or other higher versions.
Launch the app on an emulator or a real device in debugging mode (e.g. flutter run)
Press Load button and select a large image. 3MB should be probably enough but it all depends on the device performance / memory status. I even can reproduce it with much smaller images.
Press the blue save button multiple time until you see the fail% increase. Pressing the button too fast won't help because it's queued up and handled one by one. Just moderate speed is fine.

I intentionally didn't add displaying the result image in the app as it may blur the real problem. Checking the byte length is clearly enough here.

Note that this problem also can be reproduced by Loop/Stop buttons which is calling the function in a loop. However the chance is very low so it happens once in thousand time in my testing. So please use your finger :) And this smells like the race condition is triggered by user interaction handling or animations.

Target Platform:
Android, iOS

Target OS version/browser:
MacOS (reproduced on Sierra, Mohave, Catrina)

Devices:
Emulator: Google Pixel 3, Nexus5
Devices: Samsung Galaxy Note, iPhone7

Logs

While reproducing the problem, `flutter run --verbose` does not leave any extra log at all.

$ flutter analyze
Analyzing capture_error_demo...
No issues found! (ran in 1.5s)

$ flutter doctor -v
[✓] Flutter (Channel stable, v1.9.1+hotfix.2, on Mac OS X 10.15 19A602, locale ko-KR)
    • Flutter version 1.9.1+hotfix.2 at /Users/ian/Dev/flutter
    • Framework revision 2d2a1ffec9 (6 weeks ago), 2019-09-06 18:39:49 -0700
    • Engine revision b863200c37
    • Dart version 2.5.0

[✓] Android toolchain - develop for Android devices (Android SDK version 29.0.2)
    • Android SDK at /Users/ian/Library/Android/sdk
    • Android NDK location not configured (optional; useful for native profiling support)
    • Platform android-29, build-tools 29.0.2
    • Java binary at: /Applications/Android Studio.app/Contents/jre/jdk/Contents/Home/bin/java
    • Java version OpenJDK Runtime Environment (build 1.8.0_202-release-1483-b49-5587405)
    • All Android licenses accepted.

[!] Xcode - develop for iOS and macOS (Xcode 11.0)
    • Xcode at /Applications/Xcode.app/Contents/Developer
    • Xcode 11.0, Build version 11A420a
    ✗ CocoaPods not installed.
        CocoaPods is used to retrieve the iOS and macOS platform side's plugin code that responds to your plugin usage on the Dart side.
        Without CocoaPods, plugins will not work on iOS or macOS.
        For more info, see https://flutter.dev/platform-plugins
      To install:
        sudo gem install cocoapods
        pod setup

[!] Android Studio (version 3.5)
    • Android Studio at /Applications/Android Studio.app/Contents
    ✗ Flutter plugin not installed; this adds Flutter specific functionality.
    ✗ Dart plugin not installed; this adds Dart specific functionality.
    • Java version OpenJDK Runtime Environment (build 1.8.0_202-release-1483-b49-5587405)

Adding the previous audiences:
@tvolkert @aliyigitbireroglu @andreidiaconu @gisinator @benneca @hariprasadiit

The text was updated successfully, but these errors were encountered:

gisinator · 2019-10-20T14:11:25Z

Ian, thank you very much for opening this new thread and taking the time to create this demo project. As you say, this is a critical issue that prevents you, me and as it seems many more people from upgrading to current stable version. Hopefully google team can take care of this soon, thank you guys in advance for looking into this.

ianpark · 2019-10-21T10:34:24Z

Please ignore the flutter doctor result regaring cocoapad. It's wiped out after Catalina upgrade but got it fixed.

ianpark · 2019-10-21T10:36:02Z

@gisinator really hope this issue is addressed asap. My app is already in prod and unabled to update the outstanding issues.

tvolkert · 2019-10-21T13:41:33Z

@cbracken this is a pretty bad regression.

SpiciedCrab · 2019-10-23T14:08:46Z

This also happened to me, so is there any solution about it?

cbracken · 2019-10-23T15:53:07Z

Thanks for reporting this. This is not likely to get worked on by the team in the near future.

We work through issues in priority order starting with TODAY, then customer: blocker then customer: critical. Once those have been burnt through we'll be chasing issues further down the list.

Aside from the above labels, we use thumbs-up reactions as a means of measuring priority so if you're affected by this, adding thumbs-up reactions will help ensure this moves up our list.

cbracken · 2019-10-23T15:57:19Z

FWIW -- if this is a regression, then bisecting where it was introduced would be massively helpful in helping the team produce a fix (or just revert the offending commit).

xssxxssx · 2019-10-25T06:40:55Z

This also happened to me, so is there any solution about it?

jason-simmons · 2019-10-25T18:55:04Z

Bisected this to flutter/engine@78a8ca0

@gaaclarke

gaaclarke · 2019-10-25T20:33:54Z

I tried to reproduce this on iPhone SE (thanks for the repro code). What I found was more like an 80% failure rate.
I put a breakpoint on OpenGL ES errors and there isn't one getting raised.
Changing the pixelRatio to 1.0 didn't seem to have an affect on the failure rate. If anything it made things worse.
Seen on Engine version 0d43469.
I tried to capture our render frame, that got broke at some point since since xcode waits forever for a "No capture boundary detected..." If we fix that it should be apparent what is happening.

gaaclarke · 2019-10-25T21:57:30Z

@jason-simmons and I were able to track down the issue using gapid.

If you look at line 22640 you can see we are doing a glClear before we render to our frame buffer, then in 22649 we actually do the draw call. This is the case where everything worked and the glReadPixels on line 22652 gets correct output.

If you look at line 19931 we are performing a glClear, again to prepare for rendering to our frame buffer. Then if you look at line 20012 you'll notice there is another glReadPixels, but between the glClear and the glReadPixels, there is no actual draw call. That is why when we do the glReadPixels, it returns an all black image.

My diagnosis is that for some reason Skia is not performing the glDrawRangeElements call. Perhaps it is detecting some internal error state and is aborting the call? It's not clear from the OpenGL calls why it wouldn't perform the draw.

jason-simmons · 2019-10-25T23:28:12Z

When drawing the image, Skia's draw_texture_producer calls GrTextureProducer::refTextureProxyForParams, which invokes SkImage_Lazy::lockTextureProxy.

The image draw succeeds when SkImage_Lazy can obtain a GrTextureProxy from GrProxyProvider::findOrCreateProxyByUniqueKey. If findOrCreateProxyByUniqueKey returns null, then the image draw fails silently, and the bitmap rendered in Flutter's MakeRasterSnapshot will be empty.

SkImage_Lazy::lockTextureProxy has several code paths for obtaining a texture if the GrProxyProvider cache lookup fails. One of these paths calls GrBackendTextureImageGenerator::onGenerateTexture, but that fails because of the fRefHelper->fBorrowingContextID != context->priv().contextID() check.

Possibly this is a sign that the engine is doing something wrong related to how we share image textures between the IO thread's GrContext and the GPU thread's context?

ianpark · 2019-10-26T15:28:59Z

@jason-simmons @gaaclarke thanks for looking into this issue.

ianpark · 2019-10-29T15:44:12Z

this issue needs more love. @gaaclarke do you think using v1.9.1+hotfix.6 with your commit(flutter/engine@78a8ca0) reverted would be too risky?

gaaclarke · 2019-10-29T17:05:29Z

@ianpark Things are slowed down a bit since we've tracked down the issue for Skia. We are working with them to get it triaged and addressed.

Assuming the patch reverts cleanly it should be safe. It is a performance PR so hopefully the only downside would be performance. However, since it changes threading models there is a risk that the code has shifted its assumptions enough already to need to be run with the new threading model.

gaaclarke · 2019-10-29T17:49:49Z

Linked Skia Bug: https://bugs.chromium.org/p/skia/issues/detail?id=9581

ianpark · 2019-10-30T16:00:42Z

@gaaclarke thanks for the update. Sorry I am getting desperate as I found the news versions of some other dependencies fail to build with 1.7.8 hence the isolation level of my app gets increased time by time. Hope Skia team would turn around quickly.

brianosman · 2019-10-30T18:12:49Z

Okay, I think the fix may need to happen in Flutter. What's happening:

Cross-context images are designed so that they can only be used by one GrContext at a time. Internally, the image has the actual GL texture. When the image is referenced by a draw with a particular GrContext, we wrap the GL texture in a temporary Skia texture object, and remember the ID of the GrContext that's using it. Once that draw has finished and the Skia texture object is disposed of, we reset the ID on the cross-context image so that a different GrContext can start using it (or the same one, again). This is necessary, because some OpenGL texture state is tied to the texture object itself, so there's no safe way to be using the same texture on two threads at the same time.

The error that's happening suggests that the image is still in use by another GrContext when the raster snapshot is being made. The regressing change moved the raster snapshot logic to the IO thread, so we can assume that the GPU thread has the image in use. The simplest safe fix is to synchronize with the GPU thread when doing a raster snapshot, waiting for it to flush the current frame before trying to do anything.

gaaclarke · 2019-10-30T18:45:44Z

Thanks @brianosman for looking into it, that makes sense. We can look into that and it shouldn't be too difficult.

Are you sure this isn't something we'd want at the Skia level? Basically a lock around glReadPixels for reading and glBindFrameBuffer for writing when it comes to cross context images.

gaaclarke · 2019-10-30T18:47:09Z

@brianosman At the very least we should make that noop have a warning message, instead of just drawing nothing silently.

brianosman · 2019-10-30T20:13:16Z

@gaaclarke Skia CL to warn in this situation has landed: https://skia.googlesource.com/skia/+/6e1d51a2c74ce26c499cb141920007d3dda11435

jason-simmons · 2019-10-31T00:14:41Z

I think the same SkImage instance is being referred to by the SkPicture passed to Picture.toImage() and by an SkPicture in the layer tree that is rendered onscreen.

If the first SkPicture is rasterized on the IO thread and the second SkPicture is rasterized on the GPU thread, then it's possible for that SkImage to be consumed by both threads concurrently. I don't think we can prevent that without moving Picture.toImage() rendering back to the GPU thread.

brianosman · 2019-10-31T13:10:33Z

As far as I know, the only cross-context images are those explicitly created on the IO thread (for decoding large images that are assets in the application). It would have to be something that went through here: https://github.com/flutter/engine/blob/646b594d5e03e1873bbb021d0f4b2994777808de/lib/ui/painting/image_decoder.cc#L177

gaaclarke · 2019-10-31T17:25:25Z

Here is my theory of what is happening and the bug. If "render to fbo" and "access for drawing" happen at the same time we get the collision we were experiencing. My theory is that the SkPicture and the Texture are generated just for toImage. If we wait for the GPU to finish rendering to that SkPicture we can safely read from it. Let me know if I'm missing something.

gaaclarke · 2019-10-31T17:51:07Z

Okay, I talked with @jason-simmons offline. My whole assumption is wrong. The image in contention is the actual decoded image created by the IO thread, not one that was created as a result of the toImage logic.

This problem is because 2 different threads are trying to read from the same texture at the same time, not that one is trying to write to it while another is reading. Here is a stackoverflow question asking about reading from shared objects and getting a crash. So, it is definitely a thing that could be a problem depending on the opengl driver.

Since Skia gives us no visibility to the usage of the texture, it is hidden down in SkPicture internals, there is no meaningful way we can protect usage of the texture. I believe that Skia should implement the mutual exclusion but it sounds like there is an objection to that which I don't understand yet. I'll follow up with Skia, @brianosman?.

In the meantime the only safe thing Flutter can do is to move toImage back onto the GPU thread.

That is a shame because executing toImage on the GPU thread interferes with latency of the first frame because ShaderWarmUp uses toImage before we render the first frame. There are some benefits of performing it on the GPU thread though like more efficient shader caching (how good are we are predicting shaders as part of ShaderWarmUp is another question).

liyuqian · 2019-10-31T20:19:00Z

I concur that moving toImage back to the GPU thread seems to be the best solution right now. Sorry for not catching this issue while we were reviewing flutter/engine#9813. For that, I really appreciate your reproduction app, @ianpark ! Let's put that into our golden test to safeguard similar issues in the future, @gaaclarke .

cmkweber · 2019-10-31T20:27:31Z

Would moving back to the GPU thread allow an easier fix to: #40990?

This would just involve adding another endpoint to return the Picture on the GPU thread without proceeding to rasterization.

tvolkert · 2019-11-01T00:33:58Z

@gaaclarke was this meant to be closed?

ianpark · 2019-11-01T04:46:36Z

@gaaclarke really appreciate for diving deep into this problem with Skia guys. I learnt quite a bit from your comments. It's shame that your performance fix is reverted but it will unblock many other devs and apps so I am very happy. :)

I now clearly understand the root cause and think that reverting the code was the best move not just for unblocking the users but also to avoid misleading the fellow devs who will work on the similar area. Even I also got some sort of impression from your code that using the IO thread for calling a graphic rendering API of Skia is probably safeguarded by Skia's advanced algorithm or somehow else.

@tvolkert I think this issue should remain opened until another stable release with this patch is available to the public, that would help to prevent duplicated bug reports. Do you plan to release another stable release of 1.9.1 with this patch, or would it be part of the next version?

@liyuqian happy to hear that my repro code was useful :) thx!

gaaclarke · 2019-11-01T17:14:43Z

@tvolkert Yep, the merged PR fixes it.

gaaclarke · 2019-11-01T17:19:58Z

@ianpark Croeso. We close out issues once they are fixed on master.

ianpark · 2019-11-06T13:58:08Z

@tvolkert could you briefly explain the release plan of the fix for this issue?

ianpark · 2019-11-14T00:22:50Z

@tvolkert @gaaclarke any update on releasing the fix for this issue? I am happy to patch my local Flutter but not sure how to build the stable Flutter 1.9.1 with the patched engine that is 100% compatible with Flutter 1.9.1. Is there a good guideline for doing that?

tvolkert · 2019-11-14T05:31:18Z

@ianpark the fix is on versions v1.10.16 and higher, which means it's currently on the dev channel. It should land on the beta channel in a few weeks and the stable channel a few weeks after that.

ianpark · 2019-11-17T20:00:06Z

@tvolkert thanks for the information. I was trying to find a way to patch the fix on top of the latest stable release. Probably I should just wait for now.

rockingdice · 2019-11-26T02:26:23Z

Can I get an image by this fix without doing any workarounds anymore?
Like described in the SO question: https://stackoverflow.com/questions/57645037/unable-to-take-screenshot-in-flutter

They are using debugNeedsPaint to check if the image is safe to obtain, or just add a delay then to get the image.

gisinator · 2019-12-16T19:13:04Z

@ianpark the fix is on versions v1.10.16 and higher, which means it's currently on the dev channel. It should land on the beta channel in a few weeks and the stable channel a few weeks after that.

May I ask if someone with current stable v1.12.13+hotfix.5 could verify that this issue was resolved?

benneca · 2019-12-16T21:59:00Z

@ianpark the fix is on versions v1.10.16 and higher, which means it's currently on the dev channel. It should land on the beta channel in a few weeks and the stable channel a few weeks after that.

May I ask if someone with current stable v1.12.13+hotfix.5 could verify that this issue was resolved?

It appears to be working in stable v1.12.13+hotfix.6. but this same issue still persists with google maps

klaszlo8207 · 2020-01-15T11:10:03Z

I need to screenshot my GoogleMaps widget, but I get oly a black/blank image via Image.memory, any help in this?

benneca · 2020-01-15T17:42:19Z

I need to screenshot my GoogleMaps widget, but I get oly a black/blank image via Image.memory, any help in this?

the only workaround that I am aware of at this time is to create your own "screenshot" plugin that captures the flutterView. of course, it requires that you manually crop the result to get only the area of the google map widget in your final image. It is far from ideal, but to my knowledge the only option if you want to use google maps. you can use flutter maps and your standard RepaintBoundary will work fine.

klaszlo8207 · 2020-01-16T08:41:49Z

@benneca How? Please share the source code.

I tried this:

https://gist.github.com/slightfoot/8eeadd8028c373df87f3a47bd4a35e36

not worked

benneca · 2020-01-16T14:22:04Z

@klaszlo8207 take a look at this plugin, this is very similar to the approach I took
https://pub.dev/packages/screenshot_and_share

haifang12 · 2020-12-16T12:54:57Z

try to call RenderRepaintBoundary.toImage() twice

github-actions · 2021-08-08T08:01:31Z

This thread has been automatically locked since there has not been any recent activity after it was closed. If you are still experiencing a similar issue, please open a new bug, including the output of flutter doctor -v and a minimal reproduction of the issue.

ianpark changed the title ~~RenderRepaintBoundary.toImage() occasionally returns blank image~~ RenderRepaintBoundary.toImage() occasionally returns a blank image Oct 20, 2019

janmoppel added a: images Loading, displaying, rendering images customer: crowd Affects or could affect many people, though not necessarily a specific customer. framework flutter/packages/flutter repository. See also f: labels. labels Oct 21, 2019

tvolkert added engine flutter/engine repository. See also e: labels. c: regression It was better in the past than it is now and removed framework flutter/packages/flutter repository. See also f: labels. labels Oct 21, 2019

tvolkert added this to the Goals milestone Oct 23, 2019

gaaclarke added the dependency: skia Skia team may need to help us label Oct 25, 2019

Joao-b4 mentioned this issue Oct 31, 2019

sometimes not save with the background image, just the drawing ja2375/painter2#3

Closed

This was referenced Oct 31, 2019

Moved toImage back to the GPU thread. flutter/engine#13465

Closed

Revert 78a8ca0f62b04fa49030ecdd2d91726c0639401f flutter/engine#13467

Merged

gaaclarke closed this as completed Nov 1, 2019

jason-simmons mentioned this issue Nov 1, 2019

Canvas.drawImageRect() briefly blanks out UI.Image data #30697

Closed

chinmaygarde mentioned this issue Nov 4, 2019

Perform device to host texture transfer on the IO thread. #44148

Open

github-actions bot locked as resolved and limited conversation to collaborators Aug 8, 2021

RenderRepaintBoundary.toImage() occasionally returns a blank image #43085

RenderRepaintBoundary.toImage() occasionally returns a blank image #43085

Comments

ianpark commented Oct 20, 2019 • edited Loading

Problem

Google team, please raise the priority for this issue:

Steps to Reproduce

Logs

gisinator commented Oct 20, 2019

ianpark commented Oct 21, 2019

ianpark commented Oct 21, 2019

tvolkert commented Oct 21, 2019

SpiciedCrab commented Oct 23, 2019

cbracken commented Oct 23, 2019 • edited Loading

cbracken commented Oct 23, 2019

xssxxssx commented Oct 25, 2019

jason-simmons commented Oct 25, 2019

gaaclarke commented Oct 25, 2019

gaaclarke commented Oct 25, 2019

jason-simmons commented Oct 25, 2019

ianpark commented Oct 26, 2019

ianpark commented Oct 29, 2019

gaaclarke commented Oct 29, 2019

gaaclarke commented Oct 29, 2019

ianpark commented Oct 30, 2019

brianosman commented Oct 30, 2019

gaaclarke commented Oct 30, 2019

gaaclarke commented Oct 30, 2019

brianosman commented Oct 30, 2019

jason-simmons commented Oct 31, 2019

brianosman commented Oct 31, 2019

gaaclarke commented Oct 31, 2019

gaaclarke commented Oct 31, 2019

liyuqian commented Oct 31, 2019

cmkweber commented Oct 31, 2019

tvolkert commented Nov 1, 2019

ianpark commented Nov 1, 2019 • edited Loading

gaaclarke commented Nov 1, 2019

gaaclarke commented Nov 1, 2019

ianpark commented Nov 6, 2019

ianpark commented Nov 14, 2019

tvolkert commented Nov 14, 2019

ianpark commented Nov 17, 2019

rockingdice commented Nov 26, 2019

gisinator commented Dec 16, 2019

benneca commented Dec 16, 2019

klaszlo8207 commented Jan 15, 2020

benneca commented Jan 15, 2020

klaszlo8207 commented Jan 16, 2020

benneca commented Jan 16, 2020

haifang12 commented Dec 16, 2020

github-actions bot commented Aug 8, 2021

ianpark commented Oct 20, 2019 •

edited

Loading

cbracken commented Oct 23, 2019 •

edited

Loading

ianpark commented Nov 1, 2019 •

edited

Loading