Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

removing Image widgets does not reduce Graphics memory allocation #56482

Closed
pcsosinski opened this issue May 6, 2020 · 45 comments · Fixed by #66688 or #67177
Closed

removing Image widgets does not reduce Graphics memory allocation #56482

pcsosinski opened this issue May 6, 2020 · 45 comments · Fixed by #66688 or #67177
Assignees
Labels
a: images Loading, displaying, rendering images c: performance Relates to speed or footprint issues (see "perf:" labels) customer: money (g3) engine flutter/engine repository. See also e: labels. framework flutter/packages/flutter repository. See also f: labels. P0 Critical issues such as a build break or regression perf: memory Performance issues related to memory waiting for PR to land (fixed) A fix is in flight

Comments

@pcsosinski
Copy link

pcsosinski commented May 6, 2020

Internal: b/157467549

If the app was put to background and then resumed, the GPU memory went back down.

not sure what the right approach for this. Changing the widget tree and not seeing an expected memory behavior is hard to explain. Is this a bug? Is it something that should be in documentation?

/cc @cbracken @mehmetf

@liyuqian
Copy link
Contributor

liyuqian commented May 8, 2020

This issue looks similar to #19558. The next steps of fixing this would also be similar:

  1. Put the code of a minimal reproduction app here so we can manually reproduce the issue
  2. Turn that minimal reproduction to a memory perf test similar to "Memory test on scrolling large images quickly Memory test on scrolling large images quickly #46033"
  3. Once we can use the perf test to consistently and quickly reproduce and debug this issue, I'm confident that we'll come up with a fix like "Cleanup the IO thread GrContext App turn into totally black all of a sudden on iPhone X, iOS11 #14265" soon.

@chinmaygarde
Copy link
Member

@liyuqian Isn't this WAI? Even if the images are not referenced in the widget tree, they may be referenced in the image cache in the framework. Putting the application in the background clears the cache. If the application released all ui.Image references and we did not observe a GPU memory decrease, then that would be a problem. That would be the more accurate perf test.

@mehmetf
Copy link
Contributor

mehmetf commented May 9, 2020

Totally fine to close this as WAI but it is surprising from developer's point of view so we might want to think about how to expose this in tooling to guide them.

Image cache gets repeated many times in these conversations; perhaps it deserves its own metrics in the memory section of DevTools @terrylucas

@liyuqian
Copy link
Contributor

liyuqian commented May 9, 2020

@chinmaygarde : if it's image cache that's retaining those images, and there's a way for developers to clear those image cache, then this is totally WAI. A reproduction app would clarify whether it's image cache WAI or some unexpected issues.

@chinmaygarde
Copy link
Member

and there's a way for developers to clear those image cache

There are definitely ways to manage the images in the cache and its limits.

but it is surprising from developer's point of view so we might want to think about how to expose this in tooling to guide them.

This is true. Also, accessing the image caches can be implicit so users may not even be aware that this is whats going on. Also, the image caches are simple LRU caches with arbitrary limits that have no bearing on actual device capabilities. So it is totally possible for users to unknowingly use the image cache all the way to device memory exhaustion. From the engine's perspective, these caches are particularly frustrating because the references to the images in this cache are strong and there aren't too many opportunities to optimize textures in this cache. For example, if the images were to be speculatively evicted from device memory, rendering them again would lead to a massive increase in frame time because the host to device transfer would need to be carried out again. Also, making these images evictable from device memory means that the engine would need to maintain a host side copy of the allocation which would sharply increase heap memory usage. ui.Images were meant to be references to images that are extremely fast to render by referencing them in the widget tree. But they are also expensive in device memory. However, image caches keeps these references around for quite periods well beyond their intended use. In the past, we have done a lot of work in the engine to make it so that allocating and collecting these images is as fast/performant as possible. But till the application/framework holds a strong reference to these images, the issue of device memory usage will keep coming up. If a long term solution to the frequent discussions surrounding device memory usage of image caches is to be devised, it probably involves an alternative to image caches that maintains image references more closely tied to their use in widgets hierarchies. Till then, exposing the cost of images in the cache from tooling seems like a good idea.

To answer the question: "removing Image widgets does not reduce Graphics memory allocation, why?", The TL;DR is widgets don't own the Graphics (GPU/Device) memory allocations for the images they display, strong references to ui.Images do. So not using a widget does not imply that the graphics memory will get collected.

A reproduction app would clarify whether it's image cache WAI or some unexpected issues.

Yes, this issue currently seems like asking a question instead of raising an issue. Lets prepare a reproduction case to ensure we aren't talking about a construction issue.

@liyuqian
Copy link
Contributor

liyuqian commented May 13, 2020

@mehmetf : can you or the "customer: money" team please provide us a reproduction case so we can figure out if it's WAI or an issue of ours?

@liyuqian liyuqian added the waiting for customer response The Flutter team cannot make further progress on this issue until the original reporter responds label May 13, 2020
@mehmetf
Copy link
Contributor

mehmetf commented May 13, 2020

Download a large PNG to your application assets and name it very_large.png. Then run the following code:

import 'package:flutter/material.dart';
import 'package:flutter/services.dart';

void main() {
  runApp(MyApp());
}

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'Flutter Demo',
      theme: ThemeData(
        primarySwatch: Colors.blue,
        visualDensity: VisualDensity.adaptivePlatformDensity,
      ),
      home: MyHomePage(title: 'Flutter Demo Home Page'),
    );
  }
}

class MyHomePage extends StatefulWidget {
  MyHomePage({Key key, this.title}) : super(key: key);
  final String title;

  @override
  _MyHomePageState createState() => _MyHomePageState();
}

class _MyHomePageState extends State<MyHomePage> {
  int _counter = 0;

  @override
  void initState() {
    imageCache.maximumSize = 0;
    super.initState();
  }

  void _incrementCounter() {
    if (_counter == 10) {
      imageCache.clear();
    }
    setState(() {
      _counter++;
    });
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: Text(widget.title),
      ),
      body: Column(
        mainAxisSize: MainAxisSize.max,
        children: [
          Text('Images will go here $_counter'),
          if (_counter > 2 && _counter <= 5)
            Image.asset('assets/very_large.png'),
        ],
      ),
      floatingActionButton: FloatingActionButton(
        onPressed: _incrementCounter,
        child: Icon(Icons.ac_unit),
      ),
    );
  }
}

I tried to set the image cache size to 0 and also clear it when the counter hits 10 but none of that helped get the graphics size back down to "normal" levels. In my experiment, before the image is shown graphics memory reported by adb is 55M. After image is shown it is 178M and it stays there no matter what.

In fact, I misspoke earlier. After I put it the app to background and foreground again, the graphics memory goes down but not to initial levels. It stays at 140M.

@liyuqian liyuqian removed the waiting for customer response The Flutter team cannot make further progress on this issue until the original reporter responds label May 14, 2020
@liyuqian liyuqian assigned dnfield and unassigned liyuqian May 18, 2020
@dnfield
Copy link
Contributor

dnfield commented May 18, 2020

I'm going to take a quick look at this - I have a couple ideas about things to try here.

@dnfield
Copy link
Contributor

dnfield commented May 18, 2020

Something strange here. The image cache is not holding a reference to this image for sure - it's not holding anything. Something is though. Trying to figure out what.

@dnfield
Copy link
Contributor

dnfield commented May 18, 2020

By something is, I mean the Dart VM still seems to believe it's live, even though it seems like it shouldn't be.

@dnfield dnfield added dependency: dart Dart team may need to help us framework flutter/packages/flutter repository. See also f: labels. labels May 18, 2020
@mkustermann
Copy link
Member

/cc @rmacnak-google

@dnfield
Copy link
Contributor

dnfield commented May 19, 2020

With an Image.asset, it does eventually get GC'd once the counter increments high enough. The issue is not with image caching. I'm seeing Image.network get retained and not GC'd, filed separate bug for that.

I'd like to think it would get GC'd more quickly due to the size of the image I was working with (it was taking close to 200mb of memory by itself). It was taking several seconds and many many frames after it was gone before getting GC'd. I'm not sure if that's just WAI or if that's reasonable behavior.

/cc @jason-simmons who was also looking at this

@chinmaygarde
Copy link
Member

Can you capture and attach a heap dump from the observatory for analysis here?

@dnfield
Copy link
Contributor

dnfield commented May 19, 2020

@chinmaygarde - it's not showing up in the heap snapshots, but is showing up in the old generation of the allocation profile. Here are the CSV dumps. One from app startup, (1) after the image loads, (2) after the image has been disposed by the framework but you can still see the native memory in the old generation, and (3) after I press the button 20 or 30 more times which eventually triggers a GC and the old gen is cleaned up.

Again, the main concern is that native memory is remaining resident for too long. In a real application, if I had to show that large image and then got rid of it, I need to be able to show more large images again without potentially OOMing in between because Dart has failed to GC the unreferenced memory.

heap_profiles.zip

@dnfield
Copy link
Contributor

dnfield commented May 20, 2020

And in case it's not clear, the memory showing in the old generation is also visible in Xcode - e.g. the SkImage_lazy and its backing texture are still around in memory until the GC finally grabs it.

@chinmaygarde
Copy link
Member

With an Image.asset, it does eventually get GC'd ... . I'm seeing Image.network get retained

Did you also disable the image cache (not sure if you are using Mehmets reduced test case)? I forget which variants consult the cache.

And in case it's not clear, the memory showing in the old generation is also visible in Xcode.

That is to be expected. As long as there is a reference to the object in the Dart VM (even if that is a garbage reference the collector hasn't bothered collecting yet), the engine cannot collect the texture from graphics memory.

eventually triggers a GC and the old gen is cleaned up.

Ok, this is good. It means that when a reference is finally collected, all native allocations including ones resident in graphics memory are collected.

I need to be able to show more large images again without potentially OOMing in between because Dart has failed to GC the unreferenced memory.

Have you actually seen this happening? Before growing the old space, the VM will attempt to release garbage references. Attempting to allocate more large objects will ensure that the old garbage references are released first. Can you setup an example where one large image is loaded after another (make sure the old one is actually collectable otherwise you'll just run out of memory)?

Again, the main concern is that native memory is remaining resident for too long.

Without manual memory management, you are never going to be in complete control of when something gets collected. Some things you can try though:

  • Shell::NotifyLowMemoryWarning should perform more aggressive collection of unused references (in the VM and otherwise). Maybe call that before going into the background.
  • We could add a new method to image (say dispose) that the framework can call manually that makes the engine pre-emptively get rid of the texture allocation and not wait for the garbage collector to finally come along and collect the reference. So semi-manual memory management for textures.
  • Actually kill the Dart VM when going into the background. You'll incur the cost of restarting the VM when you get back which is not fast (thats why we had to add the the Settings::leak_vm flag).

Only the second suggestion will help when not going into the background. But then again, the collector taking a while to actually collect references seems like expected behavior.

After attempting that a reduced test case where very large images are loaded, displayed and unloaded one after another runs fine, I think we can safe say that the answer to the question "removing Image widgets does not reduce Graphics memory allocation, why?" is that the image is actually referenced in a cache or is awaiting collection by the GC, and, no actual bug has been identified.

@dnfield
Copy link
Contributor

dnfield commented May 20, 2020

In the demo app, the image cache is definitely disabled.

The ui.Image has a dispose method, that does this:

https://github.com/flutter/engine/blob/d0b69d076560ba0be6a828bf37fc9ba93722a75e/lib/ui/painting/image.cc#L39-L41

It looks like the framework never calls this. Adding a call to it also doesn't currently cause the native textures to get released (until a GC happens). It's also a little hard to be 100% sure of when that would be safe to call anyway.

Have you actually seen this happening?

I changed the simple demo app from Mehmet so that it displays the image when the counter is at an even number. I occasionally see memory spikes in Xcode that indicate two SkImages are still resident - in fact, I managed to get it to pause at one point where that was still the case:
image

I'm fairly sure that one of those is in the old gen heap and one is the new incoming image - the memory drops down quickly. However, with sufficiently large images this would definitely fail, even if there was lots of wall time between the taps.

@dnfield
Copy link
Contributor

dnfield commented May 20, 2020

I would have expected a call to NotifyIdle to trigger the GC we want here - but I'm not entirely clear on whether that's getting called. At the very least, I'd expect it to get called when we go background if it's not already.

@dnfield
Copy link
Contributor

dnfield commented Aug 31, 2020

This had to get bumped to september to accomodate relanding a new fix. Those patches are currently in progress and should land "soon".

@dnfield
Copy link
Contributor

dnfield commented Sep 8, 2020

Engine changes have landed. #64582 is still under review.

@dnfield
Copy link
Contributor

dnfield commented Sep 22, 2020

Framework side PR has been closed in favor of new engine change to create a disposable image wrapper. That is under review here: flutter/engine#21057

It is currently blocked by #66379 and blocked by general difficulty in keeping the tree open of late.

@dnfield dnfield moved this from To do to In progress in Memory Improvements 2020 Sep 24, 2020
@dnfield
Copy link
Contributor

dnfield commented Sep 25, 2020

engine patches have all landed, final framework patch is posted and under review.

@dnfield dnfield moved this from In progress to Review in progress in Memory Improvements 2020 Sep 25, 2020
@dnfield
Copy link
Contributor

dnfield commented Sep 28, 2020

Framework patch is awaiting another engine roll before it passes CI, but is still available for review. I expect to land it this week, which may mean it slips into the first week of October. Leaving it in September for now.

Memory Improvements 2020 automation moved this from Review in progress to Done Oct 1, 2020
@dnfield
Copy link
Contributor

dnfield commented Oct 1, 2020

Here's a quick overview of the problem before this bug was closed:

  1. dart:ui Image objects are backed natively by SkImage. SkImage is a C++ ref-counted pointer and it is considered immutable internally in Skia.
  2. Although Image has a dispose method, that method only releases one reference on the SkImage. The SkImage will not be destroyed until the refcount is zero on the image, and potentially until we call performDeferredCleanup on the GrContext(s) related to objects referring to the image.
  3. We bump the refcount on an image any time we draw it into a picture.
  4. Pictures are similarly backed by SkPicture objects, which are similarly refcounted. They get retained by Layer objects.
  5. Layer objects have both a Dart and a C++ representation. Some of the C++ representations end up getting decoupled from any specific Dart object, e.g. an engine's idea of a PictureLayer has no direct Dart corresponding layer, but ends up being indirectly retained via various EngineLayer objects, which are retained by various framework Layer objects.
  6. Framework Layer objects are retained by RenderObjects. RenderObjects to not have a clearly defined endpoint in their lifecycle - can get detached, but may be re-attached.
  7. At the core, we rely on the Dart GC to get rid of stale RenderObjects, Layers, and Pictures. Once these things are gone, as long as the Image is not held in e.g. the ImageCache, it can also be collected and the native memory can be collected.
  8. Previously, the Dart VM's GC heuristics relied on us creating new objects ("upward pressure") to know when to run a collection. There was no way to signal it that "downward pressure" existed (i.e., the program believes it has released references even if it doesn't immediately intend to create new ones). This was compounded by the fact that, as more memory gets allocated, the amount memory needed to trigger a collection increases, and so more new memory needs to get allocated.

Here's an overview of what had to happen to fix this, and some of the failed/reverted attempts.

  1. Add a HintFreed API to the Dart VM so that we can signal it when we think memory might be freed. https://flutter.dev/go/downward-memory-pressure, https://dart-review.googlesource.com/c/sdk/+/150503
  2. Use HintFreed at appropriate times in the engine.
  • An initial failed attempt involved trying to call this at every idle period with whatever garbage we thought we freed from the layer tree - Hint freed engine#19842. This would have had the advantage of not needing any changes to the framework or application code, at the cost of potentially doing more GCs than necessary.
  • This freed more memory, but became very CPU intensive as it triggered GCs way too frequently. Reverted here: Revert hint_freed engine#20746
  • A second attempt was landed in Use hint freed specifically for image disposal engine#20754, which focused specifically on using it when Image.dispose is called. This has the advantage of being more focused with GCs (although we will still see some upticks on GCs if many images are used and disposed of), but the disadvantage of requiring changes to the framework to actually dispose of images.
  1. This meant that we had to get more serious about making the framework/applications dispose image objects. But the problem is, you never know if your own reference to an image is exclusive, and the framework was not built with any ownership model of image objects. This is particularly difficult as well because image objects have to be passed into canvas methods to draw, rather than calling a method on an image to draw to a canvas.
  • Initial attempt at doing this only on the framework side. Image dispose #64582 Hacky, doesn't actually prevent framework or application code from disposing image objects out from under holders.
  • Rework dart:ui Image so that it can be safely cloned and disposed instead: Create an ImageHandle wrapper engine#21057
  • Rework the image stream/completer/provider/cache framework to clone handles and dispose of them when done: Dispose of images after using them #66688. A key part of this is a behavioral change in ImageStreamCompleters so that they are now considered disposed if they lose all listeners and no one is specifically asking to keep them alive.

This solution isn't perfect and has room for improvements:

  1. LiveImage tracking can go wrong now, we can and should fix that: The image cache can lose track of a live image that it probably shouldn't lose track of #67069
  2. We should probably more highly prioritize excluding specific images from the cache (It should be possible to exclude an image from the ImageCache #47380), since customers now have very good reasons to avoid caching large images that could be disposed immediately after they're done being used (as long as that large image will not be used again).
  3. The dispose API feels a bit clunky, especially on ImageInfo. ImageInfo is a const-creatable object that contains this stateful image which must get dispsoed.
  4. A dart:ui Image handle should have a debugDispose check. There are places in the framework code that would be helped by this. Add debugDisposed to dart:ui Images #67070

@dnfield
Copy link
Contributor

dnfield commented Oct 1, 2020

This broke CanvasKit and some internal targets. Reverting and reopening while I investigate a reland.

@dnfield dnfield reopened this Oct 1, 2020
Memory Improvements 2020 automation moved this from Done to In progress Oct 1, 2020
Memory Improvements 2020 automation moved this from In progress to Done Oct 5, 2020
@dnfield
Copy link
Contributor

dnfield commented Oct 5, 2020

Fix has relanded, I think fo rreal this time :)

@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. If you are still experiencing a similar issue, please open a new bug, including the output of flutter doctor -v and a minimal reproduction of the issue.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 11, 2021
@flutter-triage-bot flutter-triage-bot bot added P0 Critical issues such as a build break or regression and removed P2 labels Jun 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
a: images Loading, displaying, rendering images c: performance Relates to speed or footprint issues (see "perf:" labels) customer: money (g3) engine flutter/engine repository. See also e: labels. framework flutter/packages/flutter repository. See also f: labels. P0 Critical issues such as a build break or regression perf: memory Performance issues related to memory waiting for PR to land (fixed) A fix is in flight
Projects
9 participants