Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rendering in background isolates #21

Closed
blaugold opened this issue Jan 3, 2022 · 26 comments
Closed

Rendering in background isolates #21

blaugold opened this issue Jan 3, 2022 · 26 comments

Comments

@blaugold
Copy link

blaugold commented Jan 3, 2022

After looking at how tiles are rendered and thinking a bit about how to reduce jank, I feel that vector tiles probably should not be rendered on the UI thread.

  1. Jank is not only caused by the UI isolate, but also by the raster thread.

    I suspect one factor for this is shader compilation jank. Because maps are so irregular, I'm not sure how much SkSL warm up can help here.

    The other problem is that, if multiple complex enough tiles are painted for the first time in the same frame, it just takes too long to raster them.

  2. Rendering tiles to images often blocks the UI thread.

    This could be offloaded to a background isolate, but we would need to have copies of themes, tilesets and all the cached data in the UI isolate and the background isolate.

Proposal

Each Tileset gets its own isolate, which is used to read it, preprocess it and render it to images at different zoom levels.
This parallelizes all the stages of processing.

ui.Images cannot be passed between isolates. They have to be encoded in one isolate and decoded in the other.
The raw TypedData itself can be transferred between isolates without copying. It is an open question how much of a performance penalty encoding and decoding ui.Images imposes.

The UI isolate just schedules tiles to be rendered when necessary and composes the images.

The TileImageProvider provides a simple interface for the UI isolate to access cached images and request rendering of new ones:

/// The data that describes the contents of a rendered tile image.
///
/// Two tile images with the same [TileImageDescriptor] must have identical contents.
abstract class TileImageDescriptor {
  /// The id of the theme the tile was rendered with.
  String get themeId;

  /// The zoom level at which the tile was rendered.
  ///
  /// The zoom level must be >= 0.
  double get zoom;

  /// The x coordinate of the rendered tile.
  int get x;

  /// The y coordinate of the rendered tile.
  int get y;

  /// The width and hight of the rendered image, in physical pixels.
  int get size;
}

/// The image of a rendered tile.
abstract class TileImage {
  /// Describes the contents of this image.
  TileImageDescriptor get descriptor;

  /// The actual image of the tile.
  ///
  /// Don't dispose this image. This is handled by [release].
  ui.Image get image;

  /// Retains this image.
  ///
  /// An image must only used while it is retained.
  void retain();

  /// Releases this image.
  ///
  /// An image must be released when it is no longer needed.
  void release();
}

/// Provides images of rendered tiles.
abstract class TileImageProvider {
  /// Returns the immediately available rendered tile images.
  ///
  /// An image from the returned list must be retained before it can be used.
  List<TileImage> get images;

  /// Returns the rendered tile image with given [descriptor].
  ///
  /// The returned image is already retained and must be released by the caller.
  Future<TileImage> requestImage(TileImageDescriptor descriptor);
}

Both background isolates and rendered images can be cached. The caching policy can make decisions based on which tile images are being retained/released.

I have not though it all through, but this approach would make it practically impossible for map rendering to cause jank.

@FaFre
Copy link
Contributor

FaFre commented Jan 3, 2022

There was recently a discussion about the use of isolates in another issue here.
It seems like right now there is no possibility to even import dart:ui in non-main isolates. There are several open issues for this: flutter/flutter#10647 (comment) flutter/flutter#75755

But maybe this has changed recently through the new isolates features?

@blaugold
Copy link
Author

blaugold commented Jan 3, 2022

@FaFre No, you're right :(
For some reason, I thought the only limitation is not being able to pass objects from dart:ui directly between isolates.

But if I understand this comment correctly, this can still work. Background isolates just cannot be launched directly from the UI isolate. The platform implementation of a Flutter plugin can launch an arbitrary number of Flutter views, each with its own UI isolate. They are probably not cheap, so they would have to be pooled. The UI isolate that is actually rendering the app just needs to be able to communicate with the background Flutter views.

@FaFre
Copy link
Contributor

FaFre commented Jan 3, 2022

Yeah, I was suggesting something similar a couple of weeks ago, until @filipproch told us about the current state. It was discussed on the other open issue here, in case you are interested.

Wow, the comment you found is really interesting and would be maybe the only approach working around the current limitations.

@greensopinion
Copy link
Owner

Great to see the discussion and ideas!

I'm not sure if you've seen it, but I created a branch with an isolate experiment just yesterday.
It doesn't go so far as to try to render on a separate isolate, but it is a proof-of-concept that some processing can occur off of the UI thread.

Feel free to brainstorm and try out different ideas.

@blaugold
Copy link
Author

blaugold commented Jan 4, 2022

I've seen that branch. It looks promising and more doable in the near term. 👍

@ibrierley
Copy link

Just a few thoughts, as I've been following this thread with interest.

For what it's worth, I dabbled with Isolates about 6 months ago (but only for processing a vector tile to png), with my old hacky vector tiles which I'm not really updating atm (relevant code is at https://github.com/ibrierley/flutter_map_vector_tiles/blob/filters/lib/VectorTileWidget.dart lines 169-246 & 320-370. Having a quick glace at David code, mine is obviously a lot messier :D.

I experimented with both passing data in/out of the isolate via the Port, and also simply saving the created Image as a file, and decoded layers as json file (and just sending a confirmation via the Port that its been created). That way the main isolate could just load in image (which would probably be in the o.s disk/mem cache, so probably quite fast. Not sure I found any particular noticable difference between the two visually. Naturally this helped the most when rendering vector->image, rather than pure vector.

Overally it kinda felt it helped a bit, but I was never quite sure if I was pulling resources away somehow and adding some other complexity where more could go wrong and it felt a bit harder to understand then what was happening sometimes with delays.

It's also worth reminding what the main UI janks are down to (or even if its creating an image in an isolate, why is may be slow).

By far for me, the main issue was drawing lines more than 1px thick in busy area tiles like capital cities. This introduces a pretty resource heavy set of calculations which hairlines can optimise away. It also may be worth batching multiple lines of the same class all into one big superpath that get drawn with one call (if not already done), but I think the gains on that are smaller than getting around fat lines somehow.

I tended to feel that the things that felt the best for me, were detecting a drag and then only drawing hairline lines until end of the drag, or eliminating thick lines altogether where possible, but naturally that may not be possible depending on use case.

@blaugold
Copy link
Author

blaugold commented Jan 4, 2022

Good to see that it's possible to use dart:ui in background isolates, through the flutter_isolate package.

I experimented with both passing data in/out of the isolate via the Port, and also simply saving the created Image as a file, and decoded layers as json file (and just sending a confirmation via the Port that its been created). That way the main isolate could just load in image (which would probably be in the o.s disk/mem cache, so probably quite fast. Not sure I found any particular noticable difference between the two visually. Naturally this helped the most when rendering vector->image, rather than pure vector.

Saving the rendered images to disk is probably a good idea. That should work well with caching.

It's also worth reminding what the main UI janks are down to (or even if its creating an image in an isolate, why is may be slow).

When all the heavy processing is handled somewhere other than the UI isolate, there really should not be any jank, meaning dropped frames. Tiles might render with a delay when panning and zooming, but that's the case anyway when loading tiles from the network.

There are still things to optimize in the way tiles are rendered. It might be better to implement those first and see if there is still a need to go further. Currently, tiles are fully redrawn when inputs change. It should be possible to use composition layers to redraw theme layers individually, for example.

@FaFre
Copy link
Contributor

FaFre commented Jan 4, 2022

I use Isolates in kinda the same use case. I parse incoming MVT tiles and post process incoming points and linestrings using some heavy algorithms.

So I did some (naive and rudimentary) tests with isolates, that could help here. In my tests I didn't gain any performance improvements from spinning up an isolate for the purpose of processing a single task, that took less than around 200ms to process. While reducing jank, the time spent for spinning up an isolate is pretty high, at least in debug. Release may put things in a different light.

My solution was dedicating a single isolate to my Stream, bound to the lifetime of the Stream. New events are getting communicated over the SendPortto the isolate, so that basically .map() is happening inside the isolate. Thats working pretty decent and I'm happy with the performance for now.

However, knowing a bit more about isolates now I would probably try a different approach.
With the new isolate features introduced in the last stable update, it's now possible to pass return values between isolates without copying the data (This is happening in my Stream approach all the time, as far as I know).

So I would try something like this:

  1. The amount of tiles to process is deterministic, depending on screen resolution. So I would spin up maxTiles * 3 isolates and put them into an stack, that they are ready and waiting for an incoming task.
  2. Map is loaded or user panning around, each incoming tile pops an isolate from the stack, executes, and kills afterwards, returning the result.
  3. When processing the tile batch is done, the stack gets filled up again with fresh isolates, so everything is ready when user is panning around again.

Spinning up even hundreds of isolates is no problem and due to the last update also very easy on memory. Spinning up can probably also be done by an isolate in case spinning up costs time on main-isolate.

But there is another downside working with short-running isolates: Debugging. I experienced lots of weird side effects, for example isolates are not returning when awaited, or need to get pinged with pause/resume via debugger, to finish their jobs. In my experience, with my setup (Linux + vscode) it lead to additional debugging time. Running tests with isolates is also pretty unstable at the moment (flutter/flutter#95331). However, in release I didn't encounter any issues so far.

@blaugold
Copy link
Author

blaugold commented Jan 4, 2022

The downside of using a fresh isolate for each render pass is that you can't build up caches. A tile might need to be rendered many times while zooming between zoom levels.

Maybe the state can be persisted efficiently, but I'm not sure you gain all that much through Isolate.exit, when passing straight binary data. There already is TransferableTypedData for that purpose.

I think Isolate.exit is primarily for large object graphs.

@FaFre
Copy link
Contributor

FaFre commented Jan 4, 2022

Caching could be done using the identity of each tile (x, y ,z) and done even before processing. For zooming in between full tile zooms, canvas.scale() should do the job. But I'm not sure, if this is applicable to the current rendering.

TransferableTypedData is very interesting, didn't know that.

@blaugold
Copy link
Author

blaugold commented Jan 4, 2022

You're right, caching the rendered images shouldn't be a problem.

But to render a tile, feature geometry and properties need to be decoded. For each theme layer, the features to render have to be resolved as well. Depending on a few factors, all this can take more time than to do the actual drawing.
The TilesetPreprocessor allows us to do those things only once.

@greensopinion
Copy link
Owner

There are some great ideas in this thread.

I've reworked how tile data is provided to tiles in commit bb955ab with the intention of making it easier to introduce isolate-based offloading and other optimizations.

Here's my rough plan going forward:

  1. fix eliminate flicker when zooming #22 since it's visually disorienting
  2. introduce some kind of executor, probably based on executor.dart so that we can drop in different processing models
  3. introduce a direct executor - i.e. one that doesn't use isolates
  4. add a switch so that a direct executor can be used in debugging
  5. look at which processing could be moved to isolates - current thinking is protobuf decoding and tile preprocessing
  6. for each item from step 5, reorganize the code to go through an executor

If you have any feedback on the above or if you want to be involved, let me know!

Another place to look is Paragraph.layout() because it's a big contributor to profiler flame graphs. It seems unreasonably expensive for the relatively simple labels that we are rendering on map tiles. Anyone have any ideas how to make that faster? I was looking at the native code for Paragraph and there's way more to it than I expected. You can find the relevant code in paragraph_txt.cc and Layout.cpp.

@ibrierley
Copy link

R.e paragraph layout, not sure where that is used, is that part of a normal TextPainter layout ? If so, I know a TextPainter layout is expensive, and I think I cached each labels textPainter in mine (not sure if you're doing that already).

@greensopinion
Copy link
Owner

I've just pushed a commit that puts protobuf decoding on an isolate (but only when not debugging) 4cf1dd1

I'd appreciate any feedback!

Next step is to put tile preprocessing on an isolate.

@greensopinion
Copy link
Owner

R.e paragraph layout, not sure where that is used, is that part of a normal TextPainter layout ? If so, I know a TextPainter layout is expensive, and I think I cached each labels textPainter in mine (not sure if you're doing that already).

Yes, it's part of initializing a TextPainter. With your caching, what was the cache key?

@ibrierley
Copy link

Apologies, thinking about it, I maybe leading you astray with that thought, I was thinking more in terms of when I was drawing pure vectors every frame (rather than drawing vectors to an image) I simply stored the textPainter in a tile cache with other stuff, rather than some clever rejigging of the textPainter somehow. This wouldn't help at all with speeding up the process of rendering to an image.

@FaFre
Copy link
Contributor

FaFre commented Jan 9, 2022

I also hit some performance problems with TextPainter's layout() and used to cache it right after laying out.

My key looks like this, but is optimized for my use case:

class TextRenderParameter with FastEquatable {
  final double maxWidth;
  final int? maxLines;
  final TextDirection textDirection;
  final String? elipsis;

  TextRenderParameter(
      {this.maxWidth = double.infinity,
      this.maxLines,
      this.textDirection = TextDirection.ltr,
      this.elipsis});

  @override
  bool get cacheHash => true;

  @override
  List<Object?> get hashParameters =>
      [maxWidth, maxLines, textDirection, elipsis];
}

class _RenderCacheKey with FastEquatable {
  final String text;
  final Size constraints;
  final TextStyle defaultStyle;
  final TextRenderParameter renderParameter;

  _RenderCacheKey(
      this.text, this.constraints, this.defaultStyle, this.renderParameter);

  @override
  bool get cacheHash => true;

  @override
  List<Object?> get hashParameters =>
      [text, constraints, defaultStyle, renderParameter];
}

You may notice the FastEquatable mixin, it's a small library I developed to boost equality performance for "immutable" objects (not immutable per language definition). It uses an optimized equality comparison together with hash code caching. Use case is especially larger sets or maps, or/with complex keys like above.

In case you are interested, take a look here: https://github.com/FaFre/fast_equatable it will decimate a lot of hash code operations in performance profiling :)
I also added a small and very basic benchmark in examples, comparing it the well known equality package, which it is able to outperform by nearly 40% (almost linear) for very simple objects. The compared performance will even increase the more data there is to compare.

However, its not pushed to pub.dev yet. Let me know when you see use in it and I will get the package ready to publish it.

@greensopinion
Copy link
Owner

greensopinion commented Jan 10, 2022 via email

@greensopinion
Copy link
Owner

With the recent changes, we now have tile preprocessing and tile loading on background isolates.
There is an issue with memory usage that may be related to recent changes: sometimes when zooming, memory usage spikes temporarily. Sometimes this can be severe enough for the example app to crash on ios. I'm going to close this issue based on recent changes, open a new one for memory usage. Feel free to help investigate if you're interested. If you can see other opportunities for isolate-based processing or other optimizations, feel free to open a new issue.

Thanks for all of your help to date, I appreciate it.

@greensopinion
Copy link
Owner

I just realized that closing this issue effectively closed off the conversation. Reopening so that we can continue!

I've been doing some work on #24 to reduce memory consumption and will merge to main soon. Part of that work resulted in deduplication of processing - which can reduce overhead in some cases.

I've got isolates working pretty well, but will have them disabled for now. It's easy enough to change the isolate model by editing https://github.com/greensopinion/flutter-vector-map-tiles/blob/optimizations/lib/src/executor/executor.dart#L36

Any feedback on the current approach would be helpful.

@greensopinion greensopinion reopened this Jan 21, 2022
@greensopinion
Copy link
Owner

I've been working with some experimental changes on a branch that significantly reduce memory usage and improve performance. flutter-vector-map-tiles/tree/memory-investigation
If you're using this with your own codebase, you'll want to set renderMode: RenderMode.layered_vector to take advantage of some of these changes.

I'd love to get some feedback, specifically:

  • how is rendering performance - any jank while loading tiles or zooming?
  • what do you think of the new layered rendering mode - it renders labels on a slight delay while interacting with the map
  • how is memory usage, have you been able to reproduce issue investigate excessive memory usage while zooming #24?

@blaugold
Copy link
Author

blaugold commented Feb 7, 2022

how is rendering performance - any jank while loading tiles or zooming?

For me, zooming has the most notable jank. I noticed that tiles are always rendered as 256x256 pixels. Mapbox vector tiles are supposed to be rendered as 512x512 pixels at an integer zoom level, and contain the according number of details. This means that for Mapbox tiles, 4 times more memory and computation is used than necessary. I experimented with setting all tileSize constants to 512 and it makes a huge difference.

what do you think of the new layered rendering mode - it renders labels on a slight delay while interacting with the map

I think that's a good idea. You could even apply that to more layers.

how is memory usage, have you been able to reproduce issue #24?

I have not been able to reproduce this, but I have submitted a PR with which I see a reduction of the memory occupied by lists from ~60MB to ~15MB, in the example app.

@greensopinion
Copy link
Owner

Thanks for the feedback!

I experimented with setting all tileSize constants to 512 and it makes a huge difference.

Do we need a setting for this? How do other mapping libraries handle this?

I think that's a good idea. You could even apply that to more layers.

The way that it works now, labels are not rendered while zooming. The user experience of disappearing labels seems fine (in my limited testing of 2 people, no once noticed) but having other map features disappear might be a bit disorienting. What do you have in mind?

but I have submitted a PR

Very cool! I've commented on the PR

@blaugold
Copy link
Author

blaugold commented Feb 7, 2022

Some jank is coming from the raster thread, when it is not able to render everything fast enough. From my experience, this often happens during zooming because many tiles are re-rendered at once. When a map area becomes visible for the first time, layers could be rendered in groups in multiple frames.

@greensopinion
Copy link
Owner

In case anyone is interested in providing feedback, I have a proposal on #31

@greensopinion
Copy link
Owner

Closing since isolates are now used by default and can be controlled with the concurrency option

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants