Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support partial rendering of off-screen pictures / render tasks. #2382

Merged
merged 1 commit into from Feb 7, 2018

Conversation

@glennw
Copy link
Member

glennw commented Feb 6, 2018

This allows drawing a render task where the allocated space in
the surface cache is smaller than the required size if we were
to draw the complete contents of a render task.

The primary motivation of this task is to fix some existing bugs.
Specifically, if an off-screen render task is intersecting with
the main screen rect, we would only allocate enough space in the
surface cache texture for the visible portion. Then, the render
task would be drawn, and in some cases could extend outside the
bounds of the allocated rect for the task. This can result in
image corruption - for instance, a long scrolling region which
is drawn to an off-screen target (due to a filter) could draw
outside its region, affecting other tasks in the same surface
cache texture layer.

A second motivation is as a performance and power saving optimization.
In the future, we may want to render and cache "tiles" that can be
cached and provided to the OS compositor interfaces, to avoid work
during scrolling.

Finally, the implementation also contains the infrastructure necessary
to enable us to completely skip texture-cache sub-render tasks when
a frame is re-rendered, which is a performance gain in some cases.

To achieve this:

  • Instead of one alpha-batcher per target, there is one per render task.
  • Each alpha-batcher tracks the task-relative, screen-space bounding
    rect of primitives added to it.
  • Once each alpha-batcher is complete, we check if the allocated rect
    for this task completely contains the combined bounding rect of the
    primitives inside it.
  • If it does contain it (common case) then we run a batch merging
    step where this set of batches is merged with any other alpha batch
    tasks that are also self-contained. This merging step can be very
    simple, since we know there is no overlap ordering constraints.
  • If it doesn't contain it, then we will submit this set of batches
    separately to the merged batch list, applying a scissor rect set to
    the size of the render task allocation.

The batching results should be as good or better than previously, except
in the genuine case where this fixes a bug, in which case there may be
a small number of extra draw calls.

Since the batching of each render task is independent, and also only
references read-only shared state, we can easily run each batch creation
task on a worker thread, if profiling shows any benefit to that.


This change is Reviewable

@glennw
Copy link
Member Author

glennw commented Feb 6, 2018

This has some outstanding Gecko try failures - I need to fix those before this is ready for review / merge.

This also removes the alpha primitive debug option - the information is no longer available in the renderer, and we can always view this in RenderDoc, Frame Analyzer etc now.

When the remaining issues are fixed, https://bugzilla.mozilla.org/show_bug.cgi?id=1429537 will be resolved.

@glennw glennw force-pushed the glennw:rt-scissor3 branch from af8a7a1 to 49df332 Feb 6, 2018
@glennw glennw changed the title [WIP] Support partial rendering of off-screen pictures / render tasks. Support partial rendering of off-screen pictures / render tasks. Feb 7, 2018
@glennw
Copy link
Member Author

glennw commented Feb 7, 2018

Gecko try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=dcb7f90ba3475e3a8216cb58cf72b51e2c44e355

There's a couple of orange results, but they appear to be unrelated intermittents (although @staktrace may like to verify).

r? @kvark

@bors-servo
Copy link
Contributor

bors-servo commented Feb 7, 2018

The latest upstream changes (presumably #2391) made this pull request unmergeable. Please resolve the merge conflicts.

@staktrace
Copy link
Contributor

staktrace commented Feb 7, 2018

Yup, the oranges on the try push are unrelated intermittents.

@kvark
Copy link
Member

kvark commented Feb 7, 2018

The reasoning sounds good to me.
A few notes here since Reviewable appears to bug out and block me from adding some comments:

  • why is AlphaBatchList no longer serializable?
  • could we have a Vec<Rect> per batch (or just one rect?) as opposed to having a separate item_rects: Vec<Vec<Rect>>

Reviewed 12 of 13 files at r1.
Review status: 12 of 13 files reviewed at latest revision, all discussions resolved, some commit checks failed.


webrender/examples/common/boilerplate.rs, line 231 at r1 (raw file):

                    Some(glutin::VirtualKeyCode::B),
                ) => {
                    renderer.toggle_debug_flags(webrender::DebugFlags::ALPHA_PRIM_DBG);

why are we removing this debug toggle?


webrender/src/picture.rs, line 549 at r1 (raw file):

                        let picture_task = RenderTask::new_picture(
                            RenderTaskLocation::Dynamic(None, cache_size),

back to the future? :) I recall us changing from the RenderTaskLocation, and now it's back


webrender/src/renderer.rs, line 3851 at r1 (raw file):

            }

            if let Some(..) = alpha_batch_container.target_rect {

nit: use is_some()?


Comments from Reviewable

@glennw glennw force-pushed the glennw:rt-scissor3 branch from 49df332 to 4b016ee Feb 7, 2018
This allows drawing a render task where the allocated space in
the surface cache is smaller than the required size if we were
to draw the complete contents of a render task.

The primary motivation of this task is to fix some existing bugs.
Specifically, if an off-screen render task is intersecting with
the main screen rect, we would only allocate enough space in the
surface cache texture for the visible portion. Then, the render
task would be drawn, and in some cases could extend outside the
bounds of the allocated rect for the task. This can result in
image corruption - for instance, a long scrolling region which
is drawn to an off-screen target (due to a filter) could draw
outside its region, affecting other tasks in the same surface
cache texture layer.

A second motivation is as a performance and power saving optimization.
In the future, we may want to render and cache "tiles" that can be
cached and provided to the OS compositor interfaces, to avoid work
during scrolling.

Finally, the implementation also contains the infrastructure necessary
to enable us to completely skip texture-cache sub-render tasks when
a frame is re-rendered, which is a performance gain in some cases.

To achieve this:
 * Instead of one alpha-batcher per target, there is one per render task.
 * Each alpha-batcher tracks the task-relative, screen-space bounding
   rect of primitives added to it.
 * Once each alpha-batcher is complete, we check if the allocated rect
   for this task completely contains the combined bounding rect of the
   primitives inside it.
 * If it *does* contain it (common case) then we run a batch merging
   step where this set of batches is merged with any other alpha batch
   tasks that are also self-contained. This merging step can be very
   simple, since we know there is no overlap ordering constraints.
 * If it *doesn't* contain it, then we will submit this set of batches
   separately to the merged batch list, applying a scissor rect set to
   the size of the render task allocation.

The batching results should be as good or better than previously, except
in the genuine case where this fixes a bug, in which case there may be
a small number of extra draw calls.

Since the batching of each render task is independent, and also only
references read-only shared state, we can easily run each batch creation
task on a worker thread, if profiling shows any benefit to that.
@glennw glennw force-pushed the glennw:rt-scissor3 branch from 4b016ee to df6e685 Feb 7, 2018
@glennw
Copy link
Member Author

glennw commented Feb 7, 2018

@kvark Thanks for the review, answers below:

AlphaBatchList is no longer serializable since it's never stored - it's only used during batch building. So I just made AlphaBatchContainer serializable, since that's what is built and stored in a frame.

The Vec<Vec<Rect>> code isn't ideal. I thought it might be a lesser evil than having a Vec<Rect> in the alpha batch, since that would make the merge step more complex / expensive. But it's definitely worth doing some experiments here. Having one rect altogether which is a union may well provide just as good batching as we get now, with less cost. We need to do some testing of this though.

The debug toggle is removed because we no longer have the screen-space bounding rects of alpha items. Once batches are merged, we no longer have the offsets to get the true bounding rect of each item. I figured we probably don't need this anymore, since we can use RenderDoc and friends, and since it would be more CPU cost / memory to re-calculate this information.

I updated the nit to use is_some().

Thanks!

@kvark
Copy link
Member

kvark commented Feb 7, 2018

Ok, sounds good.
I think the alpha debug toggle was quite useful though in some cases, but I agree it's not required.
@bors-servo r+

@bors-servo
Copy link
Contributor

bors-servo commented Feb 7, 2018

📌 Commit df6e685 has been approved by kvark

@bors-servo
Copy link
Contributor

bors-servo commented Feb 7, 2018

Testing commit df6e685 with merge 7ae075a...

bors-servo added a commit that referenced this pull request Feb 7, 2018
Support partial rendering of off-screen pictures / render tasks.

This allows drawing a render task where the allocated space in
the surface cache is smaller than the required size if we were
to draw the complete contents of a render task.

The primary motivation of this task is to fix some existing bugs.
Specifically, if an off-screen render task is intersecting with
the main screen rect, we would only allocate enough space in the
surface cache texture for the visible portion. Then, the render
task would be drawn, and in some cases could extend outside the
bounds of the allocated rect for the task. This can result in
image corruption - for instance, a long scrolling region which
is drawn to an off-screen target (due to a filter) could draw
outside its region, affecting other tasks in the same surface
cache texture layer.

A second motivation is as a performance and power saving optimization.
In the future, we may want to render and cache "tiles" that can be
cached and provided to the OS compositor interfaces, to avoid work
during scrolling.

Finally, the implementation also contains the infrastructure necessary
to enable us to completely skip texture-cache sub-render tasks when
a frame is re-rendered, which is a performance gain in some cases.

To achieve this:
 * Instead of one alpha-batcher per target, there is one per render task.
 * Each alpha-batcher tracks the task-relative, screen-space bounding
   rect of primitives added to it.
 * Once each alpha-batcher is complete, we check if the allocated rect
   for this task completely contains the combined bounding rect of the
   primitives inside it.
 * If it *does* contain it (common case) then we run a batch merging
   step where this set of batches is merged with any other alpha batch
   tasks that are also self-contained. This merging step can be very
   simple, since we know there is no overlap ordering constraints.
 * If it *doesn't* contain it, then we will submit this set of batches
   separately to the merged batch list, applying a scissor rect set to
   the size of the render task allocation.

The batching results should be as good or better than previously, except
in the genuine case where this fixes a bug, in which case there may be
a small number of extra draw calls.

Since the batching of each render task is independent, and also only
references read-only shared state, we can easily run each batch creation
task on a worker thread, if profiling shows any benefit to that.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/2382)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Feb 7, 2018

☀️ Test successful - status-appveyor, status-taskcluster, status-travis
Approved by: kvark
Pushing 7ae075a to master...

@bors-servo bors-servo merged commit df6e685 into servo:master Feb 7, 2018
3 of 4 checks passed
3 of 4 checks passed
code-review/reviewable 6 files, 3 discussions left (glennw, kvark)
Details
Taskcluster (pull_request) TaskGroup: success
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
homu Test successful
Details
@glennw glennw deleted the glennw:rt-scissor3 branch Feb 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.