Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scattered GPU cache updates #2162

Merged
merged 13 commits into from Dec 9, 2017
Merged

Scattered GPU cache updates #2162

merged 13 commits into from Dec 9, 2017

Conversation

@kvark
Copy link
Member

kvark commented Dec 4, 2017

Fixes #2110

This PR introduces the GPU cache uploading via shader writes. It minimizes the amount of data we need to transfer to the GPU in order to reduce the stalls and make us scale better with content (see #2132 (comment)).

Note: in terms of efficiency, rasterizing lines would be much better than points. Leaving this for the follow up. Doing so would require us to upload the source data into a texture as opposed to a buffer, and slightly (but non-critically) complicate things.

WIP, because:

  • benchmarking is TODO
  • to be squashed eventually

Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f583b2deae1c7312437638bdda6ddc65693d53ed

r? @glennw


This change is Reviewable

@glennw
Copy link
Member

glennw commented Dec 4, 2017

I like the general idea of this, but I think that F32 targets are not renderable on ES3 - which would mean this isn't usable on ANGLE?

@pcwalton
Copy link
Collaborator

pcwalton commented Dec 5, 2017

Half precision float should be part of GLES3. It's part of WebGL 2. If you can get away with f16 render targets, you should be good.

@kvark
Copy link
Member Author

kvark commented Dec 5, 2017

@pcwalton I don't see RGBA16F supported as color-renderable in GLES 3.1. However, it appears that we can use RGBA32UI and cast/reinterpret the floats to/from it, given that we need no filtering.

@pcwalton
Copy link
Collaborator

pcwalton commented Dec 5, 2017

Oh, I guess not. Well, all hardware I've tested supports half-float as an extension (the straggler being PowerVR, as it only supports R16F, not R32F).

@glennw
Copy link
Member

glennw commented Dec 5, 2017

Unfortunately I don't think ANGLE exposes any extensions for that, which is primarily what this is for.

@pcwalton
Copy link
Collaborator

pcwalton commented Dec 5, 2017

Really? Pathfinder relies on that extension and works on my Windows laptop in WebGL…

@glennw
Copy link
Member

glennw commented Dec 5, 2017

Huh, interesting! Which extension is it you're relying on? I'll check on my Windows laptop if it's present with ANGLE.

@glennw
Copy link
Member

glennw commented Dec 5, 2017

OK, I can see that https://chromium.googlesource.com/chromium/src/gpu/+/master/GLES2/extensions/CHROMIUM/CHROMIUM_color_buffer_float_rgba.txt is available on my Windows laptop. @kvark So it looks like we can render to F32 on ANGLE...

@kvark kvark force-pushed the kvark:gpu-upload branch 6 times, most recently from 27aff45 to 0db9a19 Dec 5, 2017
@kvark kvark changed the title [WIP] Scattered GPU cache updates Scattered GPU cache updates Dec 8, 2017
@kvark
Copy link
Member Author

kvark commented Dec 8, 2017

Here is some good news, bad news, and a somewhat ugly conclusion.

Good

I managed to fix the correctness/Angle issues, making it run now on all platforms. 🎉
Angle is quite picky about the shader code, it decided to ignore my scatter draw call because the VS didn't assign anything to gl_PointSize...

Bad

Benchmarking with MotionMark on a Windows/Angle machine didn't show any speedup. In fact, the scores were even ~10 lower, within the margin of measurement error. On the other hand, we don't appear to be nearly bottlenecked by the GPU cache updates. Compositor time I've seen is either mostly waiting, or doing texture uploads (which have been optimized to be Angle-friendly).

Perhaps, we can come up with a stronger test case that stresses the hell out of GPU cache, e.g. by using a thousand of borders?..

Ugly

The PR brings a bit of a nicer code overall, and it would be nice to reserve this option (to minimize the GPU cache transfers) for the future. So we have a tentative agreement with @glennw to disable it by default and merge.

@kvark kvark force-pushed the kvark:gpu-upload branch from 0db9a19 to 8eeeae0 Dec 8, 2017
@glennw
Copy link
Member

glennw commented Dec 8, 2017

Reviewed 2 of 4 files at r1, 4 of 5 files at r3, 1 of 1 files at r4.
Review status: all files reviewed at latest revision, 3 unresolved discussions.


webrender/res/gpu_cache_update.glsl, line 14 at r4 (raw file):

void main() {
	vData = aValue;

nit: tabs here instead of spaces


webrender/src/device.rs, line 411 at r4 (raw file):

    }

    pub fn streaw_with<'a>(&self, attributes: &'a [VertexAttribute]) -> Stream<'a> {

nit: spelling


webrender/src/renderer.rs, line 74 at r4 (raw file):

pub const MAX_VERTEX_TEXTURE_WIDTH: usize = 1024;
const GPU_CACHE_RESIZE_TEST: bool = false;

Let's add a comment here describing what this is for, if we still need it.


Comments from Reviewable

@glennw
Copy link
Member

glennw commented Dec 8, 2017

@kvark Looks good! Added a few very minor nits, r=me once those are resolved. I leave it to your judgment if we need a try run here :)

Address review notes.
@kvark kvark force-pushed the kvark:gpu-upload branch from 8eeeae0 to 720c58c Dec 8, 2017
@kvark
Copy link
Member Author

kvark commented Dec 8, 2017

Thanks @glennw !
Notes are addressed. The try run has 1 unexpected pass and 1 unexpected failure (in same-filter): https://treeherder.mozilla.org/#/jobs?repo=try&revision=f583b2deae1c7312437638bdda6ddc65693d53ed&selectedJob=150711942
Not sure if those are expected (will check later, unless you know out of hand).

Also, I believe the scatter-enabled Gecko is a little bit snappier at start, for it uses blitting for GPU texture cache resizing, which helps to grow the height from 512 to 2048 without hitches.

@glennw
Copy link
Member

glennw commented Dec 8, 2017

@kvark Yes, those are expected differences with current WR master.

@glennw
Copy link
Member

glennw commented Dec 8, 2017

@bors-servo
Copy link
Contributor

bors-servo commented Dec 8, 2017

📌 Commit 720c58c has been approved by glennw

@bors-servo
Copy link
Contributor

bors-servo commented Dec 8, 2017

Testing commit 720c58c with merge cb8d8d9...

bors-servo added a commit that referenced this pull request Dec 8, 2017
Scattered GPU cache updates

Fixes #2110

This PR introduces the GPU cache uploading via shader writes. It minimizes the amount of data we need to transfer to the GPU in order to reduce the stalls and make us scale better with content (see #2132 (comment)).

Note: in terms of efficiency, rasterizing lines would be much better than points. Leaving this for the follow up. Doing so would require us to upload the source data into a texture as opposed to a buffer, and slightly (but non-critically) complicate things.

WIP, because:
  - benchmarking is TODO
  - to be squashed eventually

Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f583b2deae1c7312437638bdda6ddc65693d53ed

r? @glennw

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/2162)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Dec 9, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: glennw
Pushing cb8d8d9 to master...

@bors-servo bors-servo merged commit 720c58c into servo:master Dec 9, 2017
3 of 4 checks passed
3 of 4 checks passed
code-review/reviewable 3 files, 3 discussions left (glennw, kvark)
Details
Taskcluster (pull_request) TaskGroup: success
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
homu Test successful
Details
@kvark kvark deleted the kvark:gpu-upload branch Dec 9, 2017
@staktrace
Copy link
Contributor

staktrace commented Dec 9, 2017

This PR caused panics on windows reftest runs: https://bugzilla.mozilla.org/show_bug.cgi?id=1424280#c8

@staktrace
Copy link
Contributor

staktrace commented Dec 9, 2017

I filed #2208 for this regression with a more useful backtrace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.