Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upScattered GPU cache updates #2162
Conversation
|
I like the general idea of this, but I think that F32 targets are not renderable on ES3 - which would mean this isn't usable on ANGLE? |
|
Half precision float should be part of GLES3. It's part of WebGL 2. If you can get away with f16 render targets, you should be good. |
|
@pcwalton I don't see RGBA16F supported as color-renderable in GLES 3.1. However, it appears that we can use RGBA32UI and cast/reinterpret the floats to/from it, given that we need no filtering. |
|
Oh, I guess not. Well, all hardware I've tested supports half-float as an extension (the straggler being PowerVR, as it only supports R16F, not R32F). |
|
Unfortunately I don't think ANGLE exposes any extensions for that, which is primarily what this is for. |
|
Really? Pathfinder relies on that extension and works on my Windows laptop in WebGL… |
|
Huh, interesting! Which extension is it you're relying on? I'll check on my Windows laptop if it's present with ANGLE. |
|
OK, I can see that https://chromium.googlesource.com/chromium/src/gpu/+/master/GLES2/extensions/CHROMIUM/CHROMIUM_color_buffer_float_rgba.txt is available on my Windows laptop. @kvark So it looks like we can render to F32 on ANGLE... |
27aff45
to
0db9a19
|
Here is some good news, bad news, and a somewhat ugly conclusion. GoodI managed to fix the correctness/Angle issues, making it run now on all platforms. BadBenchmarking with MotionMark on a Windows/Angle machine didn't show any speedup. In fact, the scores were even ~10 lower, within the margin of measurement error. On the other hand, we don't appear to be nearly bottlenecked by the GPU cache updates. Compositor time I've seen is either mostly waiting, or doing texture uploads (which have been optimized to be Angle-friendly). Perhaps, we can come up with a stronger test case that stresses the hell out of GPU cache, e.g. by using a thousand of borders?.. UglyThe PR brings a bit of a nicer code overall, and it would be nice to reserve this option (to minimize the GPU cache transfers) for the future. So we have a tentative agreement with @glennw to disable it by default and merge. |
|
Reviewed 2 of 4 files at r1, 4 of 5 files at r3, 1 of 1 files at r4. webrender/res/gpu_cache_update.glsl, line 14 at r4 (raw file):
nit: tabs here instead of spaces webrender/src/device.rs, line 411 at r4 (raw file):
nit: spelling webrender/src/renderer.rs, line 74 at r4 (raw file):
Let's add a comment here describing what this is for, if we still need it. Comments from Reviewable |
|
@kvark Looks good! Added a few very minor nits, r=me once those are resolved. I leave it to your judgment if we need a try run here :) |
Address review notes.
|
Thanks @glennw ! Also, I believe the scatter-enabled Gecko is a little bit snappier at start, for it uses blitting for GPU texture cache resizing, which helps to grow the height from 512 to 2048 without hitches. |
|
@kvark Yes, those are expected differences with current WR master. |
|
@bors-servo r+ |
|
|
Scattered GPU cache updates Fixes #2110 This PR introduces the GPU cache uploading via shader writes. It minimizes the amount of data we need to transfer to the GPU in order to reduce the stalls and make us scale better with content (see #2132 (comment)). Note: in terms of efficiency, rasterizing lines would be much better than points. Leaving this for the follow up. Doing so would require us to upload the source data into a texture as opposed to a buffer, and slightly (but non-critically) complicate things. WIP, because: - benchmarking is TODO - to be squashed eventually Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f583b2deae1c7312437638bdda6ddc65693d53ed r? @glennw <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/2162) <!-- Reviewable:end -->
|
|
|
This PR caused panics on windows reftest runs: https://bugzilla.mozilla.org/show_bug.cgi?id=1424280#c8 |
|
I filed #2208 for this regression with a more useful backtrace. |
kvark commentedDec 4, 2017
•
edited
Fixes #2110
This PR introduces the GPU cache uploading via shader writes. It minimizes the amount of data we need to transfer to the GPU in order to reduce the stalls and make us scale better with content (see #2132 (comment)).
Note: in terms of efficiency, rasterizing lines would be much better than points. Leaving this for the follow up. Doing so would require us to upload the source data into a texture as opposed to a buffer, and slightly (but non-critically) complicate things.
WIP, because:
Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f583b2deae1c7312437638bdda6ddc65693d53ed
r? @glennw
This change is