Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upPBO copies are slow on Angle #2110
Comments
|
This is showing up during motionmark bouncing circles. |
|
I should also note that to reproduce this slowness I need to run the test case in ramp mode first and then reload with constant complexity. i.e. So it does seem like this problem is being made worse through some invisible persistent state (the webrender texture cache?, the ANGLE buffer cache?) |
|
@jrmuizel we should have a separate issue for the (suspected) cache pollution |
|
Thanks to Angle team I got some answers! This is not about orphaning, but rather about some texture formats not being supported by the fast path of buffer->texture copies: https://cs.chromium.org/chromium/src/third_party/angle/src/libANGLE/renderer/d3d/d3d11/Renderer11.cpp?type=cs&q=supportsFastCopyBufferToTexture&sq=package:chromium&l=3006 In particular, Angle doesn't like our RGB8 and A8. |
|
Awesome, thanks for following this up! |
|
The profile shows us hitting the fast path not missing it.
…On Nov 27, 2017 8:29 PM, "Glenn Watson" ***@***.***> wrote:
Awesome, thanks for following this up!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2110 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAUTbYZ7dMM78BEAVnp_cSJJxA_sLIsIks5s62HlgaJpZM4QsHT->
.
|
|
@jrmuizel Are you using the Gecko profiler for this or something else? |
|
@kvark From your investigations of angle, would it be worth considering alternative update strategies (e.g. render target with float points, or perhaps map/unmap or something else?) |
|
Yeah. If you look at the perfht.ml link in the first comment
…On Nov 27, 2017 9:18 PM, "Glenn Watson" ***@***.***> wrote:
@jrmuizel <https://github.com/jrmuizel> Are you using the Gecko profiler
for this or something else?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2110 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAUTbWx_4_m36NUxINr0mjgU5fklNZ06ks5s6218gaJpZM4QsHT->
.
|
|
When I originally set up the GPU cache, the idea was to be able to use an unsynchronized map to update the texture, on platforms where that made sense (i.e. a persistent pointer into the texture data). It does two things to enable this:
In theory, this means we don't need any synchronization here - these two policies should guarantee that we never write to a location that is < 10 frames old, which should thus guarantee that the GPU is never reading incorrect data. Admittedly, I've never actually tested this in practice - I intended to revisit it at a later time. But perhaps we can signal to ANGLE that it doesn't need to do any blocking and see how it goes? Another possibility (more of a temporary quick fix / hack) could be to round-robin a series of backing textures for the GPU cache, and update / upload the entire texture each frame. This sounds bad, but that data is actually quite small, and may be perfectly fine as an interim solution, if most of the time is spend blocking on a GPU fence or similar. @kvark @jrmuizel These might be worth pursuing if we're not able to get the current path running well on ANGLE. |
Damn, that is correct. Back to square one!
Will see after this issue is resolved.
TBH, that sounds perfect :) We definitely need to check out persistent mapping for the GPU cache texture. However, this is quite specific (GPU cache only, not the texture cache or other data) and will probably need to wait till we resolve the outstanding performance issues first (assuming we can resolve the Angle problem without rewriting the upload path too much). |
|
Had a long discussion today with Angle peers. Ended up filing an upstream issue - https://bugs.chromium.org/p/angleproject/issues/detail?id=2268 TL;DR: |
|
We can quite easily switch to using |
|
Yes, that's my plan :) We'll definitely need separate paths on Windows/Angle versus the world :) |
Texture update strategies ~~Fixes #2110, hopefully: still to be benchmarked on Windows.~~ Provide the settings for Gecko to fix ^^ Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=12fecf240aa3440764cd64205b17bfcfe7eb252c r? @glennw <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/2147) <!-- Reviewable:end -->
|
Oh, didn't mean to close this just yet. |
Scattered GPU cache updates Fixes #2110 This PR introduces the GPU cache uploading via shader writes. It minimizes the amount of data we need to transfer to the GPU in order to reduce the stalls and make us scale better with content (see #2132 (comment)). Note: in terms of efficiency, rasterizing lines would be much better than points. Leaving this for the follow up. Doing so would require us to upload the source data into a texture as opposed to a buffer, and slightly (but non-critically) complicate things. WIP, because: - benchmarking is TODO - to be squashed eventually Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f583b2deae1c7312437638bdda6ddc65693d53ed r? @glennw <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/2162) <!-- Reviewable:end -->
It appears that Windows/Angle spends tons of time in
update_texture_from_pbo: https://perfht.ml/2zJnkg4Angle decided to defer mapping and filling an actual D3D11 staging buffer up until this call. Since the copy itself is done in a draw call, it has to stall the GPU until the copy is done. This is one issue, but the actual current one is different:
Mapitself is waiting. I suppose it's waiting for the GPU considering the texture is still in use, which implies our PBO orphaning doesn't work as expected (seeDevice::orphan_pbo).cc @jrmuizel @glennw