Add support for GL_EXT_robustness #2705

emersion · 2021-02-01T17:17:06Z

Would allow us to recover from GPU resets. We'll need to trash all of our renderer state and re-create it.

See:

wlroots has migrated to gitlab.freedesktop.org. This issue has been moved to:

https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/2705

emersion · 2021-04-21T07:26:04Z

On amdgpu, a manual GPU reset can be triggered with /sys/kernel/debug/dri/0/amdgpu_gpu_recover.

zzag · 2021-04-21T07:50:01Z

You will have to trash renderer state and also client buffer textures. This means that you will need to keep a wl-shm client buffer referenced even after uploading data to the opengl texture.

emersion · 2021-04-21T07:50:10Z

Support for robustness could look like this:

In wlr_renderer_bind_buffer, check glGetGraphicsResetStatusEXT. If it doesn't return NO_ERROR, busy-wait with some sleep interval and a timeout until the GPU is ready again.
Re-initialize the renderer's internal state: EGL contexts, shaders
Destroy all wlr_gles2_buffer
Re-import all wlr_texture from their original wlr_buffer, if any (needs Cache and re-use DMA-BUF textures #2851). Otherwise make the texture "inert": destroy the GL texture but keep the wlr_texture alive to avoid crashing compositors.
Emit a wlr_renderer.events.reset event so that compositors can re-upload any texture that they created, and re-create their own GL state if any.

emersion · 2021-04-21T07:58:04Z

This means that you will need to keep a wl-shm client buffer referenced even after uploading data to the opengl texture.

Indeed. We can't re-upload the buffer after we've released the buffer, because the client might be in the process of rendering to it, so its contents can be garbage.

emersion · 2021-04-21T08:00:50Z

Emit a wlr_renderer.events.reset event so that compositors can […] re-create their own GL state if any

This might be slightly more complicated: we need compositors to destroy their old GL state before we destroy the EGL context, and re-create it after we've established a new EGL context. We might need two events (meh API), or to keep the old EGL context alive up to wlr_renderer.events.reset.

zzag · 2021-04-21T08:07:02Z

Indeed. We can't re-upload the buffer after we've released the buffer, because the client might be in the process of rendering to it, so its contents can be garbage.

Beware though, there are applications in the wild that assume the compositor will release a shm buffer after uploading its data to an opengl texture, most prominent example is Firefox. https://bugzilla.mozilla.org/show_bug.cgi?id=1693472

emersion · 2021-04-21T08:17:09Z

Oh, wow. That's pretty gross.

dnkl · 2021-05-06T08:55:27Z

This means that you will need to keep a wl-shm client buffer referenced even after uploading data to the opengl texture.

Is there no way around this? I completely agree that clients cannot assume a buffer is released immediately. But it does enable very nice optimizations when it is released immediately, especially considering rendering is done on the CPU.

Being able to recover from GPU resets is of course an extremely nice thing to have. It just seems somewhat weird to incur this kind of performance penalty for something that should, ideally, never happen.

But I guess one could argue that performance critical applications should use GL, not shm...

emersion · 2021-05-06T09:00:26Z

One way around this would be to add a protocol to let the compositor ask the client to submit a new frame. wl_surface.frame isn't enough, because it's (1) requested by the client (2) doesn't require the client to submit a new buffer (e.g. if nothing changed on screen, no need to redraw).

Non-immediate release shouldn't have too much of a CPU usage cost, I think? It does have a memory cost though.

kennylevinsen · 2021-05-06T09:22:03Z

How about sending a configure to all surfaces? That is likely to provoke a new frame.

Granted, it's not guaranteed to work, but clients might not use a dedicated recovery protocol either, so a backup plan might be in order...

dnkl · 2021-05-06T09:31:15Z

Non-immediate release shouldn't have too much of a CPU usage cost, I think? It does have a memory cost though.

Clients can choose between a couple of strategies I think.

Re-render the new frame from scratch. Potentially very expensive.
memcpy() the old frame, then apply current frame's damage.
Try to re-apply the damage from the old frame, before applying the new frame's damage.

("damage" refering to the client's internal damage tracking)

For reference, foot currently does 2), but I'm going to test 3). I should have some performance numbers after that (see https://codeberg.org/dnkl/foot/issues/478).

One way around this would be to add a protocol to let the compositor ask the client to submit a new frame.

How about sending a configure to all surfaces? That is likely to provoke a new frame.

Both alternatives crossed my mind as well. I think a new protocol would be more robust? But like @kennylevinsen said, perhaps a configure event can be used as a fallback mechanism for clients not implementing the new protocol?

emersion · 2021-05-06T09:56:42Z

How about sending a configure to all surfaces? That is likely to provoke a new frame.

Yeah, but some clients might just realize the configure event doesn't change anything, and ack it without attaching a new buffer.

For clients that don't support the "please redraw" protocol, we could always hold the wl_buffer longer, just as described above. The protocol could be designed with a per-surface addon object, so that the compositor knows exactly which surface can be redrawn on-demand.

(3) Try to re-apply the damage from the old frame, before applying the new frame's damage.

This seems like the best solution.

dnkl · 2021-05-08T09:46:53Z

Here are some initial numbers from the foot PR that implements (3). While the PR still needs more testing, I believe the performance numbers are close to what we'll see in the end:

Sway 1.6, wlroots 0.13.0
Terminal size: 135x67 cells
Surface size: 953x1024
(CPU: i5-8250U CPU @ 1.60GHz 4/8 cores/threads, 6MB L3)

Times are in microseconds (µs).

Numbers in parentheses is the time taken to “prepare” the buffer before applying the current frame’s damage (hence it’s always zero in the “Immediate release” column).

Not covered here: ignoring old buffer content and instead re-rendering the entire frame (on this setup they range from ~3000-5000µs).

edit: since this wasn't mentioned anywhere; this table shows the average rendering time for a single frame.

Benchmark	Immediate release¹	Damage tracking²	Copy last frame³
typing⁴	143.6 ±40.1 (0.0 ±0.0)	181.6 ±35.3 (35.2 ±3.2)	1404.2 ±135.2 (1212.2 ±120.7)
cursor movement⁵	165.9 ±39.9 (0.0 ±0.0)	231.5 ±33.8 (74.2 ±5.2)	1246.6 ±113.9 (1094.6 ±103.4)
scrolling⁶	540.6 ±106.7 (0.0 ±0.0)	854.5 ±91.1 (350.6 ±29.1)	1677.8 ±282.8 (1082.8 ±213.2)

Observations:

double buffering damage is a very clear improvement over doing a dumb copy of the last frame (which in turn is a huge improvement over re-rendering the frame from scratch).
scrolling is a fairly expensive operation. This is exacerbated when we’re forced to do it twice (350µs vs. 35-75µs).
a simple memcpy() is a fairly expensive operation on buffers as large as these. Also, in addition to take time, they can easily thrash the cache, slowing things down further.

no double buffering, foot re-uses previous frame’s buffer ↩
foot re-applies last frame’s damage before applying current frame’s damage ↩
foot copies the old buffer (all of it) before applying current frame’s damage ↩
running cat - in the shell, at the bottom of the screen, typing a single letter at a time ↩
large C file in vim, moving cursor with arrow keys without scrolling the content ↩
large C file in vim, scrolling content by holding down arrow key ↩

emersion added enhancement renderer labels Feb 1, 2021

emersion mentioned this issue Feb 1, 2021

Occasional full desktop freeze/crash swaywm/sway#5988

Closed

emersion mentioned this issue May 22, 2021

Crash with amdgpu swaywm/sway#6290

Closed

emersion mentioned this issue Sep 8, 2021

[do not merge] Surface state refactor #3143

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for GL_EXT_robustness #2705

Add support for GL_EXT_robustness #2705

emersion commented Feb 1, 2021 •

edited

Loading

emersion commented Apr 21, 2021

zzag commented Apr 21, 2021

emersion commented Apr 21, 2021 •

edited

Loading

emersion commented Apr 21, 2021

emersion commented Apr 21, 2021

zzag commented Apr 21, 2021

emersion commented Apr 21, 2021

dnkl commented May 6, 2021

emersion commented May 6, 2021 •

edited

Loading

kennylevinsen commented May 6, 2021

dnkl commented May 6, 2021

emersion commented May 6, 2021

dnkl commented May 8, 2021 •

edited

Loading

Add support for GL_EXT_robustness #2705

Add support for GL_EXT_robustness #2705

Comments

emersion commented Feb 1, 2021 • edited Loading

emersion commented Apr 21, 2021

zzag commented Apr 21, 2021

emersion commented Apr 21, 2021 • edited Loading

emersion commented Apr 21, 2021

emersion commented Apr 21, 2021

zzag commented Apr 21, 2021

emersion commented Apr 21, 2021

dnkl commented May 6, 2021

emersion commented May 6, 2021 • edited Loading

kennylevinsen commented May 6, 2021

dnkl commented May 6, 2021

emersion commented May 6, 2021

dnkl commented May 8, 2021 • edited Loading

Footnotes

emersion commented Feb 1, 2021 •

edited

Loading

emersion commented Apr 21, 2021 •

edited

Loading

emersion commented May 6, 2021 •

edited

Loading

dnkl commented May 8, 2021 •

edited

Loading