Skip to content

recover from lost/outdated wgpu surface on Wayland#1074

Merged
jrmoulton merged 2 commits into
lapce:mainfrom
rageknuff:fix/niri-freeze-on-start
May 11, 2026
Merged

recover from lost/outdated wgpu surface on Wayland#1074
jrmoulton merged 2 commits into
lapce:mainfrom
rageknuff:fix/niri-freeze-on-start

Conversation

@inanc-g
Copy link
Copy Markdown
Contributor

@inanc-g inanc-g commented May 11, 2026

On some Wayland compositors like Niri, floem apps freeze on startup.

The first frame renders, then the window is visually frozen.
Inputs still work under the hood but nothing visually changes.
Resizing the window fixes it.

To reproduce, simply run any example on niri (I haven't tried any other WM).
-> Window is frozen, on resize, it works again.

This is same bug Zed had: Freezing on Start in Niri #50574

Their fix on their internal gpui_wgpu renderer: Fix handling of surface.configure on Linux#50640

In floem, for the vello and vger renderers, finish() doesn't handle any Err Result from surface.get_current_texture(). On some Wayland compositors, the surface might be "lost" or "outdated" right after initial configuration, so get_current_texture() returns an error on the very first frame.

SurfaceError in wgpu doesn't auto-recover, so all future frames also fail silently and the window appears frozen.

Resizing unfreezes because the resize calls surface.configure() and rebuilds.

The fix is simply to handle SurfaceError::Lost and SurfaceError::Outdated by calling surface.configure() again, basically what Zed did:

Err(wgpu::SurfaceError::Lost | wgpu::SurfaceError::Outdated) => 
 self.surface.configure(&self.device, &self.config)

Note that this is also the cause of an open bug in Lapce: Lapce freezing on niri at startup #3897

Comment thread vger/src/lib.rs
}
let frame = match self.surface.get_current_texture() {
Ok(frame) => frame,
Err(wgpu::SurfaceError::Lost | wgpu::SurfaceError::Outdated) => {
Copy link
Copy Markdown

@0xNULLderef 0xNULLderef May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WebGpu examples also choose to handle wgpu::SurfaceError::Other and wgpu::SurfaceError::OutOfMemory. Not relevant on niri (from my testing), but it might help getting a present instead of a frozen mess in some other specific edge cases

they also choose to attempt a get again but I'm not that thoroughly versed in Vulkan to decide whether this or the other approach is better

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting. In the wgpu docs, it's described a bit differently:

Lost and Outdated are the two variants where wgpu itself explicitly recommends recreating the surface: https://docs.rs/wgpu/27.0.1/wgpu/enum.SurfaceError.html#variants

It seems in their example they opted for a broad "just reconfigure will fix it" catch-all approach, but it might make sense to only handle what's actually a surface configuration issue.

OOM shouldn't really try to recreate the surface if memory is starved (wouldn't be able to create a new frame anway - the comment in their example even says that), and Other would need to be inspected to decide what to do.

@jrmoulton
Copy link
Copy Markdown
Collaborator

This looks good. Thanks!

@jrmoulton jrmoulton merged commit bcafaaa into lapce:main May 11, 2026
6 of 7 checks passed
@inanc-g inanc-g deleted the fix/niri-freeze-on-start branch May 11, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants