-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examples crash WindowServer on macOS #268
Comments
Oh, and this was on an Macbook Pro (M1 Pro) running latest macOS Ventura 13.2. |
This repros, thanks for the report. It was inadequately tested on mac, ironically because I moved my development to Windows because it's so painful to debug when the computer is constantly crashing and hard-locking. I suspect this is not a simple problem. I naively assumed that the behavior of |
I can confirm that #269 fixes the problem on my machine. That was fast! :-) Out of curiosity, could the actual problem be described as a deadlock in a compute shader? Whether as a logic error or a miscompilation. |
Running this:
An empty window is shown, and then after a second or two the
WindowServer
process seems to freeze, requiring a hard reboot. Unfortunately, no crash report is available in the system console, but if I leave it frozen for a while and reboot, a watchdog event seems to have been reported, noting that theWindowServer
process is unresponsive and that thewith_winit
example is still running. No stack trace, no nothing.Tried the following:
with_bevy
example is the same.I tried inserting
println!()
s and commenting things out to try to understand what's going on.Commenting out surface.present() seems to make it not freeze the system outright, but does still seem to cause some mysterious jankiness even after the process is terminated (like scrolling in VSCode becomes jittery).
Then I looked at block_on_wgpu(), which has a note about deadlocking if it is "awaiting anything other than GPU progress". As I understand
wgpu::Device::poll()
, it defines "GPU progress" as work having been submitted to a queue, and it returnstrue
if the queue is empty. I'm not sure, but I think that means that whenwgpu::Device::poll()
returns true, that is exactly the situation where there would be a deadlock. So I tried panicking whenpoll()
returns true, and indeed it happens.(Oddly enough, after crashing the process in a separate run with that panic, the jankiness in VSCode stopped...)
The
render_to_texture_async()
function does.await
a buffer mapping after some stuff has been submitted to the queue (throughrun_recording()
), and I'm honestly not sure what to make of wgpu's documentation here, becausedevice.poll()
is also meant to be called when awaiting buffer mapping events, but what exactly happens when the queue is fully emptied by the driver, but there are still outstanding buffer mappings?To clarify: I'm not certain that
poll()
returningtrue
while the future passed toblock_on_wgpu()
is stillPending
is actually a reliable indication of a deadlock, because I'm not sure it captures pending buffer mappings. But if it is, it seems plausible that the device is being spammed bypoll()
s (with theMaintain::Wait
argument no less), which somehow ends up crashing eitherWindowServer
or the GPU driver (which is concerning in itself).I'm sorry if I'm missing something and this is a wild goose chase - I'm really struggling (and I know I'm not alone here) with understanding how exactly
wgpu::Device::poll()
should be used, and I know that thewgpu
teams has iterated a bit on its behavior.The text was updated successfully, but these errors were encountered: