Skip to content
This repository has been archived by the owner on Jun 18, 2021. It is now read-only.

Compute test hangs #237

Closed
kvark opened this issue Apr 5, 2020 · 12 comments · Fixed by gfx-rs/wgpu#553
Closed

Compute test hangs #237

kvark opened this issue Apr 5, 2020 · 12 comments · Fixed by gfx-rs/wgpu#553
Labels
bug Something isn't working
Milestone

Comments

@kvark
Copy link
Member

kvark commented Apr 5, 2020

Just run cargo test and observe the hang. Mine was on macOS, 100% reproducible.

@kvark kvark added the bug Something isn't working label Apr 5, 2020
@kvark kvark added this to the Version 0.5 milestone Apr 5, 2020
@kvark
Copy link
Member Author

kvark commented Apr 5, 2020

Could be related to #214, cc @lachlansneff

@lachlansneff
Copy link
Contributor

By compute test, you mean hello-compute?

I worked on my machine, so I’m not sure what’s going on here.

@kvark
Copy link
Member Author

kvark commented Apr 5, 2020

Also, unfortunately the test was not built or ran for a while, so it's not clear which change broke it.
Just doing cargo run --example hello-compute sometimes runs for me, sometimes hangs.
Running just cargo test always hangs so far. It runs a test in examples/hello-compute.

@aloucks
Copy link
Contributor

aloucks commented Apr 5, 2020

It's always going to deadlock because there isn't anything driving it. I tried to explain this on the PR.

Right now if I try to map a buffer asynchronously and use .await on that future, it's going to immediately deadlock because there isn't anything that can wake the waker (and we should not call Device::maintain / wgpu_device_poll from the future for the reasons described above).

@lachlansneff
Copy link
Contributor

@alouks, in hello-compute, there is something driving it. The future isn’t awaited upon until after the device is polled in a blocking fashion.

@kvark
Copy link
Member Author

kvark commented Apr 5, 2020

Debugging async code is somewhat painful, unless I'm doing it wrong :) the call stack has pretty much nothing useful.

@aloucks
Copy link
Contributor

aloucks commented Apr 5, 2020

This is only calling Device::poll once. If that one crank didn't cause your future to resolve, it's going to deadlock.

Because there is nothing driving the event loop here you'll need to call Device::poll repeatedly until buffer_future.poll() returns Poll::Ready.

// Note that we're not calling `.await` here.
let buffer_future = staging_buffer.map_read(0, size);
// Poll the device in a blocking manner so that our future resolves.
// In an actual application, `device.poll(...)` should
// be called in an event loop or on another thread.
device.poll(wgpu::Maintain::Wait);

Can we rename Device::poll to Device::tick or Device::maintain -- something that indicates that it's not the same as future polling? The difference is subtle but the underlying concepts are different. Future::poll is to check state, execute an action and return. Where as Device::maintain is a cranking mechanism.

@lachlansneff
Copy link
Contributor

lachlansneff commented Apr 5, 2020

In this case, the device is polled in a blocking fashion. This should block the thread until all asynchronous mappings currently installed complete.

Regardless, it's not working, so something is going on here.

@kvark
Copy link
Member Author

kvark commented Apr 5, 2020

fwiw, I agree that poll(true) is not really a poll()

@lachlansneff
Copy link
Contributor

@kvark Maybe something along the lines of device.process(Poll/Wait)?

@kvark
Copy link
Member Author

kvark commented Apr 5, 2020

Actually, it's already device.poll(wgpu::Maintain::xxx);. Perhaps, poll needs to be renamed, or a separate wait function introduced.

@aloucks
Copy link
Contributor

aloucks commented Apr 5, 2020

Perhaps, poll needs to be renamed

Yes, this was my suggestion. Dawn uses tick. Maybe use the same name? maintain would also work and that matches the core function name.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants