Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make vkDeviceWaitIdle stop a pending vkAcquireNextImageKHR Semaphore #1059

Closed
krOoze opened this issue Oct 16, 2019 · 13 comments
Closed

Make vkDeviceWaitIdle stop a pending vkAcquireNextImageKHR Semaphore #1059

krOoze opened this issue Oct 16, 2019 · 13 comments

Comments

@krOoze
Copy link
Contributor

krOoze commented Oct 16, 2019

I suggest to make vkDeviceWaitIdle be explicitly able to stop a pending vkAcquireNextImageKHR Semaphore and Fence. I.e. make valid:

VkSemaphore s;
vkCreateSemaphore( s );
vkAcquireNextImageKHR( s );
vkDeviceWaitIdle();
vkDestroySemaphore( s );
  1. It is not immediatelly 100 % obvious to me it is actually invalid.

All vkDeviceWaitIdle says is:

wait on the host for the completion of outstanding queue operations for all queues on a given logical device

vkAcquireNextImageKHR only says:

semaphore signal operation referenced by semaphore has been submitted for execution.

In of itself IMO not sufficient to judge one way or other.

  1. It is a general expectation of the command to stop all work on the given VkDevice.

Cons:

  • Might be an incompatible spec change, and I don't know what drivers are actually doing (except open-source).

Motivation\Use Case:

VkResult r = vkAcquireNextImageKHR();
if( r == VK_SUBOPTIMAL_KHR ){
    vkDeviceWaitIdle();
    vkDestroyEverything();
    createSwapchain();
}
@cubanismo
Copy link

I don't think this is a good idea. AcquireNextImage is explicitly not a queue operation. DeviceWaitIdle() expicitly waits only for all device queues to idle, regardless of what naive users expect. Vulkan often surprises naive users, so I don't see why this case is special. The above motiviation can trivially be satisfied by inserting a QueueSubmit() waiting on the semaphore on any device queue, or by waiting on a fence from AcquireNextImage. I think the spec hole you're trying to plug is very unlikely to be an issue in practice, and certainly doesn't warrant special-casing the behavior in a way that has such broad implementation implications.

@krOoze
Copy link
Contributor Author

krOoze commented Oct 16, 2019

DeviceWaitIdle() expicitly waits only for all device queues to idle

But vkAcquireNextImageKHR says the semaphore signal was submitted for execution. That is a language usually used with a queue. That the specific queue is not stated does not necessarily mean it is not one of the queues subject to vkDeviceWaitIdle().

At least under an "as-if" clause that could always be the case. The implementation can have whatever internal foreign queues as much as it wants as long as it behaves in a healthy way towards the user of the API. I mean, how does the vkQueueSubmit knows the Semaphore is "special", not normal Vulkan semaphore? It too simply keeps it an implementation detail instead of introducing another parameter pWaitForForeignSemaphores.

spec:
To ensure that no work is active on the device, vkDeviceWaitIdle can be used to gate the destruction of the device.

which says it is not just what "naive users expect". It is practically spec recommendation.
"No active work". vkAcquireNextImageKHR is work on the device (first parameter).

The above motiviation can trivially be satisfied by inserting a QueueSubmit() waiting on the semaphore on any device queue, or by waiting on a fence from AcquireNextImage.

One require to always demand otherwisely unnecessary Fence of vkAcquireNextImage. Other is trivially equivalent to what the driver can do, minus the hacky-looking code required of the Application. It is Implementation detail of the Driver, so it should be responsible for cleaning it. Meanwhile no one expects vkDeviceWaitIdle to be super explicit and performant -- they expect it to be safe and debug choice (there used to be a Note to that effect, can't find it ATM).

doesn't warrant special-casing

That's somewhat my point. vkAcquireNextImage is being the exception(, assuming it is not subject to vkDeviceWaitIdle). What I want is explicitly remove that special case. There's no other command, I think, that behaves this way.

@cubanismo
Copy link

vkAcquireNextImageKHR() was designed to be special from the start. It's an artifact of decoupling acquiring the next image to render to from presenting the prior image, one of the major goals of the WSI design (We saw this as a big improvement over SwapBuffers()). Trying to pretend it's not special doesn't improve the design. Trying to pretend it operates on a device queue when it explicitly does not doesn't improve the design. Asking Vulkan drivers to do special things under the hood that applications can do (and understand they must do, if reading the spec closely) to accomplish these goals doesn't improve the design either.

We discussed this in the WSI group and the agreement was these semaphore signals are essentially the same as external semaphore signal operations, as I walked through in my comment on #152, and hence shouldn't be covered by vkDeviceWaitIdle(). Hence, closing the issue.

@krOoze
Copy link
Contributor Author

krOoze commented Oct 24, 2019

Hm, shame. The above concerns sound too theoretical for me TBH. Meanwhile it will be a hassle and an error prone gotcha...

I mean the logic is bit self-defeating. Why is there a vkDeviceWaitIdle if application can just do vkQueueWaitIdle. Why is there a vkQueueWaitIdle if application can just use a Fence. Those are not commands used for explicitness...

Thanks. Had to try...

@mokafolio
Copy link

I agree with @krOoze on this one . Every real world vulkan application will deal with swapchain recreation and inevitably run into this issue. It's not intuitive that vkDeviceWaitIdle essentially waits on everything with this being an exception.
99% of the online examples get this wrong and I am sure a lot of applications out there suffer from window resizing issues because of it. As a (worse) alternative I think the fence in vkAcquireNextImage can't be optional and needs to have a big fat note as to why that is.

@krOoze
Copy link
Contributor Author

krOoze commented Dec 2, 2022

@mokafolio Assuming the same problem does not apply to the fence in reality.

Want me to reopen the Issue so you can try taking over to champion the issue?

@mokafolio
Copy link

@krOoze Regarding the fence, I was thinking that it can't be optional and you'd have to explicitly wait for it when recreating the swap chain (as an alternative to relying on vkDeviceWaitIdle). But yeah, I don't think that's very clean.

Let me think about this a little more, I will come back to you if I think this should be reopened!

@krOoze
Copy link
Contributor Author

krOoze commented Dec 2, 2022

You can use the semaphore to later make a fence out of it. So the vkAcquire fence would not be that mandatory.

@mokafolio
Copy link

Could you elaborate what you mean with making a fence out of it? When not using a fence, the only way I could find (In your Hello_Triangle example by the way :) ) is to do an empty queue submission to make the vkAcquireNextImageKHR semaphore visible to vkDeviceWaitIdle.

@krOoze
Copy link
Contributor Author

krOoze commented Dec 2, 2022

Yea. You take the semaphore. You put it into vkQueueSubmit to be waited on. You put a fence to the vkQueueSubmit. You now have a fence out of a semaphore that transitively has the same workload releasing it. So the fence is not entirely mandatory.

PS: So the issue is at least not without a workaround unlike the #1678

@mokafolio
Copy link

mokafolio commented Dec 2, 2022

Awesome, thanks for all the great input. Just to wrap things up for the time being:

While the existing documentation for vkAcquireNextImageKHR and vkDeviceWaitIdle might be technically correct, it does not prepare you for the impending doom you are inevitably gonna encounter once you try to properly react to VK_ERROR_OUT_OF_DATE_KHR and VK_SUBOPTIMAL_KHR. I assume that's why most online examples (including some of the official ones as far as I can tell) don't properly deal with it.

In my mind there are two solutions:

  1. Prioritize user friendliness of the API just like @krOoze suggests and make vkDeviceWaitIdle wait on vkAcquireNextImageKHR semaphores
  2. Put some form of a NOTICEABLE warning/note into the documentation/reference that makes you aware of this issue (as it is highly unintuitive)

I prefer the former.

@luxalpa
Copy link

luxalpa commented Mar 26, 2023

I'm trying to understand how to resolve this issue but I am not sure (and things I tried didn't work). There seems to be a new extension except it's so new it's not supported by the driver so I can't even test it...

@krOoze
Copy link
Contributor Author

krOoze commented Mar 26, 2023

@luxalpa The workaround is described above. If you get the dirty semaphore from vkAcquireNextImage, you can convert it to sanitized fence or semaphore (or even the act of waiting on it itself will make it subject to vk*WaitIdle) with empty vkQueueSubmit (as stupid as having to do that is).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants