Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulkan Backend #3935

Merged
merged 17 commits into from
Sep 30, 2016
Merged

Vulkan Backend #3935

merged 17 commits into from
Sep 30, 2016

Conversation

stenzek
Copy link
Contributor

@stenzek stenzek commented Jun 25, 2016

This thread will be strictly moderated, any off-topic or pointless comments will be deleted.

  • Not going to implement XFB, waiting on TextureCache-based XFB first.
  • Same with screenshot/frame dumping
  • You can probably play games now.
  • Performance in my brief testing is slightly above OpenGL on nVidia
  • AMD seems roughly ~25% faster compared to OpenGL in some scenarios (apart from needing the dualsrc blend fallback)
  • Can't really test on Intel, my most recent iGPU is a Haswell and Vulkan support is buggy

Rough overall todo:

  • Basic framework
  • Ability to draw stuff
  • Texture/vertex uploading
  • Update to new source styling
  • Sort out shader compiler
  • Having games render a decent amount of stuff
  • Not crash regularly (somewhat improved)
  • EFB to texture
  • EFB to ram
  • EFB format changes
  • GPU texture conversion
  • CPU EFB access
  • Bounding box
  • Perf queries
  • MSAA support
  • Stereoscopy support
  • Anisotropic filtering support
  • Logic ops for blending
  • Vsync
  • Texture dumping
  • Support for mobile/android
  • Support for drivers without coherent buffer mapping (needed?)
  • Support for drivers without dual source blending (looking at you, Adreno)
  • Pipeline state caching

Now feature-complete, time for clean-ups/bug-fixing/performance work.


This change is Reviewable

@phire
Copy link
Member

phire commented Jun 25, 2016

Looks like you got further than me.

My attempt puts GLSLang in externals - commit
It also puts vulkan headers inside externals and includes a dynamic loader - commit

@ratchetfreak
Copy link

https://github.com/stenzek/dolphin/blob/vulkan-pr/Source/Core/VideoBackends/Vulkan/SwapChain.cpp#L248

This is not allowed. You have to acquire the images before you can transition them.

https://www.khronos.org/registry/vulkan/specs/1.0-wsi_extensions/xhtml/vkspec.html#_wsi_swapchain

Use of a presentable image must occur only after the image is returned by vkAcquireNextImageKHR, and before it is presented by vkQueuePresentKHR. This includes transitioning the image layout and rendering commands.

Instead you can always treat the image as starting as UNDEFINED when acquired. Though that implies that you don't care about the data still in the image (which you don't because you clear it on renderpass begin). The other option is to track the images to see which one has been presented already.

See http://stackoverflow.com/q/37524032/731620 for more info.

Note that the renderpass can do this transition by setting the initialLayout and finalLayout during renderpass creation. You do need to add a subpass dependency from external to 0 with srcStageMask matching the waitStageMask of the acquire semaphore and a dstStageMask of COLOR_ATTACHMENT_OUTPUT.

@lioncash lioncash added WIP / do not merge Work in progress (do not merge) RFC Request for comments labels Jun 25, 2016
@stenzek
Copy link
Contributor Author

stenzek commented Jun 26, 2016

Good to know @ratchetfreak, thanks for the explanation.

For now I'm just transitioning from UNDEFINED to COLOR_ATTACHMENT after beginning the render pass, and setting the final layout to PRESENT_SRC. Overall the barriers in their current state could definitely be more fine grained than they are, will be something to look into later on.

VkClearAttachment clear_attachments[2];
uint32_t num_clear_attachments = 0;
// Native -> EFB coordinates
TargetRectangle target_rc = Renderer::ConvertEFBRectangle(rc);

// Clearing must occur within a render pass.
m_state_tracker->BeginRenderPass();

This comment was marked as off-topic.

@davidbepo
Copy link

good work, tested sonic colors on intel hd 5500 linux and worked perfecly (but a little slower than opengl)
but dragon ball z bt3 rendered incorrectly
captura de pantalla de 2016-07-12 19-07-43
captura de pantalla de 2016-07-12 19-08-05
captura de pantalla de 2016-07-12 19-08-11

keep up with good work

@davidbepo
Copy link

dbz fails still happen with last commit from today

@stenzek
Copy link
Contributor Author

stenzek commented Jul 20, 2016

@davidbepo does this issue also manifest on D3D11? I vaguely recall the issues with that game being caused by depth inaccuracy.

Unfortunately we're not able to use the viewport depth trick like we do in the GL backend for Vulkan (according to the spec it's allowed, it works on NVIDIA, but breaks on AMD), so this won't be solvable in the backend without a different solution (which will also solve it for D3D).

@davidbepo
Copy link

@stenzek i cant test d3d11 because i am on linux (as i said in previous comment)

if you think is useful i can test ssbb ,one piece uc2 ,sonic colors and inazuma eleven go strikers 2013

@Enverex
Copy link
Contributor

Enverex commented Jul 20, 2016

Would it be worth implementing it and pushing the issue back upstream to AMD? As it sounds more like an issue they need to fix.

@stenzek
Copy link
Contributor Author

stenzek commented Jul 20, 2016

@davidbepo Any feedback for Intel is definitely helpful! I'm unable to test it since my most recent CPU with an iGPU is a Haswell, and the Anvil driver completely breaks when running Dolphin (need to narrow that one down further at some point).

@Enverex I'm not entirely sure if the bugs I'm experiencing on AMD are still present in their latest hotfix driver, and didn't want to whinge in case they have been fixed (I'm limited to stable-ish drivers, since my AMD test system has an APU, their latest hotfix drivers only support discrete GPUs).

The main issue for AMD users is that shaders with an indexed output (dual source blending) causing graphics pipelines to fail to create. The driver advertises support, but just silently fails (also fails in CodeXL with a simple shader). At the moment it just falls back to the two-pass hack, but better than those objects not rendering at all.

@davidbepo
Copy link

@stenzek
all the games (except dbz) i tested on vulkan backend work correctly but a bit slower and with a bit more stuttering.
This however could be caused by the anv driver lacking performance features like: HiZ,fast color clears,guardband clipping...
see: https://cgit.freedesktop.org/mesa/mesa/tree/src/intel/vulkan/TODO for more details

should i also test your branch but with openGL backend?

@JMC47
Copy link
Contributor

JMC47 commented Jul 20, 2016

@davidbepo DBZ has bugs with the method we use for D3D (and now Vulkan too,) depth emulation. It's expected until we come up with a solution to our depth problems.

@Fallcrest
Copy link

@stenzek, I would be happy to test some games on Intel, the few that I have. Do you know if Broadwell GPUs have similar issues to Haswell before I even try?

@davidbepo
Copy link

davidbepo commented Jul 20, 2016

@Fallcrest if you are on linux, broadwell works fine , i got most games rendering correctly
dont know about windows though

@stenzek
Copy link
Contributor Author

stenzek commented Jul 20, 2016

@davidbepo You'll likely get more stuttering than GL if it's due to shader compiling (since we now have num_vertex_shaders * num_pixel_shaders * num_primitives * num_vertex_formats, ... combinations of pipelines instead of num_vertex_shaders * num_pixel_shaders programs, with variations hidden by the driver). This may be reduced on further runs depending on what the driver gives us back for pipeline caches (AMD gives us pretty much nothing, nvidia seems to work). We're "caching" the generated SPIR-V from the shaders themselves, so worst case your driver just has to do the last "compilation" stage each run.

@Fallcrest AFAIK, Broadwell is much better supported than Haswell on the Anvil driver, but this is for Linux, on Windows only Skylake has a Vulkan driver.

@Fallcrest
Copy link

Fallcrest commented Jul 20, 2016

@stenzek Oh well, then. That is unfortunate, as I am on Windows.

@jerbmega
Copy link

This Vulkan backend has been a life saver.

I've been following the progress (and compiling the github repo) over the past week. My situation is that I have an AMD FX-8120, so I have very weak single core performance, bottlenecking my 970. Vulkan seems to help a lot since it uses as many cores as it can get and has a lower overhead. Talos Principle went from struggling to do 40 ultra to 60FPS ultra - and that;s just wrapped around DX11.

I couldn't do much with Dolphin due to the CPU bottleneck. A lot of the more CPU intensive games wouldn't run full speed at all. Vulkan remedied that for a lot of things.

Thank you stenzek for your hard work!

@stenzek
Copy link
Contributor Author

stenzek commented Jul 20, 2016

@jerbear64 Unfortunately in this case I wouldn't expect much of a performance difference, if you're using D3D11, definitely, but probably not much over NVIDIA's GL driver, but if you're seeing some wins, great!

We can't do much in the way of using multi-threaded rendering for Dolphin, since the work for the renderer is "bottlenecked" by the CPU emulation feeding it (simplified: we don't know all the data upfront, so it's hard to split that up across several threads/cores, whereas a PC game has all that information upfront).

@jerbmega
Copy link

I'm seeing some pretty big performance gains (especially in Sega titles)

I guess my processor is just that bad.

-------- Original Message --------
From:Stenzek notifications@github.com
Sent:Wed, 20 Jul 2016 14:07:01 -0400
To:dolphin-emu/dolphin dolphin@noreply.github.com
Cc:jerbear64 jsilliman@cfl.rr.com,Mention mention@noreply.github.com
Subject:Re: [dolphin-emu/dolphin] [RFC/WIP] Vulkan Backend (#3935)

@jerbear64 Unfortunately in this case I wouldn't expect much of a performance difference, if you're using D3D11, definitely, but probably not much over NVIDIA's GL driver, but if you're seeing some wins, great!

We can't do much in the way of using multi-threaded rendering for Dolphin, since the work for the renderer is "bottlenecked" by the CPU emulation feeding it (simplified: we don't know all the data upfront, so it's hard to split that up across several threads/cores, whereas a game has all that information upfront).


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@cammelspit
Copy link

I use Windows and not Linux but if my Skylakes iGPU would be useful for testing I would be more than happy to give some tests a go. With my R9380 the Vulkan backend is almost twice as fast as OpenGL and beats out DX12 in both terms of stability and speed. Though, speed is only about 10FPS faster than DX12. Either way, amazing job and let me know if my testing anything for you is useful or not since I do use Windows.

@JMC47
Copy link
Contributor

JMC47 commented Jul 20, 2016

On NVIDIA, performance is really close to OpenGL (slightly slower), Slightly faster or way faster (Twilight Princess Map)

EFB2RAM games seem slower, while EFB2Tex ones edge on faster.

@FIX94
Copy link

FIX94 commented Jul 20, 2016

just gave this a try on my amd r9 280 and it does seem to perform very close (but a little better) than dx12 in quite a few cases so that is pretty cool! still, some games do run better in plain dx11 for me, dont know why.

@Anti-Ultimate
Copy link
Contributor

The EFB to RAM implementation seems to have some trouble. In Super Mario Sunshine in Noki Bay, there are platforms which are supposed to appear after the goop is cleared. This doesn't work on Vulkan.

@JMC47
Copy link
Contributor

JMC47 commented Jul 20, 2016

wouldn't that be perf queries? Which isn't implemented?

@davidbepo
Copy link

@stenzek one question does the vulkan backend have a second level fifo?

i ask because it is one of the reasons of dx12 backend being so fast
(thats what the 5.0 release notes said)

@stenzek
Copy link
Contributor Author

stenzek commented Jul 21, 2016

@davidbepo No it doesn't. Command buffer execution is pushed to a worker thread to reduce latency on the CPU/Video thread, but pushing everything to a worker thread at the API command level causes a whole lot of issues (e.g. image resizes -> acquire calls failing, when the next one has already been issued, this is the reason that exclusive fullscreen cannot be implemented on our D3D12 backend).

IMO, the D3D12 method is definitely not worth the added complexity and issues that it causes for a very slight increase in performance (even without it, vulkan is pretty close to matching D3D12 perf).

The better solution here would be to create our own command list (e.g. draw x indices from buffer y using shader z), and do it at the VideoCommon not backend level, this way the benefits are shared across backends, and the API overhead is completely removed from the video thread (which is not the case in D3D12, it issues API calls on both the main thread and the worker thread).

@davidbepo
Copy link

@stenzek thanks for the explanation.
hoping to see the videocommon improvement

@Anti-Ultimate
Copy link
Contributor

On Ubuntu 16.04, with NVIDIA 367.35, resizing the render window does not work, similiar to the bug that existed under Windows.

6a1594f

Should this be done under Linux too?

@degasus
Copy link
Member

degasus commented Sep 30, 2016

Reviewed 8 of 31 files at r27, 1 of 1 files at r28.
Review status: 45 of 95 files reviewed at latest revision, 39 unresolved discussions.


Comments from Reviewable

@lioncash lioncash merged commit 46b9383 into dolphin-emu:master Sep 30, 2016
@lioncash lioncash removed RFC Request for comments WIP / do not merge Work in progress (do not merge) labels Sep 30, 2016
@mbc07
Copy link
Contributor

mbc07 commented Oct 1, 2016

Aaand the buildbot apparently goes down just after merging, so, no Windows version 😞

@psennermann
Copy link

With my gtx 950 on Windows 10, Dx12 is slightly better than Vulkan in copy to texture while Vulkan definitely outperforms Dx12 in copy to RAM, but most important shadercache stuttering seems better (noticeably less than DX and OGL) with Vulkan (tested with Mario Kart Wii).
Anyway I wanted to report a bug with the game Fatal Frame 4, as you can see Vulkan doesn't render correctly the scene (maybe a problem with the flashlight?)

FF4 Vulkan
r4zj01-2

FF4 Dx12
r4zj01-4

@stenzek
Copy link
Contributor Author

stenzek commented Oct 1, 2016

Hi @psennermann

Are you using the newest NVIDIA drivers? 370+ are broken if the game is using destination alpha, unfortunately until this is resolved you have to use an earlier driver (I'm using 368.81).

If it's not a driver problem, any chance you could record a fifolog of this issue? If you haven't done this before, boot the game and move to the point where the issue is happening (loading a save state works too), and select FIFO Player under the Tools menu. Switch to the Record tab, choose the number of frames (one is probably sufficient unless the issue is intermittent), hit Record, then Save, and upload the file.

Thanks!

@psennermann
Copy link

Now I'm using 372.90, as soon as possible I'll try to roll back to 368.81 to see if it's really a Driver issue...
Thanks

@psennermann
Copy link

Confirmed that is a driver issue: I just uninstalled 372.90, then at reboot Windows 10 automatically installed 369.09 and now the problem is gone

@psennermann
Copy link

Don't know if someone already mentioned it, but sometimes Vulkan backend seems less smooth, I mean every now and then although frames still display at 60 fps, there's some visible "stuttering" (not due to shadercache), this doesn't happen with other backends; if you try Mega Man X Collection you'll notice immediately what I mean (sometimes lateral scrolling becomes jerky)...btw speaking of Mega Man X Collection there are video artifacts on screen as you can see below:

gxge08-3

@7oxicshadow
Copy link

Built on Linux. Couple of things I have found so far:

When using the CLI version I seem to have problems when going full screen. (I probably need to investigate further at my end)

F-Zero GX - Menus work but in game is black showing only the boosters

Rogue Squadron - Screen just flashes whilst the audio plays in the background

@psennermann
Copy link

psennermann commented Oct 1, 2016

The "stuttering" problem that I mentioned before seems to be fixed for me by setting the monitor refresh rate to 59 hz...

@gfxstrand
Copy link

Jason Ekstrand from the Intel mesa team here... First off, good work guys! I'm glad to see yet another app using the Vulkan API and exercising our driver. I'm also glad to hear it's mostly working on Intel. Please, If you encounter any driver bugs, file them on bugs.freedesktop.org rather than just giving up and saying "doesn't run because of bugs". It's very hard to fix bugs we don't know about and, with as few Vulkan apps as are floating around these days, every bug report we get is important. We'll try and address things as quickly as we can.

@stenzek Please feel free to test on Haswell and report any bugs you find. Haswell support is at about 85% these days and I expect most stuff to more-or-less work.

@Enverex Please don't use RetroArch and paraLLel as a driver quality metric. We did have an exotic compiler bug that affected them fairly seriously but it's been fixed for a while now. More to the point, they use Vulkan in a very non-standard way (effectively doing a SW implementation in compute shaders) so "it renders like garbage" for them doesn't mean you should expect that for a normal Vulkan client.

Keep up the good work!

@Enverex
Copy link
Contributor

Enverex commented Oct 1, 2016

@jekstrand That comment was actually based on issues with programs like The Talos Principle rather than ParaLLel in RetroArch given that ParaLLel was/is very experimental anyway.

@MayImilae
Copy link
Contributor

@psennermann Please report issues to our issue tracker. https://bugs.dolphin-emu.org/issues/

@gfxstrand
Copy link

@Enverex Fair enough. However, the quality of our driver is actually higher than most people with blogs or on forums think it is. Many of the issues have been with apps. Talos, for instance, shipped with a glslang bug that was causing us (but no one else; yes, I know that seem strange) problems. It took them a good month or so after the release before the shipped the update that fixed it. The Dota2 crashing people liked to complain about was due to a bug in the steam overlay.

The moral of the story is: File bugs rather than just blaming everything on bad drivers and moving on.

@JMC47
Copy link
Contributor

JMC47 commented Oct 2, 2016

Vulkan doesn't have exclusive fullscreen support, so it may look choppier than OpenGL on Linux for quite a while.

@Sunderland93
Copy link

Windows 8.1, Nvidia 368.81, Dolphin 5.0-753. After loading game, an error occurs "Failed to submit command buffer"

1
2

@stenzek stenzek deleted the vulkan-pr branch November 13, 2016 07:35
@dolphin-emu dolphin-emu deleted a comment Aug 8, 2017
@dolphin-emu dolphin-emu locked and limited conversation to collaborators Aug 8, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet