Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubershaders 2.0 #5702

Merged
merged 19 commits into from Jul 30, 2017
Merged

Ubershaders 2.0 #5702

merged 19 commits into from Jul 30, 2017

Conversation

stenzek
Copy link
Contributor

@stenzek stenzek commented Jun 27, 2017

If users want to test, use this link: https://dl.dolphin-emu.org/prs/pr-5702-dolphin-latest-x64.7z

This pull request completes the implemention of ubershaders in the graphics backends, started by @phire. Most of the hard work was done already by them, I just had to write the vertex ubershaders, integrate it into the backends, and fix bugs.

Last time I checked, ubershaders are fifoci regression-free on GL. Haven't tried D3D or Vulkan.

In the graphics options under the enhancements tab, there is a new drop-down field, "Ubershader mode". The options are:

  • Disabled: "Classic" mode with normal shader generation. Stuttering will still exist, same as before. Recommended for low-end systems.
  • Hybrid: Compiles specialized shaders asynchronously, while this occurs, ubershaders will be used. Best balance of performance and stuttering. This is not the same as the Ishiiruka async shaders! Game objects will continue to render as normal while shaders are being compiled. Not guaranteed to remove stuttering completely, as drivers often defer some work to the first time a program is used, and/or GL_LINK_STATUS is checked (cough NVIDIA), which we do on the main thread.
  • Exclusive: Only use ubershaders for rendering. Largest performance hit. Don't expect to hit full speed at high resolutions, even on high-end systems. Least possible amount of compilation "stutter".

There's also a few hidden options, which you can modify via the ini:

  • BackgroundShaderCompiling: Enables aforementioned "hybrid" mode.
  • DisableSpecializedShaders: Enables aforementioned "exclusive" mode.
  • PrecompileUberShaders: Precompiles all ubershader combinations at boot time. For drivers that support a shader cache, this will only take time on the first boot, subsequent boots should be very fast. You want to leave this enabled for the best experience, as the ubershader compile time is much longer than the specialized shader compile time.
  • ShaderCompilerThreads: Sets the number of worker threads created for asynchronous shader compilation. Most drivers have some sort of lock involved in shader creation, so this will only scale up to a certain point. It defaults to 1, as this hopefully should be sufficient for background compiling in most cases, as well as not oversubscribing those with fewer CPU cores. This can be also be set to -1, which determines the number of threads based on system it is running, or 0, which disables asynchronous compilation.
  • ForceVertexUberShaders: Replaces specialized vertex shaders with uber shaders. Only really useful for debugging issues in the uber shaders.
  • ForcePixelUberShaders: Replaces specialized pixel shaders with uber shaders. Only really useful for debugging issues in the uber shaders.

Feel free to report bugs at this point. Please attach a fifolog where possible, and provide as much detail as you can, as this will allow me to get the issue fixed faster.

For the best experience depending on your operating system and GPU vendor:

  • Intel on Windows
    • Use D3D for Exclusive or Hybrid modes.
    • Driver generates variants with OpenGL -> suttering.
    • The Vulkan driver only supports Skylake+, and is buggy anyway.
  • Intel on Linux
    • Use Vulkan for Exclusive or Hybrid modes.
    • anv works quite well.
    • i965 doesn't share compiled code between contexts, which means the main context will always recompile and stutter.
  • AMD on Windows
    • Use D3D for Hybrid mode.
    • Use D3D or Vulkan for Exclusive mode.
    • The AMD GL driver is just slow in general.
    • Vulkan doesn't work too badly, but the shader cache is ineffective, leading to long boot times.
  • AMD on Linux
    • Use Vulkan for Exclusive or Hybrid modes.
    • I haven't tested radeonsi, but radv likely behaves similarly to anv. If radeonsi shares GPU code between contexts, GL may be an option.
  • NVIDIA on Windows
    • Use D3D for Hybrid mode. GL isn't too bad with the latest changes, but may still have some stutter.
    • Use D3D or GL or Vulkan for Exclusive mode. D3D will get you the best performance.
  • NVIDIA on Linux
    • Use GL for Hybrid mode.
    • Use GL or Vulkan for Exclusive mode. GL performs slightly better.

Few notes:

  • There is a large performance hit when using ubershaders. This is most noticeable in the exclusive mode. In the hybrid mode, ubershaders likely aren't used for every object being rendered, meaning the overall performance hit will be lower. There's still room for optimizations, but these can come later.
  • Per-pixel lighting is not currently compatible with ubershaders. If you enable per-pixel lighting, ubershaders won't be used, and you will still experience compilation stutter.
  • The ubershader caches are shared between games, hence the dependency on PR ShaderGen: Decouple host state from shader UIDs #5679. The compile times can be pretty long, so it makes sense to share them where possible.
  • D3D11 currently offers the best experience in regards to compilation stutter. The NV GL driver still generates variants behind our back, which sometimes creates a noticeable hitch.
  • AMD's Vulkan driver is garbage and doesn't use a pipeline cache, so every variant is expensive to create. This means when we generate ubershader variants, it'll still stutter. Not much we can do about this, unless they implement derived pipelines, which we could potentially make use of. They also fail at arrays in shader input/output interfaces, forcing an ugly workaround.
  • Progress dialogs for shader compilation at boot are implemented for D3D, and Vulkan only.
  • OpenGL asynchronous compilation is currently done via the ARB_parallel_shader_compile extension, currently only implemented by NVIDIA. We may consider a multi-context approach in the future, but for other vendors, you may wish to use one of the other backends.

@stenzek stenzek added the WIP / do not merge Work in progress (do not merge) label Jun 27, 2017
@stenzek stenzek force-pushed the ubershaders branch 3 times, most recently from a7006ff to 89fc5a5 Compare June 27, 2017 08:26
@@ -1279,8 +1280,12 @@ void ShaderCache::PrecompileUberShaders()

void ShaderCache::WaitForBackgroundCompilesToComplete()
{
m_async_shader_compiler->WaitUntilCompletion();
m_async_shader_compiler->WaitUntilCompletion([](size_t completed, size_t total) {
Host_UpdateProgressDialog("Compiling shaders...", static_cast<int>(completed),

This comment was marked as off-topic.

@iwubcode
Copy link
Contributor

iwubcode commented Jun 27, 2017

i7 6700k here with an amd rx480 graphics card playing at 4k with the default number of threads using Windows 10.

Vulkan

Hybrid mode is not playable, tried a few games and get the following error on each:

VulkanLoader.cpp:314 E[Video]: (Vulkan::CommandBufferManager::SubmitCommandBuffer) vkQueueSubmit failed: (-3: VK_ERROR_INITIALIZATION_FAILED)

Oddly, exclusive mode sees no error but as you mentioned is really slow (50% decrease in speed) so it's hard to gauge stuttering.

D3D

Hybrid mode works and removes stuttering completely in my tests!

Exclusive mode also works but is very slow.

OGL

Seemed to be mostly the same as D3D but being an AMD user I never use it so I have little to compare it to!

@fantesykikachu
Copy link

AMD RX460 user on windows 10
first Vulkan with hybrid ubershaders worked fine for me
second i got no process dialog for DX11 no matter whether i'm using hybrid or exclusive. (this leads me to believe it is not running.)
third found a really weird bug, in Final Fantasy Crystal Chronicles(GC) with DX11, msaa anti-aliasing and either hybrid or disabled (yes disabled) ubershaders, the main menu becomes corrupted (turns solid purple), but if you click a button the next menu appears fine however in game a lot of 3D textures don't show up.

@Tilka
Copy link
Member

Tilka commented Jun 27, 2017

GT Cube (GTCJBL): When you're in the garage menu, switching between Disabled/Hybrid and Exclusive noticeably changes lighting. I think a light disappears with ubershaders. (on Nvidia/Windows using either OpenGL or Direct3D, Vulkan errors out with "Failed to submit command buffer.")

fifolog

@mimimi085181
Copy link
Contributor

More AMD results, Windows 7, amd radeon hd 7790, driver version 17.4.4. I have only briefly tested new super mario bros with a portable.text and only changing the IR to 3x:

  • Opengl has no progress dialog for the compilation, but otherwise seems to work
  • Vulkan seems to be work, can't say anything about stutter, i'm not that sensitive to it
  • DirectX works in hybrid mode, after 2 shader compilation errors, exclusive does not and freezes after the 3rd compilation errors. I have uploaded the 2 error messages and the bad shaders in the user folder. There were only 2 unique shaders, the rest were duplicates of those 2.

bad_ps_0000.txt
bad_ps_0001.txt
shader error 1
shader error 2

@gourdcaptain
Copy link

Arch Linux x64, Intel Core i7 6800k, AMD Radeon RX 460 on the Mesa 17.1.3 open-source drivers. Did a quick test of the ubershaders (with mostly Super Smash Bros Brawl, the game I have that suffers the most from shader stuttering on an empty cache) and got the following results:
OpenGL, Hybrid: I think it might be a slight improvement over OpenGL without Ubershaders, but it honestly didn't feel that great.
OpenGL, Exclusive: No stuttering, but too slow to be what I want.
Vulkan, Hybrid or Exclusive: With the RADV Vulkan drivers, Dolphin just crashes as soon as a game is launched (this game works fine in Vulkan on a normal Dolphin build or this one without Ubershaders). Attaching a log, although it doesn't seem especially useful.
dolphin-ubershaders-radv-crash.txt

@JMC47
Copy link
Contributor

JMC47 commented Jun 28, 2017

@gourdcaptain unfortunately we rely on drivers a lot. Your card is more than good enough, but, the open source drivers aren't there yet to be able to run this at full speed. AMD always seems to struggle on OpenGL for performance.

Hopefully we can make ubershaders more efficient and hopefully driver devs care enough to help to make our use case more efficient as well. Thanks for the crash report, that will get looked into immediately.

@gourdcaptain
Copy link

When I say it's too slow, Hybrid OpenGL runs full speed at 3x IR and still stutters a bit as noted it might. Exclusive runs full speed at 1x IR, but as noted, can't do larger IRs. Just want to make sure it didn't sound worse than it actually was. (The card in general has decent performance on recent stable Mesa under Vulkan or on most games in OpenGL.)

@JMC47
Copy link
Contributor

JMC47 commented Jun 28, 2017

3x IR on Vulkan/OpenGL can't happen on almost anything right now. 3x/4x IR is possible on D3D. Hopefully we can increase performance on OpenGL/Vulkan soon to at least match D3D's performance on ubershaders.

Thanks for the clarification, that makes more sense. I'm really impressed the open source drivers can do 1x IR full speed.

@emmauss
Copy link

emmauss commented Jun 28, 2017

using intel i5 4210u and a geforce 840m and windows 10, using latest drivers. ubershaders works well, esp. in Final Fantasy Crystal Chronicles - The Crystal Bearers, a game notorious for its heavy use async shader compilation. it stutters like heck, esp in the desert area, with no ubershaders. With it on, after an initial run through an area, repeated runs stutter less and less. there will still be some stuttering even in a place you have been lots of time, but it does not hurt gameplay. tested with d3d11,vulkan and ogl.
the following where done on hybrid mode, native resolution with default enhancements. area tested is the eastern wildlands
Vulkan - smoother framerate on hybrid over D3D and ogl. has better performance that D3D and OGL. 28-30fps. little to no stuttering
OGL - worse performance on hybrid compared to D3D and vulkan, but better than when disabled. about 20-22fps on average.
D3D11 - good performance with some stuttering. 26-30fps
On exclusive, no stuttering, but heavy performance hit as expected
Vulkan - 18-22FPS
D3D11 - 19-23FPS, sometimes going to 28FPS
OGL - 15-18FPS

@theboy181
Copy link

Is there a build for this available somewhere?

@stenzek
Copy link
Contributor Author

stenzek commented Jun 28, 2017

@Tilka Do you have per-pixel lighting on? That's the only setting that let me reproduce what you're describing with the fifolog. Per-pixel lighting should disable ubershaders though, so there's a bug if that's not happening, I'll look into it. Edit: Should be fixed now

@mimimi085181 I have a feeling the older D3DCompiler that ships with Win7 may be choking on our shaders. Try grabbing d3dcompiler_47.dll from "C:\Program Files (x86)\Windows Kits\10\bin\10.0.15063.0\x64", and copying it to the Dolphin directory.

@theboy181 https://dl.dolphin-emu.org/prs/pr-5702-dolphin-latest-x64.7z

@ligfx
Copy link
Contributor

ligfx commented Jun 28, 2017

Graphics on macOS w/ an Intel card are messed up regardless of which ubershader option (Disabled, Hybrid, or Exclusive) I select:

screen shot 2017-06-27 at 10 02 37 pm

(I bisected the issue down to... "OGL: Uber shader support". Womp!)

@phire
Copy link
Member

phire commented Jun 28, 2017

Yeah.... I think that's an OSX driver issue.

@MayImilae
Copy link
Contributor

Confirmed, macOS is doing reeaaally weird things with this PR! Even with ubershaders off.

That's the brawl into video, with ubershaders off. In addition to being black and white and upsidedown, brawl's intro video was weirdly slow (something pr5679 shares with it, weirdly enough!) Taito no Tatsujin Wii only shows cyan, and changing the ubershaders options on or off changes nothing. Changing the setting, closing the emulator, and then running a game doesn't affect anything either.

Setting "PrecompileUberShaders = False" in the INI options made Brawl's into video full speed again, but didn't change the visuals.

So um, drivers. Yay! Unfortunately since turning off ubershaders is broken it's kind of a big deal...

Tested on:
MacBook Pro 13in mid-2012
MacOS 10.12.5
Core i5-3210M @ 2.5ghz
Intel HD Graphics 4000
16GB DDR3-800

@mimimi085181
Copy link
Contributor

@stenzek: That .dll fixes the compilation error.

@JMC47
Copy link
Contributor

JMC47 commented Jun 28, 2017

Should we ship that dll?

@stenzek
Copy link
Contributor Author

stenzek commented Jun 28, 2017

@JMC47 potentially. From what I've read, you're allowed to distribute it: https://blogs.msdn.microsoft.com/chuckw/2012/05/07/hlsl-fxc-and-d3dcompile/

@phire
Copy link
Member

phire commented Jun 28, 2017

Though perhaps we should just precompile the ubershaders instead.

@shuffle2
Copy link
Contributor

Or drop downlevel support \o/

@tabnk
Copy link

tabnk commented Jun 28, 2017

Tatsunoko vs. Capcom: Ultimate All-Stars. 5xIR with Per-pixel lighting disabled.

Run great and smooth gameplay. Almost no noticeable stuttering. :)

Windows 10 insider build 16199.
Nvidia GTX 960. Latest driver.
Direct3D 11 with Hybrid mode.

@stenzek
Copy link
Contributor Author

stenzek commented Jun 28, 2017

@gourdcaptain The latest push fixes the crash on anv, hopefully will do the same for radv. If not, I'll set up my AMD box again and investigate.

For those curious, seems switch (x) { default: return float3(0.0, 0.0, 0.0); } causes the SPV->NIR compiler to crash. Might be worth reporting.

@stenzek stenzek force-pushed the ubershaders branch 3 times, most recently from 66a9b8d to b45c85e Compare June 28, 2017 14:24
@Helios747 Helios747 merged commit ba57605 into dolphin-emu:master Jul 30, 2017
@Sarkie
Copy link

Sarkie commented Jul 30, 2017

Congrats!

@stenzek
Copy link
Contributor Author

stenzek commented Jul 30, 2017

Please note there is currently a known issue with primus, which causes Dolphin to crash on boot in OpenGL mode regardless of whether ubershaders are enabled. I'm going to look at this as soon as possible, but in the meantime, setting ShaderCompilerThreads=0 and ShaderPrecompilerThreads=0 in GFX.ini (Settings section) should allow you to continue to use Dolphin.

@ghost
Copy link

ghost commented Jul 30, 2017

This might sound a bit strange, but is force 24-bit color always enabled in exclusive mode? There's a game that actually needs it disabled called cyber sled since it causes color artifacts on the flat shaded polygons in game. Disabling it works with none or hybrid, but not exclusive.

@stenzek
Copy link
Contributor Author

stenzek commented Jul 30, 2017

@spring568 sounds like a bug. Could you please record and upload a fifolog of the issue? It may not be related to only the 24-bit color setting.

@ghost
Copy link

ghost commented Jul 30, 2017

@stenzek
Here you go

cybersled.zip

edit: reuploaded, it had a dud fifo log

@stenzek
Copy link
Contributor Author

stenzek commented Jul 30, 2017

@spring568 Thanks, I'll have a look when I next get a chance.

@ghost
Copy link

ghost commented Jul 30, 2017

Sure no problem! Ah I also forgot you need ignore format changes and store efb copies to texture unchecked for proper graphics.

@avindra
Copy link

avindra commented Jul 30, 2017

Incredible work, all! The hybrid mode seems to work well on MacOS on an integrated chip (Intel HD Iris 6100), can't wait to try this on AMD with Mesa / Vulkan.

@bb010g
Copy link
Contributor

bb010g commented Jul 30, 2017

#3163 (original PR) and #3185 (fifoci) should be closed now.

@stenzek
Copy link
Contributor Author

stenzek commented Jul 31, 2017

@spring568 PR #5858 should sort out this issue, or, at least it does for me when playing back the fifolog.

@stenzek stenzek deleted the ubershaders branch February 19, 2018 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet