New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a post-cache shader UID fixup pass #10747
Conversation
09925bb
to
a26f79c
Compare
|
Oh yeah, is there something I'm supposed to do when I change the layout of uids? The uid cache doesn't seem to be invalidating properly |
|
Yeah, you need to bump this: dolphin/Source/Core/VideoCommon/GXPipelineTypes.h Lines 18 to 22 in 431d757
|
a26f79c
to
891cc75
Compare
Ahh okay, should be fixed up now |
d8b7f1f
to
332e41b
Compare
b4af592
to
3a12553
Compare
3a12553
to
33597cc
Compare
|
The point of Ubershaders is to remove stuttering, so whatever performance down there is from fbfetch being used shouldn't be a problem. |
Adds a pass to process driver deficiencies between UID caching and use, allowing a full view of the whole pipeline, since some bugs/workarounds involve interactions between blend modes and the pixel shader
33597cc
to
a88c803
Compare
Reduce the number of different pipelines needed. Also works around drivers that break when you combine fbfetch with dual source blending
It's not supported by any PC graphics API, and therefore completely unused
46e9076
to
fb56485
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code wise LGTM. I did test a handful of games on my AMD card.


Adds a fixup pass that runs between the pipeline uid cache and actual pipeline generation that applies fixups based on the current backend's supported features.
A number of bugs (in particular
BUG_BROKEN_DUAL_SOURCE_BLENDING) found themselves spread across VideoCommon and the backends, with each place having to guess at what the other would do, risking a failed pipeline compile if any disagreed. With the new implementation, a single pass runs over the entire pipeline at a point where all other decisions have been finalized, so it can make decisions based on what is actually about to happen, rather than trying to guess.I'm not super happy with the new location either though. Other recommendations welcome.
Reasons for the new location:
Drawbacks of the new location:
I also deleted dstalpha from the BlendingState struct, since none of the backends use it anymore. I assume this is fine? Also, should I adjust the positions of all the other bitfields to fill in the gap?
Testing needed
I changed the meaning of
BUG_BROKEN_DUAL_SOURCE_BLENDINGto "dual source breaks when the shader outputs src1 but the blending configuration doesn't use it". I can confirm (from testing for PCSX2) that this is the actual cause for the Intel graphics listed (and it means less brokenness for dstalpha, which previously still enabled DSB), but need verification that I didn't break things for the AMD GPUs with the flag. If it does, maybe we should split it for the two different bug types. I tried running on OpenGL+AMD+macOS and didn't notice anything weird, but maybe I didn't try the right game.Mario Kart Double Dash uses dstalpha without blending, and is improved by this PR on Intel+MVK
Images
I reduced
BUG_BROKEN_DISCARD_WITH_EARLY_Zto only apply fbfetch when early z was in use. I assume this won't break anything but you never know. Please test on M1.I fully enabled all fbfetch things at all times on ubershaders when fbfetch is supported, which should reduce the number of different pipelines needed, but may slow down the shader a bit. (It was also a lazy way for me to get around a bug in the unofficial Intel Metal fbfetch support, where fbfetch + dual source blend freezes the GPU.) I assume this is fine since reducing compilation time is kind of the point of ubershaders, but it would be good to know others' opinions. For reference, my UHD 630 goes from 24 to 20 fps at 2x resolution on the Wind Waker title screen flyover to enable fbfetch with exclusive ubershaders.
I haven't actually tested this outside of macOS. I don't see why it would break, but just in case.