Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [Not for Merge] Vertex Ubershaders #3185

Closed
wants to merge 47 commits into from

Conversation

phire
Copy link
Member

@phire phire commented Oct 18, 2015

This is a second PR for tracking the Vertex Ubershaders, so we can get fifoci runs without pixel ubershaders enabled.

This PR is more or less the same as #3163 but with ShaderGen for pixel shaders.

Review on Reviewable

@mbc07
Copy link
Contributor

mbc07 commented Oct 19, 2015

I know it's not ready for public consumption but I would like to thank you anyway phire, Tatsunoko Vs Capcom stuttering is hugely improved, it still have a little stutter here and there with this PR but compared to master where every single action you do in the game would cause various shader compilation stuttering it's thousands of times better...

@JMC47
Copy link
Contributor

JMC47 commented Oct 19, 2015

Someone's reports about that game made it a testing candidate for the ubershaders. If you were the one who reported the severe shader generation issues in the game, I thank you for bringing it to our attention.

@phire
Copy link
Member Author

phire commented Oct 19, 2015

I hope you tested the other PR. This one currently has no ubershaders at all enabled.

@JMC47
Copy link
Contributor

JMC47 commented Oct 19, 2015

Yeah, a majority of the stuttering in TvC is actually pixel shaders. The comment by @mbc07 points toward them using the other PR, as i have the same experience.

@mbc07
Copy link
Contributor

mbc07 commented Oct 19, 2015

I tested the main Ubershaders PR as well, it also improves TvC (despite some funny colors in some character's faces probably because it's no finished)...

@JMC47
Copy link
Contributor

JMC47 commented Oct 19, 2015

The funny characters on faces should be fixed. I tried a few random characters and it was working. The rest of the stuttering is vertex shaders being generates, so once it's done the game should NEVER stutter!

@phire phire mentioned this pull request Oct 22, 2015
10 tasks
@phire phire force-pushed the ubershaders-vertex branch 2 times, most recently from 6539c23 to 5d9549e Compare November 8, 2015 10:23
The only code which touches xfmem is code which writes directly into
uid_data.

All the rest now read their parameters out of uid_data.

I also simplified the lighting code so it always generated seperate
codepaths for alpha and color channels instead of trying to combine
them on the off-chance that the same equation works for all 4 channels.

As modern (post 2008) GPUs generally don't calcualte all 4 channels
in a single vector, this optimisation is pointless. The shader compiler
will undo it during the GLSL/HLSL to IR step.

Bug Fix: The about optimisation was also broken, applying the color light
         equation to the alpha light channel instead of the alpha light
	 euqation. But doesn't look like anything trigged this bug.
This frees up 21 bits and allows us to shorten the UID struct by an entire
32 bits.

It's not strictly needed (as it's encoded into the length) but I added a
bit for per-pixel lighiting to make my life easier in the following
commits.
Bug Fix: The normal stage UIDs were randomly overwriting indirect
         stage texture map UID fields. It was possible for multiple
	 shaders with diffrent indirect texture targets to map to
	 the same UID.
         Once again, it dpesn't look like this bug was ever triggered.
Bug Fix: It was theoretically possible for a shader with depth writes
         disabled to map to the same UID as a shader with late depth
	 writes.
	 No known test cases trigger this.
Bug Fix: Previously vertex shaders and geometery shaders didn't track
         antialaising state in their UIDs, which could cause AA bugs
	 on directx.
As much as possible, the asserts have been moved out of the GetUID
function. But there are some places where asserts depend on variables
that aren't stored in the shader UID.
Note: It's not 100% perfect, as some of the GPU capablities leak into the
pixel shader UID.

Currently our UIDs don't get exported, so there is no issue. But someone
might want to fix this in the future.
Kind of pointless now that multiple shaders with the same UID are
now fundementally impossible.
Or anything else which doesn't use textures (Basically nothing)
This allows a large number of games to be semi-playable.
Also fixed up which registers Konst was being written to.
I did this mostly so it would work on llvmpipe and fifoci.
This fixes up the remaining alpha problems, particually in n64 games.
See Source/Core/VideoCommon/DriverDetails.h:140 for the horriable details
Until now we have generated 1 ubershader for one PixelShaderGen UID
which meant it would compile excatly as many shaders as before.
Oh and these shaders were way more complex than the old shadergen
shaders, so we were spending way more time compiling.

The llvmpipe based FifoCI's runtime had jumped from 6min to 36min.

With this commit we generate much simpiler uber shader uid with
only 4 bits of state (16 shaders). We still generate them at
runtime when needed, but only ever 16 of them.

16 is not the final number.
So I pulled up a profiler, and found that UberShaders were completly
memory bound, because the ColorInput and AlphaInput arrays were
stored in main memory. Worked fine at 1xIR, but by 2xIR it was
trying to write 2GB every frame to main memory and read back 500MB
for a test scene on Wind Waker's Outset island.

So we rewrite everything to use switch statements so it compiles to
uniform control flow selecting registers instead of indexed
reads/writes to main memory.

The result is impressivly fast.
They aren't needed and hide errors from DirectX.
Oh and mesa gets annoyed if you use the wrong type.
Well OK,  I admit it was a pretty major fail. We weren't uploading
ksel, so we were using the wrong konsts like all the time.

Wind Waker now renders almost perfectly apparet from cell shading.
I assume most other games will be pretty close to correct too.
Self-shadowing in Rogue Squadron 2 works.
Swizzling - Charater lighting now shows the correct color (not red)
            in wind waker.

Konsts - Had an off by 2 error so were using the wrong Konsts.
         Fixes The black boarders around the end of the water in
	 Wind Waker (Yes, that is the only game I'm currently testing.
Now the shader always outputs the second color, as dual source
blending is turned on/off in ogl/dx state.
Should fix those single bit errors with pixel colors.
We were indexing the texture coordinates by sampler_num rather than
tex_coord. This happened to line up for a lot of games.

Fixes the red-tinted videos in Mario Sunshine (and other games) and
Clouds in the distance for Wind Waker.
This should also give a nice speedup.
So Rogue Squadron works.
So we can test vertex changes without our incomplete pixel ubershader
getting in the way for fifoci.
Currently only for DirectX. Only implements per-vertex colors.
I'm really supprised this works as well as it does. It assumes
that each texcoordX is connected to texgenX and all texgens are in
the simplest mode.
@dolphin-emu-bot
Copy link
Contributor

FifoCI detected that this change impacts graphical rendering. Here are the behavior differences detected by the system:

  • aeon-charge-attack on ogl-lin-intel: diff
  • chibi-robo-fastdepth on ogl-lin-intel: diff
  • chibi-robo-zfighting on ogl-lin-intel: diff
  • custom-brawl-char on ogl-lin-intel: diff
  • djfny-menu on ogl-lin-intel: diff
  • DKCR-Char on ogl-lin-intel: diff
  • ea-vp6 on ogl-lin-intel: diff
  • fifa-street on ogl-lin-intel: diff
  • find-mii on ogl-lin-intel: diff
  • fortune-street on ogl-lin-intel: diff
  • fortune-street-fog on ogl-lin-intel: diff
  • fortune-street-white-box on ogl-lin-intel: diff
  • fsa-layers on ogl-lin-intel: diff
  • f-zero-rain on ogl-lin-intel: diff
  • goldeneye-depth on ogl-lin-intel: diff
  • inverted-depth-range on ogl-lin-intel: diff
  • kirby-shadows on ogl-lin-intel: diff
  • luigi-shadows on ogl-lin-intel: diff
  • mario-sluggers-bar on ogl-lin-intel: diff
  • mario-tennis-menu on ogl-lin-intel: diff
  • medabots-crash on ogl-lin-intel: diff
  • megaman-heat on ogl-lin-intel: diff
  • melee-depth on ogl-lin-intel: diff
  • melee-lighting on ogl-lin-intel: diff
  • mii-channel on ogl-lin-intel: diff
  • milotic-texture on ogl-lin-intel: diff
  • mini-ninjas on ogl-lin-intel: diff
  • mkdd-efb on ogl-lin-intel: diff
  • mkwii-bluebox on ogl-lin-intel: diff
  • monkeyball-fuse on ogl-lin-intel: diff
  • mp3-bloom on ogl-lin-intel: diff
  • mp7-text on ogl-lin-intel: diff
  • mtennis-zfreeze on ogl-lin-intel: diff
  • my-word-coach on ogl-lin-intel: diff
  • nddemo-bumpmapping on ogl-lin-intel: diff
  • nes-vc on ogl-lin-intel: diff
  • nfsu-purplerect on ogl-lin-intel: diff
  • nfsu-reflections on ogl-lin-intel: diff
  • nsmbw-coins on ogl-lin-intel: diff
  • nsmbw-intro on ogl-lin-intel: diff
  • rs2-glass on ogl-lin-intel: diff
  • rs2-skybox on ogl-lin-intel: diff
  • rs2-zfreeze on ogl-lin-intel: diff
  • sadx-ui on ogl-lin-intel: diff
  • sf-assault-flashing on ogl-lin-intel: diff
  • simpsons-tev on ogl-lin-intel: diff
  • smg2-fog on ogl-lin-intel: diff
  • smg-marioeyes on ogl-lin-intel: diff
  • sms-bubbles on ogl-lin-intel: diff
  • sms-gc on ogl-lin-intel: diff
  • soa-black on ogl-lin-intel: diff
  • soniccolors-mm on ogl-lin-intel: diff
  • sonicriderszg-gb on ogl-lin-intel: diff
  • spyro-bloom on ogl-lin-intel: diff
  • ssbm-pointsize on ogl-lin-intel: diff
  • ss-map on ogl-lin-intel: diff
  • ss-timestone on ogl-lin-intel: diff
  • super-sluggers-white-out on ogl-lin-intel: diff
  • sw3-dt on ogl-lin-intel: diff
  • thps3-earlyz on ogl-lin-intel: diff
  • thps4-shadow on ogl-lin-intel: diff
  • tos-invis-char on ogl-lin-intel: diff
  • tsp3-pinkgrass on ogl-lin-intel: diff
  • vegas-party-depth on ogl-lin-intel: diff
  • xenoblade-menu on ogl-lin-intel: diff
  • zelda1-vc on ogl-lin-intel: diff
  • ztp-grass on ogl-lin-intel: diff
  • zww-armos on ogl-lin-intel: diff
  • zww-water on ogl-lin-intel: diff
  • zww-waves on ogl-lin-intel: diff
  • aeon-charge-attack on ogl-lin-mesa: diff
  • chibi-robo-fastdepth on ogl-lin-mesa: diff
  • chibi-robo-zfighting on ogl-lin-mesa: diff
  • custom-brawl-char on ogl-lin-mesa: diff
  • djfny-menu on ogl-lin-mesa: diff
  • DKCR-Char on ogl-lin-mesa: diff
  • ea-vp6 on ogl-lin-mesa: diff
  • fifa-street on ogl-lin-mesa: diff
  • find-mii on ogl-lin-mesa: diff
  • fortune-street on ogl-lin-mesa: diff
  • fortune-street-fog on ogl-lin-mesa: diff
  • fortune-street-white-box on ogl-lin-mesa: diff
  • fsa-layers on ogl-lin-mesa: diff
  • f-zero-rain on ogl-lin-mesa: diff
  • inverted-depth-range on ogl-lin-mesa: diff
  • kirby-shadows on ogl-lin-mesa: diff
  • luigi-shadows on ogl-lin-mesa: diff
  • mario-sluggers-bar on ogl-lin-mesa: diff
  • mario-tennis-menu on ogl-lin-mesa: diff
  • medabots-crash on ogl-lin-mesa: diff
  • megaman-heat on ogl-lin-mesa: diff
  • melee-depth on ogl-lin-mesa: diff
  • melee-lighting on ogl-lin-mesa: diff
  • mii-channel on ogl-lin-mesa: diff
  • milotic-texture on ogl-lin-mesa: diff
  • mini-ninjas on ogl-lin-mesa: diff
  • mkdd-efb on ogl-lin-mesa: diff
  • mkwii-bluebox on ogl-lin-mesa: diff
  • monkeyball-fuse on ogl-lin-mesa: diff
  • mp3-bloom on ogl-lin-mesa: diff
  • mp7-text on ogl-lin-mesa: diff
  • mtennis-zfreeze on ogl-lin-mesa: diff
  • my-word-coach on ogl-lin-mesa: diff
  • nddemo-bumpmapping on ogl-lin-mesa: diff
  • nes-vc on ogl-lin-mesa: diff
  • nfsu-purplerect on ogl-lin-mesa: diff
  • nfsu-reflections on ogl-lin-mesa: diff
  • nsmbw-coins on ogl-lin-mesa: diff
  • nsmbw-intro on ogl-lin-mesa: diff
  • rs2-glass on ogl-lin-mesa: diff
  • rs2-skybox on ogl-lin-mesa: diff
  • rs2-zfreeze on ogl-lin-mesa: diff
  • sadx-ui on ogl-lin-mesa: diff
  • sf-assault-flashing on ogl-lin-mesa: diff
  • simpsons-tev on ogl-lin-mesa: diff
  • smg2-fog on ogl-lin-mesa: diff
  • smg-marioeyes on ogl-lin-mesa: diff
  • sms-bubbles on ogl-lin-mesa: diff
  • sms-gc on ogl-lin-mesa: diff
  • soa-black on ogl-lin-mesa: diff
  • soniccolors-mm on ogl-lin-mesa: diff
  • sonicriderszg-gb on ogl-lin-mesa: diff
  • spyro-bloom on ogl-lin-mesa: diff
  • ssbm-pointsize on ogl-lin-mesa: diff
  • ss-map on ogl-lin-mesa: diff
  • ss-timestone on ogl-lin-mesa: diff
  • super-sluggers-white-out on ogl-lin-mesa: diff
  • sw3-dt on ogl-lin-mesa: diff
  • thps3-earlyz on ogl-lin-mesa: diff
  • thps4-shadow on ogl-lin-mesa: diff
  • tos-invis-char on ogl-lin-mesa: diff
  • tsp3-pinkgrass on ogl-lin-mesa: diff
  • xenoblade-menu on ogl-lin-mesa: diff
  • zelda1-vc on ogl-lin-mesa: diff
  • ztp-grass on ogl-lin-mesa: diff
  • zww-armos on ogl-lin-mesa: diff
  • zww-water on ogl-lin-mesa: diff
  • zww-waves on ogl-lin-mesa: diff
  • aeon-charge-attack on ogl-lin-nv: diff
  • chibi-robo-fastdepth on ogl-lin-nv: diff
  • chibi-robo-zfighting on ogl-lin-nv: diff
  • custom-brawl-char on ogl-lin-nv: diff
  • djfny-menu on ogl-lin-nv: diff
  • DKCR-Char on ogl-lin-nv: diff
  • DKCR-fast-depth on ogl-lin-nv: diff
  • ea-vp6 on ogl-lin-nv: diff
  • ed-updated on ogl-lin-nv: diff
  • fifa-street on ogl-lin-nv: diff
  • find-mii on ogl-lin-nv: diff
  • fortune-street on ogl-lin-nv: diff
  • fortune-street-fog on ogl-lin-nv: diff
  • fortune-street-white-box on ogl-lin-nv: diff
  • fsa-layers on ogl-lin-nv: diff
  • f-zero-rain on ogl-lin-nv: diff
  • goldeneye-depth on ogl-lin-nv: diff
  • inverted-depth-range on ogl-lin-nv: diff
  • kirby-shadows on ogl-lin-nv: diff
  • luigi-shadows on ogl-lin-nv: diff
  • mario-sluggers-bar on ogl-lin-nv: diff
  • mario-tennis-menu on ogl-lin-nv: diff
  • medabots-crash on ogl-lin-nv: diff
  • megaman-heat on ogl-lin-nv: diff
  • melee-depth on ogl-lin-nv: diff
  • melee-lighting on ogl-lin-nv: diff
  • mii-channel on ogl-lin-nv: diff
  • milotic-texture on ogl-lin-nv: diff
  • mini-ninjas on ogl-lin-nv: diff
  • mkdd-efb on ogl-lin-nv: diff
  • mkwii-bluebox on ogl-lin-nv: diff
  • monkeyball-fuse on ogl-lin-nv: diff
  • mp3-bloom on ogl-lin-nv: diff
  • mp7-text on ogl-lin-nv: diff
  • mtennis-zfreeze on ogl-lin-nv: diff
  • my-word-coach on ogl-lin-nv: diff
  • nddemo-bumpmapping on ogl-lin-nv: diff
  • nddemo-lighting on ogl-lin-nv: diff
  • nes-vc on ogl-lin-nv: diff
  • nfsu-purplerect on ogl-lin-nv: diff
  • nfsu-reflections on ogl-lin-nv: diff
  • nsmbw-coins on ogl-lin-nv: diff
  • nsmbw-intro on ogl-lin-nv: diff
  • rs2-glass on ogl-lin-nv: diff
  • rs2-skybox on ogl-lin-nv: diff
  • rs2-zfreeze on ogl-lin-nv: diff
  • sadx-ui on ogl-lin-nv: diff
  • sf-assault-flashing on ogl-lin-nv: diff
  • simpsons-tev on ogl-lin-nv: diff
  • smg2-fog on ogl-lin-nv: diff
  • smg-marioeyes on ogl-lin-nv: diff
  • sms-bubbles on ogl-lin-nv: diff
  • sms-gc on ogl-lin-nv: diff
  • soa-black on ogl-lin-nv: diff
  • soniccolors-mm on ogl-lin-nv: diff
  • sonicriderszg-gb on ogl-lin-nv: diff
  • spyro-bloom on ogl-lin-nv: diff
  • spyro-depth on ogl-lin-nv: diff
  • ssbb-mod-lloyd on ogl-lin-nv: diff
  • ssbm-pointsize on ogl-lin-nv: diff
  • ss-map on ogl-lin-nv: diff
  • ss-timestone on ogl-lin-nv: diff
  • super-sluggers-white-out on ogl-lin-nv: diff
  • sw3-dt on ogl-lin-nv: diff
  • thps3-earlyz on ogl-lin-nv: diff
  • thps4-shadow on ogl-lin-nv: diff
  • tos-invis-char on ogl-lin-nv: diff
  • tsp3-pinkgrass on ogl-lin-nv: diff
  • xenoblade-menu on ogl-lin-nv: diff
  • zelda1-vc on ogl-lin-nv: diff
  • ztp-grass on ogl-lin-nv: diff
  • zww-armos on ogl-lin-nv: diff
  • zww-water on ogl-lin-nv: diff
  • zww-waves on ogl-lin-nv: diff

automated-fifoci-reporter

@Parlane Parlane added the WIP / do not merge Work in progress (do not merge) label May 3, 2016
@bb010g bb010g mentioned this pull request Jul 30, 2017
@Helios747
Copy link
Contributor

Closed in favor of #5702

@Helios747 Helios747 closed this Jul 30, 2017
@phire phire deleted the ubershaders-vertex branch February 2, 2023 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP / do not merge Work in progress (do not merge)
6 participants