Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VideoCommon: Manually handle texture wrapping and sampling #9956

Merged
merged 15 commits into from Nov 18, 2021

Conversation

Pokechu22
Copy link
Contributor

@Pokechu22 Pokechu22 commented Jul 25, 2021

This PR reimplements texture sampling logic in the pixel shaders, rather than using the video backend's built-in texture sampling functionality. This fixes a number of bugs that occur in various drivers (such as the EA VP6 issue), and also allows implementing a few additional features that were not handled before (diagonal LOD and the jank wrap logic used by actual hardware and implemented for the software renderer in #9921). Currently, anisotropic filtering is not implemented, but that can be handled in a later PR.

@Pokechu22 Pokechu22 force-pushed the non-power-of-2-wrap-2 branch 3 times, most recently from 51d60a2 to 4a84138 Compare July 25, 2021 06:44
@JMC47
Copy link
Contributor

JMC47 commented Jul 25, 2021

This does fix the EA VP6 issues on my NVIDIA.

Fixes an issue where there's a one pixel line in the middle of certain Wii EA sports games as well, but due to the tmem regressions they still aren't playable.

@Pokechu22 Pokechu22 force-pushed the non-power-of-2-wrap-2 branch 10 times, most recently from a3ae608 to 8e7afdf Compare July 30, 2021 01:55
@Pokechu22 Pokechu22 force-pushed the non-power-of-2-wrap-2 branch 2 times, most recently from d83c704 to d802a00 Compare August 23, 2021 04:48
@Pokechu22 Pokechu22 marked this pull request as ready for review August 23, 2021 04:49
@iwubcode
Copy link
Contributor

@Pokechu22 - just to confirm, is this ready to go?

@Pokechu22
Copy link
Contributor Author

It's ready to be reviewed, yes. There still are some particularly bad performance decreases with mesa that I haven't investigated (fifoci takes ~45 minutes instead of a more normal 10-20 minutes for ogl-lin-mesa (compare this to ogl-lin-radeon, which went from ~4 minutes to ~7 minutes)), which may require the fifoci configuration to be changed. But apart from that, all features are implemented and the code has been reorganized in a way that shouldn't be too bad to review.

@iwubcode
Copy link
Contributor

iwubcode commented Aug 25, 2021

I did some preliminary testing to see if the issues I've seen on a variety of games were resolved. Unfortunately not.

That being said, I didn't notice any major regressions either. I'm curious what the incompatibility is with 'Anisotropic Filtering'? I had it on and off with both the GPU sampling and the manual sampling and didn't notice any issues.

I'll do some more testing tomorrow, as well as start reviewing the code in more depth.

@Pokechu22
Copy link
Contributor Author

Pokechu22 commented Aug 25, 2021

I looked through all of the fifoci changes above. Only a few had actually interesting differences that could be checked on hardware (though it's still hard to see due to analog signal stuff and the hardware fifoplayer having some vertex explosion issues).

TestNewHardwareOld
Melty Motlen Galaxy00000000_2021-08-24_21-05-15MMGCrop00000000_2021-08-24_21-28-23
Baby Park (diagonal LOD)00000000_2021-08-24_21-11-38_BabyPark00000000_2021-08-24_21-28-55
Wario's Gold Mine (diagonal LOD)00000000_2021-08-24_21-13-55_mkwii_bluebox00000000_2021-08-24_21-29-22
Need for Speed00000000_2021-08-24_21-16-19_nfsu00000000_2021-08-24_21-29-34
RS200000000_2021-08-24_21-18-26_RS2_frame100000000_2021-08-24_21-29-41
Galaxy 2 clouds00000000_2021-08-24_21-22-10_marioeyes00000000_2021-08-24_21-29-48
Tony Hawk00000000_2021-08-24_21-26-00_THPS00000000_2021-08-24_21-30-06

Most things seem to be improvements. However, RS2 has a red color on the tube (which is very hard to see), old / new. Need for speed underground has some new blue dots in the headlights which are odd. One triangle in Tony Hawk changed in brightness (but I can't tell what's right). Also the shadows in Tony Hawk (both 3 and 4) flicker now which seems like a bigger problem - this happens even when looping frame 0 only.


I'm curious what the incompatibility is with 'Anisotropic Filtering'? I had it on and off with both the GPU sampling and the manual sampling and didn't notice any issues.

Anisotropic filtering is a form of texture filtering (related to mipmapping). It applies when performing texture sampling. I haven't implemented it for manual texture sampling, so it basically acts as if the value is set to 1x.


enum class AddressMode : StorageType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you ultimately made the right choice here but I was planning on taking advantage of the fact that we had separate enum types for our VideoCommon RenderState and add a 'Border' property. I suppose now I'll have to add it to BPMemory and document that it's not a real value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably could add a separate field for it into SamplerState; it's got space to spare since it uses 32-bit values while BP is limited to 24-bit values (though a few extra bits are already used for a higher maximum bias value for instance).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking some more about this, from a generic high level, I do wish we hadn't bled the emulation state into our somewhat generic graphics api. Obviously there is some overlap and this is relatively minor but I'm not sure this was necessary. My assumption is the point of Generate was to make that conversion from emulation to graphics backend. However, not suggesting you revert or modify this, just something I think we should avoid in the future.

To your point, adding a separate field might work but would make for a very klunky solution (if both are set, which do I choose?). For this particular use case, wouldn't we have enough bits to add a new enum flag? Again, no big deal, I will likely drop the feature (it's not critical) but thought it'd be worth discussing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an unused value in the enum (0, 1, and 2 correspond to valid values, while 3 is invalid), but hardware testing indicated that its behavior matches clamp (0).

Leaving the separate AddressMode enum would work, but then there would be an enum value that's not implemented in the actual shaders (since it doesn't exist on console), which feels weird to me.

@iwubcode
Copy link
Contributor

Just did a rough pass. This looks really good @Pokechu22 ! I'll take another pass later in the week if someone else doesn't approve this in the meantime.

This was added in 0b9a72a but became irrelevant in 70f9fc4 as the check is now self-explanatory due to a rejiggering of the bitfields.
The benefit to exposing this over the raw BP state is that adjustments Dolphin makes, such as LOD biases from arbitrary mipmap detection, will work properly.
This produces behavior matching the behavior on hardware (see Wario's Gold Mine in Mario Kart Wii).
Note that both GLSL and HLSL provide a fwidth (fragment width) function defined as `fwidth(p) = abs(dFdx(p)) + abs(dFdy(p))`.  However, it's easy enough to implement this ourselves (and it makes the code a bit more obvious).
Copy link
Contributor

@iwubcode iwubcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak to the emulation logic directly but the code LGTM.

@@ -266,6 +269,21 @@ void AdvancedWidget::AddDescriptions()
"<br><br>May improve performance in some games which rely on CPU EFB Access at the cost "
"of stability.<br><br><dolphin_emphasis>If unsure, leave this "
"unchecked.</dolphin_emphasis>");
static const char TR_MANUAL_TEXTURE_SAMPLING_DESCRIPTION[] = QT_TR_NOOP(
"Use a manual implementation of texture sampling instead of the graphics backend's built-in "
"functionality.<br><br>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users aren't that good at reading lots of things, so having such a long description might not be a good idea. I suggest trying to break it down a bit.

"This setting can fix graphical issues in some games on certain GPUs, most commonly vertical lines on FMVs. In addition to this, enabling manual texture sampling will allow for correct emulation of texture wrapping (at 1x IR or when scaled EFB is disabled) and better emulates Level of Detail calculation.

This comes at the cost of potentially worse performance, especially at higher internal resolutions, and Anisotropic Filtering is not compatible yet.

If unsure, leave this unchecked."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have much to say about the new text but I'd make the last line:

<dolphin_emphasis>If unsure, leave this unchecked.</dolphin_emphasis>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworded this based on your suggestion. Let me know how the new wording looks.

Pokechu22 and others added 3 commits November 17, 2021 21:27
Co-authored-by: JosJuice <josjuice@gmail.com>
Specifically, when using Manual Texture Sampling, if textures sizes don't match the size the game specifies, things previously broke.  That can happen with custom textures, and also with scaled EFB copies at non-native IRs.  It breaks most obviously by not scaling the texture coordinates (so only part of the texture shows up), but the hardware wrapping functionality also assumes texture sizes are a power of 2 (or else it will behave weirdly in a way that matches how hardware behaves weirdly).  The fix is to provide alternative texture wrapping logic when custom texture sizes are possible.
This commit changes the default value of Fast Texture Sampling to true, and also moves the setting that controls it to the experimental section of the advanced tab.  This is its own commit so that it can be easily reverted when we want to default to Manual Texture Sampling.

Co-authored-by: JosJuice <josjuice@gmail.com>
@dolphin-emu-bot
Copy link
Contributor

FifoCI detected that this change impacts graphical rendering. Here are the behavior differences detected by the system:

  • aeon-charge-attack on sw-lin-mesa: diff
  • chibi-robo-fastdepth on sw-lin-mesa: diff
  • custom-brawl-char on sw-lin-mesa: diff
  • DKCR-Char on sw-lin-mesa: diff
  • DKCR-fast-depth on sw-lin-mesa: diff
  • fishing-resort-map on sw-lin-mesa: diff
  • f-zero-rain on sw-lin-mesa: diff
  • jj-awae-mirrored on sw-lin-mesa: diff
  • kirby-shadows on sw-lin-mesa: diff
  • lego-star-wars-crane-shadow on sw-lin-mesa: diff
  • mario-baseball-shadows on sw-lin-mesa: diff
  • mario-sluggers-bar on sw-lin-mesa: diff
  • mario-tennis-menu on sw-lin-mesa: diff
  • MaS-LOG-wiimote on sw-lin-mesa: diff
  • megaman-heat on sw-lin-mesa: diff
  • melee-depth on sw-lin-mesa: diff
  • metroid-visor on sw-lin-mesa: diff
  • mkdd-babypark on sw-lin-mesa: diff
  • mkdd-efb on sw-lin-mesa: diff
  • mkwii-bluebox on sw-lin-mesa: diff
  • monkeyball-fuse on sw-lin-mesa: diff
  • mp3-bloom on sw-lin-mesa: diff
  • mtennis-zfreeze on sw-lin-mesa: diff
  • nfsu-purplerect on sw-lin-mesa: diff
  • rs2-bumpmapping on sw-lin-mesa: diff
  • rs2-glass on sw-lin-mesa: diff
  • rs2-skybox on sw-lin-mesa: diff
  • rs2-zfreeze on sw-lin-mesa: diff
  • rs3-bumpmapping on sw-lin-mesa: diff
  • rs3-skybox2 on sw-lin-mesa: diff
  • sfa-shadows on sw-lin-mesa: diff
  • simpsons-game on sw-lin-mesa: diff
  • smb-mirror on sw-lin-mesa: diff
  • smg2-fog on sw-lin-mesa: diff
  • smg-marioeyes on sw-lin-mesa: diff
  • smg-mmg on sw-lin-mesa: diff
  • sms-bubbles on sw-lin-mesa: diff
  • sms-gc on sw-lin-mesa: diff
  • sms-water on sw-lin-mesa: diff
  • sonic-riders-blur on sw-lin-mesa: diff
  • spyro-bloom on sw-lin-mesa: diff
  • spyro-depth on sw-lin-mesa: diff
  • ssbb-mod-lloyd on sw-lin-mesa: diff
  • super-sluggers-white-out on sw-lin-mesa: diff
  • sw3-dt on sw-lin-mesa: diff
  • tp-skin on sw-lin-mesa: diff
  • xenoblade-menu on sw-lin-mesa: diff
  • ztp-grass on sw-lin-mesa: diff
  • zww-armos on sw-lin-mesa: diff
  • aeon-charge-attack on ogl-lin-radeon: diff
  • burnout2-vehicletextures on ogl-lin-radeon: diff
  • et-vid on ogl-lin-radeon: diff
  • jd2-fmv on ogl-lin-radeon: diff
  • metroid-visor on ogl-lin-radeon: diff
  • mp2-scanner on ogl-lin-radeon: diff
  • ss-map on ogl-lin-radeon: diff
  • aeon-charge-attack on uberogl-lin-radeon: diff
  • burnout2-vehicletextures on uberogl-lin-radeon: diff
  • et-vid on uberogl-lin-radeon: diff
  • jd2-fmv on uberogl-lin-radeon: diff
  • metroid-visor on uberogl-lin-radeon: diff
  • mp2-scanner on uberogl-lin-radeon: diff
  • ss-map on uberogl-lin-radeon: diff

automated-fifoci-reporter

@JMC47
Copy link
Contributor

JMC47 commented Nov 18, 2021

Tested and confirmed to work on Android. I guess it's just about time.

@JMC47 JMC47 merged commit 6f4bbac into dolphin-emu:master Nov 18, 2021
10 checks passed
@CrusadingNinja
Copy link

CrusadingNinja commented Nov 19, 2021

With Manual Texture Sampling enabled, Dolphin seems to have significantly longer boot times when booting any game on my GTX 1070 when using D3D11 or D3D12. OpenGL and Vulkan are not affected.

@JMC47
Copy link
Contributor

JMC47 commented Nov 19, 2021

This is probably because it needs to compile shaders.

@CrusadingNinja
Copy link

This is probably because it needs to compile shaders.

Same behavior even with shader pre compilation disabled.

@JMC47
Copy link
Contributor

JMC47 commented Nov 20, 2021

I have a GTX 1070 as well and cannot reproduce the issue.

@mbc07
Copy link
Contributor

mbc07 commented Nov 24, 2021

Although this PR didn't completely fix the vertical/horizontal lines issue on Just Dance 2 (they're less noticeable, though), it did eliminate all lines on Just Dance 1, 3 and derivates (Just Dance Wii 2, Just Dance Greatest Hits, etc.) when using a NVIDIA GPU.

My question is, given that enabling "Manual Texture Sampling" has a minor impact on performance and that it's only needed for specific GPU drivers (mainly NVIDIA), should this setting be enabled by default on the Game INI of the affected games?

(asking because I was about to push a small batch of Game INI updates and earlier Just Dance games are among that batch)

@Pokechu22
Copy link
Contributor Author

I plan on doing a larger batch of GameINI updates for all Bink Video and EA VP6 games, but it might be a while before I get around to it. Those early Just Dance games do use Bink Video, but if you already plan on updating their GameINIs, it would make sense to enable it for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet