Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Texture lookup fix #9473

Closed
wants to merge 1 commit into from
Closed

Conversation

Miksel12
Copy link
Contributor

@Miksel12 Miksel12 commented Jan 25, 2021

Fixes Texture lookup by applying point sampling.

Old:
This adds an epsilon in the texture sampling stage to prevent rounding errors.

An epsilon of 0.5 gave the same results (when it comes to artifacts)as point sampling without actually point sampling so there is no behavioral change. 0.5 was chosen as that is the lowest value that can always be represented when converting a 24bit signed integer to a float.
The only fifolog that this doesn't fix is Just Dance 2 but the same artifacts can be found when point sampling and this depends on the hardware, so this is probably a rounding error(#9473 (comment), #9473 (comment)).
But Just Dance 2 is improved compared to master, at least on my hardware (Nvidia).

OLD:
This solution was provided by u/fortsnek47 on reddit (https://www.reddit.com/r/EmuDev/comments/l0y7sg/gc_fixing_those_vp6_videos/).
I don't have any games with texture lookup issues so I can't test these changes.

Games that can be tested: https://wiki.dolphin-emu.org/index.php?title=Template:Problems/VP6_Videos

@Miksel12
Copy link
Contributor Author

Ok, that doesn't seem right. Guess I should have copied the code 1:1.

@Miksel12 Miksel12 force-pushed the texture-lookup branch 2 times, most recently from 9de90ed to 6b2e85a Compare January 25, 2021 18:26
@JMC47
Copy link
Contributor

JMC47 commented Jan 25, 2021

@dolphin-emu-bot rebuild

@Miksel12
Copy link
Contributor Author

I have been testing and with the solution from u/fortsnek47 you get very low precision textures, which makes sense as you are throwing away precision. But some experimentation show better results.

@JMC47
Copy link
Contributor

JMC47 commented Jan 25, 2021

seems like fifoci doesn't like it so far, though.

@Miksel12
Copy link
Contributor Author

Miksel12 commented Jan 25, 2021

Really weird, I don't have any issues. u/fortsnek47 also says they has no issues: https://www.reddit.com/r/EmuDev/comments/l0y7sg/gc_fixing_those_vp6_videos/gkqjk5f?utm_source=share&utm_medium=web2x&context=3

To circumvent the loss of precision I tried this: Miksel12@b33207d
Which seemed to work but randomly started showing different behaviour. It all seems a bit iffy.

@Miksel12
Copy link
Contributor Author

Okay, this seems to work now. I tested 2 random vp6 fifo logs which are now perfectly rendered (not sure if that is luck) and there is no loss of precision. 1 << 6 comes from 0.5 * 128.

@Miksel12
Copy link
Contributor Author

Miksel12 commented Jan 25, 2021

I do have some very weird inconsistencies. It seems to come from running master before running a fifo log on this build.
Running fifo log on this build, great result -> Test on master, expected result from master -> Test on this build again, ton of artefacts.

Now that I think about it, it has probably something to do with shader caching. Though I have been able to fix multiple vp6 fifo's during one test run.

Edit: Okay, I figured out that state of bounding box and running dolphin as portable (while having the exact same gfx settings?!) vs not portable seemed to make a difference but after wiping my GFX.ini (which is probably pretty old) it seems that most inconsistencies are gone.

The end result is that (((float2)(tevcoord.xy >> 7) + 0.5f) * 128.0f) seems to work great on the vp6 fifo's I tested. float2(tevcoord.xy + (1 << 6)) still seems to be inconsistend but at least keeps the same texture precision as master. u/fortsnek47 said about this: https://www.reddit.com/r/EmuDev/comments/l0y7sg/gc_fixing_those_vp6_videos/gkqjk5f?utm_source=share&utm_medium=web2x&context=3
Another remark by u/fortsnek47: https://www.reddit.com/r/EmuDev/comments/l0y7sg/gc_fixing_those_vp6_videos/gkqva4w?utm_source=share&utm_medium=web2x&context=3

All in all it seems like this could be the right direction. I just don't have the knowledge to do much debugging.

@JMC47
Copy link
Contributor

JMC47 commented Jan 25, 2021

Okay, I know what's wrong. You're having shaders carry between builds I bet. And that's causing your inconsistent results. Changing the bounding box setting forces new shaders to be generated.

@mbc07
Copy link
Contributor

mbc07 commented Jan 25, 2021

Didn't we have a shader cache version somewhere in the code? Bumping the version should force Dolphin to regenerate the shader cache...

@Miksel12
Copy link
Contributor Author

Yeah I was looking for that but could find it.

@endrift
Copy link
Contributor

endrift commented Jan 25, 2021

Zero percent surprised it's this broken, since the reddit version seems to be just substituting the epsilon for a very large value instead. Not sure the author realized that this "recentering" was applied after normalization instead of before.

@Miksel12
Copy link
Contributor Author

Miksel12 commented Jan 25, 2021

Zero percent surprised it's this broken, since the reddit version seems to be just substituting the epsilon for a very large value instead. Not sure the author realized that this "recentering" was applied after normalization instead of before.

Well, the current implementation is my way to prevent precision loss but is wrong. ((float2(tevcoord.xy >> 7) + 0.5f) * 128.0f) is correct for point-sampled textures, meaning that the filtering should be different for different sample type. You can view the discussion on the reddit post. That is way too complex for me so I think I'll be closing this PR eventually. Though a FifoCI test run with the 'correct' point-sample might be interesting to see what it does to the vp6 fifologs.

@Miksel12 Miksel12 force-pushed the texture-lookup branch 2 times, most recently from d2f4f00 to 6217de8 Compare January 25, 2021 23:48
@Miksel12
Copy link
Contributor Author

FifoCI should be able to run now, I was using C-style casts by accident.

@mbc07
Copy link
Contributor

mbc07 commented Jan 26, 2021

@Miksel12 I think that's the line that controls shader cache version. Bumping it should invalidate existing caches...

@Miksel12
Copy link
Contributor Author

Miksel12 commented Jan 26, 2021

Ah yes, all inconsistencies are gone now. float2(tevcoord.xy + (1 << 6)) clearly breaks vp6 video's, so it seemed to work sometime because it was using the cache which used ((float2(tevcoord.xy >> 7) + 0.5f) * 128.0f) in those cases.

@mbc07 could you test your Just Dance games? Those clearly showed an epsilon doesn't work so I'm interested to see if this works. You should ignore all the low res texture for now and maybe run portable as this will invalidate the shader cache (though I assume only for games you boot in this build).

@mbc07
Copy link
Contributor

mbc07 commented Jan 26, 2021

Could someone trigger the build bot? I can't compile this PR manually at the moment...

@leoetlino leoetlino marked this pull request as draft January 26, 2021 20:29
@JMC47
Copy link
Contributor

JMC47 commented Jan 26, 2021

The current version of the PR does seem to fix things in the cases I know about.

@Miksel12
Copy link
Contributor Author

I also tested some texture seams fifo's I found on the issue tracker and it seems to fix those.

@mbc07
Copy link
Contributor

mbc07 commented Jan 26, 2021

The affected Just Dance games seems to behave exactly the same as in PR #9407: completely fixed on Intel, improved/worsened on NVIDIA, depending of the game. If it helps, I recorded FIFO Logs of the same tests I've done in the other PR:

@Miksel12
Copy link
Contributor Author

Miksel12 commented Jan 26, 2021

Hmm, that is a shame. Though that could be floating point discrepency on Nvidia's side, the fact that Intel doesn't have the issue and AMD didn't seem to have any problems at all shows that there is a difference in implementation.

u/fortsnek47 did mention that HLSL and GLSL can use integer coordinates for texture lookups: https://www.reddit.com/r/EmuDev/comments/l0y7sg/gc_fixing_those_vp6_videos/gkqva4w?utm_source=share&utm_medium=web2x&context=3
This would circumvent any rounding issues.
Either way, a point-sample and non point sample code path is needed and I have no idea when a texture needs one or the other.

u/fortsnek47 also shared a universal hack/epsilon in the form of: SampleTexture(out, "float2(tevcoord.xy | int2(1, 1))", texswap, stage.tevorders_texmap, stereo, api_type);. This gives the same result for the Just Dance fifologs as this PR but did render my only vp6 fifolog I have correct (so more testing would be needed).

@JMC47
Copy link
Contributor

JMC47 commented Jan 26, 2021

Even if there is a performance penalty, I'd love to see an integer texture lookup option to see if it fixes everything.

@Miksel12
Copy link
Contributor Author

Even if there is a performance penalty, I'd love to see an integer texture lookup option to see if it fixes everything.

Yeah, that would be awesome to see.

@Miksel12
Copy link
Contributor Author

Point sampling will now be correctly applied. No epsilon in sight :D
Be sure to wipe caches if you tested a previous build. Haven't done ubershaders.

@JMC47
Copy link
Contributor

JMC47 commented Jan 31, 2021

@dolphin-emu-bot rebuild

@dolphin-emu-bot
Copy link
Contributor

FifoCI detected that this change impacts graphical rendering. Here are the behavior differences detected by the system:

  • aeon-charge-attack on ogl-lin-radeon: diff
  • burnout2-vehicletextures on ogl-lin-radeon: diff
  • ea-vp6 on ogl-lin-radeon: diff
  • et-vid on ogl-lin-radeon: diff
  • mario-sluggers-bar on ogl-lin-radeon: diff
  • medabots-crash on ogl-lin-radeon: diff
  • megaman-heat on ogl-lin-radeon: diff
  • metroid-visor on ogl-lin-radeon: diff
  • mini-ninjas on ogl-lin-radeon: diff
  • mkw-bridge on ogl-lin-radeon: diff
  • mkwii-bluebox on ogl-lin-radeon: diff
  • mtennis-zfreeze on ogl-lin-radeon: diff
  • pw-black-bars on ogl-lin-radeon: diff
  • sf-assault-flashing on ogl-lin-radeon: diff
  • simpsons-game on ogl-lin-radeon: diff
  • soa-black on ogl-lin-radeon: diff

automated-fifoci-reporter

@JMC47
Copy link
Contributor

JMC47 commented Jan 31, 2021

https://fifo.ci/compare/5887239-5873434/ looks weird to me? Is this correct?

@Miksel12
Copy link
Contributor Author

Looks correct, clearly point sampled vs bilinearly sampled.

@JMC47
Copy link
Contributor

JMC47 commented Jan 31, 2021

So there may be an issue with the detection? Extrems posted a hardware test that has a modified pixel center and tests bilinear sampling. Dolphin Master fails as expected, and old versions of this Pull Request also fail. The new version of the Pull Request succeeds... however it's supposed to be bilinearly filtering. This would suggest our detection is wrong. Note that this uses a modified pixel center to fix a bug on hardware so that this renders correctly. I don't know if that mucks up with detection. Software renderer fails spectacularly on this test, suggesting something is seriously wrong with Dolphin.

Edit: The modifications are - 1/24 Pixel Center and 1/64 Texel Center were used.

The Hardware Test

https://files.extremscorner.org/gamecube/apps/bilineartest.zip

Current version of this PR

image

Master

image

@iwubcode
Copy link
Contributor

iwubcode commented Jan 31, 2021

I tested a number of issues I've seen and none of them were unfortunately fixed :(. A few were admittedly higher resolution issues but some were at native too (ex: 10401).

One downside to the current solution is upscaling to 6x for Spiderman Shattered Dimensions now looks really bad. You used to get this smooth outline, now it is all pixely. I think that was captured by fortsnek47's comment in the reddit post and also may be due to JMC's comment above.

I don't know if we need to be modifying this offset value based on resolution (not sure if that makes sense) but wanted to report my findings. This is on an AMD card fwiw

@Miksel12
Copy link
Contributor Author

Miksel12 commented Jan 31, 2021

@JMC47 Is the correct rendering like master but with all sides covered by red? Did you test on console?

@iwubcode I tested that fifolog in a previous attempt which fixed it at native (though not at higher resolutions), can you share some of your other test cases? And yes, some games/scenes will probably be more pixelated.

@Miksel12
Copy link
Contributor Author

Miksel12 commented Jan 31, 2021

I think the main problem now comes from detection. I pretty sure all the info we need is contained here:

// BPMEM_TX_SETMODE0 - (Texture lookup and filtering mode) LOD/BIAS Clamp, MaxAnsio, LODBIAS,
// DiagLoad, Min Filter, Mag Filter, Wrap T, S
// BPMEM_TX_SETMODE1 - (LOD Stuff) - Max LOD, Min LOD

Which means we need to determine if a texture should be point sampled based on these values:

struct
{
u32 wrap_s : 2;
u32 wrap_t : 2;
u32 mag_filter : 1;
u32 min_filter : 3;
u32 diag_lod : 1;
s32 lod_bias : 8;
u32 pad0 : 2;
u32 max_aniso : 2;
u32 lod_clamp : 1;
};
u32 hex;
};
union TexMode1
{
struct
{
u32 min_lod : 8;
u32 max_lod : 8;
};

At the moment, I am using this function:

constexpr bool IsBpTexMode0PointFiltering(const T& tm0)
{
return tm0.min_filter < 4 && !tm0.mag_filter;
}

I think it should be doable to hardware test this. Or just by viewing what for values Extrems test case have to see what for parameters we can set.

Edit: Also, I'm currently treating indirect texcoords and tev texcoords exactly the same but I'm not sure if that is correct.

Edit2: The definitions of the BPMEM_TX_SETMODE0 can be found on yagcd: https://www.gc-forever.com/yagcd/chap5.html#sec5.11.1

@JMC47
Copy link
Contributor

JMC47 commented Jan 31, 2021

The correct rendering is like the Pull Request. However, the hardware test is bilinearly sampled according to extrems.

@Miksel12
Copy link
Contributor Author

Euh, so if you would run this on a Wii it would look like this PR or not?

@JMC47
Copy link
Contributor

JMC47 commented Jan 31, 2021

It would render like this PR, but supposedly linearly sampled?

@Miksel12
Copy link
Contributor Author

Miksel12 commented Jan 31, 2021

I changed some logic and it now gets linearly sampled, it doesn't look as bad as master which I can't really explain as the non point sample path has remained te same. The funny thing is that the picture changes over time. It starts looking like it does in this PR and slowly changes to look more like master. Seems like constant rounding errors.

Some changes even made the screen completely red after some time.

@Miksel12
Copy link
Contributor Author

Miksel12 commented Jan 31, 2021

I tested a number of issues I've seen and none of them were unfortunately fixed :(. A few were admittedly higher resolution issues but some were at native too (ex: 10401).

Something weird is going on, I tested the fifolog and got the same result. I later plugged it into RenderDoc and found that it was working correctly. Vulkan also seemed to work, after rebuilding all the backend were correct except for Vulkan. I have no idea what this could be but something seems to persist.

I found that building the first caches with OGL/D3D makes it incorrect on those and correct on Vulkan and vice versa... No idea

@JMC47
Copy link
Contributor

JMC47 commented Feb 1, 2021

Any time there's something not changed, I'm guessing it's due to shaders getting set up.

@Miksel12 Miksel12 marked this pull request as ready for review February 1, 2021 00:38
@Miksel12 Miksel12 marked this pull request as draft February 1, 2021 00:39
@Miksel12
Copy link
Contributor Author

Miksel12 commented Feb 1, 2021

I did some testing to see if the detection parameters are correct by comparing to console and it seems to be correct, documentation also seems to agree. Sadly, I'm now stuck on inconsistent behaviour (again) with backends acting different from each other.

@JMC47
Copy link
Contributor

JMC47 commented Feb 2, 2021

Did you push the changes that seemingly make the backends inconsistent so I can mess with it? Maybe we can sort out why.

@Miksel12
Copy link
Contributor Author

Miksel12 commented Feb 2, 2021

The changes that cause inconsistency have been part of this PR for 2 days, so the tests you did were with those changes.
For example, you showed that Extrems test was wrong on this build. I get the same result as you did but when I run portable I get the right result. I thought incrementing GX_PIPELINE_UID_VERSION would fix any shader cache issues so maybe another VERSION I should increment?

I used this fifolog for testing: https://bugs.dolphin-emu.org/issues/10401
I ran the fifolog through RenderDoc and found that the missile icon should be point sampled, the texture has min_filter = 0 and mag_filter = 0. The detection code istm0.min_filter < 4 && !tm0.mag_filter; so it should catch it. But when I run it in portable mode it doesn't point sample it but it does when run not in portable mode...

Taking portable as baseline as it has no relation to previous builds/settings, it should handle both cases but does in one case but not the other?

Though the inconsistencies between API don't seem to be in this build.

@PatrickFerry
Copy link
Contributor

PatrickFerry commented Feb 2, 2021

The caches will only get invalidated and rebuilt once. Then were invalidated and then recreated with version 3. You made further changes in this PR but the caches won't get invalidated automatically because they are still version 3. So only yourself and the people testing this are going to see that issue.

@Miksel12
Copy link
Contributor Author

Miksel12 commented Feb 2, 2021

I actually tested in multiple portable builds so this wasn't the issue for me but JMC did find a weird mismatch between shader and uidcache.

@mbc07
Copy link
Contributor

mbc07 commented Feb 2, 2021

I know that at least the NVIDIA drivers for Windows has their own shader caches. Maybe it's related to that?

(they're located at %LocalAppData%\NVIDIA\GLCache and %LocalAppData%\NVIDIA Corporation\NV_Cache)

@fortsnek9348
Copy link

Shader caching problems? It seems you can disable caching by setting ShaderCache = False in GFX.ini [Settings].

Still got shader problems? Well, that direct reference of bpmem inside shader codegen might have something to do with it. I've got my build running with shader invalidation at BPMEM_TX_SETMODE0(_4), and the point filtering info is put inside some spare bits in PixelShaderUid. Also, as the texcoord correction is only applied to point-filtered textures, I use the bitwise-or method, assuming it's faster

@JMC47
Copy link
Contributor

JMC47 commented Feb 2, 2021

The problem in the PR currently is that the saved UIDs are broken. When we wipe the UIDs, it's fine.

If someone can fix that, that'd probably help. We're in a bit over our head.

@Miksel12
Copy link
Contributor Author

Miksel12 commented Feb 2, 2021

Shader caching problems? It seems you can disable caching by setting ShaderCache = False in GFX.ini [Settings].

Still got shader problems? Well, that direct reference of bpmem inside shader codegen might have something to do with it. I've got my build running with shader invalidation at BPMEM_TX_SETMODE0(_4), and the point filtering info is put inside some spare bits in PixelShaderUid.

I tried that but quickly realised that I just don't have enough knowledge to do something like that.

@Miksel12
Copy link
Contributor Author

Miksel12 commented Feb 4, 2021

Implementing this correctly is clearly over my head. I think someone with more knowledge and experience should be able to get the gist of it and make a correct implementation by looking at this PR.
If no one implements a correct implementation in the future, I think I'll PR an epsilon solution which was also used in a previous revision in this PR which had no known regression.

@Miksel12 Miksel12 closed this Feb 4, 2021
@Miksel12 Miksel12 mentioned this pull request Feb 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
8 participants