New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix out of bounds tex coord behavior; always apply fb_addprev and tex coord wrapping #9651
Conversation
|
The difference in Fortune Street is/should be correct; the one part of the water references a nonexistant tex coord on stage 1. I'm not entirely sure why there aren't any differences for it on OGL. Note that it only has one tex coord in the vertex spec (coord 0), with coords 1 and 2 coming from the normals.
For object 644 (commands are in 643 at the moment):
000d1049: BP register BPMEM_TREF number 0 Stage 0 texmap: 0 Stage 0 tex coord: 0 Stage 0 enable texmap: Yes Stage 0 color channel: Color chan 0 (0) Stage 1 texmap: 3 Stage 1 tex coord: 3 Stage 1 enable texmap: Yes Stage 1 color channel: Color chan 1 (1) 000d11f9: BP register BPMEM_GENMODE Num tex gens: 3 Num color channels: 2 Unused bit: 0 Flat shading (unconfirmed): No Multisampling: No Num TEV stages: 5 Cull mode: Back-facing primitives only (1) Num indirect stages: 0 ZFreeze: No I also implemented the 0 tex coords making things black case. Lastly, I hardware tested it, and |
|
I did one more hardware test and confirmed that in addition to |
| out.Write("\tint2 fixpoint_uv{} = int2(", i); | ||
| out.Write("(tex{}.z == 0.0 ? tex{}.xy : tex{}.xy / tex{}.z)", i, i, i, i); | ||
| out.Write(" * " I_TEXDIMS "[{}].zw);\n", i); | ||
| // TODO: S24 overflows here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the story here? Is this indicating a concern or is this something that needs future investigation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same comment exists in specialized shaders:
dolphin/Source/Core/VideoCommon/PixelShaderGen.cpp
Lines 786 to 792 in 4d37dad
| for (u32 i = 0; i < uid_data->genMode_numtexgens; ++i) | |
| { | |
| out.Write("\tint2 fixpoint_uv{} = int2(", i); | |
| out.Write("(tex{}.z == 0.0 ? tex{}.xy : tex{}.xy / tex{}.z)", i, i, i, i); | |
| out.Write(" * " I_TEXDIMS "[{}].zw);\n", i); | |
| // TODO: S24 overflows here? | |
| } |
Ubershaders already had (basically) the same code, just in a different place and without the comment.
The software renderer doesn't seem to worry about it at all:
dolphin/Source/Core/VideoBackends/Software/Rasterizer.cpp
Lines 228 to 238 in 4d37dad
| // tex coords | |
| for (unsigned int i = 0; i < bpmem.genMode.numtexgens; i++) | |
| { | |
| float projection = invW; | |
| float q = TexSlopes[i][2].GetValue(dx, dy) * invW; | |
| if (q != 0.0f) | |
| projection = invW / q; | |
| pixel.Uv[i][0] = TexSlopes[i][0].GetValue(dx, dy) * projection; | |
| pixel.Uv[i][1] = TexSlopes[i][1].GetValue(dx, dy) * projection; | |
| } |
dolphin/Source/Core/VideoBackends/Software/Rasterizer.cpp
Lines 117 to 123 in 4d37dad
| // tex coords | |
| for (unsigned int i = 0; i < bpmem.genMode.numtexgens; i++) | |
| { | |
| // multiply by 128 because TEV stores UVs as s17.7 | |
| tev.Uv[i].s = (s32)(pixel.Uv[i][0] * 128); | |
| tev.Uv[i].t = (s32)(pixel.Uv[i][1] * 128); | |
| } |
It's something that might need further investigation, given the inconsistent results I got with my first attempt at hardware testing where I ended up overflowing really hard. The zww-water test case is also incorrect in the software renderer so it might be wrong. But those are something to come back to in the future.
1d44854
to
7a93147
Compare
c2e2100
to
c37b5fe
Compare
|
The last two commits change the pixelshader/ubershader to always run the indirect stage logic. Was that already present in the software renderer or possibly not applicable here? |
|
Yes, the software renderer always ran it (in dolphin/Source/Core/VideoCommon/PixelShaderManager.cpp Lines 161 to 162 in ac679eb
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through and played a few games with this pull request. I made sure nothing was really broken. As was said on IRC, we have pretty good coverage on indirect textures so there's not really a high chance anything will break. If it does, we could fix it.
YAGCD uses BI0/BC0/BI1/BC1/BI2/BC2/BI3/BC3, so I'm pretty sure the BI2/BC3/BI4/BC4 names are a typo that just was propagated.
It became unused in f039149.
Previously we set the texture coordinate to zero, now we set the texture coordinate *index* to zero. This fixes the ripple effect of the Mario painting in Luigi's Mansion. Co-authored-by: Pokechu22 <Pokechu022@gmail.com>
Previously we set the texture coordinate to zero, now we set the texture coordinate *index* to zero. This fixes the ripple effect of the Mario painting in Luigi's Mansion.
This change should have no behavioral differences itself, but allows for changing the behavior of out of bounds tex coord indices more easily in the next commit. Without this change, returning tex0 for out of bounds cases and then applying the fixed-point logic would use the wrong tex dimension info (tex0 with I_TEXDIMS[1] or such), which is inaccurate.
Previously we set the texture coordinate to zero, now we set the texture coordinate *index* to zero. This fixes the ripple effect of the Mario painting in Luigi's Mansion.
Hardware testing has confirmed that fb_addprev and wrapping both run even when the indirect stage is disabled.
Hardware testing has confirmed that fb_addprev and wrapping both run even when the indirect stage is disabled.
|
FifoCI detected that this change impacts graphical rendering. Here are the behavior differences detected by the system:
automated-fifoci-reporter |
|
Finally verified how Mario looks in the final cutscene. It looks identical to console on a zoom in. |




Fixes the Mario portrait blur effect in Luigi's Mansion (issue 11462). Closes #8296 (the older implementation).
When a tex coord is greater than or equal to the number of tex gens, the behavior is undefined on console, but it generally appears to be equivalent to using tex coord 0 (instead of clamping it, or setting the value to (0, 0), or just using whatever garbage is already around). This applies to both direct and indirect textures. There does seem to be some unpredictable garbage data that also can be used, but I don't know the details of it and that doesn't seem to be an issue with the Mario portrait.
There is an exception to this: when 0 tex gens are enabled, I always got a black image on console;
I didn't implement this in Dolphin because I couldn't think of a clean way of doing it (however, ubershaders seem to already do it)this has been implemented as well. I don't think this is something that games would do by accident since it's easy to notice.For the Mario portrait specifically, the blur effect is achieved with indirect textures; they have a 128 by 128 texture (generated from an EFB (so at higher IRs, it will be a higher resolution, but the scale factor is still 128 by 128)) that shows Mario, and another 64 by 64 texture that contains random greyscale noise. They also have separate texture coordinates for both of them (the Mario texture uses texture coordinate 0, with values ranging horizontally from .1 to .9 and vertically from 0 to 1, while the noise texture uses texture coordinate 1, with both axes ranging from 0 to 1). Since the texture coordinates get multiplied by the texture scale (more or less; it's actually a separate parameter the game controls), texture coordinate 0 more strictly ranges from 12.8 to 115.2 on one axis and 0 to 128 on the other, and texture coordinate 1 ranges from 0 to 64 on both axes.
The only problem is that Nintendo forgot to set the number of tex gens to 2, so only texture coordinate 0 is valid, triggering this undefined behavior. For the portrait, it (apparently) always renders using texture coordinate 0, and since the only purpose of the noise texture is to distort the image, that it was using the wrong texture coordinate presumably wasn't noticed. The only difference this causes is that the noise is more fine-grained than it would have been, since texture coordinate 0's scale is twice that of texture coordinate 1, causing the texture to be repeated twice in each direction, but that (at least in my opinion) looks better than what it would look like with 2 tex gens. (To be clear, with 2 tex gens, it would render correctly without this PR, and this PR does not change how it renders with 2 tex gens.)
Screenshots:
Broken portrait effect
Fixed portrait effect
"Intended" portrait effect (by hex-editing the FIFO to use 2 tex gens)
Here is my hardware test, and here are some fragmentary, partially incorrect notes.