Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix out of bounds tex coord behavior; always apply fb_addprev and tex coord wrapping #9651

Merged
merged 10 commits into from May 9, 2021

Conversation

Pokechu22
Copy link
Contributor

@Pokechu22 Pokechu22 commented Apr 18, 2021

Fixes the Mario portrait blur effect in Luigi's Mansion (issue 11462). Closes #8296 (the older implementation).

When a tex coord is greater than or equal to the number of tex gens, the behavior is undefined on console, but it generally appears to be equivalent to using tex coord 0 (instead of clamping it, or setting the value to (0, 0), or just using whatever garbage is already around). This applies to both direct and indirect textures. There does seem to be some unpredictable garbage data that also can be used, but I don't know the details of it and that doesn't seem to be an issue with the Mario portrait.

There is an exception to this: when 0 tex gens are enabled, I always got a black image on console; I didn't implement this in Dolphin because I couldn't think of a clean way of doing it (however, ubershaders seem to already do it) this has been implemented as well. I don't think this is something that games would do by accident since it's easy to notice.

For the Mario portrait specifically, the blur effect is achieved with indirect textures; they have a 128 by 128 texture (generated from an EFB (so at higher IRs, it will be a higher resolution, but the scale factor is still 128 by 128)) that shows Mario, and another 64 by 64 texture that contains random greyscale noise. They also have separate texture coordinates for both of them (the Mario texture uses texture coordinate 0, with values ranging horizontally from .1 to .9 and vertically from 0 to 1, while the noise texture uses texture coordinate 1, with both axes ranging from 0 to 1). Since the texture coordinates get multiplied by the texture scale (more or less; it's actually a separate parameter the game controls), texture coordinate 0 more strictly ranges from 12.8 to 115.2 on one axis and 0 to 128 on the other, and texture coordinate 1 ranges from 0 to 64 on both axes.

The only problem is that Nintendo forgot to set the number of tex gens to 2, so only texture coordinate 0 is valid, triggering this undefined behavior. For the portrait, it (apparently) always renders using texture coordinate 0, and since the only purpose of the noise texture is to distort the image, that it was using the wrong texture coordinate presumably wasn't noticed. The only difference this causes is that the noise is more fine-grained than it would have been, since texture coordinate 0's scale is twice that of texture coordinate 1, causing the texture to be repeated twice in each direction, but that (at least in my opinion) looks better than what it would look like with 2 tex gens. (To be clear, with 2 tex gens, it would render correctly without this PR, and this PR does not change how it renders with 2 tex gens.)

Screenshots:

Broken portrait effect

MarioPortraitBroken

Fixed portrait effect

MarioPortraitFixed

"Intended" portrait effect (by hex-editing the FIFO to use 2 tex gens)

MarioPortraitIntended

Here is my hardware test, and here are some fragmentary, partially incorrect notes.

@Pokechu22
Copy link
Contributor Author

The difference in Fortune Street is/should be correct; the one part of the water references a nonexistant tex coord on stage 1. I'm not entirely sure why there aren't any differences for it on OGL. Note that it only has one tex coord in the vertex spec (coord 0), with coords 1 and 2 coming from the normals.

For object 644 (commands are in 643 at the moment):

000d1049:

BP register BPMEM_TREF number 0
Stage 0 texmap: 0
Stage 0 tex coord: 0
Stage 0 enable texmap: Yes
Stage 0 color channel: Color chan 0 (0)
Stage 1 texmap: 3
Stage 1 tex coord: 3
Stage 1 enable texmap: Yes
Stage 1 color channel: Color chan 1 (1)

000d11f9:

BP register BPMEM_GENMODE
Num tex gens: 3
Num color channels: 2
Unused bit: 0
Flat shading (unconfirmed): No
Multisampling: No
Num TEV stages: 5
Cull mode: Back-facing primitives only (1)
Num indirect stages: 0
ZFreeze: No

I also implemented the 0 tex coords making things black case.

Lastly, I hardware tested it, and fb_addprev applies even if the indirect stage's matrix is off or if the number of indirect stages is 0; currently that works correctly for the software renderer but not specialized shaders or ubershaders (specifically for the indirect stages = 0 case). (Note that that hardware test also exhibits the other undefined behavior where it uses an indirect stage when the number of indirect stages is 0; you can see a slightly different pattern in the 0YY and 0YN cases. That was only due to lazyness on my part but it's still interesting.)

@Pokechu22
Copy link
Contributor Author

I did one more hardware test and confirmed that in addition to fb_addprev, the indirect texture wrap functions also work when the indirect stage is not active (here's the test). I don't think there's a situation when they'd be useful (regular textures already wrap properly (as long as they're a power of 2, but indirect texture wrap only does powers of 2 as well)); I think it's only useful when you have an indirect texture to implement tilemaps/spritesheets/something like that where the indirect texture selects the tile/sprite. I also found that a value of 7 behaves the same as 6 (ITW_0), in that it results in the texture coord always being 0; shader generation currently crashes in that case. (Side note, the comment about the value being 1 and not 1 << 7 was wrong; the value should be 1 because 1 - 1 = 0 which would lead to x & 0 => 0, just like the first value is 0 which leads to 0 - 1 = -1 so x & -1 = x. Except all of those are pointless, because the array isn't used in those cases. The comment was added in 3a63899 for reference.)

@Pokechu22 Pokechu22 marked this pull request as ready for review April 19, 2021 03:30
@Pokechu22 Pokechu22 changed the title Fix out of bounds tex coord behavior Fix out of bounds tex coord behavior; always apply fb_addprev and tex coord wrapping Apr 19, 2021
out.Write("\tint2 fixpoint_uv{} = int2(", i);
out.Write("(tex{}.z == 0.0 ? tex{}.xy : tex{}.xy / tex{}.z)", i, i, i, i);
out.Write(" * " I_TEXDIMS "[{}].zw);\n", i);
// TODO: S24 overflows here?
Copy link
Contributor

@iwubcode iwubcode Apr 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the story here? Is this indicating a concern or is this something that needs future investigation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comment exists in specialized shaders:

for (u32 i = 0; i < uid_data->genMode_numtexgens; ++i)
{
out.Write("\tint2 fixpoint_uv{} = int2(", i);
out.Write("(tex{}.z == 0.0 ? tex{}.xy : tex{}.xy / tex{}.z)", i, i, i, i);
out.Write(" * " I_TEXDIMS "[{}].zw);\n", i);
// TODO: S24 overflows here?
}

Ubershaders already had (basically) the same code, just in a different place and without the comment.

The software renderer doesn't seem to worry about it at all:

// tex coords
for (unsigned int i = 0; i < bpmem.genMode.numtexgens; i++)
{
float projection = invW;
float q = TexSlopes[i][2].GetValue(dx, dy) * invW;
if (q != 0.0f)
projection = invW / q;
pixel.Uv[i][0] = TexSlopes[i][0].GetValue(dx, dy) * projection;
pixel.Uv[i][1] = TexSlopes[i][1].GetValue(dx, dy) * projection;
}

// tex coords
for (unsigned int i = 0; i < bpmem.genMode.numtexgens; i++)
{
// multiply by 128 because TEV stores UVs as s17.7
tev.Uv[i].s = (s32)(pixel.Uv[i][0] * 128);
tev.Uv[i].t = (s32)(pixel.Uv[i][1] * 128);
}

It's something that might need further investigation, given the inconsistent results I got with my first attempt at hardware testing where I ended up overflowing really hard. The zww-water test case is also incorrect in the software renderer so it might be wrong. But those are something to come back to in the future.

@Pokechu22 Pokechu22 force-pushed the oob-texcoord branch 2 times, most recently from 1d44854 to 7a93147 Compare April 23, 2021 18:26
@Pokechu22 Pokechu22 force-pushed the oob-texcoord branch 2 times, most recently from c2e2100 to c37b5fe Compare April 24, 2021 19:49
@Pokechu22 Pokechu22 mentioned this pull request Apr 24, 2021
@iwubcode
Copy link
Contributor

The last two commits change the pixelshader/ubershader to always run the indirect stage logic. Was that already present in the software renderer or possibly not applicable here?

@Pokechu22
Copy link
Contributor Author

Yes, the software renderer always ran it (in Tev::Indirect, which is always called even if no indirect stage is enabled). There actually was a comment about the software renderer doing that in PixelShaderManager (with the stage < bpmem.genMode.numindstages condition above):

constants.pack1[i][2] = bpmem.tevind[i].hex; // TODO: This match shadergen, but videosw
// will always wrap.

Copy link
Contributor

@JMC47 JMC47 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through and played a few games with this pull request. I made sure nothing was really broken. As was said on IRC, we have pretty good coverage on indirect textures so there's not really a high chance anything will break. If it does, we could fix it.

Pokechu22 and others added 7 commits May 7, 2021 16:14
YAGCD uses BI0/BC0/BI1/BC1/BI2/BC2/BI3/BC3, so I'm pretty sure the BI2/BC3/BI4/BC4 names are a typo that just was propagated.
Previously we set the texture coordinate to zero, now we set
the texture coordinate *index* to zero. This fixes the ripple
effect of the Mario painting in Luigi's Mansion.

Co-authored-by: Pokechu22 <Pokechu022@gmail.com>
Previously we set the texture coordinate to zero, now we set
the texture coordinate *index* to zero. This fixes the ripple
effect of the Mario painting in Luigi's Mansion.
This change should have no behavioral differences itself, but allows for changing the behavior of out of bounds tex coord indices more easily in the next commit.  Without this change, returning tex0 for out of bounds cases and then applying the fixed-point logic would use the wrong tex dimension info (tex0 with I_TEXDIMS[1] or such), which is inaccurate.
Previously we set the texture coordinate to zero, now we set
the texture coordinate *index* to zero. This fixes the ripple
effect of the Mario painting in Luigi's Mansion.
Hardware testing has confirmed that fb_addprev and wrapping both run even when the indirect stage is disabled.
Hardware testing has confirmed that fb_addprev and wrapping both run even when the indirect stage is disabled.
@dolphin-emu-bot
Copy link
Contributor

FifoCI detected that this change impacts graphical rendering. Here are the behavior differences detected by the system:

  • lm-mario-portrait on ogl-lin-mesa: diff
  • rs2-glass on ogl-lin-mesa: diff
  • rs3-bumpmapping on ogl-lin-mesa: diff
  • fortune-street on sw-lin-mesa: diff
  • lm-mario-portrait on sw-lin-mesa: diff
  • lm-mario-portrait on ogl-lin-radeon: diff
  • rs3-bumpmapping on ogl-lin-radeon: diff

automated-fifoci-reporter

@JMC47
Copy link
Contributor

JMC47 commented May 9, 2021

Finally verified how Mario looks in the final cutscene. It looks identical to console on a zoom in.

GLME01_2021-05-09_14-43-24

https://gist.githubusercontent.com/Pokechu22/897780fcb026f106f537e0cf0803fef0/raw/6857cff633c1d944c395f7286595c321b944a5d5/Screenshot3h13m29s00f.png

@JMC47 JMC47 merged commit a66852d into dolphin-emu:master May 9, 2021
10 checks passed
@Pokechu22
Copy link
Contributor Author

4:3 images:

Broken portrait effect

MarioPortraitBroken

Fixed portrait effect

MarioPortraitFixed

"Intended" portrait effect (by hex-editing the FIFO to use 2 tex gens)

MarioPortraitIndented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
6 participants