Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need for Speed Most Wanted: Missing orange/yellow color filter #890

Closed
vbnmh00 opened this issue May 6, 2022 · 8 comments · Fixed by #999
Closed

Need for Speed Most Wanted: Missing orange/yellow color filter #890

vbnmh00 opened this issue May 6, 2022 · 8 comments · Fixed by #999
Labels
bug Something isn't working

Comments

@vbnmh00
Copy link

vbnmh00 commented May 6, 2022

Title

https://xemu.app/titles/4541007b/#Need-for-Speed-Most-Wanted

Bug Description

gdfh
hhnf

Expected Behavior

gshgb
hethjn

xemu Version

0.6.6

System Information

OS: Windows 10
CPU: AMD Ryzen 7 5800X 8-Core Processor
GPU: NVIDIA GeForce GTX 1060 6GB/PCIe/SSE2
GPU Driver: 4.0.0 NVIDIA 496.76

Additional Context

No response

@vbnmh00 vbnmh00 added the bug Something isn't working label May 6, 2022
@abaire
Copy link
Contributor

abaire commented May 27, 2022

Hey @vbnmh00 could you add information about how to get to either of those pictures in the game?

If you can find a case where I can reproduce the problem with a quick race without having to play career mode that would be ideal. (I'm not super familiar with the game so sorry if these are from easy to get to levels already)

@vbnmh00
Copy link
Author

vbnmh00 commented May 27, 2022

For the first picture (orange filter) - any of the prologue races in career.
For the second picture (brown/yellow filter) - with any quick race.

@abaire
Copy link
Contributor

abaire commented May 28, 2022

Log from an nv2a-trace:
nv2a_log.txt

Draw 945 is where the yellow filter first appears to be applied. It's drawing quads which it only seems to do for the fullscreen effect and some billboarded effects like tire/exhaust smoke.

Looking at a renderdoc capture from xemu, I see it applying a filter that clearly darkens and slightly yellows the screen but it doesn't appear to match faithfully so more investigation is needed.
Oddly the filtering appears to be done in the vertex shader rather than the pixel shader; the pixel shader is just a passthrough:

vec4 v0 = pD0;
vec4 v1 = pD1;
vec4 ab;
vec4 cd;
vec4 mux_sum;
vec4 t0 = vec4(0.0); /* PS_TEXTUREMODES_NONE */
vec4 t1 = vec4(0.0); /* PS_TEXTUREMODES_NONE */
vec4 t2 = vec4(0.0); /* PS_TEXTUREMODES_NONE */
vec4 t3 = vec4(0.0); /* PS_TEXTUREMODES_NONE */
vec4 r0;
r0.a = 1.0;
// Stage 0
ab.rgb = clamp(vec3((max(v0.rgb, 0.0) * (1.0 - clamp(vec4(0.0).rgb, 0.0, 1.0)))), -1.0, 1.0);
r0.rgb = ab.rgb;
ab.a = clamp(((max(v0.a, 0.0) * (1.0 - clamp(vec4(0.0).a, 0.0, 1.0)))), -1.0, 1.0);
r0.a = ab.a;
// Final Combiner
fragColor.rgb = max(r0.rgb, 0.0) + mix(vec3(max(vec4(0.0).rgb, 0.0)), vec3(max(vec4(0.0).rgb, 0.0)), vec3(max(vec4(0.0).rgb, 0.0)));
fragColor.a = max(r0.a, 0.0);

For the draw in question, v1 = 0.4156863, 0.3921569, 0.3921569, 0.5019608 for all vertices and c[121] = 1.00, 1.00, 1.00, 1.00
Vertex shader:

  /* Slot 0: 0x00000000 0x00EC201B 0x08363800 0x20B04800 */
  DP4(oPos,y, v0, c[97]);

  /* Slot 1: 0x00000000 0x00EC401B 0x08365800 0x20B02800 */
  DP4(oPos,z, v0, c[98]);

  /* Slot 2: 0x00000000 0x00EC601B 0x08367800 0x20B01800 */
  DP4(oPos,w, v0, c[99]);

  /* Slot 3: 0x00000000 0x007740AA 0xC4001002 0xB1200000 */
  ADD(R2,w, R12.z, c[186].x);

  /* Slot 4: 0x00000000 0x06EC001B 0x08361BFF 0x10B88800 */
  DP4(oPos,x, v0, c[96]);
  RCC(R1,x, R12.w);

  /* Slot 5: 0x00000000 0x005740FF 0x24AB5800 0x21200000 */
  MUL(R2,w, R2.w, c[186].y);

  /* Slot 6: 0x00000000 0x02544055 0xC4002800 0xB130F854 */
  MUL(_temp_vec,w, R12.y, R1.x);
  MOV(oT1,xyzw, c[162].x);
  R3.w = _temp_vec.w;

  /* Slot 7: 0x00000000 0x013760FF 0x25FF7800 0x21200000 */
  MIN(R2,w, R2.w, c[187].w);

  /* Slot 8: 0x00000000 0x035440FF 0x37FE6800 0xB130F85C */
  MAX(_temp_vec,w, R3.w, -R3.w);
  MOV(oT2,xyzw, c[162].x);
  R3.w = _temp_vec.w;

  /* Slot 9: 0x00000000 0x015440FF 0x24005800 0x21200000 */
  MAX(R2,w, R2.w, c[162].x);

  /* Slot 10: 0x00000000 0x005740FF 0x35555800 0x21300000 */
  MUL(R3,w, R3.w, c[186].z);

  /* Slot 11: 0x00000000 0x007440AA 0x2C0017FC 0xD1300000 */
  ADD(R3,w, c[162].z, -R3.w);

  /* Slot 12: 0x00000000 0x0054001A 0xC4341800 0x20B0E800 */
  MUL(oPos,xyz, R12.xyz, c[160].xyz);

  /* Slot 13: 0x00000000 0x004F221B 0x9C363000 0x20B0F818 */
  MUL(oD0,xyzw, c[121], v1);

  /* Slot 14: 0x00000000 0x0074641B 0x2800106C 0xF0B0F848 */
  ADD(oT0,xyzw, v2, c[163]);

  /* Slot 15: 0x00000000 0x004000FF 0x35FE4800 0x20B01828 */
  MUL(oFog,x, R3.w, R2.w);

  /* Slot 16: 0x00000000 0x0094201A 0xC4002868 0x70B0E801 */
  MAD(oPos,xyz, R12.xyz, R1.x, c[161].xyz);

@abaire
Copy link
Contributor

abaire commented May 30, 2022

I looked more closely at the nv2a-trace and I think there's a mismatch in the shader there versus what I thought was doing the filter in xemu.

On hardware, the draw that appears to apply the filter takes the existing framebuffer in as the second texture and has a much more complex pixel shader coupled with a simpler vertex shader.

VSH:

MOV R1.xyzw, v0
MOV oD0.xyzw, v3 + RCP R1.w, R1.w
MOV oFog.xyzw, v4.w
MUL R2.xyzw, R1, c[0] + MOV oD1.xyzw, v4
ADD oPos.xyzw, R2, c[1]
MOV oPts.xyzw, v1.x
MOV oB0.xyzw, v7
MOV oB1.xyzw, v8
MOV oT0.xyzw, v9
MOV oT1.xyzw, v10
MOV oT2.xyzw, v11
MOV oT3.xyzw, v12

PSH:

nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_CONTROL<0x1E60> (0x11008 {Count:8, Mux:LSB, Factor0:EACH_STAGE, Factor1:EACH_STAGE})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_ICW[0]<0xAC0> (0x9010000 {[A: Tex1  Map:UNSIGNED_IDENTITY], [B: C0  Map:UNSIGNED_IDENTITY], [C: Zero  Map:UNSIGNED_IDENTITY], [D: Zero  Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_OCW[0]<0x1E40> (0x20D0 {AB_Reg:R1Temp, CD_Reg:Discard, AB+CD_Reg:Discard, AB_DOT:true, CD_DOT:false, MUX:false, OP:NoShift, AB_BlueToAlpha:false, CD_BlueToAlpha:false})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_ICW[0]<0x260> (0x30101010 {[A: Zero Alpha Map:UNSIGNED_INVERT], [B: Zero Alpha Map:UNSIGNED_IDENTITY], [C: Zero Alpha Map:UNSIGNED_IDENTITY], [D: Zero Alpha Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_OCW[0]<0xAA0> (0xD0 {AB_Reg:R1Temp, CD_Reg:Discard, MuxSum_Reg:Discard, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_ICW[1]<0xAC4> (0xD012109 {[A: R1Temp  Map:UNSIGNED_IDENTITY], [B: C0  Map:UNSIGNED_IDENTITY], [C: C0  Map:UNSIGNED_INVERT], [D: Tex1  Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_OCW[1]<0x1E44> (0xC00 {AB_Reg:Discard, CD_Reg:Discard, AB+CD_Reg:R0Temp, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift, AB_BlueToAlpha:false, CD_BlueToAlpha:false})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_ICW[1]<0x264> (0x19301010 {[A: Tex1 Alpha Map:UNSIGNED_IDENTITY], [B: Zero Alpha Map:UNSIGNED_INVERT], [C: Zero Alpha Map:UNSIGNED_IDENTITY], [D: Zero Alpha Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_OCW[1]<0xAA4> (0xC0 {AB_Reg:R0Temp, CD_Reg:Discard, MuxSum_Reg:Discard, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_ICW[2]<0xAC8> (0xA0C0000 {[A: Tex2  Map:UNSIGNED_IDENTITY], [B: R0Temp  Map:UNSIGNED_IDENTITY], [C: Zero  Map:UNSIGNED_IDENTITY], [D: Zero  Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_ICW[2]<0x268> (0x30301010 {[A: Zero Alpha Map:UNSIGNED_INVERT], [B: Zero Alpha Map:UNSIGNED_INVERT], [C: Zero Alpha Map:UNSIGNED_IDENTITY], [D: Zero Alpha Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_ICW[3]<0xACC> (0xD01210C {[A: R1Temp  Map:UNSIGNED_IDENTITY], [B: C0  Map:UNSIGNED_IDENTITY], [C: C0  Map:UNSIGNED_INVERT], [D: R0Temp  Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_OCW[3]<0x1E4C> (0xC00 {AB_Reg:Discard, CD_Reg:Discard, AB+CD_Reg:R0Temp, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift, AB_BlueToAlpha:false, CD_BlueToAlpha:false})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_OCW[3]<0xAAC> (0xC0 {AB_Reg:R0Temp, CD_Reg:Discard, MuxSum_Reg:Discard, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_ICW[4]<0xAD0> (0x90B200C {[A: Tex1  Map:UNSIGNED_IDENTITY], [B: Tex3  Map:UNSIGNED_IDENTITY], [C: Zero  Map:UNSIGNED_INVERT], [D: R0Temp  Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_OCW[4]<0x1E50> (0xC00 {AB_Reg:Discard, CD_Reg:Discard, AB+CD_Reg:R0Temp, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift, AB_BlueToAlpha:false, CD_BlueToAlpha:false})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_ICW[4]<0x270> (0x30301010 {[A: Zero Alpha Map:UNSIGNED_INVERT], [B: Zero Alpha Map:UNSIGNED_INVERT], [C: Zero Alpha Map:UNSIGNED_IDENTITY], [D: Zero Alpha Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_ICW[5]<0xAD4> (0xC200108 {[A: R0Temp  Map:UNSIGNED_IDENTITY], [B: Zero  Map:UNSIGNED_INVERT], [C: C0  Map:UNSIGNED_IDENTITY], [D: Tex0  Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_OCW[5]<0x1E54> (0xC00 {AB_Reg:Discard, CD_Reg:Discard, AB+CD_Reg:R0Temp, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift, AB_BlueToAlpha:false, CD_BlueToAlpha:false})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_ICW[5]<0x274> (0x30301010 {[A: Zero Alpha Map:UNSIGNED_INVERT], [B: Zero Alpha Map:UNSIGNED_INVERT], [C: Zero Alpha Map:UNSIGNED_IDENTITY], [D: Zero Alpha Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_OCW[5]<0xAB4> (0xC0 {AB_Reg:R0Temp, CD_Reg:Discard, MuxSum_Reg:Discard, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_ICW[6]<0xAD8> (0xC010000 {[A: R0Temp  Map:UNSIGNED_IDENTITY], [B: C0  Map:UNSIGNED_IDENTITY], [C: Zero  Map:UNSIGNED_IDENTITY], [D: Zero  Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_OCW[6]<0x1E58> (0x200D0 {AB_Reg:R1Temp, CD_Reg:Discard, AB+CD_Reg:Discard, AB_DOT:false, CD_DOT:false, MUX:false, OP:ShiftLeft2, AB_BlueToAlpha:false, CD_BlueToAlpha:false})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_ICW[6]<0x278> (0x30301010 {[A: Zero Alpha Map:UNSIGNED_INVERT], [B: Zero Alpha Map:UNSIGNED_INVERT], [C: Zero Alpha Map:UNSIGNED_IDENTITY], [D: Zero Alpha Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_OCW[6]<0xAB8> (0xC0 {AB_Reg:R0Temp, CD_Reg:Discard, MuxSum_Reg:Discard, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_ICW[7]<0xADC> (0xC20200D {[A: R0Temp  Map:UNSIGNED_IDENTITY], [B: Zero  Map:UNSIGNED_INVERT], [C: Zero  Map:UNSIGNED_INVERT], [D: R1Temp  Map:UNSIGNED_IDENTITY]})

nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_SPECULAR_FOG_CW0<0x288> (0x13010C00 {[A: Fog Alpha], [B: C0], [C: R0Temp], [D: Zero]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_SPECULAR_FOG_CW1<0x28C> (0x1C00 {[E: Zero], [F: Zero], [G: R0Temp Alpha]})

@abaire
Copy link
Contributor

abaire commented May 30, 2022

Looking at the shader in more depth, it's using the fog value in the final combiner but the fog value coming out of the vertex shader is -1, which seems likely to be an invalid value.

UPDATE: While -1 still seems wrong, the fog param is only used to select between a constant color and the calculated color and is not the problem.

Of more interest is the way that the two gradient textures are accessed. Both of the gradients (a grey one at t2, a yellow one at t3) are 256x1 ARGB swizzled textures.

The texture coords used to reference the textures appear to be linear, however. For the quad, both use (0, 0), (256, 0), (256,1), (0, 1). The textures are configured to use clamping for ST (repeat for R). The grey gradient is set up w/ Linear min+mag+mip, the yellow is linear min+mag, mip=none.

UPDATE: Looking at the shader itself, the texture coords are especially unusual:

vec4 t0 = textureProj(texSamp0, pT0.xyw);
pT1.xy = texScale1 * pT1.xy;
vec4 t1 = textureProj(texSamp1, pT1.xyw);
vec4 t2 = texture(texSamp2, t0.ar);
vec4 t3 = texture(texSamp3, t0.ar);

The golden gradient ends up producing black output using t0.ar as texture coords, t0 is fully transparent. Looking at the hardware trace, I see what I think is the same highlighting texture and it clearly has a non-zero alpha channel.

By modifying the "x" value in those texture lookups I can get an output that looks much more similar to hardware (setting it to 0.5 in this case)
Screenshot_20220530_073147
:

So I think the issue is the alpha output of the draw prior to the filter application that produces the highlight edge detection image.

@abaire
Copy link
Contributor

abaire commented May 30, 2022

The pixel shader for the draw that produces the highlight output:

// Stage 0
mux_sum.rgb = clamp(vec3(((max(t0.rgb, 0.0) * (1.0 - clamp(vec4(0.0).rgb, 0.0, 1.0))) + (-t1.rgb * (1.0 - clamp(vec4(0.0).rgb, 0.0, 1.0))))), -1.0, 1.0);
r0.rgb = mux_sum.rgb;
ab.a = clamp(((max(t0.a, 0.0) * (1.0 - clamp(vec4(0.0).a, 0.0, 1.0)))), -1.0, 1.0);
r0.a = ab.a;
// Stage 1
ab.rgb = clamp(vec3(dot(max(t1.rgb, 0.0), max(c0_1.rgb, 0.0))), -1.0, 1.0);
r1.rgb = ab.rgb;
ab.a = clamp((((1.0 - clamp(vec4(0.0).a, 0.0, 1.0)) * max(vec4(0.0).a, 0.0))), -1.0, 1.0);
r1.a = ab.a;
// Stage 2
ab.rgb = clamp(vec3((max(r1.rgb, 0.0) * max(vec4(0.0).rgb, 0.0))), -1.0, 1.0);
r1.rgb = ab.rgb;
ab.a = clamp(((max(r1.b, 0.0) * (1.0 - clamp(vec4(0.0).a, 0.0, 1.0)))), -1.0, 1.0);
r0.a = ab.a;
// Final Combiner
fragColor.rgb = max(vec4(0.0).rgb, 0.0) + mix(vec3(max(r0.rgb, 0.0)), vec3(max(c0_8.rgb, 0.0)), vec3(max(pFog.aaa, 0.0)));
fragColor.a = max(r0.a, 0.0);

Stage 2 blacks out r0.a and causes it to never produce alpha. This does not appear to match the behavior I see from the hardware trace. Relevant part of the pgraph dump from the HW matches the shader behavior observed, however:

nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_ICW[2]<0xAC8> (0xD000000 {[A: R1Temp  Map:UNSIGNED_IDENTITY], [B: Zero  Map:UNSIGNED_IDENTITY], [C: Zero  Map:UNSIGNED_IDENTITY], [D: Zero  Map:UNSIGNED_IDENTITY]})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_COLOR_OCW[2]<0x1E48> (0xD0 {AB_Reg:R1Temp, CD_Reg:Discard, AB+CD_Reg:Discard, AB_DOT:false, CD_DOT:false, MUX:false, OP:NoShift, AB_BlueToAlpha:false, CD_BlueToAlpha:false})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COMBINER_ALPHA_ICW[2]<0x268> (0xD303030 {[A: R1Temp  Map:UNSIGNED_IDENTITY], [B: Zero Alpha Map:UNSIGNED_INVERT], [C: Zero Alpha Map:UNSIGNED_INVERT], [D: Zero Alpha Map:UNSIGNED_INVERT]})

UPDATE: Looking at the source textures for this draw, I also see an apparent pixel offset bug, not sure if that's a contributing factor or not.

@abaire
Copy link
Contributor

abaire commented May 30, 2022

Looking again, I bet this is another combiner interdependence error. I'm guessing the alpha calculation needs to use the r1 value from the previous stage and not the value calculated by the color combiner in the same stage.

@abaire
Copy link
Contributor

abaire commented May 30, 2022

Confirmed that this is an issue where none of the outputs of a stage should be able to affect other components of the same stage (similar to #720). PR #999 fixes this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants