Add ARL-bias to work around OpenGL float behaviour #79

JayFoxRox · 2018-07-11T22:49:47Z

This adds a small bias to the ARL (address register loads) Vertex Shader instruction to work around an issue we have with OpenGL / GLSL. I've tried a short version of the explanation in the GLSL code comment.

See #78 for more details. That issue also has a screenshot of how Azurik would look on current master if it ran (Azurik needs some additional patches to run on current master (fifo-refactor), this patch is purely about the graphical issues with characters).

An issue with this workaround is, that this problem also exists in the other direction, where we now accidentally turn 16.99 into 17 when it really was supposed to be 16.
However, this is probably less likely to happen, so this workaround should do more good than bad.

Here is how it looks with the bias:

Screenshot by John GodGames

According to John GodGames, this also fixes the known issue in Digimon Rumble Arena 2.

This only fixes one instance how the problems mentioned in #78 manifest, so that issue is still valid after merge. This is not a permanent solution.

JayFoxRox · 2018-07-11T22:51:39Z

hw/xbox/nv2a/nv2a_vsh.c

-    "  return int(floor(src));\n"
+    "  /* Xbox GPU does specify rounding, OpenGL doesn't; so we need a bias.\n"
+    "   * Example: We probably want to floor 16.99.. to 17, not 16.\n"
+    "   * Source of error (why we get 16.99.. instead of 17.0) is typically\n"


nit: "typically" → "probably"

There's some uncertainty in #78 regarding why this issue doesn't also affect a real Xbox.

mborgerson · 2018-07-12T21:03:31Z

hw/xbox/nv2a/nv2a_vsh.c

+    "   * If we multiply these rounded values by 255 later, we get:\n"
+    "   *   17.00 (ARL result = 17) or 16.99 (ARL result = 16).\n"
+    "   * We assume the intend was to get 17, so we add our bias to fix it. */\n"
+    "  return int(floor(src + 0.001));\n"


Is ARL domain always [0, 255]?

It's a bit more complicated than this.

tl;dr: ARL input range is clamped to [-64; 160] in the original spec, but Xbox has larger constant space than original spec which claims this. Other limitation is relative access in [-64; 63] range, but I'm not sure if this is applies to a0. Xbox also implements a second spec which removed these limits. Click here to unfold quotes from spec.

See section 2.14.1.10.1 for ARL pseudo-code in https://www.khronos.org/registry/OpenGL/extensions/NV/NV_vertex_program.txt .

Also see this quote from the same spec:

What range of relative addressing offsets should be allowed?

RESOLUTION: -64 to 63. Negative offsets are useful for accessing a table centered at zero without extra bias instructions. Having the offsets support much larger magnitudes just seems to increase the required instruction widths. The -64 to 63 range seems like a reasonable compromise.

Furthermore:

What happens when the source scalar value for the ARL instruction
is an extremely positive or extremely negative floating-point value?
Is there a problem mapping the value to a constrained integer range?

RESOLUTION: It is not a problem. Relative addressing can by offset by a limited range of offsets (-64 to 63). Relative addressing that falls outside of the 0 to 95 range of program parameter registers is automatically mapped to (0,0,0,0). Clamping the source scalar value for ARL to the range -64 to 160 inclusive is sufficient to ensure that relative addressing is out of range.

(This clamp is missing from the ARB spec, I'm not sure what it says regarding out-of-bounds reads)

Also a little bit of additional info regarding the addressed constant registers: Xbox has 192 constant registers but the original spec said:

<progParamRegNum> ::= decimal integer from 0 to 95 inclusive

and the addressing is typically done as "0 to 95" and then "-96 to -1" for the second bank.
Which bank (or both!) is used depends on the GPU configuration.

We don't enforce these limits anywhere because we don't know for certain what actually happens and most games probably won't trigger these. It's entirely seperated from this issue / PR, too.

I believe I tested the bounds using nv2a-re in the past, but I probably failed to document my findings (also there's many GPU configuration registers - it's not exactly easy to test this without better tools; nv2a-re also only allows state-shader debugging which uses constant space differently).

mborgerson · 2018-07-12T22:11:04Z

Thank you!

shogun3d · 2018-07-30T20:22:09Z

Shogun is very pleased.

Shogun.

nv2a: Add ARL-bias to work around OpenGL float behaviour

d447e8d

JayFoxRox commented Jul 11, 2018

View reviewed changes

mborgerson reviewed Jul 12, 2018

View reviewed changes

mborgerson merged commit 4ffe83b into xqemu:master Jul 12, 2018

JayFoxRox deleted the arl-bias branch July 12, 2018 23:26

JayFoxRox mentioned this pull request Aug 25, 2018

nv2a: Fix EXP and LOG #102

Merged

JayFoxRox mentioned this pull request Sep 3, 2018

Vertex program EXP behaviour does not respect our ARL-hack #105

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ARL-bias to work around OpenGL float behaviour #79

Add ARL-bias to work around OpenGL float behaviour #79

JayFoxRox commented Jul 11, 2018

JayFoxRox Jul 11, 2018

mborgerson Jul 12, 2018

JayFoxRox Jul 12, 2018 •

edited

mborgerson commented Jul 12, 2018

shogun3d commented Jul 30, 2018

Add ARL-bias to work around OpenGL float behaviour #79

Add ARL-bias to work around OpenGL float behaviour #79

Conversation

JayFoxRox commented Jul 11, 2018

JayFoxRox Jul 11, 2018

Choose a reason for hiding this comment

mborgerson Jul 12, 2018

Choose a reason for hiding this comment

JayFoxRox Jul 12, 2018 • edited

Choose a reason for hiding this comment

mborgerson commented Jul 12, 2018

shogun3d commented Jul 30, 2018

JayFoxRox Jul 12, 2018 •

edited