New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ARL-bias to work around OpenGL float behaviour #79
Conversation
" return int(floor(src));\n" | ||
" /* Xbox GPU does specify rounding, OpenGL doesn't; so we need a bias.\n" | ||
" * Example: We probably want to floor 16.99.. to 17, not 16.\n" | ||
" * Source of error (why we get 16.99.. instead of 17.0) is typically\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "typically" → "probably"
There's some uncertainty in #78 regarding why this issue doesn't also affect a real Xbox.
" * If we multiply these rounded values by 255 later, we get:\n" | ||
" * 17.00 (ARL result = 17) or 16.99 (ARL result = 16).\n" | ||
" * We assume the intend was to get 17, so we add our bias to fix it. */\n" | ||
" return int(floor(src + 0.001));\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ARL domain always [0, 255]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit more complicated than this.
tl;dr: ARL input range is clamped to [-64; 160] in the original spec, but Xbox has larger constant space than original spec which claims this. Other limitation is relative access in [-64; 63] range, but I'm not sure if this is applies to a0. Xbox also implements a second spec which removed these limits. Click here to unfold quotes from spec.
See section 2.14.1.10.1 for ARL pseudo-code in https://www.khronos.org/registry/OpenGL/extensions/NV/NV_vertex_program.txt .
Also see this quote from the same spec:
What range of relative addressing offsets should be allowed?
RESOLUTION: -64 to 63. Negative offsets are useful for accessing a table centered at zero without extra bias instructions. Having the offsets support much larger magnitudes just seems to increase the required instruction widths. The -64 to 63 range seems like a reasonable compromise.
Furthermore:
What happens when the source scalar value for the ARL instruction
is an extremely positive or extremely negative floating-point value?
Is there a problem mapping the value to a constrained integer range?RESOLUTION: It is not a problem. Relative addressing can by offset by a limited range of offsets (-64 to 63). Relative addressing that falls outside of the 0 to 95 range of program parameter registers is automatically mapped to (0,0,0,0). Clamping the source scalar value for ARL to the range -64 to 160 inclusive is sufficient to ensure that relative addressing is out of range.
(This clamp is missing from the ARB spec, I'm not sure what it says regarding out-of-bounds reads)
Also a little bit of additional info regarding the addressed constant registers: Xbox has 192 constant registers but the original spec said:
<progParamRegNum> ::= decimal integer from 0 to 95 inclusive
and the addressing is typically done as "0 to 95" and then "-96 to -1" for the second bank.
Which bank (or both!) is used depends on the GPU configuration.
We don't enforce these limits anywhere because we don't know for certain what actually happens and most games probably won't trigger these. It's entirely seperated from this issue / PR, too.
I believe I tested the bounds using nv2a-re in the past, but I probably failed to document my findings (also there's many GPU configuration registers - it's not exactly easy to test this without better tools; nv2a-re also only allows state-shader debugging which uses constant space differently).
Thank you! |
Shogun is very pleased. Shogun. |
This adds a small bias to the ARL (address register loads) Vertex Shader instruction to work around an issue we have with OpenGL / GLSL. I've tried a short version of the explanation in the GLSL code comment.
See #78 for more details. That issue also has a screenshot of how Azurik would look on current master if it ran (Azurik needs some additional patches to run on current master (fifo-refactor), this patch is purely about the graphical issues with characters).
An issue with this workaround is, that this problem also exists in the other direction, where we now accidentally turn 16.99 into 17 when it really was supposed to be 16.
However, this is probably less likely to happen, so this workaround should do more good than bad.
Here is how it looks with the bias:
Screenshot by John GodGames
According to John GodGames, this also fixes the known issue in Digimon Rumble Arena 2.
This only fixes one instance how the problems mentioned in #78 manifest, so that issue is still valid after merge. This is not a permanent solution.