[dxvk] Optimize for the d3d9 Strict float emulation path #346

Blisto91 · 2024-01-09T18:34:01Z

The d3d9 Strict float emulation path in dxvk (see links below for technical description) is not enabled by default for all drivers, even though it is more correct, because it has a performance penalty compared to the default True.
Radv and now also nvk have code to optimize for this and so will both use Strict out of the box without any performance penalty and with more games functioning out of the box without visual issues.

Amdvlk currently doesn't do this and so will either have a performance penalty for any games where dxvk sets Strict by default or risk of visual issues in any games where such builtin configs doesn't exist yet. A couple of examples for illustrating the performance dip can be seen below.
Note that these games are just randomly chosen and are not meant to be worst case scenarios. Also note that my test setup is pretty high end to begin with (RX6800 and 7950x) and so does not represent a typical one.

Risen

d3d9.floatEmulation = True

d3d9.floatEmulation = Strict

Dragons Dogma

d3d9.floatEmulation = True

d3d9.floatEmulation = Strict

See original dxvk PR doitsujin/dxvk#2294
See also radv MR https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436

The text was updated successfully, but these errors were encountered:

ruiminzhao · 2024-02-19T08:16:16Z

@Blisto91 Thanks for your comments. Now I'm investigating this issue on Amdvlk.
A question here:
Do you have any SPIRV generated when "d3d9.floatEmulation = True" or "d3d9.floatEmulation = Strict" ? I want to know if this setting will be reflected in the shader. Then I can do the optimization according to the related flag in SPIRV.
Thanks.

Blisto91 · 2024-02-23T09:33:15Z

Hi there and thank you for the response. I am not personally skilled in this area but I have asked the dxvk devs for assistance when they have a bit of time.

DadSchoorse · 2024-03-23T09:05:11Z

I see GPUOpen-Drivers/llpc@e91a935 added an optimization for ((b==0.0 ? 0.0 : a) * (a==0.0 ? 0.0 : b)). But dxvk also emits fma((b==0.0 ? 0.0 : a), (a==0.0 ? 0.0 : b), c) . So unless llpc lowers fma to mul+add, you should also add a pattern that optimizes the fma version to v_fma_legacy_f32/v_mad_legacy_f32. And depending on if you run constant folding before the optimizations, you also want to handle the case where the comparison+select was optimized away for one mul operand, (a * (a==0.0 ? 0.0 : b))/fma(a, (a==0.0 ? 0.0 : b), c), if b is not constant zero.

ruiminzhao · 2024-03-24T17:03:15Z

@DadSchoorse Thanks for your comment. Now I have added more patterns as you refer, now the patterns supported is listed below:

((b==0.0 ? 0.0 : a) * (a==0.0 ? 0.0 : b)) ==>fmul_legacy(a,b)
a * (a==0.0?0.0:b) or (b==0.0?0.0:a) * b ==>fmul_legacy(a.b)
fma((b==0.0 ? 0.0 : a), (a==0.0 ? 0.0 : b), c) ==>fma_legacy(a,b,c)
fma(a, (a==0.0 ? 0.0 : b), c) or fma(b==0.0?0.0:a, b, c) ==>fma_legacy(a,b,c)

For 2.3, one more condition is the single operand(a or b) should not be constant zero here.

Please check any missing here. Now my fix is under CI, looking forward to merge and deliver it ASAP.
Thanks.

DadSchoorse · 2024-03-24T19:53:05Z

For 2.3, one more condition is the single operand(a or b) should not be constant zero here.

What I've said before may have been a bit ambiguous, so just to make sure: For a * (a==0.0?0.0:b) it's important that b is not zero. So if (b.isConstant() && b.constantValue() != 0.0) { apply_opt(); }, not if (!b.isConstant() || b.constantValue() != 0.0).

Otherwise, your list matches what radv optimizes.

DadSchoorse · 2024-03-24T21:01:40Z

Oh, another thing I just thought of, I don't see a bit size check in GPUOpen-Drivers/llpc@e91a935 . v_mul_legacy_f32/v_fma_legacy_f32 are 32bit only.

Blisto91 · 2024-05-18T22:17:09Z

Was this work supposed to be enabled in the 2024.Q2.1 release?
I tried a quick test with my iGPU in Risen 1 and still get a big performance drop when setting d3d9.floatEmulation = Strict

AMDVLK 2024.Q2.1

d3d9.floatEmulation = True

d3d9.floatEmulation = Strict

RADV for comparison

d3d9.floatEmulation = True

d3d9.floatEmulation = Strict

ruiminzhao · 2024-05-20T06:42:31Z

@Blisto91 Thanks for your feedback. For the cause of the issue, I have two suspects here:

My fix hasn't fit the pattern of IR pattern generate by the game.
Other fix which is related with fastmath flag has broken the pattern on which I have the optimized.

To confirm which one cause this issue, would you please(or ask dxvk devs for assistance) to dump the pipeline then I can check whether my optimization has been effective.

Thanks.

K0bin · 2024-05-20T08:42:23Z

You can dump the shaders by setting the environment variable DXVK_SHADER_DUMP_PATH=/your/path and then running the game with DXVK.

That will export the generated SPIR-V among other things.

Any D3D9 game will work, you just need to also set the environment variable DXVK_CONFIG=d3d9.floatEmulation = Strict; to enable the accurate float behavior.

Blisto91 · 2024-05-23T15:23:41Z

Linked is a dxvk shader dump from Risen 1 running on my 7950x iGPU with amdvlk 2024.Q2.1 and d3d9.floatEmulation = strict

https://drive.proton.me/urls/SF8RPVZ6CG#Rk7KIIG4d480

ruiminzhao · 2024-05-28T03:54:06Z

@Blisto91 Thanks. But unfortunatelly I can't access this link.... Maybe you can add the related files in this page attached?

Blisto91 · 2024-05-28T06:47:45Z

@ruiminzhao Hi there. I hope a 7zip wrapped in a zip is fine as most formats Github allows doesn't compress enough on their own.
Risen-amdvlk-strict-float.zip

ruiminzhao · 2024-05-28T07:48:41Z

@ruiminzhao Hi there. I hope a 7zip wrapped in a zip is fine as most formats Github allows doesn't compress enough on their own. Risen-amdvlk-strict-float.zip

Thanks. I can get the log now and will have a look later.

ruiminzhao · 2024-06-03T02:26:30Z

The root cause has been found, the pattern used widely is like:
"
“
%2272 = select reassoc nnan nsz arcp contract afn i1 %2270, float 0.000000e+00, float %2271
%2273 = insertelement <3 x float> poison, float %2272, i64 0
…
%2276 = select reassoc nnan nsz arcp contract afn i1 %2274, float 0.000000e+00, float %2275
%2277 = insertelement <3 x float> %2273, float %2276, i64 1
…
%2280 = select reassoc nnan nsz arcp contract afn i1 %2278, float 0.000000e+00, float %2279
%2281 = insertelement <3 x float> %2277, float %2280, i64 2
..

%2293 = select reassoc nnan nsz arcp contract afn i1 %2291, float 0.000000e+00, float %2292
%2294 = insertelement <3 x float> poison, float %2293, i64 0
…
%2297 = select reassoc nnan nsz arcp contract afn i1 %2295, float 0.000000e+00, float %2296
%2298 = insertelement <3 x float> %2294, float %2297, i64 1
…
%2301 = select reassoc nnan nsz arcp contract afn i1 %2299, float 0.000000e+00, float %2300
%2302 = insertelement <3 x float> %2298, float %2301, i64 2
…
%2303 = fmul reassoc nnan nsz arcp contract afn <3 x float> %2281, %2302
“

"
It hasn't been caught, it needs to reorder the process like this:
"
Many other transforms
Scalarizer pass
fmul_legacy / fma_legacy matching
"

ruiminzhao · 2024-08-07T09:15:41Z

I think the further fix for this issue has been delivered, maybe you can have a validation in the latest release. Thanks.

Blisto91 · 2024-08-07T09:49:32Z

@ruiminzhao Hi there.
I am not seeing any change in 2024.Q2.3 compared to prior test with 2024.Q2.1. Tested on my 7950x iGPU

ruiminzhao · 2024-08-07T10:12:37Z

Oh, it seems this issue hasn't been delivered on 2024.Q2.3. Once it's delivered, I would ask you for further validation. Thanks.

Blisto91 · 2024-08-13T14:55:34Z

Thank you for the work everyone, i can confirm this looks good now with 2024.Q3.1.
doitsujin/dxvk#4203

Risen screenshots on 7950x iGPU

d3d9.floatEmulation = True

d3d9.floatEmulation = Strict

Blisto91 changed the title ~~[dxvk] Optimize for the d3d9 Strict float emulation path in dxvk~~ [dxvk] Optimize for the d3d9 Strict float emulation path Jan 9, 2024

jinjianrong added the assigned The issue is assigned to engineer label Jan 11, 2024

jinjianrong assigned amdrexu Jan 11, 2024

jinjianrong added the reproducing Reproducing the issue label May 21, 2024

jinjianrong assigned ruiminzhao and unassigned amdrexu May 22, 2024

jinjianrong added reproduced The issue is reproduced by CQE and removed reproducing Reproducing the issue labels Jun 4, 2024

Blisto91 closed this as completed Aug 13, 2024

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dxvk] Optimize for the d3d9 Strict float emulation path #346

[dxvk] Optimize for the d3d9 Strict float emulation path #346

Blisto91 commented Jan 9, 2024 •

edited

Loading

ruiminzhao commented Feb 19, 2024 •

edited

Loading

Blisto91 commented Feb 23, 2024 •

edited

Loading

DadSchoorse commented Mar 23, 2024

ruiminzhao commented Mar 24, 2024 •

edited

Loading

DadSchoorse commented Mar 24, 2024 •

edited

Loading

DadSchoorse commented Mar 24, 2024

Blisto91 commented May 18, 2024

ruiminzhao commented May 20, 2024

K0bin commented May 20, 2024 •

edited

Loading

Blisto91 commented May 23, 2024

ruiminzhao commented May 28, 2024

Blisto91 commented May 28, 2024

ruiminzhao commented May 28, 2024

ruiminzhao commented Jun 3, 2024

ruiminzhao commented Aug 7, 2024

Blisto91 commented Aug 7, 2024

ruiminzhao commented Aug 7, 2024

Blisto91 commented Aug 13, 2024

[dxvk] Optimize for the d3d9 Strict float emulation path #346

[dxvk] Optimize for the d3d9 Strict float emulation path #346

Comments

Blisto91 commented Jan 9, 2024 • edited Loading

ruiminzhao commented Feb 19, 2024 • edited Loading

Blisto91 commented Feb 23, 2024 • edited Loading

DadSchoorse commented Mar 23, 2024

ruiminzhao commented Mar 24, 2024 • edited Loading

DadSchoorse commented Mar 24, 2024 • edited Loading

DadSchoorse commented Mar 24, 2024

Blisto91 commented May 18, 2024

ruiminzhao commented May 20, 2024

K0bin commented May 20, 2024 • edited Loading

Blisto91 commented May 23, 2024

ruiminzhao commented May 28, 2024

Blisto91 commented May 28, 2024

ruiminzhao commented May 28, 2024

ruiminzhao commented Jun 3, 2024

ruiminzhao commented Aug 7, 2024

Blisto91 commented Aug 7, 2024

ruiminzhao commented Aug 7, 2024

Blisto91 commented Aug 13, 2024

Blisto91 commented Jan 9, 2024 •

edited

Loading

ruiminzhao commented Feb 19, 2024 •

edited

Loading

Blisto91 commented Feb 23, 2024 •

edited

Loading

ruiminzhao commented Mar 24, 2024 •

edited

Loading

DadSchoorse commented Mar 24, 2024 •

edited

Loading

K0bin commented May 20, 2024 •

edited

Loading