Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not sure what changes from 9.6 to 1.0 #937

Closed
brokenisuseless opened this Issue Feb 27, 2019 · 31 comments

Comments

Projects
None yet
9 participants
@brokenisuseless
Copy link

brokenisuseless commented Feb 27, 2019

I had issue with 9.6 playing world of warcraft. Mini hangs I will call it. I starting having some other issues, and discovered there is an issue with amdgpu and linux kernel 4.19, I think I found it on the arch forum. So I looked and had the same kernel errors. I then built kernel 4.20 and installed it.(error in logs vanished) World of warcraft played flawlessly, totally awesome. That was about 12 hours ago nothing changed on this box except I updated to 1.0, and now the mini hangs are back and worse than before

I am running debian testing
I just uninstalled 1.0 and reinstalled 9.6, everything works great again

@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Feb 27, 2019

Logs? Hardware / Driver?

Just as a note, if you are experiencing stutter (it is unclear what exactly your problem is), this is to be expected since all shaders have to be recompiled.

@grigi

This comment has been minimized.

Copy link

grigi commented Feb 28, 2019

I may be experiencing a similar issue.
Playing The Witcher 3 on my Intel UHD620 (kabylake mobile i7) using ANV from mesa GIT, using proton 3.16-7 with the stock DXVK 0.96, when replacing the dlls with DXVK 1.0 ones, I have the following experience:

  1. First everything is super choppy as it recompiles shaders (To be expected).
  2. This gets better after a few minutes, but the performance does not feel right.
  3. I reverted to 0.96 and performance is back to where it was.

I will try and get some more info for you, I don't know yet if the issue is overall frame rate, or a jitter issue, just that it is quite large.

@grigi

This comment has been minimized.

Copy link

grigi commented Feb 28, 2019

With DXVK 0.96:
w3_dxvk_0 96

With DXVK 1.0:
w3_dxvk_1 0

With DXVK 1.0 the framerate has a significant slowdown, more jitter, and the whole scene is rendered too dark.

@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Feb 28, 2019

Can you bisect this please? All your issues seem to be Intel-specific, the game still renders perfectly fine on everything else and performance on my Polaris card has actually improved slightly.

Not that I'd call 17 FPS playable in the first place.

@grigi

This comment has been minimized.

Copy link

grigi commented Mar 1, 2019

Yeah, it was on battery, plugged in I get closer to 20.
I'm busy setting up a cross compiler, as I only have access to Linux ,so will report once I get it to build anything.

@Jrugia

This comment has been minimized.

Copy link

Jrugia commented Mar 3, 2019

Yeah, it was on battery, plugged in I get closer to 20.

You could achieve a bit more fps by reducing the resolution to 720p, 1080p is really taxing on a mobile chipset.

@X0rg

This comment has been minimized.

Copy link

X0rg commented Mar 5, 2019

I had stuttering in The Witcher 3 after updating from v0.96 to v1.0 me too (RX 580 with RADV).
After deleting witcher3.dxvk-cache file, problem was solved for me.

@grigi

This comment has been minimized.

Copy link

grigi commented Mar 7, 2019

Ok, I failed to get a build env that worked (changed mingw threading model, but then ld broke), but I found these near daily builds: https://haagch.frickel.club/files/dxvk/

So here is my almost-bisect-log:

r1938.af92bc9 — 19.1
r1939.35c7d68 — 19.1 → v0.96
r1948.c360a19 — 19.2
r1949.e5a06d3 — 17.8 → Initial performance regression introduced here: e5a06d3
r1955.a437899 — 17.9
r1967.5ea8648 — 17.9
r1971.746562d — 17.8, dark → Dark rendering appears in 1968 - 1971 (7ed9187 9f8c1d0 fd445f7 746562d)
r1976.cbaeca8 — 17.9, dark
r1980.be22756 — 17.8, dark
r1983.20ea74f — 17.9, dark
r1990.d12a8e0 — 17.6, dark
r1991.6d814b2 — 17.6, dark
r1995.e03b574 — 16.4, dark, jitter → Jitter starts between 1992 - 1995 (2231caa a6d1fe0 b6804a9 e03b574)
r2000.10140f4 — 16.4, dark, jitter → v1.0
r2001.7118685 — 16.7, dark, jitter
r2007.d011102 — 16.6, dark, jitter
r2011.a40d8d4 — 16.8, dark, jitter

There are three distinct regression points:
FPS drop, Dark rendering, and jitter (with another performance regression). Each was quite easy to reproduce.

@doitsujin I hope this will help you somewhat?

@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Mar 10, 2019

It would be more helpful if you could pinpoint the exact commit which introduces the dark rendering, but that would require a working build environment. What trouble are you having setting it up?
Thing is, this particular issue doesn't really make sense and only seems to affect Intel drivers.

The initial performance regression was introduced by an optimization which improves performance on AMD drivers and does not seem to affect Nvidia negatively. I do not plan to revert this change or add a second code path, ideally this should be fixed in the driver.

The jitter issue is most likely caused by e03b574, which reportedly improves frame time consistency on Nvidia and does not have any negative impact on AMD. Disabling this for Intel should be easy, but might not be desireable.

@grigi

This comment has been minimized.

Copy link

grigi commented Mar 10, 2019

Thanks for the analysis Philip

The build env issue is bizzarre, its almost like when I changed the param to build a pthread MinGW64, the linker still thinks it is Win32 threads. But I verified that I followed the instructions, so, unsure what I actually did wrong. I'm open to building those three missing versions in a Docker image, which it seems that one can find instructions for on that site.
I primarily just have a lack of time to get much done right now. So haven't tried that route yet.

I have a suspicion the dark rendering may be related the the gamma/brightness slider in the game. I moved it up near the max, to work around my non-ideal notebook monitor.

Will you have time to work with the ANV driver developers re the performance regression introduced by the optimisation? I'm sure you can provide much more insightful comments than me.

Re the jitter issue, it would also be useful to talk to an ANV dev?

@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Mar 10, 2019

Looks like 1.0 does indeed break gamma (for everyone), and I already know why. Unfortunately a proper fix is rather ugly to implement.

@mozo78

This comment has been minimized.

Copy link

mozo78 commented Mar 10, 2019

Hello doitsujin,
I have an interesting observation - with Nvidia 418.43 and DXVK 1.0 DMC 5 is extremly dark but with 396.54 and again DXVK 1.0 it's just fine and there aren't gamma issues at all.

@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Mar 10, 2019

@mozo78 This is not a gamma issue, see #956. There's already a workaround for that in latest master.

@mozo78

This comment has been minimized.

Copy link

mozo78 commented Mar 10, 2019

Yes I know but I just wanted to inform you for the driver role in this case too :)

@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Mar 10, 2019

It doesn't matter because it is completely unrelated to any of the issues discussed in this thread. Please post such comments to the appropriate threads.

@grigi

This comment has been minimized.

Copy link

grigi commented Mar 11, 2019

Looks like 1.0 does indeed break gamma (for everyone), and I already know why. Unfortunately a proper fix is rather ugly to implement.

I hate those, when doing the simple thing is wrong, and the right thing unnecessarily complex.
∴ I'll await your instruction if you want me to do more testing for you.

@werman

This comment has been minimized.

Copy link

werman commented Mar 11, 2019

I could confirm that at least e5a06d3 causes performance regression.
I tried to look into which and how shaders are affected by the change but I'm keep getting crash in driver in push constants code when replaying trace... And I don't have knowledge to make a guess about the performance regression without seeing the shaders. I'm trying to diff native code of the shaders with and without that commit but I always meet some unexpected obstacles....

I think you better file a bug against ANV since I'm not able to check it quickly.

@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Mar 11, 2019

What that commit does is change constant buffer loads as follows:

layout(binding = X)
uniform cb0_t {
  vec4 m[...];
} cb0;

vec4 r0;

void main() {
  ...
  // before
  vec4 tmp = cb0.m[0];
  r0.xy = tmp.xy;
  // after
  r0.xy = vec2(cb0.m[0].x, cb0.m[0].y);
  ...
}

This is useful on AMD drivers in cases where components of a single UBO elements are used in different scopes, since the compiler can combine multiple loads into one but not split one load into multiple loads.

@werman

This comment has been minimized.

Copy link

werman commented Mar 13, 2019

I wrestled with some tooling and found that it is most likely due to the increased register spilling:

After e5a06d3
384:../witcher-bad/397.pipeline_test - FS SIMD8 shader: 1050 inst, 1 loops, 71681 cycles, 7:7 spills:fills, Promoted 9 --> ../witcher-bad/45.pipeline_test
406:../witcher-bad/396.pipeline_test - FS SIMD8 shader: 1050 inst, 1 loops, 71681 cycles, 7:7 spills:fills, Promoted 9 constants, compacted 16800 to 11488 bytes.
418:../witcher-bad/398.pipeline_test - FS SIMD8 shader: 1050 inst, 1 loops, 71681 cycles, 7:7 spills:fills, Promoted 9 constants, compacted 16800 to 11488 bytes.
551:../witcher-bad/388.pipeline_test - FS SIMD8 shader: 1345 inst, 1 loops, 70485 cycles, 16:24 spills:fills, Promoted 8 constants, compacted 21520 to 14320 bytes.
589:../witcher-bad/394.pipeline_test - FS SIMD8 shader: 1044 inst, 1 loops, 70950 cycles, 7:7 spills:fills, Promoted 9 constants, compacted 16704 to 11440 bytes.
892:../witcher-bad/384.pipeline_test - FS SIMD8 shader: 1338 inst, 1 loops, 70500 cycles, 16:24 spills:fills, Promoted 8 constants, compacted 21408 to 14256 bytes.
920:../witcher-bad/395.pipeline_test - FS SIMD8 shader: 1050 inst, 1 loops, 71681 cycles, 7:7 spills:fills, Promoted 9 constants, compacted 16800 to 11488 bytes.
1106:../witcher-bad/383.pipeline_test - FS SIMD8 shader: 3230 inst, 1 loops, 99458 cycles, 65:175 spills:fills, Promoted 23 constants, compacted 51680 to 34528 bytes.
1288:../witcher-bad/385.pipeline_test - FS SIMD8 shader: 1345 inst, 1 loops, 70485 cycles, 16:24 spills:fills, Promoted 8 constants, compacted 21520 to 14320 bytes.
1355:../witcher-bad/393.pipeline_test - FS SIMD8 shader: 1044 inst, 1 loops, 70950 cycles, 7:7 spills:fills, Promoted 9 constants, compacted 16704 to 11440 bytes.
1470:../witcher-bad/386.pipeline_test - FS SIMD8 shader: 1345 inst, 1 loops, 70485 cycles, 16:24 spills:fills, Promoted 8 constants, compacted 21520 to 14320 bytes.
1471:../witcher-bad/387.pipeline_test - FS SIMD8 shader: 1345 inst, 1 loops, 70485 cycles, 16:24 spills:fills, Promoted 8 constants, compacted 21520 to 14320 bytes.
1511:../witcher-bad/379.pipeline_test - FS SIMD8 shader: 2319 inst, 1 loops, 81438 cycles, 20:71 spills:fills, Promoted 14 constants, compacted 37104 to 25120 bytes.
1513:../witcher-bad/381.pipeline_test - FS SIMD8 shader: 3002 inst, 1 loops, 96749 cycles, 63:172 spills:fills, Promoted 19 constants, compacted 48032 to 32416 bytes.
1514:../witcher-bad/382.pipeline_test - FS SIMD8 shader: 1338 inst, 1 loops, 70500 cycles, 16:24 spills:fills, Promoted 8 constants, compacted 21408 to 14256 bytes.
Before e5a06d3
1516:../witcher-good/381.pipeline_test - FS SIMD8 shader: 2597 inst, 1 loops, 74649 cycles, 29:98 spills:fills, Promoted 19 constants, compacted 41552 to 28144 bytes.
1517:../witcher-good/382.pipeline_test - FS SIMD8 shader: 2807 inst, 1 loops, 75246 cycles, 29:98 spills:fills, Promoted 23 constants, compacted 44912 to 30048 bytes.
@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Mar 14, 2019

@jekstrand ^ any chance you could take a look at this?

@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Mar 14, 2019

@grigi can you test whether this build fixes your stutter issues in Witcher 3?
dxvk-anv.tar.gz

@jekstrand

This comment has been minimized.

Copy link

jekstrand commented Mar 14, 2019

This is useful on AMD drivers in cases where components of a single UBO elements are used in different scopes, since the compiler can combine multiple loads into one but not split one load into multiple loads.

Ugh... I've been meaning to write a UBO/SSBO load-combining pass for some time now but it hasn't happened. I guess we need to stick it on the list. @werman, how'd you like to write your first NIR optimization pass? Let's chat on IRC.

@werman

This comment has been minimized.

Copy link

werman commented Mar 14, 2019

@jekstrand Yes, I could try, I already walked out of the office and would be able to chat only tomorrow.

@jekstrand

This comment has been minimized.

Copy link

jekstrand commented Mar 14, 2019

Actually... I realized I already have the pass written and laying around in a branch. I wrote it earlier this week for something completely unrelated. I'll get it cleaned up and post it here after a bit.

@jekstrand

This comment has been minimized.

Copy link

jekstrand commented Mar 14, 2019

Here's the pass and I have it wired up for ANV:

https://gitlab.freedesktop.org/jekstrand/mesa/commits/wip/nir-lower-array-of-vec

It's a bit annoying that driver and app have to be fighting over this but oh, well. I already had the pass written so it's not that hard to just tweak a few things and turn it on. @werman, could you please let me know if it helps.

@grigi

This comment has been minimized.

Copy link

grigi commented Mar 15, 2019

@grigi can you test whether this build fixes your stutter issues in Witcher 3?
dxvk-anv.tar.gz

I'll test it this afternoon.

@doitsujin

This comment has been minimized.

Copy link
Owner

doitsujin commented Mar 15, 2019

fwiw 1.0.1 already includes that patch so you could try that as well. It should also fix your gamma issue.

@werman

This comment has been minimized.

Copy link

werman commented Mar 15, 2019

@jekstrand Yes it helps, all spillings are gone.

@jekstrand

This comment has been minimized.

@eproxy

This comment has been minimized.

Copy link

eproxy commented Mar 15, 2019

I have similar issue with Final Fantasy XiV. Using dxvk up to 0.96 gives rock solid 60+ fps. When I switched to 1.0 and later 1.0.1 fps constantly kept changing between 35-60 and giving occasional performance issues. I'm using the latest nvidia vulkan drivers (arch).

@grigi

This comment has been minimized.

Copy link

grigi commented Mar 15, 2019

@doitsujin The Jitter is vastly improved with 1.0.1 (and a 7-8% speedup) and the brightness/gamma issue has been resolved too :-)
Thank you!

@doitsujin doitsujin closed this Mar 16, 2019

fdo-mirror pushed a commit to freedesktop/mesa that referenced this issue Mar 16, 2019

intel/nir: Lower array-deref-of-vector UBO and SSBO loads
This fixes a serious performance issue with DXVK:

doitsujin/dxvk#937

This was caused by a recent change that to improve performance on RADV
which back-fired on ANV and killed performance for some apps:

doitsujin/dxvk@e5a06d3

Throwing in this bit of lowering lets us come along and CSE those UBO
loads (or copy-prop for SSBO load) and get one load where we previously
would have gotten several.

VkPipeline-db results on Kaby Lake:

    total instructions in shared programs: 5115361 -> 5073185 (-0.82%)
    instructions in affected programs: 1754333 -> 1712157 (-2.40%)
    helped: 5331
    HURT: 63

    total cycles in shared programs: 2544501169 -> 2481144545 (-2.49%)
    cycles in affected programs: 2531058653 -> 2467702029 (-2.50%)
    helped: 9202
    HURT: 4323

    total loops in shared programs: 3340 -> 3331 (-0.27%)
    loops in affected programs: 9 -> 0
    helped: 9
    HURT: 0

    total spills in shared programs: 3246 -> 3053 (-5.95%)
    spills in affected programs: 384 -> 191 (-50.26%)
    helped: 10
    HURT: 5

    total fills in shared programs: 4626 -> 4452 (-3.76%)
    fills in affected programs: 439 -> 265 (-39.64%)
    helped: 10
    HURT: 5

All of the shaders with hurt spilling were in Rise of the Tomb Raider
which also had shaders solidly helped in the spilling department.  Not
shown in those results (because I've not had success dumping the
shaders) is Witcher 3 where this reduces spilling and improves over-all
perf by around 20-25%.  There were no shader-db changes.  Apparently,
this just isn't a pattern that happens in OpenGL.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: "19.0" mesa-stable@lists.freedesktop.org

fdo-mirror pushed a commit to freedesktop/mesa that referenced this issue Mar 20, 2019

intel/nir: Lower array-deref-of-vector UBO and SSBO loads
This fixes a serious performance issue with DXVK:

doitsujin/dxvk#937

This was caused by a recent change that to improve performance on RADV
which back-fired on ANV and killed performance for some apps:

doitsujin/dxvk@e5a06d3

Throwing in this bit of lowering lets us come along and CSE those UBO
loads (or copy-prop for SSBO load) and get one load where we previously
would have gotten several.

VkPipeline-db results on Kaby Lake:

    total instructions in shared programs: 5115361 -> 5073185 (-0.82%)
    instructions in affected programs: 1754333 -> 1712157 (-2.40%)
    helped: 5331
    HURT: 63

    total cycles in shared programs: 2544501169 -> 2481144545 (-2.49%)
    cycles in affected programs: 2531058653 -> 2467702029 (-2.50%)
    helped: 9202
    HURT: 4323

    total loops in shared programs: 3340 -> 3331 (-0.27%)
    loops in affected programs: 9 -> 0
    helped: 9
    HURT: 0

    total spills in shared programs: 3246 -> 3053 (-5.95%)
    spills in affected programs: 384 -> 191 (-50.26%)
    helped: 10
    HURT: 5

    total fills in shared programs: 4626 -> 4452 (-3.76%)
    fills in affected programs: 439 -> 265 (-39.64%)
    helped: 10
    HURT: 5

All of the shaders with hurt spilling were in Rise of the Tomb Raider
which also had shaders solidly helped in the spilling department.  Not
shown in those results (because I've not had success dumping the
shaders) is Witcher 3 where this reduces spilling and improves over-all
perf by around 20-25%.  There were no shader-db changes.  Apparently,
this just isn't a pattern that happens in OpenGL.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: "19.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit d3386e7)
Conflicts resolved by Dylan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.