Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash unless meshlets disabled #577

Open
hwnde opened this issue Feb 14, 2024 · 14 comments
Open

Crash unless meshlets disabled #577

hwnde opened this issue Feb 14, 2024 · 14 comments

Comments

@hwnde
Copy link
Contributor

hwnde commented Feb 14, 2024

Windows 11, fresh install, commit 26b816e
Game crashes when starting a new game or loading an existing save. Disabling meshlets resolves the issue.
crash.log

@dreimer1986
Copy link

#451 (comment)

Lemme guess... AMD graphics card? (And I answer it myself: AMD Radeon RX 6800 XT)

@Try
Copy link
Owner

Try commented Feb 14, 2024

AMD commercial driver has fully broken mesh-shading unfortunately. If you have time - please send bug-reports to them, as we need a driver fix.

@hwnde
Copy link
Contributor Author

hwnde commented Feb 15, 2024

Could you elaborate on what the issue is? I couldn't find anything in Google.
Also what is the consequence of disabling meshlets? Are there any visual differences or it's only a performance optimization?

@dreimer1986
Copy link

AMD commercial driver has fully broken mesh-shading unfortunately. If you have time - please send bug-reports to them, as we need a driver fix.

AMD seems to no care much about these reports though. We sent one already. I think they disagree with that being their fault, even though OpenGothic runs on Intel and NVidia cards just fine.

@Try
Copy link
Owner

Try commented Feb 15, 2024

I think they disagree with that being their fault, even though...

Even though OpenGothic runs on same hardware with opensource driver just fine :)

Could you elaborate on what the issue is? Also what is the consequence of disabling meshlets?

Consequence is performance.
In Vulkan or DX12 or any other graphics api, there is a base spec with basic functionality and extensions. In OpenGothic engine checks for mesh-shader extension, and uses it when possible to draw mesh-clusters. If no mesh-shader game can still run, with fallback to triangle-lists. And triangle list is part of base spec - so should work.
Reason to no use triangle-list is total lack of vertex-reuse, what forces engine to do 3-times more work in geometry.
Another problem is occupancy. On NVidia triangle-list, capt to 93% utilization rate in theory... in tests is as bad as 6%

On a bug side - pretty certain that it's AMD bug. When engine calls to vkCreateGraphicsPipelines, it doesn't compile shader, and doesn't return en error code - it hard crashes deep in amdvlk64.dll.

@hwnde
Copy link
Contributor Author

hwnde commented Feb 19, 2024

I can try to debug the issue to figure out what exactly causes it and whether the shader can be modified to compile. But I don't see any instructions for building and debugging the project in Windows. Does anyone have experience with it?

@dreimer1986
Copy link

MSVC 2022 and the project downloaded off a private fork. There were some dependencies to fulfill, but that was a while ago I had to do that. Try these easy steps first and then we will see what's missing.

Likely you will need:

Likely I forgot something, but this is a good start to minimize the problems.

@Try
Copy link
Owner

Try commented Feb 19, 2024

debugging the project in Windows. Does anyone have experience with it?

cmake + VulkanSDK + windows-sdk should be enough.
No need to install glslang or set environment variables - VulkanSDK installer will take care of it.

CMakeLists.txt can be opened with Visual-Studio (open folder) and should work just fine. If there is any issues with bring up - let me know.

Shader that you interested in is main.vert - it has multiple variants, based on geometry-type and other factors.

@hwnde
Copy link
Contributor Author

hwnde commented Feb 20, 2024

I've got some results. The culprit is this line:
shOut[laneID] = var;

I managed to get the shader to work by replacing it with this:

#if defined(MAT_COLOR)
    shOut[laneID].color = var.color;
#endif
#if defined(MAT_UV)
    shOut[laneID].uv = var.uv;
#endif
#if defined(MAT_POSITION)
    shOut[laneID].pos.x = var.pos.x;
    shOut[laneID].pos.y = var.pos.y;
    shOut[laneID].pos.z = var.pos.z;
#endif
#if defined(MAT_NORMAL)
    shOut[laneID].normal.x = var.normal.x;
    shOut[laneID].normal.y = var.normal.y;
    shOut[laneID].normal.z = var.normal.z;
#endif

Assigning pos or normal directly shOut[laneID].pos = var.pos; results in a crash. I don't know why. I guess for some reason it doesn't like vec3 since color and uv work correctly.
I can create a PR with this solution unless there are better ideas.

@Try
Copy link
Owner

Try commented Feb 20, 2024

Thanks @hwnde , for investigating it!

I can create a PR with this solution unless there are better ideas.

This is an unease split:

  • with new found evidence we can say for sure, that this is a 100% driver bug.
  • it's not realistic to carry not only API-workarounds, but also a WA for all driver bugs

My suggestion here is to report this issue again, to AMD: someone who owns their gpu can use https://www.amd.com/en/support/kb/faq/amdbrt#
In report it's important to make clear that problem scope is not a game, but rather a small part of the shader. It seems they are unable to unroll varying vec3 assignment properly. float/vec2/vec4 are safe, likely due to power of two size.

I'm not really aware, if they have other way of bug reporting specifically for game-developers, who does not own their gpu.

Secondary option: disable mesh-shader on AMD by default - call it broken. Two major downsides here: I'm not happy to have conditionals with HW-vendor names, and we will ruin performance for MESA users.

@hwnde
Copy link
Contributor Author

hwnde commented Feb 21, 2024

I'm not really aware, if they have other way of bug reporting specifically for game-developers, who does not own their gpu.

I think this is where such issues are reported: https://community.amd.com/t5/opengl-vulkan/bd-p/opengl-and-vulkan-discussions

I can post an issue and see if AMD responds. However it's still an open question what to do with this issue. There is no guarantee that AMD will do a fix any time soon or ever.
I'd argue that doing the shader workaround is the approach with least downsides right now. First of all, although it addresses an issue with specific driver, the code itself will work with any driver and any vendor, we don't have to do any vendor detection here. On the other hand disabling mesh shader on AMD will require vendor detection and also will deprive AMD users on Windows (which are the vast majority of all AMD users) from performance benefits that come with mesh shaders.

@Try
Copy link
Owner

Try commented Feb 21, 2024

Started a post on AMD for forum:
https://community.amd.com/t5/newcomers-start-here/driver-crash-when-using-vk-ext-mesh-shader/m-p/667490#M2963

Feel, free to join discussion :)

I'd argue that doing the shader workaround is the approach with least downsides right now.

I'm still worry, about maintaining it, since shader code is always something to edit. Let, see how fast they will respond to the report;
Maybe merging fix + add a comment with URL to community.amd.com + comment when it's should be safe to remove workaround is the way

@AnyOldName3
Copy link

I'm an OpenMW graphics developer who's been sent a link to this, and have a couple of recommendations:

  • Get lots of users to send reports of problems via the bug reporting tool (little insect icon) in the Radeon Software.
  • Get at least one of your team to join AMD's Vanguard program. This makes it much easier to get in contact with AMD staff, and much easier to get your complaints forwarded to engineers. If you want, I can send someone a message about your application, which should make it more likely to get accepted quickly.

@Try
Copy link
Owner

Try commented Feb 25, 2024

Hi, @AnyOldName3 and thanks for recommendation!

Get at least one of your team to join AMD's Vanguard program.

That would be nice. Unfortunately can't take opportunity myself - due to lack of AMD gpu.
Maybe @hwnde or @dreimer1986 are interested?

And one more option: sneak a test to Vulkan-CTS, that uses mesh-shader + vec3 :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants