New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an alternative backend parallel compilation queue for macOS #5291
Conversation
Download the artifacts for this pull request: Experimental GUI (Avalonia)GUI-less (SDL2)Only for Developers
|
@@ -134,7 +134,7 @@ internal class AppHost | |||
_inputManager = inputManager; | |||
_accountManager = accountManager; | |||
_userChannelPersistence = userChannelPersistence; | |||
_renderingThread = new Thread(RenderLoop, 1 * 1024 * 1024) { Name = "GUI.RenderThread" }; | |||
_renderingThread = new Thread(RenderLoop, 2 * 1024 * 1024) { Name = "GUI.RenderThread" }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the same change in SDL2 headless and GTK for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be in a common place, but I'm not sure where to put it. It will be annoying to change 4 files every time we want to change the stack size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess maybe some helper in the common project to create a thread with a given name and stack size would be nice?
Then we would have other helper around for the common thread that are duplicated for all projects that have the right values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, the render thread should be created by the ThreadedRenderer
and the GPU thread should be created on the GpuContext
. But it's not that simple since the OpenGL context is tied to the thread it is created on, and there's also some coupling with the UI and a few places that needs to know if the current thread is the "render" thread. Not something that I will change here though, so I guess I will just increase the stack size of the other threads for now.
It is only used on macOS, nothing should change for other platforms. |
Maybe it could be nice to notice that somewhere in the code or at least add an OS check (even if it's useless) just in case? |
It already has both the OS check and a comment explaining why it is done: if (OperatingSystem.IsMacOS())
{
MVKInitialization.Initialize();
// Any device running on MacOS is using MoltenVK, even Intel and AMD vendors.
IsMoltenVk = true;
// The default thread stack size on MacOS is low, and can cause stack overflow
// on SPIR-V Cross during shader compilation.
// As a workaround, we use this custom queue which allows us to specify the stack
// size of the threads used for compilation.
ShaderCompilationQueue = new ShaderCompilationQueue();
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah the check is already in the existing code, sorry! LGTM then (In cases there is no performance issue in the shader compilation)!
@@ -134,7 +134,7 @@ internal class AppHost | |||
_inputManager = inputManager; | |||
_accountManager = accountManager; | |||
_userChannelPersistence = userChannelPersistence; | |||
_renderingThread = new Thread(RenderLoop, 1 * 1024 * 1024) { Name = "GUI.RenderThread" }; | |||
_renderingThread = new Thread(RenderLoop, 2 * 1024 * 1024) { Name = "GUI.RenderThread" }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of readability, it might be nice to make it a named constant - either the whole stack size or the multiplier
This sounds cleaner to me at face value. Granted, I have very little experience with gpu code myself but if it's something that I'd like to take a crack at. Just not sure if priority/timing is a concern as I'd probably fumble through it. |
I really don't think that's cleaner, the level of nesting it would split at is arbitrary, and the information is not useful for most platforms. An alternate task queue changes the least amount of shared code, and keeps background compilation active for all shaders. |
It seems to me that using framework managed Tasks would be cleaner than hand rolling custom thread management. You both know the space way better than I do though 😅 |
hope it's not a stupid question... if it's just because the stack size is too small, why not just increase it? refer to dotnet/fsharp#9231 (comment)
|
I tried that on macOS some time ago and got a crash, maybe I did something wrong. Anyway, it is undesirable to tell users to set an environment variable before running the emulator, so we need a solution that doesn't require any user intervention. |
environment variable can be set in
in this way, users can start Ryujinx in the same way (simply click on the icon) |
I am wondering if this is a reasonable alternative would be better than changing stacksize for all threads. |
Nice, I didn't know you could do that. Still, maybe increasing the stack size of all threads is a bit excessive (but it would have the advantage that we can match the default stack size of other platforms, which is usually higher than macOS).
Isn't this basically what this PR does. Although the implementation doesn't look correct, it creates new threads every time a new shader needs to be compiled which I guess is not going to be very efficient. Also it waits for the threads to exit right after which makes it kinda pointless. It should only need to wait when the shaders are actually used. |
That was my bad I was not paying attention and reviewing the changes and yes I would trust your changes more as I am way out of my depth here. My mistake sorry! |
OK! I think I could look into how to add this to the CI process |
Superseded by #5349. |
Due to the way how SPIRV-Cross works, some shaders with deeply nested if/else blocks can cause a stack overflow depending on the stack size. SPIRV-Cross is apparently using recursive function calls in some function where it visits blocks in the code (which is bad in general IMO). To work around this issue, this change basically implements a custom thread pool. Since we create the threads ourselves, we can specify a custom stack size that is large enough to make SPIRV-Cross happy.
The obvious downside with that change is that we need to create our own thread pool, which is most likely going to perform worse than .NET one (also the current implementation is pretty basic, it has 8 queues and adds compilation requests to them in a rotating way, maybe it should have a better method for balancing the load).
An alternative to this would passing some value to the backend that indicates the amount of nesting the shader has. The backend would use that value to decide if it should use
Task
s to compile shaders in parallel, or if it should just compile it on the render thread (which already has an increased stack size).Suggestions are welcome. I didn't really want to upstream that since I think not using
Task
s here is a pretty significant downside. But as we come very close to upstreaming all the changes, and there's a general desire to tell everyone to switch to more up-to-date builds instead of using macos1 for specific games, I felt it was worth PRing it, even if just to start discussion about a better way to solve this problem.We will need tests to ensure shader compilation is not slower on macOS with this change.
Affected games
This fixes a stack overflow on Mortal Kombat 11. I also opened #5290 to fix a different crash that I found while testing it. For this game, I had to increase the stack size to 2MB (on macos1, I used 1MB as that was enough for Splatoon 3).
Also fixes the stack overflow crash when starting Splatoon 3.
Contributes to #4062, closes #5282.