-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vulkan: Parallelize GLSL compilation #15589
Conversation
Unfortunately, on my Galaxy S8, there seems to be a global lock in pipeline creation, so that creating multiple pipelines in parallel has ~0 benefit :( Might not be the same on all platforms. But seems like the main way forward for reducing stutters past this will indeed be to try to create less different shaders, branching on uniforms instead. |
So this doesn't help #13480? |
It might, a bit yes. I'll get back to this later, have many other things I want to fix... |
0250fb5
to
9994a25
Compare
Rebased on master. I removed the last commit that parallelizes the actual pipeline creation (will create a separate PR with it later for evaluation), but at least with this, GLSL compilation is done in parallel, which is nice and does reduce first-time shader stutters a bit. I want this in now because I'll have to make changes in this area to optimally support #15944 , and might as well avoid collisions. |
…ctual shader modules. Prevents early block.
Slightly more expensive I guess but shouldn't be much of a bottleneck.
…g with multiple waiters
9994a25
to
fb3f417
Compare
So a note on this, I never merged the next step that creates pipelines in parallel because I couldn't measure an improvement on Mali. And indeed, I've now got confirmation that on drivers < v36, there's no benefit, but there will likely be on higher ones. So might get back to that and only enable on new Mali drivers and maybe other archs later. |
Instead of simply directly compiling GLSL to SPIR-V on the main thread using glslang (which, it turns out, is just as expensive as pipeline creation, at least on my machine), we spawn tasks as the needed shaders are discovered, and then the actual pipeline creation blocks on those tasks to finish. This, almost surprisingly, often manages to spread the work out over 10+ threads in heavy scenarios, reducing the length of shader compilation stutters.
There seems to be some minor serialization over memory allocation from within glslang, but still a win.
Should reduce shader compilation stutters a bit. There's of course lots more to do in this area.
Still a draft as there seems to be a leak of shader modules (according to the validation layers) that I can't quite figure out right now.Never mind, fixed, still leaving as draft due to need for more testing.A next step might be to run multiple CreateGraphicsPipeline in parallel on the pool too instead of having a single "ShaderCompileThread", but there are recommendations to run compiles that use the same shaders on the same thread so some additional logic would be needed (the ShaderCompileThread should probably remain and try to group the pipeline creation requests to the best of its ability as they come in).
The actual win on Android seems pretty small on older devices with 4 cores unfortunately. Probably need to also look into pipeline creation parallelization.