-
-
Notifications
You must be signed in to change notification settings - Fork 35.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement KHR_parallel_shader_compile support #16321
Comments
Some initial thoughts from my understanding after reading the thread and the spec: Currently without parallel compiling, the pipeline looks more or less like this: With the latest In order to get the best of it we should group the compilation and linking, as @jdashg proposed on KhronosGroup/WebGL#2855 (comment): for (const x of shaders)
gl.compileShader(x);
for (const x of programs)
gl.linkProgram(x); Using this way we could get a timeline similar to: Grouping the shaders compilation together could be easy enough to achieve as we could just compile them as they go, while grouping the function* linkingProgress(programs) {
const ext = gl.getExtension('KHR_parallel_shader_compile');
let todo = programs.slice();
while (todo.length) {
if (ext) {
todo = todo.filter(x => !gl.getProgramParameter(x, ext.COMPLETION_STATUS_KHR));
} else {
const x = todo.pop();
gl.getProgramParameter(x, gl.LINK_STATUS);
}
yield 1.0 - (todo.length / programs.length);
}
} Introducing this asynchrony we need to define a We should use that |
I've prototyped splitting the compilation phase and link phase of programs to see if you can get much benefit from using the fact the Chrome compiles shaders asynchronously in a separate GPU thread rather than the main JS thread. not very useful, but the same mechanism could be used with parallel compile. There are less mods than you might expect, primarily initMaterial() needs splitting into two halves since the first half setups up the environment for creating a new program if required and the second half which relies on interrogating the program state so in effect serialises with the compilation and linking. WebGLProgram again needs two parts and an isReady status that would check compilation completion. This avoids the need for global lists of shaders and programs in various states and maintains the information in WebGLProgram() objects where is belongs. One interesting question then with a parallel compilation mechanism: You don't have an atomic change of scene after adding several new materials. Some meshes or parts of meshes with multiple materials, would display initially in different frames when intended to display at the same time which could be ugly especially for very slow compiles. How do you handle this? Preserving the atomic nature could be difficult. |
@aardgoose cool! do you have any PR open already for that? It could be interesting to see that approach. Regarding the atomic change, I agree that things always get weird with async in place. We will need to add some validation before using a material that is not fully ready yet, and probably we could include some helpers for the case when you want to load multiple materials like an explosion effect and you want to render it just when all of them are ready. |
Personally started trial implementation to see the behavior. If anyone is interested in Branch Diffs I need to clean up, optimize, and still take care some stuffs but it seems working on Canary. |
FYI, Chrome Canary which is the only browser currently supporting https://www.khronos.org/webgl/public-mailing-list/public_webgl/1904/msg00042.php |
FYI, |
Still WIP but I want to share so far that I locally confirmed |
FWIW I have a prototype patch for Firefox adding support for KHR_parallel_shader_compile which appears to work quite well, it certainly reduces jank although not quite as smooth as Chrome canary., with the basic Khronos demo/test: https://www.khronos.org/registry/webgl/sdk/tests/performance/parallel_shader_compile/ The main overhead now on the main thread with FF appears to be the shader translation/validation before submitting to the GPU driver. Performance trace alternating serial and parallel passes of the above test: |
Very excited to see progress on this! Looking forward to this feature |
An alternative take. The Khronos WebGL group recommend not checking shader compilation status, but just doing:
for all cases, not just using KHR_parallel_shader_compile. The link operation serialises automatically on the compiles in the background. So at least with Chrome you get the benefits of any built in parallel/asynchronous operation that already exists. Firefox actually checks retrieves the completion status and logs as part of the compile and link operations and caches for later use at the moment, so no real benefit there. So I have removed error checking and display from WebGLShaders.js and reworked it in WebGLProgram.js, this could be submitted upstream without changing functionality. https://github.com/aardgoose/three.js/tree/parallel1 Building on this to enable KHR_parallel_shader_compile, but now only needings two states rather than three. https://github.com/aardgoose/three.js/tree/parallel2 Still some minor issues, probably something that should be enabled on a renderer or material basis as required rather than by default. Tested with Chrome canary and my hacked version of Firefox. |
My site freezes on load for a second or two. I can delay compilation, but the freeze is unavoidable, and the shader in question isn't even that many lines of code (500?). Anyway would love to see this in THREE.js. |
Been looking into this recently. Put up a pair of PRs (Primarily #19752, which depends on #19745) that offer one way of taking advantage of parallel shader compilation, though to get the most benefit out of it apps would need to make a fairly minor change to their loading code. If you've been following this issue feel free to let me know how/if that approach works for you! |
Thank you so much for doing this work! I've been dying for this feature for ages. |
I'd like to understand a bit more about the current status of Just like 2.5 years ago, Chrome and Firefox do not support this extension. Only Safari supports it (on OS X and iOS alike, which is nice, therefore in effect for all iPhone and iPad browsers). Also, several folks elsewhere claimed that not much work is done by Is it still good advice to compile all shaders in bulk, and then link all programs in bulk, irrespective of the extension? PS. With or without the extension, Safari seems to be pathologically slow, tested with simple/short |
Chrome supports I don't see support in Firefox on any platform. As for performance gains, I have some screenshots in #19752 (comment) that show the type of difference we're potentially talking about. It's not an absolute speedup in this case, but the ability to defer blocking calls until they can spend the minimal amount of time blocking. (Meaning less jank on the main JS thread.) Also, despite the method name the extension doesn't just cover |
For three-gpu-pathtracer this would be really nice. The path tracing shader is very complex and is taking upwards of 12 seconds to compile on my desktop machine (2070 super, i7 7700k), which seems to affect other tabs and other system responsiveness. There are some other things I think that can be done to improve that time such as removing shader code not needed for the materials being rendered but either way I think an async compile would be a significant improvement. Then it would be able to be more easily compiled alongside other tasks such as file loading, processing, and BVH generation. It looks like this will work everywhere important, now, outside of FF: https://developer.mozilla.org/en-US/docs/Web/API/KHR_parallel_shader_compile |
I have an old patch for FF to implement parallel shader compilation that passes the Khronos WebGL test suite. I expect it would need updating. |
What sort of API would you find useful? |
I think something like |
I'll knock something up along those lines. |
Worth nothing that I'm happy to rebase those and ensure they still work if there's a realistic chance of them being merged. |
@toji Yes I remember that PR now - I would like to see that merged. Unfortunately this touches a sensitive part of the project code that I'm unfamiliar with so I don't feel confident providing feedback. |
Sorry to poke here, but any suggestions on next step(s) to get this blessed, majestic chunk of code merged? It would be a shame to waste what seems to some really high-value work. If this works well it may provide a much smoother user experience for many applications. Thank you. |
This new KHR_parallel_shader_compile extension will bring shader compilation off to the main thread and even if the total compiling time could remain the same it won't be blocking the main thread which is a huge benefit.
It will require some modifications on the way three.js is handling the compilation and linking of shaders and move it from sync to async. Some ideas have been proposed on the Khronos mailing list and the Khronos repo PR
Just opening the issue so we could discuss ideas on how to address this change.
The text was updated successfully, but these errors were encountered: