Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize shader generation / processing / compilation #6085

Open
mvaligursky opened this issue Feb 23, 2024 · 0 comments
Open

Optimize shader generation / processing / compilation #6085

mvaligursky opened this issue Feb 23, 2024 · 0 comments
Assignees
Labels
area: graphics Graphics related issue performance Relating to load times or frame rate

Comments

@mvaligursky
Copy link
Contributor

mvaligursky commented Feb 23, 2024

Preparing shader for rendering takes considerable time, and we should consider implementing some of the following in order to lessen the impact.

Current steps

  • [Step0] if the mesh does not have cached required shader variant, the forward renderer requests its generation
  • [Step1] based on the material / scene and other properties, an option object is created describing all features of the shader. This gets converted to a hash value.
  • [Step2] if the global cache contains the shader with this hash, we use this shader, otherwise the source code of it gets generated based on chunks and this is added to the cache
  • [Step3] we preprocess the shader - remove comments, handle some Safari 14 compatibility and few other bits
  • [Step4] - WebGPU only - process the shaders again: assign locations to attributes and uniforms, convert standalone uniforms to uniform buffers.
  • [Step5] - WebGPU only - transpile GLSL shaders to WGSL shaders
  • [Step6] - WebGL only - kick off the shader compilation and linking as early as possible. Note that we loop over all meshes in a layer and compile their shaders before trying to use any of those for rendering, to allow the browser to compile them in parallel
  • [Step7] - WebGPU only - shaders are not compiled separately, but as we render mesh instances, we create a render pipeline has based on shaders / render state and compile that if not available.

Possible optimizations

Store used shader during testing, and pre-compile those as soon as the game starts

  • We have a system (non public, not supported) to store all options for used standard material shaders, saving those to .js file, which upon loading compiles those.
  • To further cut down the costs, we could store the generated shaders as well, not only options, skipping most of the processing steps above.

Problems

  • Some of these options are device dependent. We use different versions of the shader if the device does not support some features. This makes us capture and compile shaders that are not what the user system needs, potentially failing to compile those (for example on WebGL1), or compiling those and not using them.

Limit the amount of generated shaders

  • Often the shaders only have small differences, for example DiffuseTint enabled.
  • In debug mode, when TRACEID_SHADER_ALLOC is enabled, we log all the shader creation, including the options used for their generation. We / users could analyse those and remove some of these variants
  • We could update our shader generation and remove some of these options that often only cost us one of two shader instructions, tint being a prime example. Simply always tint.

Remove some steps non needed on WebGL

  • steps 3 is mostly unneeded for WebGL (apart from one part on Safari 14), but we execute it in a preparation for Uniform Buffer support on WebGL2, when this will be required. We could temporarily disable this, but would need a different workaround for Safari 14 issue.

Non-blocking async shader compilation

Possible API

  • per material flag to allow async shader compilation for the material
  • per camera flag to allow async shader compilation when rendering the camera. This would allow us to enable it for the main forward renderer camera user supplies, but disable for shadows / lightmapper and similar where it might not be desirable. Typically, those shaders compile fast as are very simple.
  • some form of querying the number of shaders still being compiled. This could update during rendering, and so the scene would need to render for this to update. Alternatively we could loop over the shaders and query this.

Problems

Even if Chrome issues are fixed, there are still engine side limitations:

  • Consider the scenario: We mark all World layer meshes to use async shader compilation (and skip them from rendering). During the first frame, we'd only generate shaders and kick of their compilation. Then we come to the layer Skybox, and that shader is required. We kick of its compilation, and wait for result. As it was compiled after all other shaders, it waits till those are done (more of less, even though it is multi-threaded). So we'd need to walk the scene rendering once, collect async shaders, render the rest, and only then request their compilation. Non trivial, and costly for the frames where no async shaders are found.
  • WebGPU needs a different solution, as render pipelines are created, instead of shaders being compiled. Ideally a solution / APIs exposing it would handle both.

References:

@mvaligursky mvaligursky added performance Relating to load times or frame rate area: graphics Graphics related issue labels Feb 23, 2024
@mvaligursky mvaligursky self-assigned this Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: graphics Graphics related issue performance Relating to load times or frame rate
Projects
None yet
Development

No branches or pull requests

1 participant