Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
BatchRendererGroup shader optimization for low-end platforms
This PR aims to optimize the BatchRendererGroup shader variant code, especially for low-end platforms such as Android. Specifically, I've been running with Quest 2 on my side. The PR mainly adds two optimizations.
### Caching instanced property loads
This one attempts to fix the shader code-gen by caching some property loads. With BRG shader variants, we override material properties such as `_BaseColor` like this:
`#define _BaseColor UNITY_ACCESS_DOTS_INSTANCED_PROP_WITH_DEFAULT(float4, _BaseColor)`
This macro is expanded to a function which is going to branch and either load the property directly from the material cbuffer, or run some ALU operations to compute a buffer address based on the current instanceID + property metadata. It then loads the property from this buffer.
This computation is not free, and it seems that using for example `_BaseColor` multiple times in the shader will results in this function being run multiple times. It looks like the compiler is unable to optimize this case by itself.
This PR tries to fix the problem by calling this function only once per property (if ever used) at the beginning of the shader, and then store the result in a static value. The macro is then defined directly as `#define _BaseColor MyStaticValue` so that there is no risk for the loading computation to be regenerated.
This optimization improved GPU time by ~10% on Quest 2 with the URPLitProperties scene. FPS counter also went from 50 to 55 with this setup.
### Static branch for instanced property loads
The second optimization is about turning the dynamic branch in the instanced property loading code to a static branch when possible. Loading a property such as the `_BaseColor` can be done either from the material cbuffer, or from the big raw buffer. Selecting from where the data must be loaded is done dynamically by checking the metadata high-bit in the shader.
The cost of this branch has been negligible on high/mid-end platforms, but for low-end platforms it has a pretty big impact. The thing is that not all properties usually need to be instanced in the game. You might want to instantiate the `_BaseColor` but not the `_Smoothness` for example. It depends entirely on the game and this is something users need to be able to control so that they don't pay the cost of the expensive dynamic branch for a feature they do not use.
Ideally we would have a nice UI for this, but multiple discussions suggest that there is a lot of technical problem with that. So the goal here is to provide some simple utilities allowing users to control which properties they want to instantiate by modifying the shader themselves. It's not ideal UX wise, but at least it's something we can do now.
This PR introduces 4 new macros:
- `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED`
- `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_ENABLED`
- `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_REQUIRED`
- `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED_BY_DEFAULT`
The three first ones allow you to specify in the metadata cbuffer definition which properties can or must be overridden per instance. The last one is a config define that allows you to change how the default `UNITY_DOTS_INSTANCED_PROP` macro behaves. So for example we could imagine a cbuffer definition like this one:
```
UNITY_DOTS_INSTANCING_START(MaterialPropertyMetadata)
UNITY_DOTS_INSTANCED_PROP_OVERRIDE_ENABLED(float4, _BaseColor)
UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED(float4, _SpecColor)
UNITY_DOTS_INSTANCED_PROP_OVERRIDE_REQUIRED(float4, _EmissionColor)
UNITY_DOTS_INSTANCED_PROP(float , _Cutoff)
UNITY_DOTS_INSTANCING_END(MaterialPropertyMetadata)
```
Here is what this declaration means:
- The `_BaseColor` property can be either instanced or not. A dynamic branch will be emitted so that the shader is able to fetch the data from the material cbuffer or the `unity_DOTSInstanceData` buffer depending on the metadata high-bit value.
- The ` _SpecColor` property is not instantiable. The property will always be loaded from the material cbuffer, and no dynamic branch is emitted in the code.
- The `_EmissionColor` property **must** be instanced. The property will always be loaded from the `unity_DOTSInstanceData` buffer, and no dynamic branch is emitted in the code.
- The `_Cutoff` property can be instantiable or not depending on the config define.
If `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED_BY_DEFAULT` is defined, then `_Cutoff` is not instantiable will behave just like `_SpecColor` in this example.
If `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED_BY_DEFAULT` is **not** defined, then `_Cutoff` is instantiable and will behave just like `_BaseColor` in this example.
Disabling instancing for every material property improved the GPU time by ~20 to 40% on Quest 2 with the URPLitProperties scene. So this is likely something you want to keep disabled for the majority of the material properties on low-end platforms considering the performance impact it has.- Loading branch information