Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
BatchRendererGroup shader optimization for low-end platforms
This PR aims to optimize the BatchRendererGroup shader variant code, especially for low-end platforms such as Android. Specifically, I've been running with Quest 2 on my side. The PR mainly adds two optimizations. 

### Caching instanced property loads
This one attempts to fix the shader code-gen by caching some property loads. With BRG shader variants, we override material properties such as `_BaseColor` like this:
`#define _BaseColor  UNITY_ACCESS_DOTS_INSTANCED_PROP_WITH_DEFAULT(float4, _BaseColor)`

This macro is expanded to a function which is going to branch and either load the property directly from the material cbuffer, or run some ALU operations to compute a buffer address based on the current instanceID + property metadata. It then loads the property from this buffer. 

This computation is not free, and it seems that using for example `_BaseColor` multiple times in the shader will results in this function being run multiple times. It looks like the compiler is unable to optimize this case by itself. 

This PR tries to fix the problem by calling this function only once per property (if ever used) at the beginning of the shader, and then store the result in a static value. The macro is then defined directly as `#define _BaseColor MyStaticValue` so that there is no risk for the loading computation to be regenerated. 

This optimization improved GPU time by ~10% on Quest 2 with the URPLitProperties scene. FPS counter also went from 50 to 55 with this setup.

### Static branch for instanced property loads
The second optimization is about turning the dynamic branch in the instanced property loading code to a static branch when possible. Loading a property such as the `_BaseColor` can be done either from the material cbuffer, or from the big raw buffer. Selecting from where the data must be loaded is done dynamically by checking the metadata high-bit in the shader. 

The cost of this branch has been negligible on high/mid-end platforms, but for low-end platforms it has a pretty big impact. The thing is that not all properties usually need to be instanced in the game. You might want to instantiate the `_BaseColor` but not the `_Smoothness` for example. It depends entirely on the game and this is something users need to be able to control so that they don't pay the cost of the expensive dynamic branch for a feature they do not use. 

Ideally we would have a nice UI for this, but multiple discussions suggest that there is a lot of technical problem with that. So the goal here is to provide some simple utilities allowing users to control which properties they want to instantiate by modifying the shader themselves. It's not ideal UX wise, but at least it's something we can do now. 

This PR introduces 4 new macros:
- `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED`
- `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_ENABLED`
- `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_REQUIRED`
- `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED_BY_DEFAULT`

The three first ones allow you to specify in the metadata cbuffer definition which properties can or must be overridden per instance. The last one is a config define that allows you to change how the default `UNITY_DOTS_INSTANCED_PROP` macro behaves. So for example we could imagine a cbuffer definition like this one:
```
UNITY_DOTS_INSTANCING_START(MaterialPropertyMetadata)
    UNITY_DOTS_INSTANCED_PROP_OVERRIDE_ENABLED(float4, _BaseColor)
    UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED(float4, _SpecColor)
    UNITY_DOTS_INSTANCED_PROP_OVERRIDE_REQUIRED(float4, _EmissionColor)
    UNITY_DOTS_INSTANCED_PROP(float , _Cutoff)
UNITY_DOTS_INSTANCING_END(MaterialPropertyMetadata)
```
Here is what this declaration means:
- The `_BaseColor` property can be either instanced or not. A dynamic branch will be emitted so that the shader is able to fetch the data from the material cbuffer or the `unity_DOTSInstanceData` buffer depending on the metadata high-bit value.
- The ` _SpecColor` property is not instantiable. The property will always be loaded from the material cbuffer, and no dynamic branch is emitted in the code.
- The `_EmissionColor` property **must** be instanced. The property will always be loaded from the `unity_DOTSInstanceData` buffer, and no dynamic branch is emitted in the code.
- The `_Cutoff` property can be instantiable or not depending on the config define. 
If `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED_BY_DEFAULT` is defined, then `_Cutoff` is not instantiable will behave just like  `_SpecColor` in this example. 
If `UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED_BY_DEFAULT` is **not** defined, then `_Cutoff` is instantiable and will behave just like `_BaseColor` in this example.

Disabling instancing for every material property improved the GPU time by ~20 to 40% on Quest 2 with the URPLitProperties scene. So this is likely something you want to keep disabled for the majority of the material properties on low-end platforms considering the performance impact it has.
  • Loading branch information
vincent-breysse authored and Evergreen committed Aug 13, 2023
1 parent 77f7a64 commit 6e919c5
Show file tree
Hide file tree
Showing 11 changed files with 555 additions and 187 deletions.
Expand Up @@ -7,6 +7,14 @@
#error DOTS Instancing requires the new shader preprocessor. Please enable Caching Preprocessor in the Editor settings!
#endif

// Config defines
// ==========================================================================================
// #define UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED_BY_DEFAULT





/*
Here's a bit of python code to generate these repetitive typespecs without
a lot of C macro magic
Expand Down Expand Up @@ -100,6 +108,10 @@ for t, c, sz in (
#define UNITY_DOTS_INSTANCING_TYPESPEC_min16float4 H8
#define UNITY_DOTS_INSTANCING_TYPESPEC_SH F128

static const int kDotsInstancedPropOverrideDisabled = 0;
static const int kDotsInstancedPropOverrideSupported = 1;
static const int kDotsInstancedPropOverrideRequired = 2;

#define UNITY_DOTS_INSTANCING_CONCAT2(a, b) a ## b
#define UNITY_DOTS_INSTANCING_CONCAT4(a, b, c, d) a ## b ## c ## d
#define UNITY_DOTS_INSTANCING_CONCAT_WITH_METADATA(metadata_prefix, typespec, name) UNITY_DOTS_INSTANCING_CONCAT4(metadata_prefix, typespec, _Metadata, name)
Expand All @@ -115,20 +127,53 @@ for t, c, sz in (
// underscore in the common case where the property name starts with an underscore.
// A prefix double underscore is illegal on some platforms like OpenGL.
#define UNITY_DOTS_INSTANCED_METADATA_NAME(type, name) UNITY_DOTS_INSTANCING_CONCAT_WITH_METADATA(unity_DOTSInstancing, UNITY_DOTS_INSTANCING_CONCAT2(UNITY_DOTS_INSTANCING_TYPESPEC_, type), name)
#define UNITY_DOTS_INSTANCED_PROP_OVERRIDE_MODE_NAME(name) UNITY_DOTS_INSTANCING_CONCAT2(name, _DOTSInstancingOverrideMode)

#define UNITY_DOTS_INSTANCING_START(name) cbuffer UnityDOTSInstancing_##name {
#define UNITY_DOTS_INSTANCING_END(name) }
#define UNITY_DOTS_INSTANCED_PROP(type, name) uint UNITY_DOTS_INSTANCED_METADATA_NAME(type, name);

#define UNITY_ACCESS_DOTS_INSTANCED_PROP(type, var) LoadDOTSInstancedData_##type(UNITY_DOTS_INSTANCED_METADATA_NAME(type, var))
#define UNITY_ACCESS_DOTS_AND_TRADITIONAL_INSTANCED_PROP(type, arr, var) LoadDOTSInstancedData_##type(UNITY_DOTS_INSTANCED_METADATA_NAME(type, var))
#define UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED(type, name) static const uint UNITY_DOTS_INSTANCED_METADATA_NAME(type, name) = 0; \
static const int UNITY_DOTS_INSTANCED_PROP_OVERRIDE_MODE_NAME(name) = kDotsInstancedPropOverrideDisabled;

#define UNITY_DOTS_INSTANCED_PROP_OVERRIDE_SUPPORTED(type, name) uint UNITY_DOTS_INSTANCED_METADATA_NAME(type, name); \
static const int UNITY_DOTS_INSTANCED_PROP_OVERRIDE_MODE_NAME(name) = kDotsInstancedPropOverrideSupported;

#define UNITY_DOTS_INSTANCED_PROP_OVERRIDE_REQUIRED(type, name) uint UNITY_DOTS_INSTANCED_METADATA_NAME(type, name); \
static const int UNITY_DOTS_INSTANCED_PROP_OVERRIDE_MODE_NAME(name) = kDotsInstancedPropOverrideRequired;

#ifdef UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED_BY_DEFAULT
#define UNITY_DOTS_INSTANCED_PROP(type, name) UNITY_DOTS_INSTANCED_PROP_OVERRIDE_DISABLED(type, name)
#else
#define UNITY_DOTS_INSTANCED_PROP(type, name) UNITY_DOTS_INSTANCED_PROP_OVERRIDE_SUPPORTED(type, name)
#endif

#define UNITY_DOTS_INSTANCED_PROP_IS_OVERRIDE_DISABLED(name) (UNITY_DOTS_INSTANCED_PROP_OVERRIDE_MODE_NAME(name) == kDotsInstancedPropOverrideDisabled)
#define UNITY_DOTS_INSTANCED_PROP_IS_OVERRIDE_ENABLED(name) (UNITY_DOTS_INSTANCED_PROP_OVERRIDE_MODE_NAME(name) == kDotsInstancedPropOverrideSupported)
#define UNITY_DOTS_INSTANCED_PROP_IS_OVERRIDE_REQUIRED(name) (UNITY_DOTS_INSTANCED_PROP_OVERRIDE_MODE_NAME(name) == kDotsInstancedPropOverrideRequired)

#define UNITY_ACCESS_DOTS_INSTANCED_PROP(type, var) ( \ // Compile-time branches
UNITY_DOTS_INSTANCED_PROP_IS_OVERRIDE_ENABLED(var) ? LoadDOTSInstancedData_##type(UNITY_DOTS_INSTANCED_METADATA_NAME(type, var)) \
: UNITY_DOTS_INSTANCED_PROP_IS_OVERRIDE_REQUIRED(var) ? LoadDOTSInstancedDataOverridden_##type(UNITY_DOTS_INSTANCED_METADATA_NAME(type, var)) \
: ((type)0) \
)

#define UNITY_ACCESS_DOTS_INSTANCED_PROP_WITH_DEFAULT(type, var) ( \ // Compile-time branches
UNITY_DOTS_INSTANCED_PROP_IS_OVERRIDE_ENABLED(var) ? LoadDOTSInstancedData_##type(var, UNITY_DOTS_INSTANCED_METADATA_NAME(type, var)) \
: UNITY_DOTS_INSTANCED_PROP_IS_OVERRIDE_REQUIRED(var) ? LoadDOTSInstancedDataOverridden_##type(UNITY_DOTS_INSTANCED_METADATA_NAME(type, var)) \
: (var) \
)

#define UNITY_ACCESS_DOTS_INSTANCED_PROP_WITH_DEFAULT(type, var) LoadDOTSInstancedData_##type(var, UNITY_DOTS_INSTANCED_METADATA_NAME(type, var))
#define UNITY_ACCESS_DOTS_AND_TRADITIONAL_INSTANCED_PROP_WITH_DEFAULT(type, arr, var) LoadDOTSInstancedData_##type(var, UNITY_DOTS_INSTANCED_METADATA_NAME(type, var))
#define UNITY_ACCESS_DOTS_INSTANCED_PROP_WITH_CUSTOM_DEFAULT(type, var, default_value) ( \ // Compile-time branches
UNITY_DOTS_INSTANCED_PROP_IS_OVERRIDE_ENABLED(var) ? LoadDOTSInstancedData_##type(default_value, UNITY_DOTS_INSTANCED_METADATA_NAME(type, var)) \
: UNITY_DOTS_INSTANCED_PROP_IS_OVERRIDE_REQUIRED(var) ? LoadDOTSInstancedDataOverridden_##type(UNITY_DOTS_INSTANCED_METADATA_NAME(type, var)) \
: (default_value) \
)

#define UNITY_ACCESS_DOTS_INSTANCED_PROP_WITH_CUSTOM_DEFAULT(type, var, default_value) LoadDOTSInstancedData_##type(default_value, UNITY_DOTS_INSTANCED_METADATA_NAME(type, var))
#define UNITY_ACCESS_DOTS_AND_TRADITIONAL_INSTANCED_PROP_WITH_CUSTOM_DEFAULT(type, arr, var, default_value) LoadDOTSInstancedData_##type(default_value, UNITY_DOTS_INSTANCED_METADATA_NAME(type, var))
#define UNITY_ACCESS_DOTS_AND_TRADITIONAL_INSTANCED_PROP(type, arr, var) UNITY_ACCESS_DOTS_INSTANCED_PROP(type, var)
#define UNITY_ACCESS_DOTS_AND_TRADITIONAL_INSTANCED_PROP_WITH_DEFAULT(type, arr, var) UNITY_ACCESS_DOTS_INSTANCED_PROP_WITH_DEFAULT(type, var)
#define UNITY_ACCESS_DOTS_AND_TRADITIONAL_INSTANCED_PROP_WITH_CUSTOM_DEFAULT(type, arr, var, default_value) UNITY_ACCESS_DOTS_INSTANCED_PROP_WITH_CUSTOM_DEFAULT(type, var, default_value)

#define UNITY_SETUP_DOTS_MATERIAL_PROPERTY_CACHES() // No-op by default

#ifdef UNITY_DOTS_INSTANCING_UNIFORM_BUFFER
CBUFFER_START(unity_DOTSInstanceData)
Expand Down Expand Up @@ -376,6 +421,11 @@ type LoadDOTSInstancedData_##type(uint metadata) \
uint address = ComputeDOTSInstanceDataAddress(metadata, sizeof_type); \
return conv(DOTSInstanceData_Load(address)); \
} \
type LoadDOTSInstancedDataOverridden_##type(uint metadata) \
{ \
uint address = ComputeDOTSInstanceDataAddressOverridden(metadata, sizeof_type); \
return conv(DOTSInstanceData_Load(address)); \
} \
type LoadDOTSInstancedData_##type(type default_value, uint metadata) \
{ \
uint address = ComputeDOTSInstanceDataAddressOverridden(metadata, sizeof_type); \
Expand All @@ -389,6 +439,11 @@ type##width LoadDOTSInstancedData_##type##width(uint metadata) \
uint address = ComputeDOTSInstanceDataAddress(metadata, sizeof_type * width); \
return conv(DOTSInstanceData_Load##width(address)); \
} \
type##width LoadDOTSInstancedDataOverridden_##type##width(uint metadata) \
{ \
uint address = ComputeDOTSInstanceDataAddressOverridden(metadata, sizeof_type * width); \
return conv(DOTSInstanceData_Load##width(address)); \
} \
type##width LoadDOTSInstancedData_##type##width(type##width default_value, uint metadata) \
{ \
uint address = ComputeDOTSInstanceDataAddressOverridden(metadata, sizeof_type * width); \
Expand Down
Expand Up @@ -281,6 +281,7 @@
#define UNITY_SETUP_INSTANCE_ID(input) {\
DEFAULT_UNITY_SETUP_INSTANCE_ID(input);\
SetupDOTSVisibleInstancingData();\
UNITY_SETUP_DOTS_MATERIAL_PROPERTY_CACHES();\
UNITY_SETUP_DOTS_SH_COEFFS;\
UNITY_SETUP_DOTS_RENDER_BOUNDS; }
#endif
Expand Down

0 comments on commit 6e919c5

Please sign in to comment.