You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, dynamic lights tiles are computed on the GPU, and uses branching on non-dynamically uniform expressions (while this distinction is made for GLSL versions >= 4.0, it is likely to be close to what modern drivers will branch well on), so the fragment shader invocations likely execute all branches. It also uses a different FBO, which is an expensive state change. See: https://github.com/DaemonEngine/Daemon/blob/master/src/engine/renderer/glsl_source/lighttile_fp.glsl and
Performance would likely improve if lights are culled and assigned to tiles on the CPU.
Additionally, using clustered rendering instead of tiled will likely increase performance as well.
Overview:
Tiled and clustered rendering are techniques that divide the view frustum into tiles (on the XY plane)/frustum-shaped clusters (adds depth slices) respectively and store lights (and potentially things like decals or probes) only in the tiles/clusters they should affect. The lighting code in fragment shader then fetches the list of lights from a buffer object or a texture using the fragments position and only computes those lights.
Clustered rendering reduces the amount of light computations compared to tiled for a small CPU cost.
Currently, dynamic lights tiles are computed on the GPU, and uses branching on non-dynamically uniform expressions (while this distinction is made for GLSL versions >= 4.0, it is likely to be close to what modern drivers will branch well on), so the fragment shader invocations likely execute all branches. It also uses a different FBO, which is an expensive state change.
Anything that can help the game to run on more lower-end machines is welcome! 🤓️
Currently, dynamic lights tiles are computed on the GPU, and uses branching on non-dynamically uniform expressions (while this distinction is made for GLSL versions >= 4.0, it is likely to be close to what modern drivers will branch well on), so the fragment shader invocations likely execute all branches. It also uses a different FBO, which is an expensive state change. See: https://github.com/DaemonEngine/Daemon/blob/master/src/engine/renderer/glsl_source/lighttile_fp.glsl and
Daemon/src/engine/renderer/tr_backend.cpp
Line 2803 in c5f8539
Performance would likely improve if lights are culled and assigned to tiles on the CPU.
Additionally, using clustered rendering instead of tiled will likely increase performance as well.
Overview:
Tiled and clustered rendering are techniques that divide the view frustum into tiles (on the XY plane)/frustum-shaped clusters (adds depth slices) respectively and store lights (and potentially things like decals or probes) only in the tiles/clusters they should affect. The lighting code in fragment shader then fetches the list of lights from a buffer object or a texture using the fragments position and only computes those lights.
Clustered rendering reduces the amount of light computations compared to tiled for a small CPU cost.
For a more in-depth explanation and implementation example see https://www.humus.name/Articles/PracticalClusteredShading.pdf and https://advances.realtimerendering.com/s2016/Siggraph2016_idTech6.pdf (p. 5-9).
The text was updated successfully, but these errors were encountered: