Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cull dynamic lights on CPU, switch to clustered rendering #1042

Open
VReaperV opened this issue Feb 13, 2024 · 1 comment
Open

Cull dynamic lights on CPU, switch to clustered rendering #1042

VReaperV opened this issue Feb 13, 2024 · 1 comment
Labels
A-Renderer T-Improvement Improvement for an existing feature

Comments

@VReaperV
Copy link
Contributor

VReaperV commented Feb 13, 2024

Currently, dynamic lights tiles are computed on the GPU, and uses branching on non-dynamically uniform expressions (while this distinction is made for GLSL versions >= 4.0, it is likely to be close to what modern drivers will branch well on), so the fragment shader invocations likely execute all branches. It also uses a different FBO, which is an expensive state change. See: https://github.com/DaemonEngine/Daemon/blob/master/src/engine/renderer/glsl_source/lighttile_fp.glsl and

void RB_RenderPostDepthLightTile()

Performance would likely improve if lights are culled and assigned to tiles on the CPU.

Additionally, using clustered rendering instead of tiled will likely increase performance as well.

Overview:
Tiled and clustered rendering are techniques that divide the view frustum into tiles (on the XY plane)/frustum-shaped clusters (adds depth slices) respectively and store lights (and potentially things like decals or probes) only in the tiles/clusters they should affect. The lighting code in fragment shader then fetches the list of lights from a buffer object or a texture using the fragments position and only computes those lights.
Clustered rendering reduces the amount of light computations compared to tiled for a small CPU cost.

For a more in-depth explanation and implementation example see https://www.humus.name/Articles/PracticalClusteredShading.pdf and https://advances.realtimerendering.com/s2016/Siggraph2016_idTech6.pdf (p. 5-9).

@VReaperV VReaperV added T-Improvement Improvement for an existing feature A-Renderer labels Feb 13, 2024
@illwieckz
Copy link
Member

Currently, dynamic lights tiles are computed on the GPU, and uses branching on non-dynamically uniform expressions (while this distinction is made for GLSL versions >= 4.0, it is likely to be close to what modern drivers will branch well on), so the fragment shader invocations likely execute all branches. It also uses a different FBO, which is an expensive state change.

Anything that can help the game to run on more lower-end machines is welcome! 🤓️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Renderer T-Improvement Improvement for an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants