Costly Full FrameBuffer Copy in GLES3 Renderer #45

NeoSpark314 · 2019-09-01T16:35:01Z

Currently the GLES3 renderer seems to be setup to always perform a full frame copy after rendering even when no post processing is enabled (It then just does the Srgb conversion).
Below is a screenshot of a renderdoc trace that shows the frame copy for each eye that happens via a full screen quad.

On tile-based GPU architectures this is very expensive and according to this talk: https://www.youtube.com/watch?v=CQxkE_56xMU&feature=youtu.be&t=2426 it takes 2.5ms on an Oculus Quest (from the total of 13.8ms available for 72hz).

Ideally there would be an option to avoid post processing and the buffer copy completely even for a GLES3 renderer as in most cases it is way too expensive; but to solve this it would probably require change in the GLES3 render architecture.

A sideeffect of this fullscreen copy is that the 4x multisampling set here:

godot_oculus_mobile/src/common.h

Line 75 in 3639380

static const int NUM_MULTI_SAMPLES = 4;

is not affecting the final result and probably only costs bandwidth when the full screen quad is rendered to an MSAA target.

Screenshot of the GLES3 render doc trace (of two cubes rendered at the controller position)

m4gr3d · 2019-10-29T16:05:11Z

cc @BastiaanOlij @akien-mga

BastiaanOlij · 2019-10-30T11:15:30Z

Indeed, this has always been an issue for VR. While on one side people will want to be able to do some amount of post effects requiring this copy anyway, that we already take the render buffer and copy this to the final output through a lens distortion shader just seems wasteful. It can be more then one depending on the post effects used.

A more efficient way would be to skip the post effect step in Godot and either accept that as a limitation of the platform (on the Quest you're already trying to minimize unneeded overhead) or see if the platform allows some post effects in the lens distortion shader. The problem for other VR platforms Godot has to support is that for say desktop VR you still want to output to screen, something we have to keep in mind if we implement support for this in the core.

The final problem that needs to be resolved if we make it optional to skip the post effects is that the render buffer that Oculus wants us to create, and the render buffers that Godot needs in order to do its rendering properly, don't match up. Godot adds a number of extra effect buffers and requires the main color target to be a RGBA16F based buffer, not all platforms support.

Things will change radically for Vulkan anyway so it may be smart to wait and see where that ends up and then decide how we want to move forward with the new render pipeline.

NeoSpark314 · 2019-10-30T11:29:14Z

From my perspective I think waiting is also probably the best option. When I opened the issue I thought that using GLES3 would be the preferred choice for Mobile/Quest (assuming that OpenglES 3.0/3.1 features would allow better performance). But even with the option to disable post processing I see there are too many performance pitfalls due to the tiled render architecture for mobile with the current GLES3 renderer.
The ES2 renderer seems to be already very well optimized for mobile; maybe in the future some of the optimizations can be introduced by using extensions or allow the ES2 renderer to create an ES3.1 context if available.
I think the general recommendation for users should be to use only the ES2 renderer or wait for the mobile optimized vulkan path in godot 4.0.

m4gr3d · 2019-10-30T17:02:44Z

@NeoSpark314 Can you elaborate on the performance pitfalls you're seeing with the GLES3 renderer.
I'm also trying to identify the optimal Godot renderer, and so knowing ahead of time what I should be looking out for would help speed up the evaluation. It'll also be worth listing that information on the project pages since other users might be interested and/or want to weigh in.

NeoSpark314 · 2019-10-30T18:04:41Z

From what I remember at the moment (I stopped using GLES3 after I opened this issue :-) the main issue I saw with GLES3 is that there are many options that require additional full screen passes that are extremly expensive (mostly due to bandwidth): auto exposure, glow, SSAO, and screen space reflections that can easily be turned on in gles3;
On the material side I think sub surface scattering requires a buffer readback (refraction the same).
Mip map generation of the render buffer in GLES3 is also very expensive.

All in all just the demo project with no additional post processing enabled is already close to maxing out the GPU with GLES3 (gpu running at level 4); while in GLES2 it runs comfortably at level 2:

From my current experience I would list the following performance suggestions at the moment:

Use GLES2
Always use the ovr_metrics_tool during development
No Postprocessing (for fadeout use the color multiply from the godot_oculus_mobile API)
No alpha discard
Limit alpha blending
No Screen buffer readback
Maximum 1 Dynamic Lightsource (not sure yet how Environment Lighting is implemented; maybe its not costly)
Try to avoid realtime shadows (use only baked lighting)
~70000 Triangles visible per frame
Use 4xMSAA (once it's fixed; the performance impact is quite low on tiled render architectures)

NeoSpark314 mentioned this issue Oct 1, 2019

Utilities #51

Merged

NeoSpark314 mentioned this issue Nov 9, 2019

MSAA in GLES2 slow and breaks Fixed Foveated Rendering #70

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Costly Full FrameBuffer Copy in GLES3 Renderer #45

Costly Full FrameBuffer Copy in GLES3 Renderer #45

NeoSpark314 commented Sep 1, 2019

m4gr3d commented Oct 29, 2019

BastiaanOlij commented Oct 30, 2019

NeoSpark314 commented Oct 30, 2019

m4gr3d commented Oct 30, 2019

NeoSpark314 commented Oct 30, 2019

Costly Full FrameBuffer Copy in GLES3 Renderer #45

Costly Full FrameBuffer Copy in GLES3 Renderer #45

Comments

NeoSpark314 commented Sep 1, 2019

m4gr3d commented Oct 29, 2019

BastiaanOlij commented Oct 30, 2019

NeoSpark314 commented Oct 30, 2019

m4gr3d commented Oct 30, 2019

NeoSpark314 commented Oct 30, 2019