WebGLRenderer: Avoid creating per-camera render states. #20422

Oletus · 2020-09-27T08:28:47Z

Now we usually only create one render state per scene. Instead of creating camera-specific render states, the view-dependent light uniforms are updated with a separate function whenever rendering. This helps avoid initMaterial calls when rendering cube maps or array cameras, improving performance dramatically.

The only exception is handling nested render() calls done from callbacks. These are sometimes used to render the same scene from a different viewpoint, for example to generate a mirror reflection. They could also potentially modify the scene in some other ways. For these reasons they can't share the same render state.

Related issues: #12883

Now we usually only create one render state per scene. Instead of creating camera-specific render states, the view-dependent light uniforms are updated with a separate function whenever rendering. This helps avoid initMaterial calls when rendering cube maps or array cameras, improving performance dramatically. The only exception is handling nested render() calls done from callbacks. These are sometimes used to render the same scene from a different viewpoint, for example to generate a mirror reflection. They could also potentially modify the scene in some other ways. For these reasons they can't share the same render state. Related issues: mrdoob#12883

src/renderers/WebGLRenderer.js

src/renderers/webgl/WebGLRenderStates.js

Mugen87 · 2020-09-27T11:25:43Z

This helps avoid initMaterial calls when rendering cube maps or array cameras, improving performance dramatically.

Could you provide some performance numbers here? improving performance dramatically is a bit vague for my taste. Besides, a previous performance issue in context of initMaterial() was already solved via #19673.

Oletus · 2020-09-27T16:13:47Z

This helps avoid initMaterial calls when rendering cube maps or array cameras, improving performance dramatically.

Could you provide some performance numbers here? improving performance dramatically is a bit vague for my taste. Besides, a previous performance issue in context of initMaterial() was already solved via #19673.

Thanks for making me aware of #19673, I think the effect will be less dramatic after that change. In our application we saw a more than 2x performance improvement with an earlier version of this change when rendering cubemaps, but that was without #19673. I'd still expect a fairly large benefit in our case, where we have a rather compex scene with lots of materials, with added cost from customProgramCacheKey() for a lot of materials. It could be worthwhile to take another measurement, but that will take a while.

Oletus · 2020-09-27T16:42:22Z

Added a new commit with suggestions for improved naming.

One other thing: if an error is thrown from somewhere within render(), the render state will be left in the stack. Some apps might otherwise ignore the error and continue executing, in which case the stack will keep growing. Should we add a try-catch block that cleans up the stack in case of an error?

Mugen87 · 2020-09-27T18:05:35Z

It could be worthwhile to take another measurement, but that will take a while.

Using the scene and camera to organize render states always seemed reasonable to me. The mentioned performance issues are mainly introduced of the logic in initMaterial(). Since the engine has mitigated this overhead starting from r119, it would be good to know what benefit this PR adds.

Mugen87 · 2020-09-27T18:06:54Z

Should we add a try-catch block that cleans up the stack in case of an error?

Um, not sure yet. Please give me some time to evaluate your changes in a closer manner. Maybe something can be changed so the engine does not have to bother with a growing stack.

Oletus · 2020-09-28T14:00:51Z

I measured the difference this patch makes again, and the benefit is smaller now thanks to #19673 already fixing things. The measurement was done using Chrome's profiler. With #19673 applied but without this patch about 7.5% of CPU time is spent in initMaterial. With this patch all those calls disappear. I'd say that's still substantial enough to justify the patch, but what do you think?

Oletus · 2020-09-28T15:26:57Z

It's also worth mentioning that our app probably isn't the worst case when it comes to initMaterial calls - we have an optimization to use just a single camera in our cube map generation specifically to work around the issue fixed by this PR. At its worst the overhead could be ~6x more than the 7.5% of CPU time seen in our app.

Oletus · 2020-10-02T12:37:14Z

@Mugen87 any new thoughts about this? I still think the benefits are large enough especially for apps that render a lot of cube maps.

Mugen87 · 2020-10-02T13:22:03Z

Sorry, I had no time yet to continue my review.

Oletus · 2020-10-24T13:59:13Z

I still think this change or something like it has a lot of value - so here's a friendly ping @Mugen87 @mrdoob , any possibility of getting this merged? I could also work on an alternative version if there are any ideas how to implement this more cleanly.

src/renderers/webgl/WebGLRenderStates.js

Mugen87 · 2020-10-26T10:51:05Z

Apart form #20422 (comment), I was not able to figure out a better solution for now. The PR is an improvement compared to the previous render state handling.

The distinction of light uniforms into view-dependent and view-independent uniforms is something we definitely want.

Oletus · 2020-10-26T18:58:15Z

Addressed the comment, should be ready to merge. Thanks for the review!

Mugen87

I suggest to merge this at the beginning of a dev cycle. Just to be on the safe side^^.

Small TODO: We should verify if the following code block is still required in its current form:

three.js/src/renderers/webgl/WebGLCubeMaps.js

Lines 43 to 53 in 956292c

    
           const currentRenderList = renderer.getRenderList(); 
        
           const currentRenderTarget = renderer.getRenderTarget(); 
        
           const currentRenderState = renderer.getRenderState(); 
        
           const renderTarget = new WebGLCubeRenderTarget( image.height / 2 ); 
        
           renderTarget.fromEquirectangularTexture( renderer, texture ); 
        
           cubemaps.set( texture, renderTarget ); 
        
           renderer.setRenderTarget( currentRenderTarget ); 
        
           renderer.setRenderList( currentRenderList ); 
        
           renderer.setRenderState( currentRenderState );

It was necessary so far to store the current render target, list and state when converting from equirect to cube map and restore them after the process. It seems this is no longer necessary for the render state (because of the new stack usage).

@Oletus Can you confirm this? 😇

Mugen87 · 2020-10-26T19:58:24Z

If this is actually the case, we might want to consider the stack approach for render lists and targets, too (if possible). This would make nested renderings less error-prone.

mrdoob · 2020-11-22T15:04:08Z

Sorry we didn't get to merge this in this dev cycle. Lets try next cycle!

Could you resolve the conflicts?

Nested render calls now automatically restore the correct render state from the stack. getRenderState/setRenderState are not part of the public API so they can be safely removed from WebGLRenderer.

Oletus · 2020-11-24T11:59:44Z

@Mugen87 You're right, setting the render state in WebGLCubeMaps isn't required anymore, I pushed another commit that addresses that as well as fixing the merge conflict. I think the stack approach could make sense also for render lists, but I'd leave that to be addressed later.

The thing that I'm still most worried about with this change is that exceptions inside the renderer can cause the render state stack to grow, which could make dealing with programming errors trickier (including errors in callbacks).

Oletus · 2020-11-25T12:33:19Z

@mrdoob Should be ready to be merged!

Oletus · 2020-12-01T15:40:41Z

Ping @mrdoob , I think you wanted to get this merged in the beginning of this dev cycle?

mrdoob · 2020-12-08T16:23:13Z

Alright! Lets give it a try 🤞

mrdoob · 2020-12-08T16:25:13Z

I like this stack approach. I wish I would have come up with it 😁

Mugen87 · 2020-12-08T16:46:58Z

Me too^^.

@Oletus Cool PR!

Oletus · 2020-12-08T18:13:32Z

Thanks for getting this merged! 👍

Oletus force-pushed the defer-light-transform branch from 00b1a20 to 4033064 Compare September 27, 2020 08:49