Make PCF Loop Bounds Constant #22

CaffeineViking · 2018-12-28T17:49:48Z

I've found that we can save 3-4ms on the strand self-shadows of the ADSM approach if the PCF kernel radius is known at shader compilation time, it looks like glslc is unrolling the loop and making more assumptions. I know this for sure makes a difference on my laptop, I still need to profile the benefits on the AMD-machine (I've heard that performance can sometimes even decrease if the compiler is too aggressive in doing loop unrolling on GCN, so I'll profile to make sure this is worth it). For reference, on my laptop, in the worst case (i.e. hair occupies the entire screen) I get ~9.8ms for rendering the ponytail, without any shadows, I get 5.8ms, and without any shading at all, 4.8ms (an unavoidable cost of rasterizing 1.8M lines). So a performance breakdown gives us:

Rasterization: 4.8ms
Kajiya-Kay: 1ms
3x3 PCF ADSM: 4ms
Total cost: 9.8ms

By having constant loop bounds the entire pass instead takes ~6.8ms, and therefore we'll get these results:

Rasterization: 4.8ms
Kajiya-Kay: 1ms
3x3 PCF ADSM: 1ms
Total cost: 6.8ms

For further reference, on the beefy AMD-machine, the unoptimized version (that took 9.8ms on my laptop), takes around 1.5ms for the entire pass (i.e. total cost is 1.5ms). I haven't tested the optimized version yet, but if we assume the speedup translates to AMD-hardware as well, we'll go from 1.5ms to ~1ms. This will leave us more room to do other cool things (and we still need to think about leaving time for the OIT), so I think this optimization is worth trying out.

I'll see if I can make the offending calculation go away, or just assume that we use 3x3 kernels all the time. A way would be to use specialization constants, and re-compile the shaders when the PCF radius is modified.

The text was updated successfully, but these errors were encountered:

CaffeineViking · 2019-01-11T10:34:37Z

I tested this before, and the results were not as extreme on AMD hardware. Let's forget about this for now.

CaffeineViking added Type: Refactor Make this thing a bit less shitty, pretty please! Project: Renderer Issues relating to the core hair renderer itself. Priority: Medium Type: Optimize Suggestion to optimize an specific algorithm. labels Dec 28, 2018

CaffeineViking self-assigned this Dec 28, 2018

CaffeineViking added Priority: Low and removed Priority: Medium labels Jan 10, 2019

CaffeineViking closed this as completed Jan 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make PCF Loop Bounds Constant #22

Make PCF Loop Bounds Constant #22

CaffeineViking commented Dec 28, 2018 •

edited

CaffeineViking commented Jan 11, 2019

Make PCF Loop Bounds Constant #22

Make PCF Loop Bounds Constant #22

Comments

CaffeineViking commented Dec 28, 2018 • edited

CaffeineViking commented Jan 11, 2019

CaffeineViking commented Dec 28, 2018 •

edited