A bit of clip optimization #88

raphlinus · 2021-05-01T05:05:59Z

I was curious about whether the clip stack really needs to store both rgba and another alpha channel, so had another look at that logic. I think it is possible to get rid of the extra channel, with just a bit of cleverness.

The tricky bit is that the coarse rasterizer needs the path at BeginClip, so it can do the optimizations, but the fine rasterizer only needs it at EndClip. The basic idea is to make it available to both.

Encoding: the EndClip element is annotated with the number of paths inside the clip, ie the path count at BeginClip plus this quantity equals the path count at EndClip. This will require just a bit of accounting, but hopefully not too bad. Also note that by storing the difference (as opposed to an absolute path count) the encoding is no less "relocatable" than before.

Coarse rasterization, input stage: for EndClip, the path_ix is not just the element_ix (coarse.comp line 236), but the delta encoded above is subtracted. Thus, we actually read the BeginClip path, and the path associated with EndClip is ignored.

Coarse rasterization, output stage: BeginClip has the same optimization logic, but does not write the fill. EndClip writes the fill right before writing the EndClip command.

Fine rasterization: BeginClip does not push an alpha value. EndClip does not pop the alpha value, but is otherwise basically unchanged; it already composites using area[k], which at present is 1.0 because coarse rasterization always sends a Solid command before the EndClip.

A couple other notes. The do-while in kernel4 Cmd_Fill should be a while loop, as there are cases when coarse rasterization can send a tile with no path segments (when the nesting depth exceeds 32; this is unusual but possible).

Lastly, I think using the same path (and thus the same bbox) for begin and end of clips might address the problem I was talking about in this comment, and allow the use of relative bounding boxes again, which would increase the compositionality of encoded scene subtrees.

I'm not 100% sure this will work, but I think so. I also reviewed the current separation between path alpha and color source, and that's looking good for gradients; thanks Elias!

The text was updated successfully, but these errors were encountered:

This PR reworks the clip implementation. The highlight is that clip bounding box accounting is now done on GPU rather than CPU. The clip mask is also rasterized on EndClip rather than BeginClip, which decreases memory traffic needed for the clip stack. This is a pretty good working state, but not all cleanup has been applied. An important next step is to remove the CPU clip accounting (it is computed and encoded, but that result is not used). Another step is to remove the Annotated structure entirely. Fixes #88. Also relevant to #119

raphlinus mentioned this issue May 8, 2021

Add stack monoid implementation #90

Closed

raphlinus mentioned this issue Nov 1, 2021

New element processing pipeline #119

Closed

raphlinus mentioned this issue Feb 18, 2022

New clip implementation #150

Merged

raphlinus closed this as completed in #150 Feb 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A bit of clip optimization #88

A bit of clip optimization #88

raphlinus commented May 1, 2021

A bit of clip optimization #88

A bit of clip optimization #88

Comments

raphlinus commented May 1, 2021