-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First cut at clipping #44
Conversation
Actually handle transforms in RenderCtx (was implemented in renderer but not actually plumbed through). This also requires maintaining a state stack, which will also be required for clipping. This PR also starts work on encoding clipping, including tracking bounding boxes. WIP, none of this is tested yet.
I realized there's a problem with encoding clip bboxes relative to the current transform (see #36 for a more detailed explanation), so this is changing it to absolute bboxes. This more or less gets clips working. There are optimization opportunities (all-clear and all-opaque mask tiles), and it doesn't deal with overflow of the blend stack, but it seems to basically work.
@@ -27,6 +27,8 @@ layout(rgba8, set = 0, binding = 2) uniform writeonly image2D image; | |||
#include "ptcl.h" | |||
#include "tile.h" | |||
|
|||
#define BLEND_STACK_SIZE 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you lift this restriction before replacing FillMask?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, but I'm not 100% sure when. On my AMD 5700 XT card, the static limit can be increased quite a bit without seriously impacting performance (utilization goes down as register pressure goes up). Using a scratch buffer in global memory can be done in two stages - malloc-only (using an atomic bump counter, as is currently done for dynamic allocations), and a second malloc/free stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the FillMask stopgap co-exist with the new clipping? I'd hoped not to have hard limits such as this when adding piet-gpu as a Gio backend. I'll probably take a stab at #40, but I think this limit is worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add FillMask back (I mostly preserved it, but took some of it out in coarse). After I upload, let me know if it's still working for you, I don't have easy ways to test.
I do want to add unbounded stacks, and I think the malloc-only version is not that hard. It's possible I'll do that soon, and then we can have another discussion about FillMask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per discussion, don't remove FillMask until we get unbounded clip stacks.
// - the tile is completely covered (backdrop non-zero) | ||
// - the tile is not covered and we're filling everything outside the path (backdrop zero, inverse fills). | ||
bool inside = tile.backdrop != 0; | ||
bool fill = tag != Annotated_FillMaskInv; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this check needs restoring.
Confirmed that FillMask still works with this branch. |
Good to know. I'll take that as a go-ahead to commit when I feel it's appropriate. (FYI, generally the strategy I use is that my local branch is one commit ahead. When I'm ready to PR from that, I merge the one behind it into master). I've started the scratch buffer (malloc-only) and do not think it will be very hard. I'll also comment on the dynamic allocation issue. |
This is a significant amount of progress towards #36. Bounding boxes for clips are accumulated (in world coordinates, see that issue for more details) CPU-side and encoded in the BeginClip and EndClip elements. These are carried through the pipeline in a straightforward way, then there is a blend stack in the fine rasterizer.
Here are the cleanups not yet done:
This PR also encodes transforms (they have always worked, just weren't encoded).