Skip to content

vello_hybrid: Conditionally split up rectangles into (up to) 5 smaller ones#1565

Merged
LaurenzV merged 1 commit into
mainfrom
laurenz/split_rect
Apr 11, 2026
Merged

vello_hybrid: Conditionally split up rectangles into (up to) 5 smaller ones#1565
LaurenzV merged 1 commit into
mainfrom
laurenz/split_rect

Conversation

@LaurenzV
Copy link
Copy Markdown
Collaborator

In the fragment shader for the rectangle fast path, we check whether the rectangle has any fractional part, and if so perform anti-aliasing calculations for the whole rectangle. However, the problem is that if we have a very large rectangle with some fractional edges, we will also perform those calculations for the inner parts of the rectangle, even though this is wasted work. We only really need to do this for the edge pixels.

So the idea is to split up the rectangle into up to 5 parts and make sure that the big inner part of the rectangle has no fractional edges, while the outer 4 rectangles (which have a width or height of only 1 pixels) can have fractional offsets. Doing this saves a lot of computations in the fragment shader, which is especially important on low-tier devices.

With that said, there's no free lunch: Some experimentation showed for small rectangles, the overhead that comes from having 5x as many rectangles can eclipse the savings in the fragment shader. Therefore, we only apply this optimization for rectangles that have a certain size. It's impossible to come up with a threshold that is perfect everywhere, but on my low-tier Samsung Galaxy tablet it seems to have been somewhere around 20-30 pixels. Therefore, I just went with 32 pixels here.

Here are some rough measurements I get before vs. after:

Rectangle size 50:

  • 100 Rects: 60FPS vs 60 FPS
  • 500 Rects: 41FPS vs 47FPS
  • 1000 Rects: 26FPS vs 32FPS

Rectangle size 200:

  • 50 rects: 33FPS vs 41FPS
  • 100 rects: 20 FPS vs 26FPS

Rectangle size 500:

  • 5 rects: 48FPS vs 55FPS
  • 20 rects: 19FPS vs 25FPS

@LaurenzV LaurenzV requested a review from taj-p April 10, 2026 09:25
Copy link
Copy Markdown
Contributor

@taj-p taj-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Makes sense and is corroborated by my learnings made with the opaque pass investigation

const RECT_STRIP_FLAG: u32 = 1 << 31;
/// The threshold of the rectangle size after which a rectangle should be split up
/// into multiple smaller ones.
const LARGE_RECT_SPLIT_THRESHOLD: u16 = 32;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind if I decrease this to 20 when I release the opaque pass PR? If we don't split up these rects, the opaque pass has no effect on fast pass rects because they're all opaque, so the size at which splitting becomes beneficial is decreased

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I don't mind, if it makes sense for you!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it seems to me like having such small rectangles isn't very common (except for cached glyphs, which basically always have transparency), so not sure how big the gain would be in practice.

.flatten()
{
let (payload, paint_packed) =
Scheduler::process_paint(&rect.paint, encoded_paints, (part.x, part.y), paint_idxs);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context sharing: I believe this is an opportunity for improvement (perhaps across hybrid). For every draw command, we push a unique paint into the paint atlas. I'm wondering whether we can dedupe or whether paint heavy scenes should pass it via the vertex buffer (if so, then your draw_image API may make more sense because it doesn't carry the weight of padding/extend modes)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't have concrete ideas yet, but yeah, perhaps there's something that can be done about it!

@LaurenzV LaurenzV added this pull request to the merge queue Apr 11, 2026
Merged via the queue into main with commit 03a5d8e Apr 11, 2026
17 checks passed
@LaurenzV LaurenzV deleted the laurenz/split_rect branch April 11, 2026 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants