Skip to content

vello_shaders: Guard inactive clip_leaf lanes#1637

Merged
b0nes164 merged 1 commit into
linebender:mainfrom
gugutu:fix/clip-leaf-inactive-lanes
May 14, 2026
Merged

vello_shaders: Guard inactive clip_leaf lanes#1637
b0nes164 merged 1 commit into
linebender:mainfrom
gugutu:fix/clip-leaf-inactive-lanes

Conversation

@gugutu
Copy link
Copy Markdown
Contributor

@gugutu gugutu commented May 12, 2026

This guards inactive clip_leaf shader lanes when the dispatch size is rounded up past config.n_clip.

load_clip_path already returns a sentinel value for lanes where global_id.x >= config.n_clip, but those lanes could still participate in the prefix/search logic and reach shared-memory link reads. In particular, search_link can produce values derived from inactive lane state, and the previous select(link - 1, sh_link[link], link >= 0) form still exposes a potentially invalid sh_link[link] expression to the shader compiler.

On Android/Vulkan, this was observed to make scenes involving clip layers fail to render, producing a black frame. A reduced WGPU reproduction showed the same pattern: even when a value is logically guarded, keeping a potentially invalid indexed read inside select can still be enough for the shader/backend to misbehave.

This change makes inactive lanes contribute a neutral Bic, skips predecessor lookup for them, and replaces the select with explicit control flow before reading sh_link[link].

This avoids invalid accesses from inactive lanes while preserving the behavior for active clip entries. With this change applied, the affected Android/Vulkan clipping scene renders correctly again.

Verified locally with:

  • cargo fmt --all --check
  • cargo check -p vello_shaders
  • cargo test -p vello_shaders

Copy link
Copy Markdown
Collaborator

@waywardmonkeys waywardmonkeys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct, but would appreciate a bit more review from someone more familiar with this aspect of shaders.

workgroupBarrier();
let grandparent = select(link - 1, sh_link[link], link >= 0);
var grandparent = link - 1;
if link >= 0 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to have a comment here about this avoiding a materialization of the index operands on some backends.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I added a comment explaining why this uses explicit control flow instead of select.

@b0nes164
Copy link
Copy Markdown
Member

I'll take a look tonight

@gugutu gugutu force-pushed the fix/clip-leaf-inactive-lanes branch from cbd7ead to 7ffa5d7 Compare May 13, 2026 18:15
Copy link
Copy Markdown
Member

@b0nes164 b0nes164 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo some nits. Do you have a reproducer test case and device for this issue?

Comment thread vello_shaders/shader/clip_leaf.wgsl Outdated
@builtin(workgroup_id) wg_id: vec3<u32>,
) {
var bic: Bic;
var bic = Bic(0u, 0u);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This is unnecessary. If a variable declaration has no initializer, the variable is default initialized

Comment thread vello_shaders/shader/clip_leaf.wgsl Outdated
Comment on lines +134 to +138
if global_id.x < config.n_clip {
bic = Bic(1u - u32(is_push), u32(is_push));
} else {
bic = Bic(0u, 0u);
}
Copy link
Copy Markdown
Member

@b0nes164 b0nes164 May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed.

This is already guarded inside load_clip_path which results in bic = (1, 0). Since the scan is ascending, this has no effect on the scan results for valid threads. In search_link the search is exclusively backwards, and so again, valid threads will always search through valid data.

Since we already guarding indexing into sh_link by guarding search_link, this provides no additional protection.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated! I removed the redundant bic initialization and the extra guard around the bic assignment.

@gugutu gugutu force-pushed the fix/clip-leaf-inactive-lanes branch from 7ffa5d7 to f2a6d68 Compare May 14, 2026 11:20
@gugutu
Copy link
Copy Markdown
Contributor Author

gugutu commented May 14, 2026

LGTM, modulo some nits. Do you have a reproducer test case and device for this issue?

Yes, I made a small Android repro here:

https://github.com/gugutu/vello-clip-repro-android

It reproduces on my Xiaomi 23127PN0CC / houji device with Adreno 750 on Android 16. I included the device/driver details and the observed failure modes in the README. With the amended fix branch, the same repro runs through the 120-frame burst cleanly on that device.

@waywardmonkeys
Copy link
Copy Markdown
Collaborator

@gugutu Just curious how you found this. Seems like it must've been a bit of an adventure.

@b0nes164 b0nes164 added this pull request to the merge queue May 14, 2026
@gugutu
Copy link
Copy Markdown
Contributor Author

gugutu commented May 14, 2026

@gugutu Just curious how you found this. Seems like it must've been a bit of an adventure.

It was a bit of an adventure, though honestly not too bad — I had some AI help with the debugging. :)

I hit this while experimenting with a small Rust Android app using Vello directly. The app would consistently black-screen and freeze on my device. With the AI helping me narrow things down, I reduced it to a tiny clipped scene, added some validation scopes, and eventually traced it to this clip shader path.

Merged via the queue into linebender:main with commit b874583 May 14, 2026
33 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants