vello_shaders: Guard inactive clip_leaf lanes#1637
Conversation
waywardmonkeys
left a comment
There was a problem hiding this comment.
I think this is correct, but would appreciate a bit more review from someone more familiar with this aspect of shaders.
| workgroupBarrier(); | ||
| let grandparent = select(link - 1, sh_link[link], link >= 0); | ||
| var grandparent = link - 1; | ||
| if link >= 0 { |
There was a problem hiding this comment.
I think it would be useful to have a comment here about this avoiding a materialization of the index operands on some backends.
There was a problem hiding this comment.
Makes sense, I added a comment explaining why this uses explicit control flow instead of select.
|
I'll take a look tonight |
cbd7ead to
7ffa5d7
Compare
b0nes164
left a comment
There was a problem hiding this comment.
LGTM, modulo some nits. Do you have a reproducer test case and device for this issue?
| @builtin(workgroup_id) wg_id: vec3<u32>, | ||
| ) { | ||
| var bic: Bic; | ||
| var bic = Bic(0u, 0u); |
There was a problem hiding this comment.
Nit: This is unnecessary. If a variable declaration has no initializer, the variable is default initialized
| if global_id.x < config.n_clip { | ||
| bic = Bic(1u - u32(is_push), u32(is_push)); | ||
| } else { | ||
| bic = Bic(0u, 0u); | ||
| } |
There was a problem hiding this comment.
This should be removed.
This is already guarded inside load_clip_path which results in bic = (1, 0). Since the scan is ascending, this has no effect on the scan results for valid threads. In search_link the search is exclusively backwards, and so again, valid threads will always search through valid data.
Since we already guarding indexing into sh_link by guarding search_link, this provides no additional protection.
There was a problem hiding this comment.
Thanks, updated! I removed the redundant bic initialization and the extra guard around the bic assignment.
7ffa5d7 to
f2a6d68
Compare
Yes, I made a small Android repro here: https://github.com/gugutu/vello-clip-repro-android It reproduces on my Xiaomi 23127PN0CC / houji device with Adreno 750 on Android 16. I included the device/driver details and the observed failure modes in the README. With the amended fix branch, the same repro runs through the 120-frame burst cleanly on that device. |
|
@gugutu Just curious how you found this. Seems like it must've been a bit of an adventure. |
It was a bit of an adventure, though honestly not too bad — I had some AI help with the debugging. :) I hit this while experimenting with a small Rust Android app using Vello directly. The app would consistently black-screen and freeze on my device. With the AI helping me narrow things down, I reduced it to a tiny clipped scene, added some validation scopes, and eventually traced it to this clip shader path. |
This guards inactive
clip_leafshader lanes when the dispatch size is rounded up pastconfig.n_clip.load_clip_pathalready returns a sentinel value for lanes whereglobal_id.x >= config.n_clip, but those lanes could still participate in the prefix/search logic and reach shared-memory link reads. In particular,search_linkcan produce values derived from inactive lane state, and the previousselect(link - 1, sh_link[link], link >= 0)form still exposes a potentially invalidsh_link[link]expression to the shader compiler.On Android/Vulkan, this was observed to make scenes involving clip layers fail to render, producing a black frame. A reduced WGPU reproduction showed the same pattern: even when a value is logically guarded, keeping a potentially invalid indexed read inside
selectcan still be enough for the shader/backend to misbehave.This change makes inactive lanes contribute a neutral
Bic, skips predecessor lookup for them, and replaces theselectwith explicit control flow before readingsh_link[link].This avoids invalid accesses from inactive lanes while preserving the behavior for active clip entries. With this change applied, the affected Android/Vulkan clipping scene renders correctly again.
Verified locally with:
cargo fmt --all --checkcargo check -p vello_shaderscargo test -p vello_shaders