[ET-VK] Add Vulkan ops for skin segmentation and EdgeTAM models by SS-JIA · Pull Request #17709 · pytorch/executorch

SS-JIA · 2026-02-25T15:26:56Z

Stack from ghstack (oldest at bottom):

Implement several missing Vulkan operators needed to reduce graph
fragmentation in the skin segmentation and EdgeTAM models.

Skin segmentation ops:

aten.where.self: already had C++ and GLSL implementations but was
missing the Python partitioner registration.
aten.bitwise_and.Tensor: added as a new binary_op shader variant
operating on uint8 (bool) tensors.

EdgeTAM partitioning fixes:

Comparison ops (eq, lt, le, gt, ge): were registered under the
generic BinaryOp features which inherited FP_INT_T as the output
dtype set. The partitioner correctly rejected these because their
outputs are bool tensors. Split them into a dedicated
register_comparison_ops registration with outputs_dtypes=BOOL_T. The
binary_op.glsl shader already handles bool output via the
IS_COMPARISON_OP path (uint8 storage), so no shader changes are
needed.
aten.copy.default: not in the op registry, causing a subgraph break
in the first-frame model. This op appears when valid_num_points.to()
is called with matching dtype (a no-op cast). Add it to
RemoveRedundantOpsTransform so it is eliminated before the partitioner
runs. Also register it as an ephemeral op as a fallback. The removal
logic requires a _src_arg1_ops set to handle the copy.default(self,
src) argument order, where the replacement target is args[1] (src)
rather than args[0] (self) as in all other redundant ops.

Differential Revision: D94364641

cc @manuelcandales @digantdesai @cbilgin

Implement several missing Vulkan operators needed to reduce graph fragmentation in the skin segmentation and EdgeTAM models. **Skin segmentation ops:** - aten.where.self: already had C++ and GLSL implementations but was missing the Python partitioner registration. - aten.bitwise_and.Tensor: added as a new binary_op shader variant operating on uint8 (bool) tensors. **EdgeTAM partitioning fixes:** - Comparison ops (eq, lt, le, gt, ge): were registered under the generic BinaryOp features which inherited FP_INT_T as the output dtype set. The partitioner correctly rejected these because their outputs are bool tensors. Split them into a dedicated register_comparison_ops registration with outputs_dtypes=BOOL_T. The binary_op.glsl shader already handles bool output via the IS_COMPARISON_OP path (uint8 storage), so no shader changes are needed. - aten.copy.default: not in the op registry, causing a subgraph break in the first-frame model. This op appears when valid_num_points.to() is called with matching dtype (a no-op cast). Add it to RemoveRedundantOpsTransform so it is eliminated before the partitioner runs. Also register it as an ephemeral op as a fallback. The removal logic requires a _src_arg1_ops set to handle the copy.default(self, src) argument order, where the replacement target is args[1] (src) rather than args[0] (self) as in all other redundant ops. Differential Revision: [D94364641](https://our.internmc.facebook.com/intern/diff/D94364641/) [ghstack-poisoned]

pytorch-bot · 2026-02-25T15:27:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17709

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit 88544f6 with merge base 63f9724 ():

NEW FAILURES - The following jobs have failed:

pull / unittest-arm-backend-with-no-deps (test_pytest_ops_tosa) / linux-job (gh)
RuntimeError: Command docker exec -t bfbe4f28a4ff9cf7c34c379a3c9602b527a6630b6d8601b088ed24fb47a90901 /exec failed with exit code 1
Test CUDA Builds / test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 2bd6fd713cdb5f9421dbb6c6004f064fe12acb71dddd8829ba9264294445f551 /exec failed with exit code 55
Test CUDA Builds / test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job (gh)
RuntimeError: Command docker exec -t bf963c49a054550e876664ad81b7848f37b5df6e7dacd05142795a238d096646 /exec failed with exit code 55

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-02-25T15:28:07Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Implement several missing Vulkan operators needed to reduce graph fragmentation in the skin segmentation and EdgeTAM models. **Skin segmentation ops:** - aten.where.self: already had C++ and GLSL implementations but was missing the Python partitioner registration. - aten.bitwise_and.Tensor: added as a new binary_op shader variant operating on uint8 (bool) tensors. **EdgeTAM partitioning fixes:** - Comparison ops (eq, lt, le, gt, ge): were registered under the generic BinaryOp features which inherited FP_INT_T as the output dtype set. The partitioner correctly rejected these because their outputs are bool tensors. Split them into a dedicated register_comparison_ops registration with outputs_dtypes=BOOL_T. The binary_op.glsl shader already handles bool output via the IS_COMPARISON_OP path (uint8 storage), so no shader changes are needed. - aten.copy.default: not in the op registry, causing a subgraph break in the first-frame model. This op appears when valid_num_points.to() is called with matching dtype (a no-op cast). Add it to RemoveRedundantOpsTransform so it is eliminated before the partitioner runs. Also register it as an ephemeral op as a fallback. The removal logic requires a _src_arg1_ops set to handle the copy.default(self, src) argument order, where the replacement target is args[1] (src) rather than args[0] (self) as in all other redundant ops. Differential Revision: [D94364641](https://our.internmc.facebook.com/intern/diff/D94364641/) ghstack-source-id: 344667759 Pull Request resolved: #17709

pytorch-bot Bot added the module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ label Feb 25, 2026

This was referenced Feb 25, 2026

[ET-VK][ez] Fix global workgroup size underflow for sub-4D tensors in block config dispatch #17707

Merged

[ET-VK][q8_ops] Add int8x4_buffer_to_nchw shader and refactor Int8x4Staging #17708

Merged

[ET-VK] Add aten.index.Tensor op and tests #17710

Merged

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 25, 2026

manuelcandales approved these changes Feb 25, 2026

View reviewed changes

SS-JIA merged commit 112a344 into gh/SS-JIA/451/base Feb 25, 2026
202 of 208 checks passed

SS-JIA deleted the gh/SS-JIA/451/head branch February 25, 2026 19:12

SS-JIA temporarily deployed to cherry-pick-bot February 25, 2026 19:12 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK] Add Vulkan ops for skin segmentation and EdgeTAM models#17709

[ET-VK] Add Vulkan ops for skin segmentation and EdgeTAM models#17709
SS-JIA merged 1 commit intogh/SS-JIA/451/basefrom
gh/SS-JIA/451/head

SS-JIA commented Feb 25, 2026 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SS-JIA commented Feb 25, 2026 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17709

❌ 3 New Failures

Uh oh!

github-actions Bot commented Feb 25, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SS-JIA commented Feb 25, 2026 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented Feb 25, 2026 •

edited

Loading

This PR needs a `release notes:` label