Remove unnecessary cuda sync for better perf by pytorchbot · Pull Request #17594 · pytorch/executorch

pytorchbot · 2026-02-20T18:56:12Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #17315 by @Gasoonjia
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/gasoonjia/116/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/gasoonjia/116/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/main
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/gasoonjia/116/orig
Differential Revision: D92193164
@diff-train-skip-merge

Pull Request resolved: #17315 Right now we always do cudasync before existing cudabackend.execution(). However we only need that when copying data from gpu to cpu; any actions happen inside a same stream do not need explicit sync. Also we introduced new cuda_delegate_handle to remove cuda specifci inforamtion from aoti_delegate_handle for better hirearchy. ghstack-source-id: 343032381 @exported-using-ghexport Differential Revision: [D92193164](https://our.internmc.facebook.com/intern/diff/D92193164/)

pytorch-bot · 2026-02-20T18:56:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17594

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 1 Unrelated Failure

As of commit abc1514 with merge base 9591a67 ():

NEW FAILURES - The following jobs have failed:

pull / test-mediatek-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 1cdc0489be57a07c3b17bbc369224693a28cf398ea45ca31433ffadfa70e4c22 /exec failed with exit code 2
pull / test-openvino-linux / linux-job (gh)
RuntimeError: Command docker exec -t b79fcdc3154631bc2238f29c1686ecc301fcbcee1c08aa390258a740d8b44027 /exec failed with exit code 1
pull / unittest-arm-backend-with-no-deps (test_pytest_ops_tosa) / linux-job (gh)
RuntimeError: Command docker exec -t d2a0e35bc34c4f63a33414d812cdb4a5d08ebb95ebd9847f2ee71b361ed4ebfd /exec failed with exit code 1
pull / unittest-buck / linux / linux-job (gh)
RuntimeError: Command docker exec -t eab0a61d3b4f2036ce831b3aa995f3d5a3277c2c08fdb159b1fae3e66e4180c8 /exec failed with exit code 3
Test CUDA Builds / test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t dee4ce0d3756de946fb039286ec7782318a83e993769a33b86d95be10076ec4e /exec failed with exit code 1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest-buck / macos / macos-job (gh) (trunk failure)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 3

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-02-20T18:57:06Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 20, 2026

Gasoonjia approved these changes Feb 20, 2026

View reviewed changes

Merge branch 'main' into gh/gasoonjia/116/orig

abc1514

Gasoonjia merged commit 255e9e0 into main Feb 20, 2026
175 of 180 checks passed

Gasoonjia deleted the gh/gasoonjia/116/orig branch February 20, 2026 20:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove unnecessary cuda sync for better perf#17594

Remove unnecessary cuda sync for better perf#17594
Gasoonjia merged 2 commits intomainfrom
gh/gasoonjia/116/orig

pytorchbot commented Feb 20, 2026

Uh oh!

pytorch-bot Bot commented Feb 20, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pytorchbot commented Feb 20, 2026

Uh oh!

pytorch-bot Bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17594

❌ 5 New Failures, 1 Unrelated Failure

Uh oh!

github-actions Bot commented Feb 20, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot Bot commented Feb 20, 2026 •

edited

Loading

This PR needs a `release notes:` label