Use cuda version PT when build with CUDA delegate #14355

Gasoonjia · 2025-09-16T22:51:57Z

This PR enables CUDA version PT installation when building with CUDA delegate enabled from source.

More specific:

ET will keep depending on cpu PT as long as CUDA delegate is not enabled;
We will choose the CUDA PT exactly match user's cuda version: if user don't have CUDA, or have CUDA but not exactly match the versions PT supported, the installation script will raise error.

pytorch-bot · 2025-09-16T22:52:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14355

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 7 Unrelated Failures

As of commit 19c2fb2 with merge base a548635 ():

NEW FAILURES - The following jobs have failed:

pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 9b9e9d6c06f46d5426b60877948ea53d136332a09ade1fef6122d384d00c086e /exec failed with exit code 1
Test CUDA Builds / check-all-cuda-builds (gh)
Process completed with exit code 1.
Test CUDA Builds / test-executorch-cuda-build-12.9 / linux-job (gh)
RuntimeError: Command docker exec -t 6efaecb9972e9758528b0ee2937f8d19bcd405be930b572d33c9786fc1444433 /exec failed with exit code 1
trunk / test-arm-ootb-linux / linux-job (gh)
RuntimeError: Command docker exec -t ae5f5b101bccc348a9d5ce9e11c1e0aa674d2796fd38a046dadafe024957288a /exec failed with exit code 1

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

docker-builds / docker-build (linux.2xlarge, executorch-ubuntu-22.04-gcc9) (gh) (detected as infra flaky with no log or failing log classifier)
pull / test-binary-size-linux / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / test-llama-runner-qnn-linux (fp32, qnn_8a8w, qnn) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / test-models-linux (emformer_transcribe, portable, linux.2xlarge) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / test-qnn-wheel-packages-linux (3.11) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

trunk / unittest-release / linux / linux-job (gh) (trunk failure)
[ FAILED ] LoggingTest.Utf8Truncation
trunk / unittest-release / macos / macos-job (gh) (trunk failure)
[ FAILED ] LoggingTest.Utf8Truncation

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-09-16T22:52:36Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

larryliu0820 · 2025-09-17T06:25:25Z

install_requirements.py

+def _check_cuda_enabled():
+    """Check if CUDA delegate is enabled via CMAKE_ARGS environment variable."""
+    cmake_args = os.environ.get("CMAKE_ARGS", "")
+    return "-DEXECUTORCH_BUILD_CUDA=ON" in cmake_args
+
+
+def _cuda_version_to_pytorch_suffix(major, minor):


Can we pull all cuda related functions into util?

larryliu0820 · 2025-09-17T06:25:59Z

install_requirements.py

+_torch_url = ""
+
+
+def _determine_torch_url():


This seems like it belongs to util as well

JacobSzwejbka · 2025-09-19T18:05:54Z

install_utils.py

+_torch_url_cache = ""
+
+
+def determine_torch_url(torch_nightly_url_base, supported_cuda_versions):


this function is only called twice is it really necessary to cache?

main reason to cache is try not print too much noise output.

Use @functools.lru_cache

This PR enables CUDA version PT installation when building with CUDA delegate enabled from source. More specific: 1. ET will keep depending on cpu PT as long as CUDA delegate is not enabled; 2. We will choose the CUDA PT exactly match user's cuda version: if user don't have CUDA, or have CUDA but not exactly match the versions PT supported, the installation script will raise error.

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 16, 2025

larryliu0820 reviewed Sep 17, 2025

View reviewed changes

install_requirements.py Outdated

_torch_url = ""

def _determine_torch_url():

Copy link

Contributor

larryliu0820 Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it belongs to util as well

Gasoonjia requested review from kirklandsign, GregoryComer, swolchok, JacobSzwejbka, lucylq, manuelcandales, jackzhxng, mergennachin, shoumikhin, cccclai and digantdesai as code owners September 19, 2025 16:02

Gasoonjia added 10 commits September 19, 2025 09:50

rebase to latest main

8f9fc9a

add github ci for gpu pt install check

dbe31b5

add github ci for gpu pt install check

1110434

recover torchao

0621550

solve lint issue

3ef491b

create install_utils.py for better structure

9792c99

set use-custom-docker-registry as false

a18cd15

rebase to latest main

5b430f4

recover torchao

95c2536

solve platform import issue

b00bc14

Gasoonjia force-pushed the install-cuda-pt branch from 5d521f3 to b00bc14 Compare September 19, 2025 17:52

Gasoonjia added 2 commits September 19, 2025 10:58

introduce missed sys

ae52b29

introduce missed platform

57ebb63

JacobSzwejbka reviewed Sep 19, 2025

View reviewed changes

JacobSzwejbka approved these changes Sep 19, 2025

View reviewed changes

Gasoonjia added 2 commits September 19, 2025 12:00

update cuda ci script

43d164f

try ci with specific docker-image

d892e3f

larryliu0820 approved these changes Sep 19, 2025

View reviewed changes

Gasoonjia added 6 commits September 19, 2025 15:16

no conda run n yml

2e87886

remove unsupported jq

5a5e829

use lru cache to replace global cache variable

6e7884f

make SUPPORTED_CUDA_VERSIONS as tuple for hashable

bd24c4b

use default conda env

d1c596c

remove conda env selection in cuda-build.sh

19c2fb2

Gasoonjia merged commit a954a75 into main Sep 20, 2025
268 of 279 checks passed

Gasoonjia deleted the install-cuda-pt branch September 20, 2025 06:22

zingo mentioned this pull request Sep 21, 2025

test-arm-ootb-linux is failing in release branch #14417

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use cuda version PT when build with CUDA delegate #14355

Use cuda version PT when build with CUDA delegate #14355

Uh oh!

Gasoonjia commented Sep 16, 2025

Uh oh!

pytorch-bot bot commented Sep 16, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

larryliu0820 Sep 17, 2025

Uh oh!

larryliu0820 Sep 17, 2025

Uh oh!

JacobSzwejbka Sep 19, 2025

Uh oh!

Gasoonjia Sep 19, 2025

Uh oh!

larryliu0820 Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

		_torch_url_cache = ""


		def determine_torch_url(torch_nightly_url_base, supported_cuda_versions):

Use cuda version PT when build with CUDA delegate #14355

Use cuda version PT when build with CUDA delegate #14355

Uh oh!

Conversation

Gasoonjia commented Sep 16, 2025

Uh oh!

pytorch-bot bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14355

❌ 4 New Failures, 7 Unrelated Failures

Uh oh!

github-actions bot commented Sep 16, 2025

This PR needs a release notes: label

Uh oh!

larryliu0820 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

larryliu0820 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

JacobSzwejbka Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Gasoonjia Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

larryliu0820 Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 16, 2025 •

edited

Loading

This PR needs a `release notes:` label