Skip to content

Conversation

clee2000
Copy link
Contributor

@clee2000 clee2000 commented Jun 9, 2025

Sccache wasn't working for nvcc on jammy, so manually set the path to include where nvcc is

I had problems with always making nvcc a wrapper in some inductor tests where I got

sccache: encountered fatal error
sccache: error: PCH not supported by nvcc
sccache: caused by: PCH not supported by nvcc

and I also got an error (only on clang) when trying to set CMAKE_CUDA_COMPILER_LAUNCHER to /opt/cache/bin/sccache or sccache

ccache: error: failed to execute compile
    sccache: caused by: Compiler not supported: "nvcc warning : Support for offline compilation for architectures prior to \'<compute/sm/lto>_75\' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).\nnvcc fatal   : Failed to preprocess host compiler properties.\n"

Non jammy cuda jobs' docker images used a different dockerfile, which set CMAKE_CUDA_COMPILER_LAUNCHER

ENV CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache

Alt solution:
Given that I only get the error on clang, I could set CMAKE_CUDA_COMPILER_LAUNCHER=sccache only when not using clang

Setting CUDA_NVCC_EXECUTABLE doesn't fail but also doesn't result in cache hits/misses

Copy link

pytorch-bot bot commented Jun 9, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155464

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 3 Pending

As of commit 9619911 with merge base 73220d5 (image):
💚 Looks good so far! There are no failures yet. 💚

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@clee2000 clee2000 marked this pull request as ready for review June 9, 2025 16:18
@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Jun 9, 2025
@clee2000 clee2000 requested a review from jeffdaily as a code owner June 9, 2025 16:18
@clee2000 clee2000 requested a review from a team June 9, 2025 16:18
@clee2000 clee2000 marked this pull request as draft June 9, 2025 17:26
@clee2000 clee2000 changed the title [ez][CI] Set CMAKE_CUDA_COMPILER_LAUNCHER in jammy docker images [ez][CI] Set PATH during build to include location of sccache wrapped nvcc Jun 9, 2025
@clee2000 clee2000 changed the title [ez][CI] Set PATH during build to include location of sccache wrapped nvcc [CI] Set PATH during build to include location of sccache wrapped nvcc Jun 9, 2025
@clee2000 clee2000 marked this pull request as ready for review June 9, 2025 19:48
@clee2000 clee2000 force-pushed the csl/jammy_cuda_sccache branch from ddc29b6 to c5427d8 Compare June 10, 2025 17:18
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some dark magic, please create an issue and let's try to get to the bottom of it....

I.e. unwrapped nvcc should always be in /usr/local/cuda/bin/nvcc folder, if it's not there, than we are doing something really wrong

@clee2000
Copy link
Contributor Author

This is some dark magic, please create an issue and let's try to get to the bottom of it....

I.e. unwrapped nvcc should always be in /usr/local/cuda/bin/nvcc folder, if it's not there, than we are doing something really wrong

The unwrapped version is in /usr/local/cuda/bin/nvcc, and the wrapped version is it /opt/cache/lib/nvcc, which is usually not in path, so we end up not using it

Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! It's true that nvcc is at /usr/local/cuda/bin/nvcc and not the sccache-wrapper one. The surprising thing, although, is that cache hit rate looks decent https://github.com/pytorch/pytorch/actions/runs/15562870942/job/43819461677#step:16:472, maybe nvcc stats is not even show there?

@clee2000
Copy link
Contributor Author

@pytorchbot merge -f "im impatient"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actions github-actions bot deleted the csl/jammy_cuda_sccache branch July 13, 2025 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants