[bazel] enable sccache+nvcc in CI #95528

vors · 2023-02-24T23:40:18Z

This change is mostly focused on enabling nvcc+sccache in the PyTorch CI.

Along the way we had to do couple tweaks:

Split the rules_cc from the rules_cuda that embeeded them before. This is needed in order to apply a different patch to the rules_cc compare to the one that rules_cuda does by default. This is in turn needed because we need to workaround an nvcc behavior where it doesn't send -iquote xxx to the host compiler, but it does send -isystem xxx. So we workaround this problem with (ab)using -isystem instead. Without it we are getting errors like xxx is not found.
Workaround bug in bazel Bazel does not handle symlinks in system include paths for CUDA bazelbuild/bazel#10167 that prevents us from using a straightforward and honest nvcc sccache wrapper. Instead we generate ad-hock bazel specific nvcc wrapper that has internal knowledge of the relative bazel paths to local_cuda. This allows us to workaround the issue with CUDA symlinks. Without it we are getting undeclared inclusion(s) in rule all over the place for CUDA headers.

Test plan

Green CI build https://github.com/pytorch/pytorch/actions/runs/4267147180/jobs/7428431740

Note that now it says "CUDA" in the sccache output

+ sccache --show-stats
Compile requests                    9784
Compile requests executed           6726
Cache hits                          6200
Cache hits (C/C++)                  6131
Cache hits (CUDA)                     69
Cache misses                         519
Cache misses (C/C++)                 201
Cache misses (CUDA)                  318
Cache timeouts                         0
Cache read errors                      0
Forced recaches                        0
Cache write errors                     0
Compilation failures                   0
Cache errors                           7
Cache errors (C/C++)                   7
Non-cacheable compilations             0
Non-cacheable calls                 2893
Non-compilation calls                165
Unsupported compiler calls             0
Average cache write                0.116 s
Average cache read miss           23.722 s
Average cache read hit             0.057 s
Failed distributed compilations        0

pytorch-bot · 2023-02-24T23:40:21Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/95528

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit dffd3df:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

WORKSPACE

.ci/pytorch/common_utils.sh

vors

Addressed the comments, thank you for the quick turn-around!

WORKSPACE

huydhn

LGTM!

huydhn · 2023-02-27T18:47:51Z

All linter failures come from SPACES linter. Could you try to exclude tools/rules_cc/cuda_support.patch from that linter in https://github.com/pytorch/pytorch/blob/master/.lintrunner.toml#L367? This should help pass the check.

huydhn · 2023-02-27T20:20:22Z

@pytorchbot merge

pytorchmergebot · 2023-02-27T20:22:51Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-02-28T02:21:28Z

The merge job was canceled. If you believe this is a mistake,then you can re trigger it through pytorch-bot.

huydhn · 2023-02-28T02:48:39Z

@pytorchbot merge

pytorchmergebot · 2023-02-28T02:50:23Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Fixes #79348 This change is mostly focused on enabling nvcc+sccache in the PyTorch CI. Along the way we had to do couple tweaks: 1. Split the rules_cc from the rules_cuda that embeeded them before. This is needed in order to apply a different patch to the rules_cc compare to the one that rules_cuda does by default. This is in turn needed because we need to workaround an nvcc behavior where it doesn't send `-iquote xxx` to the host compiler, but it does send `-isystem xxx`. So we workaround this problem with (ab)using `-isystem` instead. Without it we are getting errors like `xxx` is not found. 2. Workaround bug in bazel bazelbuild/bazel#10167 that prevents us from using a straightforward and honest `nvcc` sccache wrapper. Instead we generate ad-hock bazel specific nvcc wrapper that has internal knowledge of the relative bazel paths to local_cuda. This allows us to workaround the issue with CUDA symlinks. Without it we are getting `undeclared inclusion(s) in rule` all over the place for CUDA headers. ## Test plan Green CI build https://github.com/pytorch/pytorch/actions/runs/4267147180/jobs/7428431740 Note that now it says "CUDA" in the sccache output ``` + sccache --show-stats Compile requests 9784 Compile requests executed 6726 Cache hits 6200 Cache hits (C/C++) 6131 Cache hits (CUDA) 69 Cache misses 519 Cache misses (C/C++) 201 Cache misses (CUDA) 318 Cache timeouts 0 Cache read errors 0 Forced recaches 0 Cache write errors 0 Compilation failures 0 Cache errors 7 Cache errors (C/C++) 7 Non-cacheable compilations 0 Non-cacheable calls 2893 Non-compilation calls 165 Unsupported compiler calls 0 Average cache write 0.116 s Average cache read miss 23.722 s Average cache read hit 0.057 s Failed distributed compilations 0 ``` Pull Request resolved: pytorch/pytorch#95528 Approved by: https://github.com/huydhn

This reverts commit 447f5b5.

Fixes pytorch#79348 This change is mostly focused on enabling nvcc+sccache in the PyTorch CI. Along the way we had to do couple tweaks: 1. Split the rules_cc from the rules_cuda that embeeded them before. This is needed in order to apply a different patch to the rules_cc compare to the one that rules_cuda does by default. This is in turn needed because we need to workaround an nvcc behavior where it doesn't send `-iquote xxx` to the host compiler, but it does send `-isystem xxx`. So we workaround this problem with (ab)using `-isystem` instead. Without it we are getting errors like `xxx` is not found. 2. Workaround bug in bazel bazelbuild/bazel#10167 that prevents us from using a straightforward and honest `nvcc` sccache wrapper. Instead we generate ad-hock bazel specific nvcc wrapper that has internal knowledge of the relative bazel paths to local_cuda. This allows us to workaround the issue with CUDA symlinks. Without it we are getting `undeclared inclusion(s) in rule` all over the place for CUDA headers. ## Test plan Green CI build https://github.com/pytorch/pytorch/actions/runs/4267147180/jobs/7428431740 Note that now it says "CUDA" in the sccache output ``` + sccache --show-stats Compile requests 9784 Compile requests executed 6726 Cache hits 6200 Cache hits (C/C++) 6131 Cache hits (CUDA) 69 Cache misses 519 Cache misses (C/C++) 201 Cache misses (CUDA) 318 Cache timeouts 0 Cache read errors 0 Forced recaches 0 Cache write errors 0 Compilation failures 0 Cache errors 7 Cache errors (C/C++) 7 Non-cacheable compilations 0 Non-cacheable calls 2893 Non-compilation calls 165 Unsupported compiler calls 0 Average cache write 0.116 s Average cache read miss 23.722 s Average cache read hit 0.057 s Failed distributed compilations 0 ``` Pull Request resolved: pytorch#95528 Approved by: https://github.com/huydhn

vors added 3 commits February 24, 2023 23:06

Create a patch

c3d249a

fix things

09fad32

nvcc hack

9dc830c

pytorchbot added the open source label Feb 24, 2023

vors changed the title ~~[don't review] Sergei/nvcc new 3~~ [bazel] enable sccache+nvcc in CI Feb 24, 2023

vors marked this pull request as ready for review February 25, 2023 00:37

kit1980 reviewed Feb 25, 2023

View reviewed changes

WORKSPACE Outdated Show resolved Hide resolved

huydhn reviewed Feb 25, 2023

View reviewed changes

.ci/pytorch/common_utils.sh Outdated Show resolved Hide resolved

vors added 2 commits February 25, 2023 01:56

remove sha256

4e45012

Replace by a more readable echo

ebab97b

vors commented Feb 25, 2023

View reviewed changes

WORKSPACE Outdated Show resolved Hide resolved

huydhn added the topic: not user facing topic category label Feb 25, 2023

huydhn approved these changes Feb 25, 2023

View reviewed changes

vors added 2 commits February 25, 2023 05:32

empty for ci

6efa712

escaping for dollar signs

8d3491c

Add '**/*.patch to exclude from SPACE linter

dffd3df

huydhn added the test-config/default label Feb 27, 2023

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 27, 2023

pytorchmergebot added the Merged label Feb 28, 2023

pytorchmergebot closed this in 447f5b5 Feb 28, 2023

msaroufim mentioned this pull request Mar 3, 2023

Remove mention of dynamo.optimize() in docs #96002

Closed

pruthvistony added a commit to ROCm/pytorch that referenced this pull request May 2, 2023

Revert "[bazel] enable sccache+nvcc in CI (pytorch#95528)"

7fc0acc

This reverts commit 447f5b5.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bazel] enable sccache+nvcc in CI #95528

[bazel] enable sccache+nvcc in CI #95528

vors commented Feb 24, 2023 •

edited

pytorch-bot bot commented Feb 24, 2023 •

edited

vors left a comment

huydhn left a comment

huydhn commented Feb 27, 2023

huydhn commented Feb 27, 2023

pytorchmergebot commented Feb 27, 2023

pytorchmergebot commented Feb 28, 2023

huydhn commented Feb 28, 2023

pytorchmergebot commented Feb 28, 2023

[bazel] enable sccache+nvcc in CI #95528

[bazel] enable sccache+nvcc in CI #95528

Conversation

vors commented Feb 24, 2023 • edited

Test plan

pytorch-bot bot commented Feb 24, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/95528

✅ No Failures

vors left a comment

Choose a reason for hiding this comment

huydhn left a comment

Choose a reason for hiding this comment

huydhn commented Feb 27, 2023

huydhn commented Feb 27, 2023

pytorchmergebot commented Feb 27, 2023

Merge started

pytorchmergebot commented Feb 28, 2023

huydhn commented Feb 28, 2023

pytorchmergebot commented Feb 28, 2023

Merge started

vors commented Feb 24, 2023 •

edited

pytorch-bot bot commented Feb 24, 2023 •

edited