[bazel] build doesn't use sccache #79348

vors · 2022-06-11T12:26:35Z

🐛 Describe the bug

A build with sccache used to take 15 minutes according to @malfet .
Right now it take 65-70 minutes. This probably indicates that sccache is not used.
However this also could be because previously GPU build was not enabled and no it is.

This needs a quick investigation.

Versions

master

vors · 2022-12-28T15:24:56Z

@jjsjann123 I think your experience with bazel tells us that sccache is in fact used, right?
It would be still good to understand why we don't get the time savings benefit (mostly for CI).

jjsjann123 · 2022-12-28T17:20:05Z

@jjsjann123 I think your experience with bazel tells us that sccache is in fact used, right? It would be still good to understand why we don't get the time savings benefit (mostly for CI).

Yes. IIRC, when I pull CI docker image to build pytorch with bazel, there's something funny with sccache, but I never get it to work and switched to our development container to WAR it.

Would be great if we actually get CI container to be usable for community contributors. 👀

Fixes pytorch#79348 This change is mostly focused on enabling nvcc+sccache in the PyTorch CI. Along the way we had to do couple tweaks: 1. Split the rules_cc from the rules_cuda that embeeded them before. This is needed in order to apply a different patch to the rules_cc compare to the one that rules_cuda does by default. This is in turn needed because we need to workaround an nvcc behavior where it doesn't send `-iquote xxx` to the host compiler, but it does send `-isystem xxx`. So we workaround this problem with (ab)using `-isystem` instead. Without it we are getting errors like `xxx` is not found. 2. Workaround bug in bazel bazelbuild/bazel#10167 that prevents us from using a straightforward and honest `nvcc` sccache wrapper. Instead we generate ad-hock bazel specific nvcc wrapper that has internal knowledge of the relative bazel paths to local_cuda. This allows us to workaround the issue with CUDA symlinks. Without it we are getting `undeclared inclusion(s) in rule` all over the place for CUDA headers. ## Test plan Green CI build https://github.com/pytorch/pytorch/actions/runs/4267147180/jobs/7428431740 Note that now it says "CUDA" in the sccache output ``` + sccache --show-stats Compile requests 9784 Compile requests executed 6726 Cache hits 6200 Cache hits (C/C++) 6131 Cache hits (CUDA) 69 Cache misses 519 Cache misses (C/C++) 201 Cache misses (CUDA) 318 Cache timeouts 0 Cache read errors 0 Forced recaches 0 Cache write errors 0 Compilation failures 0 Cache errors 7 Cache errors (C/C++) 7 Non-cacheable compilations 0 Non-cacheable calls 2893 Non-compilation calls 165 Unsupported compiler calls 0 Average cache write 0.116 s Average cache read miss 23.722 s Average cache read hit 0.057 s Failed distributed compilations 0 ``` Pull Request resolved: pytorch#95528 Approved by: https://github.com/huydhn

vors mentioned this issue Jun 11, 2022

[CI] [bazel] run GPU-requiring tests on a GPU machine #79354

Closed

mikaylagawarecki added module: bazel triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jun 15, 2022

vors mentioned this issue Jan 19, 2023

[WIP] trying fix bazel with nvcc sccache #92598

Closed

vors mentioned this issue Feb 24, 2023

[bazel] enable sccache+nvcc in CI #95528

Closed

pytorchmergebot closed this as completed in 447f5b5 Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bazel] build doesn't use sccache #79348

[bazel] build doesn't use sccache #79348

vors commented Jun 11, 2022

vors commented Dec 28, 2022

jjsjann123 commented Dec 28, 2022

[bazel] build doesn't use sccache #79348

[bazel] build doesn't use sccache #79348

Comments

vors commented Jun 11, 2022

🐛 Describe the bug

Versions

vors commented Dec 28, 2022

jjsjann123 commented Dec 28, 2022