Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sccache crashes when building Distribution.cu on Windows #24145

Open
peterjc123 opened this issue Aug 10, 2019 · 5 comments
Open

sccache crashes when building Distribution.cu on Windows #24145

peterjc123 opened this issue Aug 10, 2019 · 5 comments
Labels
module: build warnings Related to warnings during build process module: build Build system issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@peterjc123
Copy link
Collaborator

There are many occurences of this build error in Azure Pipelines.
https://dev.azure.com/pytorch/PyTorch/_build/results?buildId=3891
https://dev.azure.com/pytorch/PyTorch/_build/results?buildId=3901
https://dev.azure.com/pytorch/PyTorch/_build/results?buildId=3695

[737/1919] Building NVCC (Device) object caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/native/cuda/torch_generated_Distributions.cu.obj
FAILED: caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/native/cuda/torch_generated_Distributions.cu.obj 
cmd.exe /C "cd /D C:\w\1\s\windows\pytorch\build\build\caffe2\CMakeFiles\torch.dir\__\aten\src\ATen\native\cuda && C:\w\1\s\windows\conda\envs\py3\Library\bin\cmake.exe -E make_directory C:/w/1/s/windows/pytorch/build/build/caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/native/cuda/. && C:\w\1\s\windows\conda\envs\py3\Library\bin\cmake.exe -D verbose:BOOL=OFF -D build_configuration:STRING=Release -D generated_file:STRING=C:/w/1/s/windows/pytorch/build/build/caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/native/cuda/./torch_generated_Distributions.cu.obj -D generated_cubin_file:STRING=C:/w/1/s/windows/pytorch/build/build/caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/native/cuda/./torch_generated_Distributions.cu.obj.cubin.txt -P C:/w/1/s/windows/pytorch/build/build/caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/native/cuda/torch_generated_Distributions.cu.obj.Release.cmake"
Distributions.cu
cl : Command line warning D9025 : overriding '/EHs' with '/EHa'
cl : Command line warning D9025 : overriding '/EHa' with '/EHs'
Distributions.cu
cl : Command line warning D9025 : overriding '/EHs' with '/EHa'
cl : Command line warning D9025 : overriding '/EHa' with '/EHs'
Distributions.cu
cl : Command line warning D9025 : overriding '/EHs' with '/EHa'
cl : Command line warning D9025 : overriding '/EHa' with '/EHs'
Distributions.cu
cl : Command line warning D9025 : overriding '/EHs' with '/EHa'
cl : Command line warning D9025 : overriding '/EHa' with '/EHs'
Distributions.cu
cl : Command line warning D9025 : overriding '/EHs' with '/EHa'
cl : Command line warning D9025 : overriding '/EHa' with '/EHs'
Distributions.cu
cl : Command line warning D9025 : overriding '/EHs' with '/EHa'
cl : Command line warning D9025 : overriding '/EHa' with '/EHs'
Distributions.cu
cl : Command line warning D9025 : overriding '/EHs' with '/EHa'
cl : Command line warning D9025 : overriding '/EHa' with '/EHs'
Distributions.cu
error: failed to execute compile
caused by: error reading compile response from server
caused by: Failed to read response header
caused by: An existing connection was forcibly closed by the remote host. (os error 10054)
CMake Error at torch_generated_Distributions.cu.obj.Release.cmake:279 (message):
  Error generating file
  C:/w/1/s/windows/pytorch/build/build/caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/native/cuda/./torch_generated_Distributions.cu.obj

Any ideas, @yf225?

@peterjc123 peterjc123 changed the title sccache crashes when building Distribution.cu sccache crashes when building Distribution.cu on Windows Aug 10, 2019
@vishwakftw vishwakftw added module: build Build system issues module: build warnings Related to warnings during build process labels Aug 11, 2019
@yf225
Copy link
Contributor

yf225 commented Aug 12, 2019

Based on offline discussion with @peterjc123 , this is only reproducible with CUDA 10. I suggested trying out the solutions in mozilla/sccache#256 (e.g. increasing timeout value). Also if the error can be reproduced locally I can help debug.

@pietern
Copy link
Contributor

pietern commented Aug 13, 2019

@yf225 @peterjc123 What's the default timeout?

@pietern pietern added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 13, 2019
@peterjc123
Copy link
Collaborator Author

@peterjc123
Copy link
Collaborator Author

peterjc123 commented Aug 15, 2019

Now it is becoming worse. Some more errors:

[269/1930] Building C object confu-deps\cpuinfo\CMakeFiles\cpuinfo.dir\src\init.c.obj
FAILED: confu-deps/cpuinfo/CMakeFiles/cpuinfo.dir/src/init.c.obj 
C:\w\1\s\windows\tmp_bin\sccache.exe  cl  /nologo -DCPUINFO_LOG_LEVEL=2 -DTH_BLAS_MKL -D_OPENMP_NOFORCE_MANIFEST -I..\..\third_party\cpuinfo\src -I..\..\third_party\cpuinfo\include -I..\..\third_party\cpuinfo\deps\clog\include -I..\..\third_party\protobuf\src -IC:\w\1\s\windows\mkl\include /DWIN32 /D_WINDOWS /W3 /EHa /MDd /Zi /Ob0 /Od /RTC1   /MDd /showIncludes /Foconfu-deps\cpuinfo\CMakeFiles\cpuinfo.dir\src\init.c.obj /Fdconfu-deps\cpuinfo\CMakeFiles\cpuinfo.dir\cpuinfo.pdb /FS -c ..\..\third_party\cpuinfo\src\init.c
sccache: encountered fatal error
sccache: error : failed to store `init.c.obj` to cache
sccache:  cause: failed to store `init.c.obj` to cache
sccache:  cause: failed to zip up compiler outputs
sccache:  cause: The process cannot access the file because it is being used by another process. (os error 32)
[270/1930] Building CXX object third_party\protobuf\cmake\CMakeFiles\libprotobuf.dir\__\src\google\protobuf\util\internal\protostream_objectsource.cc.obj
cl : Command line warning D9025 : overriding '/EHs' with '/EHa'
[271/1930] Building C object confu-deps\cpuinfo\CMakeFiles\cpuinfo.dir\src\x86\info.c.obj
[272/1930] Building C object confu-deps\cpuinfo\CMakeFiles\cpuinfo.dir\src\api.c.obj
FAILED: confu-deps/cpuinfo/CMakeFiles/cpuinfo.dir/src/api.c.obj 
C:\w\1\s\windows\tmp_bin\sccache.exe  cl  /nologo -DCPUINFO_LOG_LEVEL=2 -DTH_BLAS_MKL -D_OPENMP_NOFORCE_MANIFEST -I..\..\third_party\cpuinfo\src -I..\..\third_party\cpuinfo\include -I..\..\third_party\cpuinfo\deps\clog\include -I..\..\third_party\protobuf\src -IC:\w\1\s\windows\mkl\include /DWIN32 /D_WINDOWS /W3 /EHa /MDd /Zi /Ob0 /Od /RTC1   /MDd /showIncludes /Foconfu-deps\cpuinfo\CMakeFiles\cpuinfo.dir\src\api.c.obj /Fdconfu-deps\cpuinfo\CMakeFiles\cpuinfo.dir\cpuinfo.pdb /FS -c ..\..\third_party\cpuinfo\src\api.c
sccache: encountered fatal error
sccache: error : failed to store `api.c.obj` to cache
sccache:  cause: failed to store `api.c.obj` to cache
sccache:  cause: failed to zip up compiler outputs
sccache:  cause: The process cannot access the file because it is being used by another process. (os error 32)

Looks like /Zi is appearing again for C sources.

@peterjc123
Copy link
Collaborator Author

peterjc123 commented Aug 31, 2019

@yf225 Have you rebuilt sccache? However, the issue is still there and it only occurs when trying to build the CUDA 10.0 binaries through CMD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: build warnings Related to warnings during build process module: build Build system issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants