Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot compile with torch 1.6 but successfully with torch 1.4 #10

Closed
KiedaTamashi opened this issue Dec 9, 2020 · 1 comment
Closed

Comments

@KiedaTamashi
Copy link

KiedaTamashi commented Dec 9, 2020

My env:
gcc 7.5.0
ninja 1.10.2
ubuntu 18.04
python 3.7
cudatoolkit 10.1

I successfully compile it with torch 1.4. test import _ext and passed. But failed to compile with torch 1.6
I also tried cuda 10.2, failed. My pytorch is installed using 'conda install'.

It failed from the fourth one I think..
The error report is like this:
(This is what succeed)
[3/7] c++ -MMD -MF /NAS/home01/tanzhenwei/DCNv2_latest/build/temp.linux-x86_64-3.7/NAS/home01/tanzhenwei/DCNv2_latest/src/cpu/dcn_v2_psroi_pooling_cpu.o.d -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/NAS/home01/tanzhenwei/DCNv2_latest/src -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/lib/python3.7/site-packages/torch/include -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/lib/python3.7/site-packages/torch/include/TH -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/include/python3.7m -c -c /NAS/home01/tanzhenwei/DCNv2_latest/src/cpu/dcn_v2_psroi_pooling_cpu.cpp -o /NAS/home01/tanzhenwei/DCNv2_latest/build/temp.linux-x86_64-3.7/NAS/home01/tanzhenwei/DCNv2_latest/src/cpu/dcn_v2_psroi_pooling_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14

(this is what failed.)
[4/7] /usr/local/cuda/bin/nvcc -DWITH_CUDA -I/NAS/home01/tanzhenwei/DCNv2_latest/src -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/lib/python3.7/site-packages/torch/include -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/lib/python3.7/site-packages/torch/include/TH -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/NAS/home01/tanzhenwei/anaconda3/envs/torch16/include/python3.7m -c -c /NAS/home01/tanzhenwei/DCNv2_latest/src/cuda/dcn_v2_im2col_cuda.cu -o /NAS/home01/tanzhenwei/DCNv2_latest/build/temp.linux-x86_64-3.7/NAS/home01/tanzhenwei/DCNv2_latest/src/cuda/dcn_v2_im2col_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_70,code=sm_70 -std=c++14
FAILED: /NAS/home01/tanzhenwei/DCNv2_latest/build/temp.linux-x86_64-3.7/NAS/home01/tanzhenwei/DCNv2_latest/src/cuda/dcn_v2_im2col_cuda.o

Trackback:
Error Message:
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1522, in _run_ninja_build
env=env)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/subprocess.py", line 481, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "setup.py", line 69, in
cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension},
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/core.py", line 148, in setup
dist.run_commands()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 653, in build_extensions
build_ext.build_extensions(self)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 448, in build_extensions
self._build_extensions_serial()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 473, in _build_extensions_serial
self.build_extension(ext)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 533, in build_extension
depends=ext.depends)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 482, in unix_wrap_ninja_compile
with_cuda=with_cuda)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1238, in _write_ninja_file_and_compile_objects
error_prefix='Error compiling objects for extension')
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1538, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

Could you give some advice?

@KiedaTamashi
Copy link
Author

Of course I have pointed out variable like CUDA_HOME in ~/.bashrc , as shown in following
The GPU is V100 on a remote server. pytorch works well.

CUDAVER=cuda-10.1
export PATH=/usr/local/$CUDAVER/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/$CUDAVER/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/usr/local/$CUDAVER/lib64:$LD_LIBRARY_PATH
export CUDA_PATH=/usr/local/$CUDAVER
export CUDA_ROOT=/usr/local/$CUDAVER
export CUDA_HOME=/usr/local/$CUDAVER
export CUDA_HOST_COMPILER=/usr/bin/gcc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant