Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code works on one GPU machine, doesn't work on another #36

Closed
justanhduc opened this issue Nov 15, 2019 · 4 comments
Closed

Code works on one GPU machine, doesn't work on another #36

justanhduc opened this issue Nov 15, 2019 · 4 comments

Comments

@justanhduc
Copy link
Contributor

Hello. my server has 4 GPUs. The default one is an RTX 2080. THe package can be installed and run fine on this GPU but when I run on other GPUs (TitanX), i got the following error

  File "/home/xxx/anaconda3/lib/python3.7/site-packages/torch_cluster/knn.py", line 123, in knn_graph
    row, col = knn(x, x, k if loop else k + 1, batch, batch, cosine=cosine)

  File "/home/xxx/anaconda3/lib/python3.7/site-packages/torch_cluster/knn.py", line 60, in knn
    return torch_cluster.knn_cuda.knn(x, y, k, batch_x, batch_y, cosine)

RuntimeError: CUDA error: no kernel image is available for execution on the device (launch_kernel at /opt/conda/conda-bld/pytorch_1573049306803/work/aten/src/ATen/native/cuda/Loops.cuh:102)

My guess is that setup.py automatically sets the flag for the RTX sm architecture. However, when I tried to add

extra_compile_args = {'gcc': [], 'nvcc': ['-arch=sm_30']}
if (TORCH_MAJOR > 1) or (TORCH_MAJOR == 1 and TORCH_MINOR > 2):
    extra_compile_args['gcc'] += ['-DVERSION_GE_1_3']

but when installing i got this error

Traceback (most recent call last):
  File "setup.py", line 77, in <module>
    packages=find_packages(),
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/setuptools/__init__.py", line 145, in setup
    return distutils.core.setup(**attrs)
  File "/home/xxx/anaconda3/lib/python3.7/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/home/xxx/anaconda3/lib/python3.7/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/home/xxx/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/setuptools/command/install.py", line 67, in run
    self.do_egg_install()
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/setuptools/command/install.py", line 109, in do_egg_install
    self.run_command('bdist_egg')
  File "/home/xxx/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/xxx/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/setuptools/command/bdist_egg.py", line 172, in run
    cmd = self.call_command('install_lib', warn_dir=0)
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/setuptools/command/bdist_egg.py", line 158, in call_command
    self.run_command(cmdname)
  File "/home/xxx/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/xxx/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/setuptools/command/install_lib.py", line 11, in run
    self.build()
  File "/home/xxx/anaconda3/lib/python3.7/distutils/command/install_lib.py", line 107, in build
    self.run_command('build_ext')
  File "/home/xxx/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/xxx/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 84, in run
    _build_ext.run(self)
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "/home/xxx/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 353, in build_extensions
    build_ext.build_extensions(self)
  File "/home/xx/anaconda3/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 194, in build_extensions
    self.build_extension(ext)
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 205, in build_extension
    _build_ext.build_extension(self, ext)
  File "/home/xxx/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
    depends=ext.depends)
  File "/home/xxx/anaconda3/lib/python3.7/distutils/ccompiler.py", line 574, in compile
    self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
  File "/home/xxx/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 271, in unix_wrap_compile
    cflags = cflags['cxx']
KeyError: 'cxx'

Please help!

@lookthatdog
Copy link

File "/home/xxx/anaconda3/lib/python3.7/site-packages/torch_cluster/knn.py", line 123, in knn_graph
row, col = knn(x, x, k if loop else k + 1, batch, batch, cosine=cosine)

File "/home/xxx/anaconda3/lib/python3.7/site-packages/torch_cluster/knn.py", line 60, in knn
return torch_cluster.knn_cuda.knn(x, y, k, batch_x, batch_y, cosine)

RuntimeError: CUDA error: no kernel image is available for execution on the device (launch_kernel at /opt/conda/conda-bld/pytorch_1573049306803/work/aten/src/ATen/native/cuda/Loops.cuh:102)
I also have this errors on Tesla_P100_PCIE_16GB

@justanhduc
Copy link
Contributor Author

@lookthatdog i guess you ran your code on some device other than 0 right? And that GPU is of different type from GPU0?

@rusty1s
Copy link
Owner

rusty1s commented Feb 19, 2020

Eventually, this will help.

@justanhduc
Copy link
Contributor Author

@rusty1s yes, or in my case I specified the architectures for the gencode flag of nvcc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants