Support for newer CUDA capability (e.g. sm_86) #359

MMelQin · 2022-09-29T17:53:31Z

Discussed in #358

^{Originally posted by Leengit September 28, 2022}
I successfully build a docker image with monai-deploy package that runs on the computer on which I built it. However when I try to run the same docker image on a computer with a significantly newer / more powerful GPU, it fails. It appears that the underlying docker image nvcr.io/nvidia/pytorch:21.07-py3 uses a version of CUDA 11.3 and torch that do not support sm_86. Upgrading to torch==1.12.1 within the docker image that I create (and committing the change to a new image that I then use) does not help. Despite my attempts with apt-get, I have been unable to install a newer version of CUDA within the created docker image.

Your help with getting support for an NVIDIA RTX A5000 would be much appreciated! The error from running the docker image that I created with monai deploy includes

NVIDIA RTX A5000 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA RTX A5000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
...
  File "~/venv/lung/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 453, in _conv_forward
                            weight, bias, self.stride,
                            _pair(0), self.dilation, self.groups)
        return F.conv2d(input, weight, bias, self.stride,
               ~~~~~~~~ <--- HERE   
                        self.padding, self.dilation, self.groups)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
```</div>

The text was updated successfully, but these errors were encountered:

Leengit mentioned this issue Oct 4, 2022

Provide torch segmentation model via MONAI Deploy KitwareMedical/lungair-desktop-application#41

Merged

MMelQin added this to Needs Triage in Backlog via automation Oct 16, 2022

MMelQin linked a pull request Oct 26, 2022 that will close this issue

Enhance Packager to make the MAP more secure, easier to use, and based on new version of PyTorch image for newer CUDA versions. #381

Merged

MMelQin added this to To do in v1.0.0 via automation Oct 26, 2022

MMelQin removed this from Needs Triage in Backlog Oct 26, 2022

MMelQin moved this from To do to In progress in v1.0.0 Oct 26, 2022

MMelQin closed this as completed in #381 Nov 3, 2022

MMelQin moved this from In progress to Done in v1.0.0 Nov 3, 2022

MMelQin removed this from Done in v1.0.0 Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for newer CUDA capability (e.g. sm_86) #359

Support for newer CUDA capability (e.g. sm_86) #359

MMelQin commented Sep 29, 2022

Support for newer CUDA capability (e.g. sm_86) #359

Support for newer CUDA capability (e.g. sm_86) #359

Comments

MMelQin commented Sep 29, 2022

Discussed in #358