-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: CUDA error: invalid device function ROIAlign_forward_cuda #62
Comments
It seems like you did not build detectron2 correctly. You may have wrong values in the |
I deleted the
I'm running on a 1080ti, which should be covered under "6.1". This results in the same errors as above. |
Is there a way either of you can let others reproduce this issue in docker or colab? |
I'm actually just trying to get object detection on LVIS running, and I'm able to successfully run the model when I switch the maskrcnn backbone out for a retinanet (which doesn't use ROIAlign). I unfortunately don't have time rn to try to set up docker or colab to replicate. |
I was able to reproduce the same error when I use the wrong version of cuda. What I did: |
The updated |
as followup from #78 . I installed new env with CUDA 9.2 and this solved my issue. Could the problem be since as stated at https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md all models are trained with CUDA 9.2 ? |
No it's unrelated to model zoo. |
I ran into this error as well. Re-installed Pytorch corresponding to a lower CUDA version (that matches my system CUDA). I was able to resolve the issue. |
It seems that mismatched NVCC vs CUDA Runtime version is the root cause. Closing but feel free to reopen if this does not solve your issue. |
@ppwwyyxx what should TORCH_CUDA_ARCH_LIST ideally be set to if one is using cuda/10.0 or cuda/10.1 with pytorch 1.3? nvcc --version shows me cuda 10.0 as well, I'm not sure what you mean by ^^ mismatch between nvcc and cuda runtimes since they're always the same for me.
Posting the error here because it seems related, can make a new issue if you recommend. |
The best option is to unset it (i.e., no such env variable). If you cannot solve the issue with existing information, please open a new one following the template. |
Hello, I want to use detectron2. but when I prepared the conda environment, something went wrong. First I installed pytorch from |
Detectron2 can run with cuda 10.0. #459 is caused by incorrect installation of torchvision as explained there. |
Thanks for your reply,I delete the build file in detedtron2 and rebuild it, it works well for me now |
Hi, I am just trying to run detectron2 for panoptic segmentation with PyTorch 1.4.0 and CUDA 10.2, I encountered same cuda error for ROIAlign_forward_cuda . I tried to install detectron2 using 1) local source code, and 2) pip install. I also double checked that python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/index.html and it seems CUDA10.2 is also compatible with detectron2. What kind of further step can I take? |
Most likely the solution to your problem is already in https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues. |
Great thanks! I checked that CUDA version for detectron2 and torch are mis-matched. I just re-install detectron2 with CUDA 10.1 and match pytorch as well. Now it works! Thanks again |
So, how did you solve this? |
Yes, that's the key to solve my problem Problemfirst, briefly introduce my problem: I'm new to Detectron2 and only one GPU(GeForce GTX 1080Ti). I choose to build Detectron2 from Source:
everything is fine and detectron2 is installed successfully
but when I try to train
SolveI check the cuda version
before this I install cudatoolkit=10.2,but now i choose the earlier version
after rebuilt Detectron2,the problem solved!!! |
Oh Sorry for late response, I totally missed it. As I mentioned, my prev CUDA version was 10.1 but I installed PyTorch and Detectron2 with compatibility of CUDA 10.2. Thus, I reinstalled those two to meet compatibility with my CUDA version. I think @zjZSTU 's solution is somewhat close to mine, you can refer to this. |
Attempting to forward inference the panoptic fpn model results in a CUDA error.
To Reproduce
Attempting to run a
predictor
using the modelpanoptic_fpn_R_101_dconv_cascade_gn_3x.yaml
.The following error is produced:
Environment
The text was updated successfully, but these errors were encountered: