Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU program failed to execute : cublas runtime error #140

Closed
AyeshaSarwar opened this issue Mar 28, 2020 · 9 comments
Closed

GPU program failed to execute : cublas runtime error #140

AyeshaSarwar opened this issue Mar 28, 2020 · 9 comments

Comments

@AyeshaSarwar
Copy link

I ran this command on my command prompt

python train.py -task ext -mode train -bert_data_path C:\Users\Ayesha\Downloads\Tasks\BERTSumm\PreSumm-dev\PreSumm-dev\bert_data\cnndm -ext_dropout 0.1 -model_path C:\Users\Ayesha\Downloads\Tasks\BERTSumm\PreSumm-dev\PreSumm-dev\models -lr 2e-3 -visible_gpus 0 -report_every 50 -save_checkpoint_steps 1000 -batch_size 3000 -train_steps 50 -accum_count 2 -log_file ../logs/ext_bert_cnndm -use_interval true -warmup_steps 10000 -max_pos 512

and I am getting this error

5

System Specifications
Windows 10
NVIDIA GPU RTX 2060

How can I solve this?

@AyeshaSarwar
Copy link
Author

I am still unable to run this code. It is giving me the same errors.

@chandanrao007
Copy link

Some pytorch dependency errors in local machine. Better to use google COLAB TPU, download the dataset using wget and gdown commands, you will not get this error.

@AanchalA
Copy link

AanchalA commented Apr 8, 2020

Some pytorch dependency errors in local machine. Better to use google COLAB TPU, download the dataset using wget and gdown commands, you will not get this error.

RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/torch/csrc/cuda/Module.cpp:33

This is the error I get when I try to use the COLAB TPU.

@AyeshaSarwar
Copy link
Author

I actually was trying to run it on my PC which has Windows installed. Then, after discussing it with my colleague, I ran it on PC with Ubuntu.. It ran successfully.. Actually, it is about how both OS interacts with the GPU and loads data.. That is why it is giving such errors. I would recommend you too to run it on Ubuntu server or COLAB to get rid off such errors, if you r running it on Windows PC.

@AanchalA
Copy link

AanchalA commented Apr 8, 2020

@AyeshaSarwar
I was also getting a lot of errors when I tried running it on Windows, and so I switched to Colab. But now when I run it on Colab GPU, i.m falling short of space. I think others have also run this code on Colab, and it worked fine for them. I don't understand where am I going wrong

@AyeshaSarwar
Copy link
Author

Oh, actually, I have a Ubuntu server so I tried running it there. COLAB is online so internet issues. Try running it on a dedicated server, may be assigned to you from your institution.

@chandanrao7
Copy link

Some pytorch dependency errors in local machine. Better to use google COLAB TPU, download the dataset using wget and gdown commands, you will not get this error.

RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/torch/csrc/cuda/Module.cpp:33

This is the error I get when I try to use the COLAB TPU.

When running the python file, change the parameter visible_gpu's=-1 instead of visible_gpu=0,1,2,3 and may be use colab gpu.

@NandaKishoreJoshi
Copy link

@AyeshaSarwar ,
How did you solve this issue? I'm trying to run the code on colab. with visible_gpu=0,1,2,3 I'm getting the same error. With visible_gpu=-1 and colab runtime set to GPU we do not use the GPU.

@HungVS
Copy link

HungVS commented Nov 9, 2022

I actually was trying to run it on my PC which has Windows installed. Then, after discussing it with my colleague, I ran it on PC with Ubuntu.. It ran successfully.. Actually, it is about how both OS interacts with the GPU and loads data.. That is why it is giving such errors. I would recommend you too to run it on Ubuntu server or COLAB to get rid off such errors, if you r running it on Windows PC.

Could you share with me the specifications and requirements in your Ubuntu server? such as (Ubuntu version, Cuda version, Python dependencies,...)

I also use an Ubuntu server and still face the same error.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants