RuntimeError: CUDA error: an illegal memory access was encountered #24

ditingdapeng · 2020-09-25T10:42:45Z

In the process of train, encountered such a mistake, where is the problem? "RuntimeError: CUDA error: an illegal memory access was encountered"

Sleepychord · 2020-09-25T11:16:41Z

Hi, this seems to be caused by some other problems(from the environments), could you provide more information?

ditingdapeng · 2020-09-25T11:42:47Z

Thank you ! This is my conda and torch version configuration:
python 3.7.0
conda 4.5.11
torch 1.0.1.post2
torchvision 0.2.2.post3

Hi, this seems to be caused by some other problems(from the environments), could you provide more information?

ditingdapeng · 2020-09-25T11:49:17Z

The batch_size in the train.py has been transferred to 1, My machine is:2080Ti

Sleepychord · 2020-09-27T07:00:33Z

Hi, can you tell me which code raise the error? seems like the environment is okay.

ditingdapeng · 2020-09-27T07:03:18Z

Yeah ! "batch = tuple(t.to(device) for t in batch)" , I've now reinstalled the Ubantu environment, May I ask if your VERSION of CUDA must be 8?

ditingdapeng · 2020-09-27T07:07:01Z

I have been stuck with this problem for 3 days and have ruled out memory overflow and batch_size. I couldn't resist reinstalling the system yesterday, and I noticed that the CUDA version didn't fit.

How many VERSIONS of CUDA do you have?
Thank you ~

ditingdapeng · 2020-09-27T07:16:01Z

I suspect the problem is that CudA10.0 doesn't match the torch in the code

Sleepychord · 2020-09-27T07:39:39Z

No, but you need to ensure your torch build to fit for the CUDA version.

ditingdapeng · 2020-09-27T07:41:36Z

soga. I think I know what the problem is where. My CUDA version follows your requirements, but Cuda may not match the torch

ditingdapeng · 2020-09-27T07:57:35Z

Cuda loaded 8.0 has collapsed， I'm going to reinstall the system and press the new CUDA and Torch versions

ditingdapeng · 2020-10-05T12:46:50Z

Hello！ I think I finally found the problem, but I don’t know how to solve it.
Hope to get your help.

The problem appears in the train.py file：
' hop_loss, ans_loss, pooled_output = model1(*batch)'

The error suggested is：RuntimeError: CUDA error: an illegal memory access was encountered.

I suspect that the parameter range of Model1 is different from the size of batch, Could you please help me explain the structure of model1, thank you very much！！！

ditingdapeng · 2020-10-10T10:42:21Z

I have found the problem. It is because my CUDA environment is not well installed. So much trouble for you~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: an illegal memory access was encountered #24

RuntimeError: CUDA error: an illegal memory access was encountered #24

ditingdapeng commented Sep 25, 2020

Sleepychord commented Sep 25, 2020

ditingdapeng commented Sep 25, 2020

ditingdapeng commented Sep 25, 2020

Sleepychord commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

Sleepychord commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

ditingdapeng commented Oct 5, 2020

ditingdapeng commented Oct 10, 2020

RuntimeError: CUDA error: an illegal memory access was encountered #24

RuntimeError: CUDA error: an illegal memory access was encountered #24

Comments

ditingdapeng commented Sep 25, 2020

Sleepychord commented Sep 25, 2020

ditingdapeng commented Sep 25, 2020

ditingdapeng commented Sep 25, 2020

Sleepychord commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

Sleepychord commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

ditingdapeng commented Sep 27, 2020

ditingdapeng commented Oct 5, 2020

ditingdapeng commented Oct 10, 2020