-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered #83
Comments
Got exactly the same problem on PyTorch 1.8 and Cuda 11.1 (trying to run default FastCUT train from example). Downgrading to PyTorch 1.4 and Cuda 9.2 doesn't help and leads to:
|
have the same error reported. |
I solve this problem by replacing |
this works ! |
yes, it works. Thank you! |
Thank you for the feedback and solution. I made the suggested change and pushed the code. |
The issue has reappeared, although the previously mentioned patch has been applied. I used the environment.yml to set up a conda environment. Any suggestions? |
Traceback (most recent call last):
File "train.py", line 43, in
model.data_dependent_initialize(data)
File "/home/helena/CUT/models/cut_model.py", line 108, in data_dependent_initialize
self.compute_G_loss().backward() # calculate graidents for G
File "/home/helena/anaconda3/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/helena/anaconda3/lib/python3.8/site-packages/torch/autograd/init.py", line 145, in backward
Variable._execution_engine.run_backward(
RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
hello, i'm aware that this issue was already brought up and the suggestion was to downgrade to PyTorch 1.4 which i'm trying to avoid being on CUDA 11
what i find interesting though that cycleGAN training works just fine with the same setup (CUDA 11.1, PyTorch 1.8) and on the same dataset
any suggestions how to debug are welcome
The text was updated successfully, but these errors were encountered: