RuntimeError: Backward is not reentrant #16

leon532 · 2019-03-22T00:16:47Z

It raises RuntimeError: Backward is not reentrant when i run the test.py.

torch.Size([2, 128, 128, 128])
torch.Size([2, 128, 128, 128])
torch.Size([20, 32, 7, 7])
torch.Size([20, 32, 7, 7])
torch.Size([20, 32, 7, 7])
checking
dconv im2col_step forward passed with 0.0
tensor(0., device='cuda:0', grad_fn=)
dconv im2col_step backward passed with 7.450580596923828e-09 = 7.450580596923828e-09+0.0+0.0+0.0
mdconv im2col_step forward passed with 0.0
tensor(0., device='cuda:0', grad_fn=)
mdconv im2col_step backward passed with 3.725290298461914e-09
0.971507, 1.943014
0.971507, 1.943014
tensor(0., device='cuda:0')
dconv zero offset passed with 1.1920928955078125e-07
dconv zero offset identify passed with 0.0
tensor(0., device='cuda:0')
mdconv zero offset passed with 1.7881393432617188e-07
mdconv zero offset identify passed with 0.0
check_gradient_conv: True
Traceback (most recent call last):
File "test.py", line 624, in
check_gradient_dconv()
File "test.py", line 400, in check_gradient_dconv
eps=1e-3, atol=1e-3, rtol=1e-2, raise_exception=True))
File "/data/yli18/miniconda3/envs/pytorch-1.0/lib/python3.6/site-packages/torch/autograd/gradcheck.py", line 208, in gradcheck
return fail_test('Backward is not reentrant, i.e., running backward with same '
File "/data/yli18/miniconda3/envs/pytorch-1.0/lib/python3.6/site-packages/torch/autograd/gradcheck.py", line 185, in fail_test
raise RuntimeError(msg)
RuntimeError: Backward is not reentrant, i.e., running backward with same input and grad_output multiple times gives different values, although analytical gradient matches numerical gradient

Is this a serious problem? and how can i resolve it?
Thanks for you time and suggestion.

xvjiarui · 2019-03-26T09:47:54Z

Sorry for the late reply.

Deformable-Convolution-V2-PyTorch/test.py

Lines 627 to 631 in 1b5851a

    
           # """ 
        
           # ****** Note: backward is not reentrant error may not be a serious problem, 
        
           # ****** since the max error is less than 1e-7, 
        
           # ****** Still looking for what trigger this problem 
        
           # """

As in comment, it won't affect the performance. Feel free to ignore.
Please reopen this issue if you have a solution. Thx

GreenTeaHua · 2019-05-28T09:49:03Z

thx

JiancongWang · 2019-06-10T22:55:15Z

I checked the code. Numerically the code should be correct. I verified that the error is very small using CUDA 9.0 on a GTX 1080 ~ 1e-16. The error is raised when calculating the gradient for input. The code uses atomicadd to calculate that since one can not possibly know how many time a pixel get used. Then the problem is the sequence of atomicadd is uncertain. Different sequence of adding will result in different value up to numerical error. The gradient on the bias/weight/offset does not have the atomicadd and thus have no such issue.

JiancongWang · 2019-06-11T14:15:44Z

That's only one possible cause of the issue though. I will double check and see if there is any other problem. For now I don't see any.

heartInsert · 2019-07-23T10:26:10Z

I checked the code. Numerically the code should be correct. I verified that the error is very small using CUDA 9.0 on a GTX 1080 ~ 1e-16. The error is raised when calculating the gradient for input. The code uses atomicadd to calculate that since one can not possibly know how many time a pixel get used. Then the problem is the sequence of atomicadd is uncertain. Different sequence of adding will result in different value up to numerical error. The gradient on the bias/weight/offset does not have the atomicadd and thus have no such issue.

You mean I can ignore this exception?

xvjiarui closed this as completed Mar 26, 2019

cjnjuwhy mentioned this issue May 2, 2019

error: Segmentation fault (core dumped), pytorch version: 1.0.0 cuda:8.0 #15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Backward is not reentrant #16

RuntimeError: Backward is not reentrant #16

leon532 commented Mar 22, 2019

xvjiarui commented Mar 26, 2019 •

edited

Loading

GreenTeaHua commented May 28, 2019

JiancongWang commented Jun 10, 2019 •

edited

Loading

JiancongWang commented Jun 11, 2019

heartInsert commented Jul 23, 2019

RuntimeError: Backward is not reentrant #16

RuntimeError: Backward is not reentrant #16

Comments

leon532 commented Mar 22, 2019

xvjiarui commented Mar 26, 2019 • edited Loading

GreenTeaHua commented May 28, 2019

JiancongWang commented Jun 10, 2019 • edited Loading

JiancongWang commented Jun 11, 2019

heartInsert commented Jul 23, 2019

xvjiarui commented Mar 26, 2019 •

edited

Loading

JiancongWang commented Jun 10, 2019 •

edited

Loading