Fail to adapt to Pytorch Version 0.3 #29

MatthewD1993 · 2018-01-12T16:33:14Z

I am trying to adapt the code to new Pytorch version and higher python version. I modify the code based on Pytorch source code, especially I modify the customized layers as follows:
functions:
`

class ChannelNormFunction(Function):

    @staticmethod
    def forward(ctx, input1, norm_deg=2):
        # self.save_for_backward(input1)
        ctx.norm_deg = norm_deg
        assert(input1.is_contiguous() == True)
        with torch.cuda.device_of(input1):
            b, _, h, w = input1.size()
            output = input1.new().resize_(b, 1, h, w).zero_()
            ChannelNorm_cuda_forward(input1, output, ctx.norm_deg)
        ctx.save_for_backward(input1, output)
        return output

    @staticmethod
    def backward(ctx, gradOutput):
        input1, output = ctx.saved_tensors
        with torch.cuda.device_of(input1):
            b, c, h, w = input1.size()
            gradInput1 = input1.new().resize_(b,c,h,w).zero_()
            ChannelNorm_cuda_backward(input1, output, gradOutput, gradInput1, ctx.norm_deg)

        return gradInput1, None

`

modules
`

class ChannelNorm(Module):
    def __init__(self, norm_deg=2):
        super(ChannelNorm, self).__init__()
        self.norm_deg = norm_deg

    def forward(self, input1):
        return ChannelNormFunction.apply(input1, self.norm_deg)

With other grammatical modifications, I am able to run the code in inference mode. But I fail to train because backpropagation is not working correctly. I met this error:

File "main.py", line 434, in
train_loss, iterations = train(args=args, epoch=epoch, start_iteration=global_iteration, data_loader=train_loader, model=model_and_loss, optimizer=optimizer, logger=train_logger, offset=offset)
File "main.py", line 301, in train
loss_val.backward()
File "/home/cdeng/anaconda3/envs/py27/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/cdeng/anaconda3/envs/py27/lib/python2.7/site-packages/torch/autograd/init.py", line 99, in backward
variables, grad_variables, retain_graph)
File "/home/cdeng/anaconda3/envs/py27/lib/python2.7/site-packages/torch/autograd/function.py", line 91, in apply
return self._forward_cls.backward(self, *args)
File "/home/cdeng/Thesis/flownet2-pytorch/networks/channelnorm_package/functions/channelnorm.py", line 32, in backward
ChannelNorm_cuda_backward(input1, output, gradOutput, gradInput1, ctx.norm_deg)
File "/home/cdeng/anaconda3/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/init.py", line 180, in safe_call
result = torch._C._safe_call(*args, **kwargs)
TypeError: initializer for ctype 'struct THCudaTensor *' must be a cdata pointer, not Variable

Could someone have a look? And help me locate and fix the bug?
Great thanks!

The text was updated successfully, but these errors were encountered:

MatthewD1993 · 2018-01-12T16:39:59Z

Could someone also explain what is the benefits of using the static method in new Pytorch version and what is ctx?

adriansahlman · 2018-01-17T12:45:04Z

Hey! Im also interested in running this on python 3+ and pytorch 0.3

At the moment I dont have enough free time to fiddle with it myself so im going to keep an eye on this issue.

I dont know what the benefit of using the static method is as that is the only one I have used.

As for ctx, that is the context. The context you recieve in the forward pass of a function is the same context object that you recieve for the backward pass. You can for example save tensors in the forward pass that you are going to need in your backward pass in the ctx variable.

hellock · 2018-02-04T10:03:17Z

#35

lfz · 2018-02-08T14:08:49Z

Hi, I meet a simiar problem in another package, May I ask which line in the commit #35 solve this problem? Thank you! @hellock

hellock · 2018-02-08T14:49:35Z

Hi @lfz ,
ChannelNorm_cuda_forward and ChannelNorm_cuda_backward should accept Tensor instead of Variable as arguments.

So the problem is solved by changing to

channelnorm.ChannelNorm_cuda_backward(input1, output, grad_output.data,
                                      grad_input1.data, ctx.norm_deg)

lfz · 2018-02-08T15:21:17Z

does it influence the final output, I mean, the input is now a tensor, so is its output changed to a tensor too?

hellock · 2018-02-08T15:31:00Z

The final output of backward function such as grad_input1 should be a Variable. You can refer to #35 for details.

lfz · 2018-02-09T06:57:50Z

pytorch/pytorch#5128 (comment) the pytorch master has modified the safe_call 2018-02-08 23:31 GMT+08:00 Kai Chen <notifications@github.com>:

…

The final output of backward function such as grad_input1 should be a Variable. You can refer to #35 <#35> for details. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIigQ5qPzGQ5zVxoIaDULNAkvUvWrySXks5tSxM1gaJpZM4RciSa> .

-- 廖方舟清华大学医学院 Liao Fangzhou School of Medicine Tsinghua University Beijing 100084 China

hellock · 2018-02-09T08:24:50Z

OK PyTorch is fixing lots of api inconsistencies, thanks for the notice.

fitsumreda · 2018-04-06T21:18:25Z

we've added a python3 branch https://github.com/NVIDIA/flownet2-pytorch/tree/python36

fitsumreda added the good first issue label Jan 23, 2018

fitsumreda closed this as completed Apr 6, 2018

laaRaa mentioned this issue Sep 14, 2018

error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device #86

Open

toprocker mentioned this issue Jan 2, 2019

RuntimeError:Cuda call failed #114

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail to adapt to Pytorch Version 0.3 #29

Fail to adapt to Pytorch Version 0.3 #29

MatthewD1993 commented Jan 12, 2018 •

edited

MatthewD1993 commented Jan 12, 2018

adriansahlman commented Jan 17, 2018

hellock commented Feb 4, 2018

lfz commented Feb 8, 2018

hellock commented Feb 8, 2018 •

edited

lfz commented Feb 8, 2018

hellock commented Feb 8, 2018

lfz commented Feb 9, 2018 via email

hellock commented Feb 9, 2018

fitsumreda commented Apr 6, 2018

Fail to adapt to Pytorch Version 0.3 #29

Fail to adapt to Pytorch Version 0.3 #29

Comments

MatthewD1993 commented Jan 12, 2018 • edited

MatthewD1993 commented Jan 12, 2018

adriansahlman commented Jan 17, 2018

hellock commented Feb 4, 2018

lfz commented Feb 8, 2018

hellock commented Feb 8, 2018 • edited

lfz commented Feb 8, 2018

hellock commented Feb 8, 2018

lfz commented Feb 9, 2018 via email

hellock commented Feb 9, 2018

fitsumreda commented Apr 6, 2018

MatthewD1993 commented Jan 12, 2018 •

edited

hellock commented Feb 8, 2018 •

edited