Bug: gradient of convolution in newest theano from master #3763

matthias-k · 2015-12-10T18:23:44Z

I found a strange problem with the newest theano from master. I implemented a gaussian convolution. In theano 0.7.0, taking the gradient of the output with respect to the kernel size works perfectly, however with the newest master (0.7.0.dev-e521b20e578c033d51e548181bd1edd24af64427) I get an execption ValueError: ('You cannot drop a non-broadcastable dimension.', ((True, False, False, False), (2, 3)))

Here is a minimal example to reproduce the problem:

import numpy as np
import theano
import theano.tensor as T

def gaussian_filter_theano_1d(input, sigma, window_radius=10):
    filter_1d = T.arange(-window_radius, window_radius+1)
    filter_1d = filter_1d.astype(theano.config.floatX)
    filter_1d = T.exp(-0.5*filter_1d**2/sigma**2)
    filter_1d = filter_1d / filter_1d.sum()

    filter_W = filter_1d.dimshuffle(['x', 'x', 0, 'x'])

    blur_op = T.nnet.conv2d(input, filter_W, border_mode='full', filter_shape=[1, 1, None, None])
    return blur_op

x1  = T.tensor4('x')
x1_data = np.random.randn(1, 1, 300, 300)
sigma = T.scalar('sigma')
sigma_data = 20

y = gaussian_filter_theano_1d(x1, sigma, window_radius=3)
print(y.eval({x1: x1_data, sigma: sigma_data}))
T.grad(y.sum(), sigma)

It fails (with python2.7 and python3.4) in the last line as follows:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-2e2d8647467e> in <module>()
     23 y = gaussian_filter_theano_1d(x1, sigma, window_radius=3)
     24 print(y.eval({x1: x1_data, sigma: sigma_data}))
---> 25 T.grad(y.sum(), sigma)

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in grad(cost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected, null_gradients)
    559 
    560     rval = _populate_grad_dict(var_to_app_to_idx,
--> 561                                grad_dict, wrt, cost_name)
    562 
    563     for i in xrange(len(rval)):

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name)
   1322         return grad_dict[var]
   1323 
-> 1324     rval = [access_grad_cache(elem) for elem in wrt]
   1325 
   1326     return rval

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in <listcomp>(.0)
   1322         return grad_dict[var]
   1323 
-> 1324     rval = [access_grad_cache(elem) for elem in wrt]
   1325 
   1326     return rval

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_grad_cache(var)
   1277                     for idx in node_to_idx[node]:
   1278 
-> 1279                         term = access_term_cache(node)[idx]
   1280 
   1281                         if not isinstance(term, gof.Variable):

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_term_cache(node)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in <listcomp>(.0)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_grad_cache(var)
   1277                     for idx in node_to_idx[node]:
   1278 
-> 1279                         term = access_term_cache(node)[idx]
   1280 
   1281                         if not isinstance(term, gof.Variable):

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_term_cache(node)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in <listcomp>(.0)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_grad_cache(var)
   1277                     for idx in node_to_idx[node]:
   1278 
-> 1279                         term = access_term_cache(node)[idx]
   1280 
   1281                         if not isinstance(term, gof.Variable):

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_term_cache(node)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in <listcomp>(.0)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_grad_cache(var)
   1277                     for idx in node_to_idx[node]:
   1278 
-> 1279                         term = access_term_cache(node)[idx]
   1280 
   1281                         if not isinstance(term, gof.Variable):

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_term_cache(node)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in <listcomp>(.0)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_grad_cache(var)
   1277                     for idx in node_to_idx[node]:
   1278 
-> 1279                         term = access_term_cache(node)[idx]
   1280 
   1281                         if not isinstance(term, gof.Variable):

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_term_cache(node)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in <listcomp>(.0)
    971             inputs = node.inputs
    972 
--> 973             output_grads = [access_grad_cache(var) for var in node.outputs]
    974 
    975             # list of bools indicating if each output is connected to the cost

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_grad_cache(var)
   1277                     for idx in node_to_idx[node]:
   1278 
-> 1279                         term = access_term_cache(node)[idx]
   1280 
   1281                         if not isinstance(term, gof.Variable):

/usr/local/lib/python3.4/dist-packages/theano/gradient.py in access_term_cache(node)
   1111                                 str(g_shape))
   1112 
-> 1113                 input_grads = node.op.grad(inputs, new_output_grads)
   1114 
   1115                 if input_grads is None:

/usr/local/lib/python3.4/dist-packages/theano/tensor/elemwise.py in grad(self, inp, grads)
    410             return [inp[0].zeros_like(dtype=theano.config.floatX)]
    411         else:
--> 412             return [DimShuffle(gz.type.broadcastable, grad_order)(
    413                 Elemwise(scalar.identity)(gz))]
    414 

/usr/local/lib/python3.4/dist-packages/theano/tensor/elemwise.py in __init__(self, input_broadcastable, new_order, inplace)
    162                     raise ValueError(
    163                         "You cannot drop a non-broadcastable dimension.",
--> 164                         (input_broadcastable, new_order))
    165 
    166         # this is the list of the original dimensions that we keep

ValueError: ('You cannot drop a non-broadcastable dimension.', ((True, False, False, False), (2,)))

The text was updated successfully, but these errors were encountered:

matthias-k · 2015-12-10T19:02:27Z

I ran git bisect from 0.7.0 to HEAD and found that the bug was introduced in bcc9336 i.e. with the switch to abstract_conv_2d.

nouiz · 2015-12-11T00:33:46Z

I think you can use theano.tensor.nnet.conv.conv2d until we fix this.

thanks for the report.

On Thu, Dec 10, 2015 at 2:02 PM, matthias-k notifications@github.com
wrote:

I ran git bisect from 0.7.0 to HEAD and found that the bug was introduced
in bcc9336
bcc9336
i.e. with the switch to abstract_conv_2d.

—
Reply to this email directly or view it on GitHub
#3763 (comment).

nouiz · 2015-12-15T14:09:56Z

Just to let you know that we merged the fix.

thanks for the report.

matthias-k · 2015-12-15T14:11:03Z

Great, thanks for the quick bugfix!

lamblin mentioned this issue Dec 12, 2015

Fix broadcastable pattern of gradient in abstract conv #3774

Merged

nouiz closed this as completed in #3774 Dec 15, 2015

Overdrivr mentioned this issue Sep 21, 2017

Zoom from scipy.ndimage.interpolation #6421

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: gradient of convolution in newest theano from master #3763

Bug: gradient of convolution in newest theano from master #3763

matthias-k commented Dec 10, 2015

matthias-k commented Dec 10, 2015

nouiz commented Dec 11, 2015

nouiz commented Dec 15, 2015

matthias-k commented Dec 15, 2015

Bug: gradient of convolution in newest theano from master #3763

Bug: gradient of convolution in newest theano from master #3763

Comments

matthias-k commented Dec 10, 2015

matthias-k commented Dec 10, 2015

nouiz commented Dec 11, 2015

nouiz commented Dec 15, 2015

matthias-k commented Dec 15, 2015