New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unshared convolution #6007
Unshared convolution #6007
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting the pull request. I know you're not finished yet, but I left some first comments.
theano/tensor/nnet/abstract_conv.py
Outdated
@@ -1507,6 +1526,13 @@ def conv(self, img, kern, mode="valid", dilation=1): | |||
out_shape = get_conv_output_shape(img.shape, kern.shape, | |||
mode, [1] * self.convdim, dilation) | |||
|
|||
out_rows = out_shape[self.convdim] | |||
out_cols = out_shape[self.convdim+1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only works for 2D, so make sure you check that somewhere if you don't want to implement it now (the rest of the function does 3D as well).
I also don't think self.convdim
is the best thing to use here, it doesn't really make sense. It's just a coincidence that batch and output channels use exactly self.convdim
dimensions, so I'd just use 2.
Even better: you could make this whole first part work in a convdim-agnostic way, by checking if kern.shape[2] == np.prod(out_shape[2:])
.
theano/tensor/nnet/abstract_conv.py
Outdated
out_rows = out_shape[self.convdim] | ||
out_cols = out_shape[self.convdim+1] | ||
|
||
if unshared == True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a specific reason for explicitly checking for == True
here? (See also unshared == False
below.)
theano/tensor/nnet/abstract_conv.py
Outdated
else: | ||
for reg in xrange(kern.shape[1]): | ||
row = reg // out_cols | ||
col = reg - row*out_cols |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps col = reg % out_cols
would be nicer?
theano/tensor/nnet/abstract_conv.py
Outdated
if self.convdim == 3: | ||
raise NotImplementedError("Unshared 3D convolution not implemented") | ||
elif kern.type.ndim != 3 + self.convdim: | ||
raise TypeError('kern must be %dD tensor' % (3 + self.convdim)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would swap this: explicitly check that self.convdim == 2
and raise an error if it is not. The AbstractConv class works for any convdim, even though it's only currently used for 2D and 3D.
If true, then unshared or 'locally connected' convolution will be | ||
performed. A different kernel will be used for each region of the | ||
input. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to change the documentation for filters
as well, to explain how you should pass the unshared weights.
What will the shape of the filter parameter look like for unshared convolution? From your Python implementation, it seems that you combine the For comparison: Lasagne uses two additional dimensions, i.e., |
The method I've used to compute the gradients wrt inputs seems very messy. In order to express it as a convolution, the weight matrix had to be rearranged quite a bit. Is there a better way? |
theano/tensor/nnet/abstract_conv.py
Outdated
@@ -1565,6 +1565,7 @@ def conv(self, img, kern, mode="valid", dilation=1, unshared=False): | |||
if unshared is True: | |||
for row in xrange(kern.shape[2]): | |||
for col in xrange(kern.shape[3]): | |||
val = _valfrommode('valid') | |||
out[b, n, row, col] += _convolve2d(img[b, im0, row:row + kern.shape[4], | |||
col:col + kern.shape[5]], | |||
dilated_kern[n, im0, row, col, ...], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is convolve2d necessary here? Since the image part you're extracting has the same size as the kernel, isn't this equal to an elementwise multiplication and summation? (Perhaps with a flip somewhere.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd retained the convolve2d because it would eliminate two more loops, and because then I would need to flip the filter manually. But I've made it element-wise now.
theano/tensor/nnet/abstract_conv.py
Outdated
else: | ||
kern[:, :, i, j, m, m] = kern[:, :, i + m, j + n, m, n] | ||
new_kern[:, :, i, j, m, n] = kern[:, :, i + m, j + n, m, n] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? It's somewhat complex, so I might have made a mistake in my manual derivations, but I think this is missing the first part of the kernel and shifting everything else too far to the left.
In a 1D case, with a 5-pixel input and 3-pixel kernel, I think you want the kernel to be changed like this:
kern = [[a, b, c], [d, e, f], [g, h, i]]
new_kern = [[_, _, a], [_, b, d], [c, e, g], [f, h, _], [i, _, _]]
but this implementation would change it to this, I think:
new_kern = [[a, b, c], [d, e, _], [g, _, _], [_, _, _], [_, _, _]]
Maybe it's better to implement the gradient computation directly in Python, without using I guess the reason to use I would suggest making a new helper function for unshared grad wrt inputs in What do you think? |
theano/tensor/nnet/abstract_conv.py
Outdated
if (kern.shape[2], kern.shape[3]) != (out_shape[2], out_shape[3]): | ||
raise ValueError('Kernel shape ({},{}) does not match ' | ||
'output size ({},{})'.format(kern.shape[2], kern.shape[3], | ||
out_shape[2], out_shape[3])) | ||
|
||
out = np.zeros(out_shape, dtype=img.dtype) | ||
dil_kern_shp = kern.shape[:-self.convdim] + tuple( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few lines below, I think you might need to change this assignment of dilated_kern
:
dilated_kern[(slice(None), slice(None)) +
tuple(slice(None, None, dilation[i]) for i in range(self.convdim))
] = kern
because it doesn't include the row/column dimensions in an unshared convolution.
theano/tensor/nnet/abstract_conv.py
Outdated
@@ -1022,7 +1044,8 @@ def conv2d_grad_wrt_weights(input, | |||
border_mode=border_mode, | |||
subsample=subsample, | |||
filter_flip=filter_flip, | |||
filter_dilation=filter_dilation) | |||
filter_dilation=filter_dilation, | |||
unshared=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want unshared=unshared
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd actually added this locally and forgotten to commit it. Fixed now, thanks.
Ditto here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the tests. The forward convolution seems to work fine, good!
There still seems to be a small problem in the grad wrt weights. The test passes, but when I tried to test the value of the result it didn't work (see the test below).
grad_weights = conv.conv2d_grad_wrt_weights(inputs, topgrad, | ||
self.kshp, unshared=True) | ||
grad_func = theano.function([inputs, topgrad], grad_weights, mode=self.mode) | ||
grad_val = grad_func(inputs_val, topgrad_val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I suggest a small extension of this test that also checks the result?
def test_gradweight(self):
inputs = theano.tensor.tensor4()
topgrad = theano.tensor.tensor4()
inputs_val = np.random.random(self.imshp).astype(theano.config.floatX)
topgrad_val = np.random.random(self.topgrad_shape).astype(theano.config.floatX)
# compute gradient for unshared convolution
grad_weights_unshared = conv.conv2d_grad_wrt_weights(inputs, topgrad,
self.kshp, unshared=True)
grad_func_unshared = theano.function([inputs, topgrad], grad_weights_unshared, mode=self.mode)
grad_val_unshared = grad_func_unshared(inputs_val, topgrad_val)
# compute gradient for shared convolution as a reference
grad_weights_shared = conv.conv2d_grad_wrt_weights(inputs, topgrad,
self.kshp, unshared=False)
grad_func_shared = theano.function([inputs, topgrad], grad_weights_shared, mode=self.mode)
grad_val_shared = grad_func_shared(inputs_val, topgrad_val)
# summing over all unshared gradients should give the shared gradient
utt.assert_allclose(grad_val_shared, np.sum(grad_val_unshared, axis=(2, 3)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd actually added this locally and forgotten to commit it. Fixed now, thanks.
1b14a0d
to
8d44e98
Compare
I have a doubt here. Why do we flip the topgrad, convolve and then again flip the output? Isn't this equivalent to flipping (or not flipping) the kern and convolving? |
I've just packed it all into the gradInputs perform function for now. Do you think it looks fine or should I move it into a separate function? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work.
In general, I think it's good to try to keep things 2D/3D agnostic, except for the few cases where that is really inconvenient (mostly the perform
functions). I suppose that once unshared convolution works for 2D, at some point we'll want to apply it to 3D as well, and then it's easier if we don't have to change more than necessary.
A second thing: the perform
functions are becoming fairly long. Maybe it would be better to move some of the unshared-specific code to helper functions, so that the difference between shared and unshared is a bit easier to see. For instance, the _convolve2d
function could have a _convolve2d_unshared
equivalent.
Exaggerated, it's now a bit like this:
def perform(...):
if self.unshared:
# a lot of code with nested loops
else:
self.conv(a, b)
but I would prefer:
def perform(...):
if self.unshared:
self.unshared_conv(a, b)
else:
self.conv(a, b)
theano/tensor/nnet/abstract_conv.py
Outdated
else: | ||
dilated_kern[(slice(None), slice(None)) + | ||
tuple(slice(None, None, dilation[i]) for i in range(self.convdim)) | ||
] = kern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get rid of the if
if you do something like this:
dilated_kern[(slice(None),) * (dilated_kern.ndim - self.convdim),
...
theano/tensor/nnet/abstract_conv.py
Outdated
|
||
dilated_kern[(slice(None),) * 4 + | ||
tuple(slice(None, None, dilation[i]) for i in range(self.convdim)) | ||
] = kern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to use these magic numbers like 4 that only work for the 2D case, since it's probably useful to have unshared 3D convolution in the future as well.
theano/tensor/nnet/abstract_conv.py
Outdated
broadcastable = [topgrad.broadcastable[1], | ||
img.broadcastable[1]] + ([False] * 4) | ||
output = theano.tensor.TensorType(theano.config.floatX, | ||
broadcastable=broadcastable)() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you made this a special case for unshared convolution? Wouldn't the line output = img.type.clone(...)()
for the shared convolution work here as well?
In any case, I don't think it's safe to assume that the image is of type theano.config.floatX, so using img.type.clone
would probably be better?
theano/tensor/nnet/abstract_conv.py
Outdated
] = kern | ||
# Flip the kernel since the convolution is done manually | ||
dilated_kern = dilated_kern[(slice(None),) * 4 + | ||
(slice(None, None, -1),) * 2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't do this flip here. Instead, you can do it when you compute the result of the convolution. That makes the shared and unshared cases more similar.
theano/tensor/nnet/abstract_conv.py
Outdated
for k_col in xrange(kern.shape[5]): | ||
out[b, n, row, col] += img[b, im0, row + k_row, | ||
col + k_col] * \ | ||
dilated_kern[n, im0, row, col, k_row, k_col] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of flipping the kernel earlier, compute the correct indices here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually thinking of doing an np.sum(np.multiply(..., ...))
to eliminate the two inner loops and not unnecessarily call _convolve2d
. Do you think I should? That will mean flipping the kernel earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that could work.
theano/tensor/nnet/abstract_conv.py
Outdated
if any(self.filter_dilation[i] > 1 for i in range(self.convdim)): | ||
kern = kern[(slice(None),) * 4 + | ||
tuple(slice(None, None, self.filter_dilation[i]) | ||
for i in range(self.convdim))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep this outside the if unshared
.
theano/tensor/nnet/abstract_conv.py
Outdated
(slice(None, None, -1),) * self.convdim) | ||
topgrad = topgrad.transpose(axes_order)[flip_filters] | ||
img = img.transpose(axes_order) | ||
|
||
if self.unshared is True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember to check that self.convdim == 2.
theano/tensor/nnet/abstract_conv.py
Outdated
|
||
# Rearranging the weights into the correct shape for unshared | ||
if self.unshared is True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check that self.convdim == 2.
theano/tensor/nnet/abstract_conv.py
Outdated
@@ -1690,9 +1690,9 @@ def perform(self, node, inp, out_): | |||
img = new_img | |||
if not self.filter_flip: | |||
if self.unshared is True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check self.convdim == 2 (probably earlier in the function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now added a check in the init
function of BaseAbstractConv
. Will that do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's certainly good to have it there, and hopefully it's somewhat temporary anyway. However, I do think I'd like to have an explicit check in perform
as well. It's good to have the check and warning close to where the limitation actually is, in my opinion. It's useful as documentation and also makes it future-proof: if at some point the check is removed from BaseAbstractConv, it's better for perform
to crash than to silently run a shared convolution.
Compare this with the 2D/3D limitation of the convolution: that's also explicitly checked in BaseAbstractConv.conv
.
theano/tensor/nnet/abstract_conv.py
Outdated
reg_row = row - ((kern.shape[4] - 1) - k_row) | ||
reg_col = col - ((kern.shape[45] - 1) - k_col) | ||
reg_row = max(0, min(reg_row, kern.shape[2] - 1)) | ||
reg_col = max(0, min(reg_col, kern.shape[3] - 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised to see max/min here, are you sure this is correct? If a value falls outside the available range, wouldn't you want to ignore it, rather than add it to the nearest row/col?
Isn't this the same as the loops and indices for the forward convolution, except that the source and target variables are swapped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised to see max/min here, are you sure this is correct? If a value falls outside the available range, wouldn't you want to ignore it, rather than add it to the nearest row/col?
I did this since the topgrad for these entries will anyway be zero so nothing will be added. But I've made a new version now.
Isn't this the same as the loops and indices for the forward convolution, except that the source and target variables are swapped?
The difference is that the row
and col
of the kern vary within the inner two loops also.
Possibly, yes. Maybe there is a combination of dilation, subsampling etc. when it doesn't? (I tried to come up with an example that wouldn't work, but I failed.) |
Regarding moving code into helper functions, I'd probably have to have different functions for each of the three operations (forward, gradWeights, gradInput), since the computation performed in the loop is different in each of these cases (unlike for a normal computation, where the same conv function can be used). Should I go ahead and create separate functions just the same? |
Yes, I still think that would be useful. But maybe that's a matter of taste. Ideally, I'd like to see a |
theano/tensor/nnet/abstract_conv.py
Outdated
raise NotImplementedError('Unshared convolution not implemented for %dD' | ||
% self.convdim) | ||
elif kern.type.ndim != 6: | ||
if kern.type.ndim != 6: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!= 2 + 2 * self.convdim
theano/tensor/nnet/abstract_conv.py
Outdated
reg_col = max(0, min(reg_col, kern.shape[3] - 1)) | ||
if reg_row not in range(kern.shape[2]) or \ | ||
reg_col not in range(kern.shape[3]): | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is very efficient: it's creating these range
objects for every pixel. Normal comparisons are probably faster. (And I'd invert the comparison, so you don't need the continue
.)
Actually, I think you can do with only one helper function for unshared convolution, if you use a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the new unshared2d
function.
theano/tensor/nnet/abstract_conv.py
Outdated
else: | ||
kshp = kernel_shape[2:] | ||
|
||
kshp = kernel_shape[-2:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that you got rid of the unshared
argument. Perhaps you could use the length of image_shape
to compute the value of convdim
, then use that instead of 2?
theano/tensor/nnet/abstract_conv.py
Outdated
@@ -1520,7 +1517,7 @@ def flops(self, inp, outp): | |||
raise NotImplementedError( | |||
'flops not implemented for convdim={}', self.convdim) | |||
|
|||
def conv(self, img, kern, mode="valid", dilation=1, unshared=False): | |||
def conv(self, img, kern, mode="valid", dilation=1, unshared=False, direction=0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you change the direction
to a string? (E.g., "forward", "backprop weights", "backprop inputs" as in BaseCorrMM.)
theano/tensor/nnet/abstract_conv.py
Outdated
dil_kernshp = tuple((kern.shape[2 + i] - 1) * self.filter_dilation[i] + 1 | ||
for i in range(self.convdim)) | ||
dil_kernshp = tuple((kern.shape[2 + i] - 1) * self.filter_dilation[i] + 1 | ||
for i in range(self.convdim)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't kern.shape[2 + i]
give you the output shape for unshared convolution?
If you don't want a special case for unshared, perhaps kern.shape[-self.convdim + i]
would work?
theano/tensor/nnet/abstract_conv.py
Outdated
if (reg_row >= 0 and reg_row < kern.shape[0]) and \ | ||
(reg_col >= 0 and reg_col < kern.shape[1]): | ||
out[row, col] += kern[reg_row, reg_col, k_row, k_col] * \ | ||
inp[reg_row, reg_col] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is more complicated than it needs to be. If you loop over the correct range, I don't think you need these if
s checking whether you're inside the range.
Reversing your forward convolution, these loops give the same result for the grad wrt inputs:
for row in xrange(kern.shape[0]):
for col in xrange(kern.shape[1]):
for k_row in xrange(kern.shape[2]):
for k_col in xrange(kern.shape[3]):
out[row + k_row, col + k_col] += inp[row, col] * \
kern[row, col, -(k_col + 1), -(k_row + 1)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This certainly looks a lot simpler!
But for the common unshared2d
function, the outer two loops are common for all three directions (they go from 0
to out_shape[0]
and 0
to out_shape[1]
). We need them to go from 0
to kern.shape[0]
and 0
to kern.shape[1]
.
I think the easiest way to fix this is to have a condition to check whether row
and col
are less than kern.shape[0]
and kern.shape[1]
respectively, rather than modifying the loops for k_row
and k_col
.
The conditions will look much simpler and cleaner also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't yet looked at your C changes in detail. I'll do that when there is a bit more.
theano/tensor/nnet/abstract_conv.py
Outdated
out[row + k_row, col + k_col] += inp[row, col] * \ | ||
kern[row, col, -(k_row + 1), -(k_col + 1)] | ||
else: | ||
raise ValueError("unshared2d: invalid value for 'direction'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be helpful to include the value in the error message.
theano/tensor/nnet/abstract_conv.py
Outdated
out[row, col, ...] += kern[row, col] * \ | ||
inp[row:row + out_shape[2], col:col + out_shape[3]] | ||
elif direction == "backprop inputs": | ||
if row < kern.shape[0] and col < kern.shape[1]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate your effort not to duplicate code, but I still think this solution is less than ideal. It feels a bit silly to have two loops that go on for too long, and then correct this with an if that checks the indices.
There may also be a conceptual problem: your for loops suggest that you're iterating over the rows and columns of out
, while in fact for the grad wrt inputs you're iterating over the rows and columns of kern
(which just happens to be slightly smaller than out
).
So, I would propose to move the for row
and for col
loops inside the ifs, so that they're specific to each direction. This adds three lines of code, but is easier to understand and does not have loops that are longer than necessary. (As an added bonus it tests the value of direction
only once.)
Right now, the weights are of the form Also, it makes things easier in the C code (having those three dims together). |
@gvtulder Do you think I should go ahead and change this? |
@vikramnitin9 I'm not yet convinced that it's a good idea. Of the current order, I like that the first two dimensions always have the same meaning for shared and unshared. There's now a clear relation Lasagne uses Would it make the C or GPU code a lot more efficient if the |
im2col reshapes the image as For unshared, it could be reshaped to So that's why I said the C code will become easier. Lasagne uses Pylearn2 uses Either way, the three dimensions being together makes the reshaping easy. |
@vikramnitin9 You're right, I misread the order of the Lasagne weights. If it's more efficient, as you say, to have |
Sure. I'm currently working on the C code under the assumption that weights are in the required format. Still need to do grad wrt inputs and add tests. |
Rebase done. Should be all fine now. |
jenkins test this |
The Travis tests passed except for the abstractconv, which timed out as usual. For the Jenkins test, should I add unshared tests to |
Hmm, yes, there need to be some GPU tests. I thought I had seen them, also because you had these errors, but apparently I was wrong. I think there need to be tests in |
Sorry for the delay. Here are the GPU tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the tests look good. I have some comments about explicitly skipping invalid parameter values, but otherwise I think they should be fine.
@@ -26,7 +26,7 @@ def setup_class(cls): | |||
# provide_shape is not used by the cuDNN impementation | |||
cls.provide_shape = [False] | |||
|
|||
def tcase(self, i, f, s, b, flip, provide_shape, fd=(1, 1)): | |||
def tcase(self, i, f, s, b, flip, provide_shape, fd=(1, 1), u=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to raise a SkipTest error if unshared is true. (Like for fd
.) See also the other instances below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this, thanks
@@ -196,7 +196,7 @@ def setup_class(cls): | |||
cls.shared = staticmethod(gpuarray_shared_constructor) | |||
cls.mode = mode_with_gpu.excluding('cudnn') | |||
|
|||
def tcase(self, i, f, s, b, flip, provide_shape, fd=(1, 1, 1)): | |||
def tcase(self, i, f, s, b, flip, provide_shape, fd=(1, 1, 1), u=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SkipTest for those cases as well. (You shouldn't just ignore values you don't like, since that would suggest that the test is run and passes.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the 3D case, I've put a check in run_fwd
to SkipTest if unshared is True and it is not a 2D convolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, that would require passing the parameter here, which I haven't done. Rather than add it, I'll just put a SkipTest right here.
@@ -425,8 +425,7 @@ def test_non_contiguous(self): | |||
|
|||
class TestUnsharedCorr2D(utt.InferShapeTester): | |||
if theano.config.mode == "FAST_COMPILE": | |||
# mode = theano.compile.get_mode("FAST_RUN").excluding("conv_dnn") | |||
mode = theano.compile.get_default_mode().including('gpuarray').excluding('gpu').excluding('cudnn') | |||
mode = theano.compile.get_mode("FAST_RUN").excluding("conv_dnn") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doyou need to exclude conv_dnn here? Since the shared convolution tests don't do this.
@@ -453,13 +452,13 @@ def test_fwd(self): | |||
|
|||
conv_unshared = corr.CorrMM(unshared=True, subsample=self.sub, | |||
filter_dilation=self.dil)(self.input, self.filters) | |||
unshared_func = theano.function([self.input, self.filters], conv_unshared) | |||
unshared_func = theano.function([self.input, self.filters], conv_unshared, mode=self.mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit weird, but it is because you're using the Python implementation as your reference, right? I'm thinking if it wouldn't be better to have a separate implementation for the tests, like the one that's used for the normal convolution.
Jenkins test this please |
Also, tests are timing out on Travis because some parts exceed 50 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small things I came across.
theano/gpuarray/blas.py
Outdated
PyErr_Format(PyExc_ValueError, | ||
"GpuCorrMM: impossible output shape\\n" | ||
" bottom shape: %%ld x %%ld x %%ld x %%ld\\n" | ||
" weights shape: %%ld x %%ld x %%ld x %%ld %%ld %%ld\\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some x
missing near the end.
theano/gpuarray/blas.py
Outdated
PyErr_Format(PyExc_ValueError, | ||
"GpuCorrMM backprop wrt. weights: impossible output shape\\n" | ||
" bottom shape: %%ld x %%ld x %%ld x %%ld\\n" | ||
" weights shape: %%ld x %%ld x %%ld x %%ld %%ld %%ld\\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some x
missing near the end.
PyErr_Format(PyExc_ValueError, | ||
"GpuCorrMM backprop wrt. inputs: impossible output shape\\n" | ||
" bottom shape: %%ld x %%ld x %%ld x %%ld\\n" | ||
" weight shape: %%ld x %%ld x %%ld x %%ld\\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing the extra two dimensions.
theano/gpuarray/corr_gemm.c
Outdated
if (unshared) { | ||
PyErr_Format(PyExc_ValueError, | ||
"GpuCorrMM requires weight to be C-contiguous, " | ||
"but strides are: %ld %ld %ld %ld\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing two dimensions.
theano/gpuarray/corr_gemm.c
Outdated
" bottom shape: %d %d %d %d\n" | ||
" weight shape: %d %ld %ld %d %d %d" | ||
" (expected %d %d %d %d %d %d)\n" | ||
" top shape(calculated): %d %d %d %d\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the shared version uses %ld
everywhere, it's probably good to do that here as well. (My compiler gave some warnings here.)
theano/gpuarray/corr_gemm.c
Outdated
"GpuCorrMM shape inconsistency:\n" | ||
" bottom shape: %d %d %d %d\n" | ||
" weight shape: %d %d %d %d %d %d\n" | ||
" top shape: %ld %ld %ld %ld (expected %d %d %d %d)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem.
For what it's worth: I tried to use It may be interesting to change the GPU version to use the batched |
Interestingly, I had a segfault with opencl on the Intel Integrated HD Graphics device, but it ran fine on my i5 CPU. |
When you start working on the rebase, be careful that you keep things backwards compatible: put the new parameters, props etc. for unshared convolution after the parameters that have already been added for the grouped convolution. |
Update : I'm working on the integration with grouped convolution. Should be done tomorrow or day after. |
I've squashed the commits, rebased and started a new PR since this one is already very messy. That is at #6286. I'm closing this PR. |
Related to #5620 .
Task list :
C code for grad inputs and tests
All tests need to pass.
Change weight tensor dimension ordering instead of dimshuffle
Integrate unshared tests with AbstractConv
GPU code