-
Notifications
You must be signed in to change notification settings - Fork 948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError: An update must have the same type as the original shared variable #728
Comments
Currently supplying my own weights W of exactly the same shape (checked by calling |
Can you post a minimal script to replicate the error? |
Took me a bit to slim it down sufficiently. Right now, I hope it's good enough to demonstrate the bug: https://gist.github.com/rubenvereecken/d9277cf861d1c297d953d4c27edcdd02 Change |
looks like the pylearn2 cuda_convnet wrapper might need a fix like this one? Theano/Theano#3774 |
Very good, that seems spot on! So #715 didn't do anything wrong, but it uncovered a bug in pylearn2. @rubenvereecken: You can file a PR to pylearn2 and hope they merge it (it's not in active development any more). The method in question is https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/sandbox/cuda_convnet/img_acts.py#L157, and the same in A workaround in Lasagne would be to enforce the broadcast pattern on the update, but this might hide other bugs, so we shouldn't implement it. Anyway, if you want to do this in your own code, it would be as simple as: def fix_update_bcasts(updates):
for param, update in updates.items():
if param.broadcastable != update.broadcastable:
updates[param] = T.patternbroadcast(update, param.broadcastable) |
Just bumping this again as I saw this problem elsewhere too. |
Elsewhere in pylearn2 or elsewhere altogether? I believe the best solution would be to fix this particular instance in pylearn2. We probably neither want to ignore this kind of problem in Theano nor in Lasagne. |
I don't know anyone using Pylearn2 anymore. It was with Lasagne: https://github.com/MarcCote/sb_resnet/blob/master/sb/sb_resnet.py#L133 The fix in that repo: |
Yes, the OP, for cuda-convnet. It doesn't return the correct broadcast pattern for the gradient, that's why the problem crept up.
That was a mistake in their ADAM implementation then, Lasagne does it correctly: If this turns out to affect many users, the easiest solution may be an extra keyword argument to |
TypeError: ('An update must have the same type as the original shared variable (shared_var=<TensorType(float32, matrix)>, shared_var.type=TensorType(float32, matrix), update_val=Elemwise{add,no_inplace}.0, update_val.type=TensorType(float64, matrix)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.') `def adam(lr, tparams, grads, inp, cost):
` |
If you look closely at the error message, the first type is "TensorType(float32, matrix)" and the second is "TensorType(float64, matrix)". You can't know, but "matrix" is the short term for a tensor of two non-broadcastable dimensions -- so the broadcast pattern is the same for both. (Theano could actually check this and omit the hint about the broadcast pattern, it is misleading here.) What's different, though, is the dtype: float32 vs. float64. Somewhere you are either starting with a float64, or a float32 is upcasted to a float64. A possible source of upcasting is an operation that involves a float32 and a numpy integer or float64. assert m.dtype == m_t.dtype
assert v.dtype == v_t.dtype
assert p.dtype == p_t.dtype Any reason why you cannot just use |
First off, this is not a usage question. This is, I believe, a bug report.
I got this issue using any 2D convolutional network with channel depth 1. Any higher number is fine and does not net me the follow exception:
I think the exception is indeed about the broadcastable dimension at
index=1
.Printing out the variable that gave me the error, it's apparently the weight tensor
W
of the convolutional layer. It looks like this<CudaNdarrayType(float32, (False, True, False, False))>
.If I change channel depth (1st dimension) to 2 and look at W again, it simply is
<CudaNdarrayType(float32, 4D)>
.Now, I believe the seasoned Theano user will immediately see something along the lines of "yep it's definitely that broadcastable dimension you have there messing up" but that still doesn't take away this is not expected and probably is unwanted, though I might be missing a use case for this behavior.
During my struggles yesterday I fixed this by adding a reshape layer that did absolutely nothing, simply shaped to exactly the same shape. My next attempt will be to follow the advice of the Exception and unbroadcast that dimension.
The text was updated successfully, but these errors were encountered: