-
Notifications
You must be signed in to change notification settings - Fork 951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow symbolic in reshape #287
Conversation
Maybe we should check that the Then we can also get rid of that |
Your test is missing # Secondly, replace all symbolic shapes with `None`, as we cannot
# infer their size here.
for dim, o in enumerate(output_shape):
if isinstance(o, TensorVariable):
output_shape[dim] = None
masked_output_shape[dim] = None You can also merge it with the first for loop if you find a good way of documenting it, otherwise just copy and paste my suggestion as is. Apart from that, I don't understand your use case. Why do you need a |
Yeah is see the test is not correct. My use case is for recurrent nets. When you combine recurrent nets and feed forward nets
If you want both |
elif isinstance(s, T.TensorVariable): | ||
if s.ndim != 0: | ||
raise ValueError( | ||
"The symbolic variable specifying shape must be a " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "A symbolic variable in a shape specification must be a scalar, but had %i dimensions"? The variable technically doesn't specify the shape, it only specifies one dimension.
Couldn't resist a bit of nitpicking, sorry! :) Apart from that error message everything looks great. I'm okay with merging either way, as I said myself we shouldn't be nitpicking about docs / error messages too much just yet (see #279).
|
No worries. I think its important to have meaningfull errors, and your sugesstions is more precise :) Maybe i should also add an example? |
Updated with @benanne's error message and added example with symbolic dimension specification. |
Sweet, looks good to me. I'll leave this to @f0k to merge in case he has any further comments. |
>>> l_in = InputLayer((None, None, 10)) | ||
>>> l1 = ReshapeLayer(l_in, (batch_size*seq_len, [2])) | ||
>>> l1.output_shape | ||
(None, 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a good example, because you could just do ReshapeLayer(l_in, (-1, [2]))
for that. Better remove the example than keeping people wondering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Maybe we should remove the example from here and add an example in the recurrent docs?
Minor comments, but the code and the tests look good! So the use case for this is going from |
I removed the blanks. I agree with your comment about the example, it is probably more confusing. As i mentioned above i think we should add an example in the recurrent docs. Do you agree? |
Yes, that seems to be a good place! You'd need some example on how to use a custom subnetwork for the input-to-hidden or hidden-to-hidden connections anyway, and this will include the reshapes (unless it's hidden in the recurrent layer class, but I'm not sure if that'd be good). |
Yes. I'll create an issue in @craffel's repo so we wont forget. |
I removed the new example and removed the blanks. |
Nice work, merging! |
I'm writing the example with recurrent layers + dynamic reshape and I think i found an bug/non desired behavior: from lasagne.layers import *
import lasagne
from lasagne.nonlinearities import softmax
import theano.tensor as T
import numpy as np
x, y = T.tensor3(), T.matrix()
batchsize, seqlen, _ = x.shape
num_inputs, num_units, num_classes = 10, 12, 5
l_inp = InputLayer((batchsize, seqlen, 10))
l_lstm_fwd = LSTMLayer(l_inp, num_units=num_units)
l_lstm_bck = LSTMLayer(l_inp, num_units=num_units, backwards=True)
l_lstm = ConcatLayer([l_lstm_fwd, l_lstm_bck], axis=2)
l_shp1 = ReshapeLayer(l_lstm, (-1, 2*num_units)) #<<<<<<<<<problem
l_softmax = DenseLayer(l_shp1, num_units=num_classes, nonlinearity=softmax)
l_out = ReshapeLayer(l_softmax, (batchsize, seqlen, num_classes))
lasagne.layers.get_output(
l_out, x).eval({x: np.ones((10, 5, num_inputs), dtype='float32')}) If I have minus -1 in the marked line i get an error in the sanity check in 155. I i replace it with I havent fully figured out what the sanity checks but thought i would mention it anyway. I think that the error message in line 98 should be changed to something like:
|
The problem seems to be that line 130 ff replace TensorsVariables with None for dim, o in enumerate(output_shape):
if isinstance(o, T.TensorVariable):
output_shape[dim] = None
masked_output_shape[dim] = None which is then converted to None in line 137. output_size = (None if any(x is None for x in masked_output_shape)
else np.prod(masked_output_shape)) I think it can be solved by checking for tensors def has_tensor_input(lst):
return any(map(lambda v: isinstance(v, T.TensorVariable), lst))
has_tensor = (has_tensor_input(masked_input_shape)) or (
has_tensor_input(masked_output_shape))
del masked_input_shape, masked_output_shape
# Finally, infer value for -1 if needed
if -1 in output_shape:
dim = output_shape.index(-1)
if (input_size is None) or (output_size is None):
output_shape[dim] = None
output_size = None
else:
output_size *= -1
output_shape[dim] = input_size // output_size
output_size *= output_shape[dim]
# Sanity check
if (input_size is not None) and (output_size is not None) \
and (input_size != output_size) and not has_tensor:
raise ValueError("%s cannot be reshaped to specification %s. "
"The total size mismatches." %
(input_shape, self.shape)) |
Sorry for making this so long :) But i figured that maybe i'm not supposed to give input sizes that are non int/[int] in the input layer? The shape layer works if I use
|
Indeed, symbolic scalars are not supported in the shape specification of |
ok. I guess its old habit to pass anything called batchsize into the input layer. |
No, we want the shapes to be non-symbolic. |
Allow symbolic variables in the shape specifcation when using the reshape layer.
Its discussed here: https://groups.google.com/forum/#!topic/lasagne-users/eA995V73K8I
One use case is to allow for both variable batchsize and sequence length when using recurrent nets.
It passes the tests but i'm not sure if it has some undesired consequences?