Advanced Reshape Layer #36

patyork · 2015-04-05T19:12:00Z

I've added a bit of an architecture change, as well as a new layer to allow Advanced Reshaping.

Basically, to allow reshapes involving the first dimension, the number of samples in the current batch (current_batch_size) must be available to the AdvancedReshape layer. The easiest way that I could see to allow that is to throw a new parameter (current_batch_size) to the Layer.output function. This recursively passes the current number of samples in the batch to each layer, so that it has it available if it is necessary. This required touching every layer.

The AdvancedReshape layer takes a lambda expression for the initialization parameter. This lambda should take 2 arguments: current_batch_size and current_shape (can be seen below). From these parameters, it is possible to reshape between 1D, 2D, 3D (and possibly to ND, although I am unsure on that). This lambda should return a tuple.

Below is an example in which the current_batch_size changes on the pass over the last batch. It is simplistic and not overly useful, but it shows that AdvancedReshape can correctly reshape from 2D -> 3D and then from 3D -> 2D without breaking over the toughest example:

model = Sequential()
model.add(Dense(2,1, activation='sigmoid'))

# Reshape to 3D: (number of sample in current batch, elements in each sample, values)
model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size, current_shape[0]/current_batch_size, current_shape[1])))
# Reshape back to 2D ((number of sample in current batch * elements in each sample, values))
model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size * current_shape[1], current_shape[2])))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mse', optimizer='sgd')

X = np.zeros((3,2))
Y = np.zeros((3,1))
model.fit(X, Y, batch_size=2, nb_epoch=1)

Epoch 0
�2/3 [===================>..........] - ETA: 0s - loss: 0.2500��������������������������������������������������������������
3/3 [==============================] - 0s - loss: 0.2497

…es the first dimension

…) to all layers for AdvancedReshaping

patyork · 2015-04-05T20:15:58Z

A less trivial example (and, in fact, one that is quite useful). The below is a deep net, with 3 Dense layers, a recurrent layer, and a Dense layer:

# This model is similar to the architecture utilized in DeepSpeech [http://arxiv.org/abs/1412.5567]
#   which attained state-of-the art performance in speech recognition in noisy environments
# The only changes would be:
#   -a Bidirectional RNN (BRNN) instead of a simple RNN layer,
#   -Clipped ReLU instead of PReLU (although PReLU may perform better)
#   -the loss would be NLL of Connectionist Temporal Classification (CTC) cost
model = Sequential()
model.add(Dense(1520,2048))
model.add(PReLU(2048))
model.add(Dropout(p=.15))
model.add(Dense(2048,2048))
model.add(PReLU(2048))
model.add(Dropout(p=.15))
model.add(Dense(2048,2048))
model.add(PReLU(2048))
model.add(Dropout(p=.15))
model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size, current_shape[0]/current_batch_size, current_shape[1])))
model.add(SimpleRNN(2048, 2048))
model.add(PReLU(2048))
model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size*current_shape[1], current_shape[2])))
model.add(Dense(2048,30, activation='softmax')

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer='sgd')

fchollet · 2015-04-06T16:33:46Z

model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size, current_shape[0]/current_batch_size, current_shape[1])))
model.add(SimpleRNN(2048, 2048))
model.add(PReLU(2048))
model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size*current_shape[1], current_shape[2])))

Could you provide more detail about what is going on here, in terms what quantities the dimensions in the successive shapes stand for? I'm having trouble following.

patyork · 2015-04-06T16:41:01Z

Sure.

At the point in the model below, the batches are stacked into a 2D matrix that is passed around.

However, an RNN, needs a tensor3 of size (nb_samples, time_steps, values). The lambda below creates this shape, since time_steps == current_shape[0] / current_batch_size.

model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size, current_shape[0]/current_batch_size, current_shape[1])))

Then, the data is sent through the RNN

model.add(SimpleRNN(2048, 2048))

Finally, we want to restack the tensor3 into a 2D matrix, which is given by the lambda expression:

model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size*current_shape[1], current_shape[2])))
# which is equivalent to
model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_shape[0]*current_shape[1], current_shape[2])))

Full example, with numbers.

# let batch size = 5
# let there be 6 time steps in each sample

# Incoming shape: (5 * 6, 2048) == (30, 2048)

model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size, current_shape[0]/current_batch_size, current_shape[1])))
# Out shape: (5, 30/5, 2048) == (5, 6, 2048)

model.add(SimpleRNN(2048, 2048))
# Output shape: (5, 6, 2048)

model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size*current_shape[1], current_shape[2])))
# Output shape: (5 * 6, 2048) == (30, 2048)

fchollet · 2015-04-06T18:35:14Z

model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: (current_batch_size, current_shape[0]/current_batch_size, current_shape[1])))
# Out shape: (5, 30/5, 2048) == (5, 6, 2048)

This seems to assume current_shape[0] != current_batch_size, ie. current_shape == X.shape[1:]. But looking at the code:

nshape = self.new_shape_fn(current_batch_size, X.shape)

Am I missing something?

patyork · 2015-04-06T20:24:09Z

This is not an assumption, this is fact when the batch size passed into the model.fit is not 1( batch_size != 1 )

self.new_shape_fn(...) is a function that produces a new shape, from the lambda given in the AdvancedReshape layer init.

I will provide a full working example in a moment. But let me think on it for a while, before I give examples with actual numbers.

patyork · 2015-04-06T20:41:09Z

EDIT:
Accidentally closed.

fchollet · 2015-04-06T22:16:49Z

So you're applying your reshape lambda to X.shape, where X is the input to the AdvancedReshape layer. It seems to me like X.shape[0] would be the number of samples in the batch. How is this not the case?

As far as I can tell, current_batch_size is not a user-facing parameter but is the first element of the shape of the input hitting the layer at the current iteration.

I'm just trying to understand how this works...

Looking at your code:

# we'll note X = layer.get_input(train, current_batch_size) at each layer. nb_samples is arbitrary.
# we're calling this model on an input of shape [nb_samples, 1520]. According to the changes in models.py,
# that means the value of current_batch_size being propagated throughout the entire model is nb_samples.
model = Sequential() 
model.add(Dense(1520,2048)) # X.shape == [nb_samples, 1520]
model.add(PReLU(2048)) # X.shape == [nb_samples, 2048]
model.add(Dropout(p=.15))
model.add(Dense(2048,2048)) 
model.add(PReLU(2048)) # X.shape == [nb_samples, 2048]
model.add(Dropout(p=.15))
model.add(Dense(2048,2048))
model.add(PReLU(2048)) # X.shape == [nb_samples, 2048]
model.add(Dropout(p=.15))
model.add(AdvancedReshape(new_shape_fn=lambda current_batch_size, current_shape: \
(current_batch_size, current_shape[0]/current_batch_size, current_shape[1]))) # X.shape == [nb_samples, 2048]
# hence the lambda is called on (nb_samples, (nb_samples, 2048))
# and returns (nb_samples, nb_samples/nb_samples, 2048)

At least that's how I understand it. Where is this incorrect?

patyork · 2015-04-08T15:11:58Z

This made the assumption that Keras supports 3D input into Dense layers (required for networks that include at least one Recurrent layer and one non-recurrent layer).

* fix saving * fix * mxnet backend (keras-team#32) * handle comp operators in KerasSymbol * Fix some test cases * Fix random_uniform * More fixes * Fix model load/save * fix Compare fix random_uniform_variable * fix relu * Context Setting change (keras-team#33) * Context Setting change 1. support gpu(?) besides gpu? 2. Default context when compling is get from mx.default_context() * Do not cast if type is same * fix * Implemented tile (keras-team#35) * fix

harish2704 · 2018-01-20T14:28:07Z

Hi, Is there any update on this PR ?
Is there any change that , this will get merged ?
I am also facing similar issue

* Adds adam optimizer * Adds adam optimizer * Add optimizer tests for Adam. * Adds adam optimizer * Added initial Adam tests * Adds adam optimizer * Applied nit fixes for Adam * Indent fix in docstring

patyork added 7 commits April 4, 2015 23:40

Added an Advanced Reshape layer that allows rehaping, including chang…

5dd107b

…es the first dimension

Updated batch_size passing

a06d4b6

Propagated architecture change to pass the nb_samples (the batch size…

4c5bc6b

…) to all layers for AdvancedReshaping

Merge remote-tracking branch 'upstream/master' into advanced_reshape

e5ac4ef

Removed uneeded intermediate function

b64cd75

Cleanup

63610b8

Renamed parameter to current_batch_size for clarity

1722ca9

patyork mentioned this pull request Apr 5, 2015

Added Clip layer for clipping of outputs #37

Closed

patyork closed this Apr 6, 2015

fchollet reopened this Apr 6, 2015

patyork closed this Apr 8, 2015

shoaib369 mentioned this pull request Dec 22, 2016

How to change batch size of an intermediate layer in Keras? #4807

Closed

Dahlasam mentioned this pull request Apr 1, 2017

The problem of GPU for keras using tensorflow backend ? #5712

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advanced Reshape Layer #36

Advanced Reshape Layer #36

patyork commented Apr 5, 2015

patyork commented Apr 5, 2015

fchollet commented Apr 6, 2015

patyork commented Apr 6, 2015

fchollet commented Apr 6, 2015

patyork commented Apr 6, 2015

patyork commented Apr 6, 2015

fchollet commented Apr 6, 2015

patyork commented Apr 8, 2015

harish2704 commented Jan 20, 2018

Advanced Reshape Layer #36

Advanced Reshape Layer #36

Conversation

patyork commented Apr 5, 2015

patyork commented Apr 5, 2015

fchollet commented Apr 6, 2015

patyork commented Apr 6, 2015

fchollet commented Apr 6, 2015

patyork commented Apr 6, 2015

patyork commented Apr 6, 2015

fchollet commented Apr 6, 2015

patyork commented Apr 8, 2015

harish2704 commented Jan 20, 2018