Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of weights in LSTM #3088

Closed
bennythedataguy opened this issue Jun 28, 2016 · 5 comments
Closed

Order of weights in LSTM #3088

bennythedataguy opened this issue Jun 28, 2016 · 5 comments

Comments

@bennythedataguy
Copy link

bennythedataguy commented Jun 28, 2016

I'm trying to export an LSTM layer from Keras to a portable C implementation. I accept the possibility there is a bug in my code, but assuming there isn't, I can't figure out the order of the weights / gates in the LSTM layer.

model = Sequential()
model.add(LSTM(4,input_dim=5,input_length=N,return_sequences=True))
shapes = [x.shape for x in model.get_weights()]
print shapes

[(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,)]

What I see is

  • Weights that handle inputs
  • Weights that handle recurrent / hidden outputs
  • bias
  • repeat

But which weight set goes to which gate? The third set of weights have biases initialized to 1.0, so I'm assuming that's the forget gates.

Looking in recurrent.py, I see something like this:
i = self.inner_activation(z0)
f = self.inner_activation(z1)
c = f * c_tm1 + i * self.activation(z2)
o = self.inner_activation(z3)

But i,f,c,o is not the order, because of the biases set to 1.0. So I'm kind of confused, and would appreciate the help.

@ymcui
Copy link

ymcui commented Jun 29, 2016

i don't know which version you are using now.
in keras/layers/recurrent.py, you can check the trainable_weights
https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L723
these weights will be appended to your model in sequence.

@ChristianThomae
Copy link

model = Sequential()
model.add(LSTM(4,input_dim=5,input_length=N,return_sequences=True))
for e in zip(model.layers[0].trainable_weights, model.layers[0].get_weights()):
    print('Param %s:\n%s' % (e[0],e[1]))
Param lstm_3_W_i:
[[ 0.00069305, ...]]
Param lstm_3_U_i:
[[ 1.10000002, ...]]
Param lstm_3_b_i:
[ 0., ...]
Param lstm_3_W_c:
[[-1.38370085, ...]]
...

@elhenceashima
Copy link

@ChristianThomae in the above solution it is no more giving the same details.Rather it is simply giving 'lstm_1/kernel:0 and lstm_1/recurrent_kernel:0 and so on. Any way to display or find out the exact order of the output for get_weights especally the order of I,F,C,O since I need some specific values to be extracted from these for further processing as well.

@kratzert
Copy link

If I understand https://github.com/keras-team/keras/blob/master/keras/layers/recurrent.py#L1863 correctly I would say the order is i, f, c, o for the kernel, recurrent_kernel and bias respectively.

@bartkowiaktomasz
Copy link

If you are using Keras 2.2.0

When you print

print(model.layers[0].trainable_weights)

you should see three tensors: lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0
One of the dimensions of each tensor should be a product of

4 * number_of_units

where number_of_units is your number of neurons. Try:

units = int(int(model.layers[0].trainable_weights[0].shape[1])/4)
print("No units: ", units)

That is because each tensor contains weights for four LSTM units (in that order):

i (input), f (forget), c (cell state) and o (output)

Therefore in order to extract weights you can simply use slice operator:

W = model.layers[0].get_weights()[0]
U = model.layers[0].get_weights()[1]
b = model.layers[0].get_weights()[2]

W_i = W[:, :units]
W_f = W[:, units: units * 2]
W_c = W[:, units * 2: units * 3]
W_o = W[:, units * 3:]

U_i = U[:, :units]
U_f = U[:, units: units * 2]
U_c = U[:, units * 2: units * 3]
U_o = U[:, units * 3:]

b_i = b[:units]
b_f = b[units: units * 2]
b_c = b[units * 2: units * 3]
b_o = b[units * 3:]

Source: keras code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants