Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does lstm in keras add peephole connection? Which paper do these codes refer to? #1717

Closed
Imorton-zd opened this issue Feb 14, 2016 · 13 comments

Comments

@Imorton-zd
Copy link

In keras, the lstm neural network is efficient for text classification. However, I can't understand the detail codes and don't know if the lstm neural network adds peephole connection. Please give me some reference documents. Opinions/Views would be highly appreciated!

@Imorton-zd
Copy link
Author

@EderSantana @fchollet @dbonadiman Could you spend your valuable time solving this question? Thanks a lot.

@EderSantana
Copy link
Contributor

So here is how we calculate the activations of an LSTM https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L443

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i))

if I'm not wrong, the peephole should take a peek at the cell content and do something like:

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.U_i))

If I'm correct, the following class should do what you need:
Gist is here: https://gist.github.com/EderSantana/f07fa7a0371d0e1c4ef1

from keras.layers.recurrent import LSTM

class LSTMpeephole(LSTM):
    def __init__(self, **kwargs):
        super(LSTMpeephole, self).__init__(**kwargs)

    def build(self):
        super(LSTMpeephole, self).build()
        self.P_i = self.inner_init((self.output_dim, self.output_dim))
        self.P_f = self.inner_init((self.output_dim, self.output_dim))
        self.P_c = self.inner_init((self.output_dim, self.output_dim))
        self.P_o = self.inner_init((self.output_dim, self.output_dim))
        self.trainable_weights += [self.P_i, self.P_f, self.P_o]

    def step(self, x, states):
        assert len(states) == 2
        h_tm1 = states[0]
        c_tm1 = states[1]

        x_i = K.dot(x, self.W_i) + self.b_i
        x_f = K.dot(x, self.W_f) + self.b_f
        x_c = K.dot(x, self.W_c) + self.b_c
        x_o = K.dot(x, self.W_o) + self.b_o

        i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.P_i))
        f = self.inner_activation(x_f + K.dot(h_tm1, self.U_f) + K.dot(c_tm1, self.P_f))
        c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1, self.U_c) + K.dot(c_tm1, self.P_c))
        o = self.inner_activation(x_o + K.dot(h_tm1, self.U_o) + K.dot(c_tm1, self.P_o))
        h = o * self.activation(c)
        return h, [h, c]

@tboquet
Copy link
Contributor

tboquet commented Feb 17, 2016

In addition you could take a look at this paper, and this Ph.D. thesis.

@Imorton-zd
Copy link
Author

@EderSantana Thanks a lot. By the way, does LSTM in keras stems from (Hochreiter and Schmidhuber, 1997) or a variant (Graves, 2013) ?

@EderSantana
Copy link
Contributor

@Imorton-zd We are using Graves 2013 (with forget gates with bias equal 1).
BTW, I don't see a lot of recent work talking about the peep connections. Did you read about it in a recent paper?

@dinghaoyang
Copy link

@EderSantana
In c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1, self.U_c) + K.dot(c_tm1, self.P_c)), we may not need to add K.dot(c_tm1, self.P_c)?

@ersinyar
Copy link

@dinghaoyang Have you made this peephole structure work? Or anyone made it work?

@karlittoz
Copy link

Here is a paper refering to peepholes.
http://www.jmlr.org/papers/volume3/gers02a/gers02a.pdf

@Rithmax
Copy link

Rithmax commented Jul 19, 2017

Hi, Did anyone able to find a solution for this? According to my understanding, Keras LSTM cell is similar to Tensorflow

BasicLSTMCell(RNNCell)

However, I am looking for the implementation of Tensorflow

LSTMCell(RNNCell)

in Keras (LSTMCell : The class uses optional peep-hole connections, optional cell clipping, and
an optional projection layer). https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/rnn_cell_impl.py
https://research.google.com/pubs/archive/43905.pdf
Hasim Sak, Andrew Senior, and Francoise Beaufays.
"Long short-term memory recurrent neural network architectures for
large scale acoustic modeling." INTERSPEECH, 2014.

I am wondering without writing a custom layer, is there is any way to use this Tensorflow LSTMCell in Keras or is it already available?

@huan
Copy link

huan commented Jan 1, 2018

I have the same question:

I am wondering without writing a custom layer, is there is any way to use this Tensorflow LSTMCell in Keras or is it already available?

And it seems there's a peephole implementation in tensorflow.keras, but not in keras.

@HansikaPH
Copy link

@zixia I don't see a peephole implementation in tensorflow.keras either. So is there no built in support in Keras to include peephole connections with LSTM?

@xianhaoniyes
Copy link

I guess lstm in keras doesn't implement peephole connection, according to the kera ducmentation, the lstm layer is based on the paper "Long Short-Term Memory layer - Hochreiter 1997", and i also check lassagne's documentation, it's lstm layer is based on "Graves, Alex: “Generating sequences with recurrent neural networks.” arXiv preprint arXiv:1308.0850 ", where peephole connection is added to the orgin model. And if you print the number of params of both implementation, you will find the params of keras is less the params of lassagne.

@ageron
Copy link
Contributor

ageron commented Mar 30, 2019

The Unify RNN Interface RFC confirms that tf.keras's LSTMCell is equivalent to TensorFlow's BasicLSTMCell, and the comment says "No peephole, clipping, projection. Keras allows kernel_activation to be customized (default=hard_sigmoid)".
There is no equivalent of TensorFlow 1's LSTMCell class, which has this comment: "Support peephole, clipping and projection".

So if you want peepholes with tf.keras, you have to create a custom cell that wraps the LSTMCell class. In TensorFlow 2.0, it was removed (or more precisely, it was moved to tf.compat.v1.nn.rnn_cell.LSTMCell, so technically you can still use that, although it's not ideal).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests