# Simple LSTM 

LSTM (Long term Short Term memory)
* **RNN (Recurrent Neural Network)** has problem about long term memory, **LSTM** has improvement about that
* Our **LSTM** is focusing on predicting some movie is good or not, when **LSTM** gets people's reputation

<hr>

How to use this notebook :

There is only minimum explanation

This notebook could be helpful for who want to see how code works right away

Please upvote if it was helpful !

<hr>

## Content
1. [Libraries import](#one)
2. [Prepare Data](#two)
3. [Modeling](#three)
4. [Training & Evaluation](#four)

<hr>

<a id="one"></a>
# 1. Libraries import

In [1]:
from keras.preprocessing import sequence
from keras.datasets import imdb # IMDB is public data from keras 
from keras import layers, models


<a id="two"></a>
# 2. Prepare Data

In [2]:
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=20000)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


In [3]:
x_train.shape

# 25000 binary movie reputation
# 1 = recommend, 2 = not recommend

(25000,)

In [4]:
class Data:
    def __init__(self, max_features=20000, maxlen=80):
        
        (x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
        # load_data() = bring data from imdb
        # max_features = words's maximum frequency
        
        x_train = sequence.pad_sequences(x_train, maxlen=maxlen) 
        x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
        # pad_sequences() = make every sequence have same length

        self.x_train, self.y_train = x_train, y_train
        self.x_test, self.y_test = x_test, y_test


<a id="three"></a>
# 3. Modeliing

In [5]:
class RNN_LSTM(models.Model):
    def __init__(self, max_features, maxlen):
        
        # input (80 element)
        x = layers.Input((maxlen,))
        
        # embedding = every element change to be words(128 length) 
        # 128 is output vector size
        h = layers.Embedding(max_features, 128)(x)
        
        # 128 nodes, dropout and recurrent_dropout set 20%
        h = layers.LSTM(128, dropout=0.2, recurrent_dropout=0.2)(h)
        
        # output (activation fuction = sigmoid)
        y = layers.Dense(1, activation='sigmoid')(h)
        super().__init__(x, y)

        # model compile (set oprimizer function and loss function)
        self.compile(loss='binary_crossentropy',
                     optimizer='adam', metrics=['accuracy'])


<a id="four"></a>
# 4. Training & Evaluation

In [6]:
class Machine:
    def __init__(self,
                 max_features=20000,
                 maxlen=80):
        self.data = Data(max_features, maxlen)
        self.model = RNN_LSTM(max_features, maxlen)

    def run(self, epochs=3, batch_size=32):
        data = self.data
        model = self.model
        print('Training stage')
        print('==============')
        
        # training LSTM
        model.fit(data.x_train, data.y_train,
                  batch_size=batch_size,
                  epochs=epochs,
                  validation_data=(data.x_test, data.y_test))

        score, acc = model.evaluate(data.x_test, data.y_test,
                                    batch_size=batch_size)
        print('Test performance: accuracy={0}, loss={1}'.format(acc, score))
        
  

In [7]:
def main():
    m = Machine()
    m.run()

if __name__ == '__main__':
    main()

Training stage
Epoch 1/3
Epoch 2/3
Epoch 3/3
Test performance: accuracy=0.8221200108528137, loss=0.45595741271972656


## Reference
* Coding chef 3 minute deep learning  - [ex5_1_lstm_imdb](https://github.com/jskDr/keraspp/blob/master/ex5_1_lstm_imdb_cl.py)