# Using gated recurrent units (GRUs)

Another type of unit often used in RNNs is gated recurrent units (GRUs). These units are actually simpler than LSTM units, because they only have two gates: update and reset. The update gate determines the memory and the reset gate combines the memory with the current input. The flow of data is made visual in the following figure:

![alt text][logo]

[logo]:https://github.com/sara-kassani/Python-Deep-Learning-Cookbook/blob/master/data/GRU.png?raw=true "Using gated recurrent units"
In this recipe, we will show how to incorporate a GRU into an RNN architecture to classify text with Keras.

In [1]:
import numpy as np
import pandas as pd

from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import GRU
from keras.callbacks import EarlyStopping

from keras.datasets import imdb


Using TensorFlow backend.


### We will be using the IMDb dataset that classifies the sentiment of text; load the data with the following code:

In [2]:
n_words = 1000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=n_words)
print('Train seq: {}'.format(len(X_train)))
print('Test seq: {}'.format(len(X_train)))

Train seq: 25000
Test seq: 25000


### By padding the sequences, we prepare our input for our network:

In [3]:
# Pad sequences with max_len
max_len = 200
X_train = sequence.pad_sequences(X_train, maxlen=max_len)
X_test = sequence.pad_sequences(X_test, maxlen=max_len)

### Define network architecture and compile

In [4]:
# Define network architecture and compile
model = Sequential()
model.add(Embedding(n_words, 50, input_length=max_len))
model.add(Dropout(0.2))
model.add(GRU(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(250, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',  optimizer='adam', metrics=['accuracy'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, 200, 50)           50000     
_________________________________________________________________
dropout_1 (Dropout)          (None, 200, 50)           0         
_________________________________________________________________
gru_1 (GRU)                  (None, 100)               45300     
_________________________________________________________________
dense_1 (Dense)              (None, 250)               25250     
_________________________________________________________________
dropout_2 (Dropout)          (None, 250)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 251       
Total params: 120,801
Trainable params: 120,801
Non-trainable params: 0
_________________________________________________________________


### Use early stopping to prevent overfitting:

In [8]:
callbacks = [EarlyStopping(monitor='val_acc', patience=3)]

### Define the hyperparameters and start training our network:

In [6]:
batch_size = 512
n_epochs = 100

model.fit(X_train, y_train, batch_size=batch_size, epochs=n_epochs, validation_split=0.2, callbacks=callbacks)

Train on 20000 samples, validate on 5000 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100


<keras.callbacks.History at 0x7f256d731630>

### Check the performance of our trained network on the test set:

In [7]:
print('Accuracy on test set: {}'.format(model.evaluate(X_test, y_test)[1]))

# Accuracy on test set: 0.83004

Accuracy on test set: 0.84812
