### Lesson 10 - Assignment

Using the Keras dataset, create a new notebook and perform each of the following data preparation tasks and answer the related questions:

- Read Reuters dataset into training and testing 
- Prepare dataset
- Build and compile 3 different models using Keras LTSM ideally improving model at each iteration.
- Describe and explain your findings.

In [None]:
# References
# https://medium.com/@minhao_chen/rnn-with-reuters-dataset-228ddc9d1f42
# code https://github.com/cmhjerry/DeepLearningMC/blob/master/RNN%20Mini%20Project.ipynb

# https://towardsdatascience.com/text-classification-in-keras-part-1-a-simple-reuters-news-classifier-9558d34d01d3

# https://towardsdatascience.com/machine-learning-recurrent-neural-networks-and-long-short-term-memory-lstm-python-keras-example-86001ceaaebc

In [2]:
# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras
from keras.utils import np_utils, to_categorical
from keras import models, regularizers, layers, optimizers, losses, metrics
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, LSTM, Embedding, Flatten
from keras.preprocessing import sequence

#import numpy and matplotlib
import numpy as np
import matplotlib.pyplot as plt

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


### Import dataset from Keras library

In [3]:
#import data
from keras.datasets import reuters

In [4]:
#This dataset also makes available the word index used for encoding the sequences:
word_index = reuters.get_word_index(path="reuters_word_index.json")

In [5]:
len(word_index)

30979

### Split the data into training and testing sets

In [6]:
#Here's a trick to force reuters.load_data to allow pickle by

import numpy as np
# save np.load
np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

(x_train, y_train), (x_test, y_test) = reuters.load_data(path="reuters.npz",
                                                         num_words=None,
                                                         skip_top=0,
                                                         maxlen=None,
                                                         test_split=0.2,
                                                         seed=113,
                                                         start_char=1,
                                                         oov_char=2,
                                                         index_from=3)

# restore np.load for future normal usage
np.load = np_load_old

In [None]:
# FROM - https://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-networks-python-keras/

In [9]:
max_review_length = 500
x_train = sequence.pad_sequences(x_train, maxlen=max_review_length)
x_test = sequence.pad_sequences(x_test, maxlen=max_review_length)

In [11]:
top_words = 5000

In [15]:
y_train = keras.utils.to_categorical(y_train, 46)
y_test = keras.utils.to_categorical(y_test, 46)

In [16]:
# create the model
embedding_vecor_length = 32
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
model.add(LSTM(100))
model.add(Dense(1, activation='relu'))

In [17]:
model.compile(optimizer='adam', 
              loss='categorical_crossentropy',
              metrics=['accuracy'])

In [18]:
test_loss, test_acc = model.evaluate(x_test, y_test)

print('Test accuracy:', test_acc)

ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (46,)

In [None]:
# NEXT TO TEST - https://github.com/pranav-vempati/Reuters-Newswire-Topics-Classification-using-a-Bidirectional-RNN/blob/master/ReutersClassification.py

In [None]:
# NEXT TRY
# https://slundberg.github.io/shap/notebooks/deep_explainer/Keras%20LSTM%20for%20IMDB%20Sentiment%20Classification.html

In [None]:
# Review - https://medium.com/@shivambansal36/language-modelling-text-generation-using-lstms-deep-learning-for-nlp-ed36b224b275

### Explore data

In [7]:
# word_index = reuters.get_word_index()
# reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
# decoded_newswire = ' '.join([reverse_word_index.get(i - 3, '?') for i in
# x_train[0]])

# print(decoded_newswire)
# print(x_train[0])

### Process data

In [8]:
from keras.preprocessing.text import Tokenizer

max_words = 10000

tokenizer = Tokenizer(num_words=max_words)
x_train = tokenizer.sequences_to_matrix(x_train, mode='binary')
x_test = tokenizer.sequences_to_matrix(x_test, mode='binary')

In [9]:
y_train = keras.utils.to_categorical(y_train, 46)
y_test = keras.utils.to_categorical(y_test, 46)

In [10]:
print(x_train[0])
print(len(x_train[0]))

print(y_train[0])
print(len(y_train[0]))

[0. 1. 0. ... 0. 0. 0.]
10000
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
46


In [11]:
x_train.shape

(8982, 10000)

### Build a TensorFlow model using a single dense hidden layer

#### Configure the layers

In [12]:
# model = keras.Sequential([
#     # input - width, height, color values - RGB
#     keras.layers.Flatten(input_shape=(32, 32, 3)),
#     # layer - 10 nodes for 10 image classes with softmax activation function
#     keras.layers.Dense(128, activation=tf.nn.relu),
#     # layer - # of nodes/neurons with relu activation function
#     keras.layers.Dense(10, activation=tf.nn.softmax)
# ])

In [13]:
# model = models.Sequential()
# model.add(layers.Dense(256, kernel_regularizer=regularizers.l1(0.001), activation='relu', input_shape=(10000,)))
# model.add(layers.Dropout(0.5))
# model.add(layers.Dense(256, kernel_regularizer=regularizers.l1(0.001), activation='relu'))
# model.add(layers.Dropout(0.5))


# model.add(layers.Dense(46, activation='softmax'))
# model.summary()

In [14]:
model = Sequential()
model.add(Dense(512, input_shape=(max_words,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(46))
model.add(Activation('softmax'))

#### Compilation - configure the learning process including defining the optimizer, loss function, and metric.

In [15]:
model.compile(optimizer='adam', 
              loss='categorical_crossentropy',
              metrics=['accuracy'])

#### Train model  with our training data and by defining the number of epochs.

In [16]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x1371967b8>

### Apply model to test set and evaluate accuracy

In [17]:
test_loss, test_acc = model.evaluate(x_test, y_test)

print('Test accuracy:', test_acc)

Test accuracy: 0.8036509349955476


### Perform 3 adjusts to the number of layers and activation functions to improve accuracy

#### First adjust - Add additonal layer with 64 nodes and activation function = relu

In [None]:
# model1 = keras.Sequential([
#     # input - width, height, color values - RGB
#     keras.layers.Flatten(input_shape=(32, 32, 3)),
#     # layer - 10 nodes for 10 image classes with softmax activation function
#     keras.layers.Dense(128, activation=tf.nn.relu),
#     # layer - # of nodes/neurons with relu activation function
#     keras.layers.Dense(64, activation=tf.nn.sigmoid),
#     # layer - # of nodes/neurons with relu activation function
#     keras.layers.Dense(10, activation=tf.nn.softmax)
])

In [None]:
# model1 = Sequential()
# model.add(Embedding(max_words, 64))
# #model.add(Embedding(max_words, embedding_dim, input_length=maxlen))
# model.add(Flatten())
# model1.add(LSTM(64,input_shape=(max_words, ,), return_sequences=True))
# model1.add(Dropout(0.5))
# model1.add(Dense(46))
# model1.add(Activation('softmax'))

In [40]:
model1 = Sequential()
model1.add(Embedding(max_words, 256, input_length=None))
model1.add(LSTM(256, return_sequences=False))
model1.add(Dense(46))
model1.add(Activation('softmax'))

In [46]:
model = Sequential() # Initialize model(a currently empty linear stack of sequential layers)
model.add(Embedding(max_words, 256, input_length = 200)) # Embedding layer to convert input to dense vector representations
model.add(LSTM(128)) # Add Bidirectional() layer wrapper to a standard LSTM with an output space dimensionality of 128. Bidirectional LSTMs present numerous advantages over their unidirectional variants 
model.add(Dropout(0.5))  # Indiscriminately cull half our hidden units during training(dropout hyperparameter chosen according to this paper: http://papers.nips.cc/paper/4878-understanding-dropout.pdf)
model.add(Dense(46, activation = 'softmax', use_bias = True, kernel_initializer = 'he_normal')) # Dense feedforward neural network layer employing the He_normal weight initialization scheme with output dimensionality of 46(corresponding to a one-hot vector generated by the softmax layer)

In [None]:
# Model_lstm.add(Embedding(input_dim = max_words, output_dim = 256, input_length = max_phrase_len))
# model_lstm.add(LSTM(256, dropout = 0.3, recurrent_dropout = 0.3))




In [None]:
# model = Sequential()
# model.add(LSTM(64, input_shape=(1,4410), return_sequences=True, activation='sigmoid'))

In [47]:
model1.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [48]:
model1.fit(x_train, y_train, epochs=5)

ValueError: Error when checking target: expected activation_5 to have shape (1,) but got array with shape (46,)

In [None]:
test_loss, test_acc = model1.evaluate(x_test, y_test)

print('Test accuracy:', test_acc)

#### Second adjust - Add 2D convolution layer (e.g. spatial convolution over images) layer with activation function relu

In [None]:
model2 = keras.Sequential([
    # input - width, height, color values - RGB
    keras.layers.Conv2D(32, (3, 3), padding='same',input_shape=x_train.shape[1:], activation=tf.nn.relu),
    keras.layers.Flatten(),
    # layer - 10 nodes for 10 image classes with softmax activation function
    keras.layers.Dense(128, activation=tf.nn.relu),
    # layer - # of nodes/neurons with relu activation function
    keras.layers.Dense(10, activation=tf.nn.softmax)

])

In [None]:
model2.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [None]:
model2.fit(x_train, y_train, epochs=5)

In [None]:
test_loss, test_acc = model2.evaluate(x_test, y_test)

print('Test accuracy:', test_acc)

#### Third adjust - Add 2D convolution layer (e.g. spatial convolution over images) layer and pooling operation

In [None]:
model3 = keras.Sequential([
    # input - width, height, color values - RGB
    keras.layers.Conv2D(32, (3, 3), padding='same',input_shape=x_train.shape[1:], activation=tf.nn.relu),
    keras.layers.MaxPooling2D(pool_size=(2, 2)),
    keras.layers.Flatten(),
    # layer - 10 nodes for 10 image classes with softmax activation function
    keras.layers.Dense(128, activation=tf.nn.relu),
    # layer - # of nodes/neurons with relu activation function
    keras.layers.Dense(10, activation=tf.nn.softmax)

])

In [None]:
model3.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [None]:
model3.fit(x_train, y_train, epochs=5)

In [None]:
test_loss, test_acc = model2.evaluate(x_test, y_test)

print('Test accuracy:', test_acc)

### Summary

The most significant increase in accuracy was found when I applied a 2D convolutional layer. This makes sense given it's optimized for images. 

The adjustments of adding an additional layer and adding pooling seemed to have minor impacts to accuracy. 

Perhaps further exploration and additional combinations would build upon the Conv2D and provide further improvements in accuracy.