# 4.2 Densely Connected NN

In the last notebook, we jumped into training on a CNN. However, we quickly realized that history information seemed redundant and present information may be the most valuable. Given that, we decided to try a feed forward network and the results were interesting enough to warrant furtherr investigation. 

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
import numpy as np

In [2]:
X_train = np.load('./data/prepared/august25screenfixed/numpy_matrices/X_train.npy')
y_train = np.load('./data/prepared/august25screenfixed/numpy_matrices/y_train.npy')

In [3]:
X_test = np.load('./data/prepared/august25screenfixed/numpy_matrices/X_test.npy')
y_test = np.load('./data/prepared/august25screenfixed/numpy_matrices/y_test.npy')

In [6]:
np.unique(y_train, return_counts=True)

(array([0., 1., 2.]), array([34245, 16007,  9748]))

In [7]:
# Undersample to balance classes for training set
h = np.where(y_train == 0)[0]
b = np.where(y_train == 1)[0]
s = np.where(y_train == 2)[0]

hi = np.random.choice(h, size=9000, replace=False)
bi = np.random.choice(b, size=9000, replace=False)
si = np.random.choice(s, size=9000, replace=False)

In [8]:
ind = np.concatenate([hi,bi,si])
ind.shape

(27000,)

In [9]:
np.unique(y_train[ind], return_counts=True)

(array([0., 1., 2.]), array([9000, 9000, 9000]))

In [10]:
X_train[ind].shape

(27000, 1, 116, 60)

In [11]:
y_train = y_train[ind]
X_train = X_train[ind]

In [13]:
X_train.shape, y_train.shape

((27000, 1, 116, 60), (27000, 1))

In [14]:
X_train.shape = (27000, 116, 60)
X_train.shape

(27000, 116, 60)

In [17]:
X_train = X_train[:, :, :1]
X_train.shape

(27000, 116, 1)

In [18]:
X_train.shape = (27000, 116)

In [19]:
# decrease test size for runtime and memory concerns
h = np.where(y_test == 0)[0]
b = np.where(y_test == 1)[0]
s = np.where(y_test == 2)[0]

hi = np.random.choice(h, size=4000, replace=False)
bi = np.random.choice(b, size=4000, replace=False)
si = np.random.choice(s, size=4000, replace=False)

In [21]:
indt = np.concatenate([hi,bi,si])
indt.shape

y_test = y_test[indt]
X_test = X_test[indt]
y_test.shape, X_test.shape

((12000, 1), (12000, 1, 116, 60))

In [22]:
X_test = X_test.reshape(12000, 116, 60)

In [24]:
X_test = X_test[:, :, :1]

In [26]:
X_test.shape = (12000, 116)

In [28]:
np.unique(y_test, return_counts=True)[1]/y_test.shape[0]

array([0.33333333, 0.33333333, 0.33333333])

Options:  
    1) decrease training set size and run more epochs  
    2) Change image size to make them shorter  
    3) Change filters, kernal, and layers   

#### Attempt 3 columns only

So far, the super simple feed forward network seems to perform just as well as the cnn. 

In [29]:
model = keras.Sequential([
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(3)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=50,
         validation_data=(X_test, y_test))

Train on 27000 samples, validate on 12000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<tensorflow.python.keras.callbacks.History at 0x7f85fc156e90>

In [56]:
probability_model = tf.keras.Sequential([model, 
                                         tf.keras.layers.Softmax()])

predictions = probability_model.predict(X_train)

In [58]:
predictions[0]

array([0.7640396 , 0.08323085, 0.15272966], dtype=float32)

In [59]:
np.argmax(predictions[0])

0

In [60]:
predictions = probability_model.predict(X_test)

In [61]:
np.argmax(predictions[0])

0

In [62]:
predictions[0]

array([9.8680556e-01, 1.2953308e-02, 2.4112873e-04], dtype=float32)

In [63]:
y_test[0]

0.0

The model appears interesting now. It trains very quickly and does well on the training set but not the test set. We can try a couple new approachs. 

1) Let's increase the training set size.   
2) Let's increase and balance the test set   

In [30]:
model = keras.Sequential([
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(3)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=50,
         validation_data=(X_test, y_test))

Train on 27000 samples, validate on 12000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<tensorflow.python.keras.callbacks.History at 0x7f85dc274ed0>

Despite learning the training data, the model performs no better at making predictions. 

What can I do? 

1) Go back to the decision function and re-label the y data more precisely  
2) Use 3 columns of X and flatten it  