<a href="https://colab.research.google.com/github/tpgall1996/machine_learning_course/blob/master/neural_networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Chapter 3: Neural Networks**

(Credit to William Hartemink for suggesting this much nicer notebook environment)

---
We now introduce Artificial Neural Networks, one of the most frequently used and powerful machine learning techniques available. The field of Neural Networks (also called Deep Learning) is rich and deep, and we will only see the tip of the iceberg in this course. You know the score by now, so let's jump into coding. You can either run the cells in the notebook, or you can type the code into your own command line if you have Keras and TensorFlow installed.





In [0]:
import keras
from keras.models import Sequential

We have imported the standard Feedworward (sequential) Neural Network. Now let's load in the MNIST data from last time. 

In [0]:
from keras.datasets import fashion_mnist

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

Let's have a look at the data. 

In [0]:
print(x_train.shape)
print(x_test.shape)

import matplotlib.pyplot as plt
import numpy as np

plt.imshow(x_train[239,:,:])

We need to change our data shape so that it's a normal (2D) array for inputting into the neural network.

In [0]:
x_train = x_train.reshape(60000,28**2)
x_test = x_test.reshape(10000,28**2)

Now we're ready to build our Neural Network. 

In [0]:
from keras.models import Sequential
model = Sequential()

from keras.layers import Dense
model.add(Dense(units=100, activation = 'relu', input_dim=28**2))
model.add(Dense(units=100,activation = 'relu'))
model.add(Dense(units=100,activation = 'relu'))
model.add(Dense(units=100,activation = 'relu'))
model.add(Dense(units=10, activation='softmax'))


Since this is a classification problem, our output vector will be 10 dimensional. This isn't the way that our target data arrives, so we need to convert it. 

In [0]:
print(y_train[0])

y_train = keras.utils.to_categorical(y_train,num_classes=10)
y_test = keras.utils.to_categorical(y_test,num_classes=10)

print(y_train[0])

We can now define a loss function for our Neural Network to optimise on. The 'categorial cross-entropy':

In [0]:
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

Now we're ready to train:

In [0]:
model.fit(x_train,y_train, epochs=50, batch_size = 1000)

Our neural network is trained! Let's have a look at what we've done:

In [0]:
model.summary()

We can evaluate the model's performance:

In [0]:
print(model.evaluate(x_test,y_test))
y_pred = model.predict(x_test)


We can also design our own metric of performance, which gives us a lot of flexibility


In [0]:
counter=0
for i in range(10000):
  if np.argmax(y_pred[i,:]) != np.argmax(y_test[i,:]):
    counter += 1
    
print(counter/10000)


The model only gets the answer wrong around 4% of the time. Let's have a quick visual look at our Neural Network's results:

In [0]:
N=5
start_val = 403# pick an element for the code to plot the following 25 values
fig, axes = plt.subplots(N,N)
items = ['Top','Trousers','Jumper','Dress','Coat','Sandal','Shirt','Trainer',
         'Bag','Ankle Boot']
for row in range(N):
  for col in range(N):
    pred = np.argmax(y_pred[start_val+row+N*col,:])
    axes[row,col].imshow(x_test[start_val+row+N*col,:].reshape((28,28)))
    axes[row,col].set_title(str(items[pred]))
    axes[row,col].set_xticks([])
    axes[row,col].set_yticks([])
    
plt.subplots_adjust(hspace=1.5)

We can also perform cross validation to tune our Neural Network. This is optional (takes a while to run). 

*(Inspiration from https://www.kaggle.com/stefanie04736/simple-keras-model-with-k-fold-cross-validation)*

In [0]:
from keras.models import Sequential
from keras.layers import Dense
from keras.datasets import fashion_mnist

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train = x_train.reshape(60000,28**2)
x_test = x_test.reshape(10000,28**2)

def get_model(layers, phis):
  # takes in a list of integers for layers, and a list of strings for the
  # activation functions. 
  if len(layers) != len(phis):
    print('You need to have lists of equal sizes as arguments for model object')
  else:
    model = Sequential()
    model.add(Dense(units=layers[0], activation = phis[0], input_dim=28**2))
    for i in range(1,len(layers)):
      model.add(Dense(units=layers[i], activation = phis[i]))
    
    model.add(Dense(units=10, activation='softmax'))
    model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
  return model

def accuracy(model,x_test,y_test):
  #simple accuracy metric for the NN
  y_pred = model.predict(x_test)
  counter=0
  for i in range(len(x_test)):
    if np.argmax(y_pred[i,:]) != np.argmax(y_test[i,:]):
      counter += 1
    
  return counter/len(y_test)
  

from sklearn.model_selection import StratifiedKFold
n_folds = 10
skf = StratifiedKFold(n_splits=n_folds, random_state=42,shuffle=False)

layer_options = [[100,100,100],[50,100,50],[100,200,100,100,50],[50,50,50,50,50,50]] # only the inner layers
phi_options = [['relu','relu','relu'],['relu','relu','relu'],['relu','relu','relu','relu','relu'],
               ['relu','relu','relu','relu','relu','relu',]]

k_scores = []
#loop through possible models
for k in range(np.shape(layer_options)[0]):
  cv_scores = []
  
  #cross validate this model
  for train_ind, test_ind in skf.split(x_train,y_train):
    x_train_cv, x_test_cv = x_train[train_ind], x_train[test_ind]
    y_train_cv, y_test_cv = y_train[train_ind], y_train[test_ind]
    
    #convert format of label data for model fitting
    y_train_cv_cat = keras.utils.to_categorical(y_train_cv,num_classes=10)
    y_test_cv_cat = keras.utils.to_categorical(y_test_cv,num_classes=10)
    
    model = get_model(layer_options[k],phi_options[k])
    model.fit(x_train_cv,y_train_cv_cat,epochs=100, batch_size = 1000)
    cv_scores.append(accuracy(model,x_test_cv,y_test_cv_cat))
  k_scores.append(np.mean(cv_scores))                
print('k_scores are (lower is better):\n', k_scores)
print('Best model is model ',str(np.argmin(k_scores)+1),'with hidden layers:\n',
      layer_options[np.argmin(k_scores)], '\n and activations:\n', phi_options[np.argmin(k_scores)])
    
