# MNIST Classification via Neural Network

To build the neural network are going to use the TensorFlow library. Along the way we will talk about how to use a neural network to solve a classification problem, and the choices made when building a neural network.

Tensorflow is a numerical computation library that focuses on machine learning models. It is written in Python but uses C++ for speed with large scale data. 

In [None]:
pip install tensorflow

In [None]:
# this clears some errors, but isn't necessary for the rest
pip install "numpy<1.17"

## Get the data 
* We can do this by applying the load_data method on the mnist data in keras
* By default, the data has been split into training and test set

In [None]:
from tensorflow.keras.datasets import mnist

(X_train, y_train), (X_test, y_test) = mnist.load_data()

## Explore the data
* Check the shape (the dimensions of the data)
* View an image

In [None]:
print('Shape of training data: ' ,X_train.shape)
print('Amount of testing data: ' , len(X_test))
print('Shape of training classifications: ' , y_train.shape)

In [None]:
import matplotlib.pyplot as plt 
import numpy as np

num = 147
print('Classification is ', y_train[num])
img = np.reshape(X_train[num], (28, 28))
plt.imshow(img,cmap='gray')
plt.show()

## Reshape the data
* Since we are using a fully connected layer which has shape = (samples, features), we need to reshape the training and test samples to suit that dimension.
* Also, we will scale our values to be in [0,1] interval and convert to float32.
* Previously, the values are of type unit8 and in the [0, 255].
* Also, reshape the labels to have categories 

In [None]:
X_train = X_train.reshape((60000, 28 * 28))
X_train = X_train.astype('float32')/255

X_test = X_test.reshape((10000, 28 * 28))
X_test = X_test.astype('float32')/255


# reshape the labels

from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

## Examine the data

In [None]:
print(X_train[:10])
plt.imshow(np.reshape(X_train[0],(28,28)),cmap='gray')
print("Our labels have {} classes".format(y_train.shape[1]))

## Build your architecture
We are using the sequential model in keras which involves stacking up the layers sequentially.
* Here, we are using 3 fully connected layers
* Specify the activation function (e.g. Relu)
* Specify the number of nodes in each hidden layer

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# instantiate the model
model = Sequential() 

# the first layer is where we specify the input shape
model.add(Dense(512, activation = 'relu', input_shape = (28 * 28, )))  
model.add(Dense(512, activation = 'relu'))

# in the last layer, we specify number of labels to expect
model.add(Dense(10, activation = 'softmax'))   



## Compile the network
* Here, we specify the optimizer to use for our weight update (e.g. RMSprop)
* Specify the loss function (e.g. categorical_crossentropy)
* Metrics to monitor (e.g. accuracy)

In [None]:
model.compile(optimizer = 'rmsprop',
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])

## Fit the model
* specify the number of epochs; epochs refers to the number of iteration to train all the data.
* batch size: batches of data to use for training at a time
  
After the model is trained, we will plot some metrics to see how things changed during the training

In [None]:
history = model.fit(X_train, y_train, epochs = 5, batch_size = 128)

# plot some metrics
fig = plt.figure()
plt.subplot(2,1,1)
plt.plot(history.history['acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='lower right')

plt.subplot(2,1,2)
plt.plot(history.history['loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')

plt.tight_layout()


## Evaluate the model's performance
* Examine the model's performance on the test data to see if it overfits

In [None]:
test_loss, test_acc = model.evaluate(X_test, y_test)
print("Test accuracy is {:.3f}".format(test_acc))

## Save your model
* It is a good practice to save your model
* To reuse your saved model, type: from keras.models import load_model
* Then, load_model('model name')
* The saved model saves the layers used with the network updates

In [None]:
model.save('my_model.h5')

## Generate predictions
* Use the testing data to see the model's predictions
* Display a test image and its prediction

In [None]:
predictions = model.predict_classes(X_test)

img = np.reshape(X_test[4],(28,28))
plt.imshow(img)
print('Predicted: ',predictions[4])

# Workshop Tasks
1. Modify the network architecture by changing the number of layers, the number of nodes, and/or the activation function. Do your changes help or hurt the accuracy?
2. Modify the model fit by changing the number of epochs and/or batch_size. Do your changes help or hurt the accuracy?
3. Try building a model for the MNIST Fashion dataset, which has clothing classifications instead of digits. You will need to load the MNIST Fashion data, build an architecture then compile and fit a model. Then you can inspect accuracy and visualize your predictions. 