# Machine Learning Lab 10
Multilayer Perceptron

**Name:** Fatima Mujahid

**Class:** BESE-10B

**CMS ID:** 289558

**Date:** May 13, 2022

# Introduction

The goal in this competition is to take an image of a handwritten single digit, and determine what that digit is.  

The data is taken from the MNIST dataset. The MNIST ("Modified National Institute of Standards and Technology") dataset is a classic within the Machine Learning community that has been extensively studied.  More detail about the dataset, including Machine Learning algorithms that have been tried on it and their levels of success, can be found [here][1].


  [1]: http://yann.lecun.com/exdb/mnist/index.html

# Loading the data

In [38]:
import numpy as np # Array manipulation
import pandas as pd # Dataframe manipulation

# Multilayer perceptron Neural Network
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.utils import np_utils

In [39]:
# Load data
train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")

# fix random seed for reproducibility
seed = 7
np.random.seed(seed)

Extract the features matrix X and transform it to an array of float numbers. And also extract the labels.

In [40]:
# Extract images pixels
images = train.iloc[:,1:].values
images = images.astype(np.float)

# Extract numbers Labels
labels = train.iloc[:,0].values

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  This is separate from the ipykernel package so we can avoid doing imports until


# Multilayer Perceptron

## Preprocessing

The pixel values are gray scale between 0 and 255. It is almost always a good idea to perform some scaling of input values when using neural network models. Because the scale is well known and well behaved, we can very quickly **normalize** the pixel values to the range 0 and 1 by dividing each value by the maximum of 255.

Also, the output variable is an integer from 0 to 9. This is a multi-class classification problem. As such, it is good practice to use a **one hot encoding** of the class values, transforming the vector of class integers into a binary matrix. We can easily do this using the built-in np_utils.to_categorical() helper function in Keras.

In [41]:
# Normalize input from 0-255 to 0-1
images = images / 255.0
num_pixels =  images.shape[1]

# one hot encode outputs
labels = np_utils.to_categorical(labels)
num_classes = labels.shape[1]

We are now ready to create our simple neural network model. We will define our model in a function. This is handy if you want to extend the example later and try and get a better score.

The model is a **simple neural network** with **one hidden layer** with the same **number of neurons as there are inputs (784)**. A **rectifier activation function** is used for the neurons in the hidden layer.

A **softmax activation function** is used on the output layer to turn the outputs into probability-like values and allow one class of the 10 to be selected as the model’s output prediction. **Logarithmic loss** is used as the loss function (called **categorical_crossentropy** in Keras) and the efficient **ADAM gradient descent algorithm** is used to **learn the weights**.

## Model

In [42]:
# define baseline model
def mlp_model():

	# create model
	#TODO
  model = Sequential([Dense(784, input_shape=(784,), name="layer1"),
	                    Dense(784, activation="relu", name="layer2"),
											Dense(10, activation="softmax", name="layer3")])
	
	# Compile model
	#TODO
  model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
  return model

We can now fit and evaluate the model. The model is fit **over 10 epochs with updates every 200 images**. A verbose value of 2 is used to reduce the output to one line for each training epoch.

In [43]:
# build the model
model = mlp_model()
# Fit the model
#TODO
print(images.shape, labels.shape)
model.fit(images, labels, epochs=10, batch_size=200, verbose=2)

(42000, 784) (42000, 10)
Epoch 1/10
210/210 - 7s - loss: 0.2501 - accuracy: 0.9239 - 7s/epoch - 32ms/step
Epoch 2/10
210/210 - 6s - loss: 0.1065 - accuracy: 0.9676 - 6s/epoch - 31ms/step
Epoch 3/10
210/210 - 7s - loss: 0.0780 - accuracy: 0.9746 - 7s/epoch - 32ms/step
Epoch 4/10
210/210 - 7s - loss: 0.0580 - accuracy: 0.9816 - 7s/epoch - 32ms/step
Epoch 5/10
210/210 - 6s - loss: 0.0460 - accuracy: 0.9850 - 6s/epoch - 30ms/step
Epoch 6/10
210/210 - 6s - loss: 0.0381 - accuracy: 0.9879 - 6s/epoch - 29ms/step
Epoch 7/10
210/210 - 6s - loss: 0.0319 - accuracy: 0.9889 - 6s/epoch - 30ms/step
Epoch 8/10
210/210 - 6s - loss: 0.0295 - accuracy: 0.9906 - 6s/epoch - 30ms/step
Epoch 9/10
210/210 - 6s - loss: 0.0256 - accuracy: 0.9909 - 6s/epoch - 30ms/step
Epoch 10/10
210/210 - 6s - loss: 0.0219 - accuracy: 0.9933 - 6s/epoch - 30ms/step


<keras.callbacks.History at 0x7fea2b2f7550>

Finally, we predict the model, we change our one hot encoded (binary matrix) results to a vector of labels from 0 to 9, and we save our results in a submission file

## Evaluation

In [44]:
train_pred = model(images)
train_pred = np.argmax(train_pred, axis=1)
labels = np.argmax(labels, axis=1)

from sklearn.metrics import confusion_matrix
#print the confusion matrix 
print("CONFUSION MATRIX:\n")
print(confusion_matrix(labels, train_pred))

from sklearn.metrics import classification_report
#print classification_report
print("\nCLASSIFICATION REPORT:\n")
print(classification_report(labels, train_pred))

CONFUSION MATRIX:

[[4119    0    1    0    0    0    3    0    4    5]
 [   0 4674    5    0    1    0    1    2    1    0]
 [   1    1 4146    0   13    1    2    5    7    1]
 [   2    2   15 4272    1   25    1    2   27    4]
 [   0    0    0    0 4067    0    2    2    0    1]
 [   0    0    0    7    1 3758    4    1   20    4]
 [   2    0    0    0    1    3 4130    0    1    0]
 [   0    4    8    0    1    0    0 4386    0    2]
 [   0    2    0    0    3    4    2    3 4033   16]
 [   0    0    0    0   15    0    0    8    2 4163]]

CLASSIFICATION REPORT:

              precision    recall  f1-score   support

           0       1.00      1.00      1.00      4132
           1       1.00      1.00      1.00      4684
           2       0.99      0.99      0.99      4177
           3       1.00      0.98      0.99      4351
           4       0.99      1.00      0.99      4072
           5       0.99      0.99      0.99      3795
           6       1.00      1.00      1.00   

In [49]:
# use the NN model to predict and classify test data
#TODO
test_images = test.values
test_images = test_images.astype(np.float)
test_images = test_images / 255.0

test_pred = model(test_images)
test_pred = np.argmax(test_pred, axis=1)
print(test_pred)

# save results and network weights 
# submit the submission file on lms along with the notebook file 
# TODO
model.save('model.h5')

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  after removing the cwd from sys.path.


[2 0 9 ... 3 9 2]
