# Build an Artificial Neural Network to implement Multi-Class Classification task using the Back-propagation algorithm and test the same using appropriate data sets

### Database
* The data that will be incorporated is the **MNIST database** (Modified National Institute of Standards and Technology database) which contains 60,000 images for training and 10,000 test images.
* The dataset consists of small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9
* The MNIST dataset is conveniently bundled within Keras, and we can easily analyze some of its features in Python.

In [None]:
pip install matplotlib

In [None]:
from tensorflow import keras
from keras.datasets import mnist     # MNIST dataset is included in Keras
(X_train, y_train), (X_test, y_test) = mnist.load_data()

print("X_train shape", X_train.shape)
print("y_train shape", y_train.shape)
print("X_test shape", X_test.shape)
print("y_test shape", y_test.shape)

X_train shape (60000, 28, 28)
y_train shape (60000,)
X_test shape (10000, 28, 28)
y_test shape (10000,)


In [None]:
# Plot first few images
import matplotlib.pyplot as plt
for i in range(9):
	# define subplot
	plt.subplot(3,3,i+1) # 3 rows, 3 col, pos
	# plot raw pixel data
	plt.imshow(X_train[i], cmap='gray')
# show the figure
plt.show()

In [None]:
X_train[i].shape

In [None]:
# Each pixel is an 8-bit integer from 0-255 (0 is full black, 255 is full white)
# single-channel pixel or monochrome image
X_train[i][10:20,10:20]

### Formatting the input data

* Reshape (or flatten) the 28x28 image into a 784-length vector.


<img src='https://github.com/AviatorMoser/keras-mnist-tutorial/blob/master/flatten.png?raw=1' width=50%>

* Input values [0-255] are Normalized in the range [0-1]

A `Min-Max Scaling` is typically done via the following equation:

$$X_{norm} = \frac{X_{i} - X_{min}}{X_{max} - X_{min}}$$

$X_i$ is the $i^{th}$ sample of dataset.

In [None]:
# reshape 28 x 28 matrices into 784-length vectors
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)

# normalize each value for each pixel for the entire vector for each input
# change integers to 32-bit floating point numbers
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# normalize by dividing by largest pixel value
X_train /= 255
X_test /= 255

print("Training matrix shape", X_train.shape)
print("Testing matrix shape", X_test.shape)

Training matrix shape (60000, 784)
Testing matrix shape (10000, 784)


### DNN for Multi-class classification using Keras library

#### Build the model

helpful url: https://www.analyticsvidhya.com/blog/2021/06/mnist-dataset-prediction-using-keras/

In [None]:
# Sequential keras model with Dense layes (DIY)

from keras.models import Sequential  # Model type to be used
from keras.layers.core import Dense # Types of layers to be used in our model

mdl = Sequential()
# Input layer with 64 units and relu activation

# Hidden layer with 32 units and relu activation

# Output layer with 10 units and softmax activation

# Compile model


In [None]:
# Visualize the model
from keras.utils.vis_utils import plot_model
plot_model(mdl, show_shapes=True, show_layer_names=False)

In [None]:
# Display model summary
mdl.summary()

In [None]:
#understand model summary
784*64 + 64

In [None]:
64*32 + 32

In [None]:
32*10 + 10

#### Convert labels to "one-hot" vectors using the to_categorical function
```
0 -> [1, 0, 0, 0, 0, 0, 0, 0, 0]
1 -> [0, 1, 0, 0, 0, 0, 0, 0, 0]
2 -> [0, 0, 1, 0, 0, 0, 0, 0, 0]
etc.
```


In [None]:
from tensorflow.keras.utils import to_categorical
y_train1 = to_categorical(y_train)
y_test1 = to_categorical(y_test)
print(y_test[6])
print(y_test1[6,:])

#### Train the model

* If unspecified, by default batch_size=32
* 60,000/64 = 938 minibatches
* Reference: https://keras.io/api/models/model_training_apis/

In [None]:
# Train the model
epochs=10
batch = 64
history = mdl.fit(X_train, y_train1,epochs=epochs, batch_size=batch,verbose=1, validation_data=(X_test, y_test1))

# Evaluate Model

### Plot Learning graphs

In [None]:
epochRange = range(1,epochs+1);
plt.plot(epochRange,history.history['loss'])
plt.plot(epochRange,history.history['val_loss'])
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.grid()
plt.xlim((1,epochs))
plt.legend(['Train','Test'])
plt.show()

In [None]:
plt.plot(epochRange,history.history['accuracy'])
plt.plot(epochRange,history.history['val_accuracy'])
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.grid()
plt.xlim((1,epochs))
plt.legend(['Train','Test'])
plt.show()

#### Performance metrics

In [None]:
import numpy as np
yhat_test_mdl_prob = mdl.predict(X_test);
yhat_test_mdl = np.argmax(yhat_test_mdl_prob,axis=-1)
print(yhat_test_mdl_prob[0])
print(yhat_test_mdl[0:10])
print(y_test[0:10])

In [None]:
from sklearn.metrics import accuracy_score
print('Accuracy:')
print(float(accuracy_score(y_test, yhat_test_mdl))*100,'%')

In [None]:
from sklearn.metrics import confusion_matrix
print('Confusion Matrix:')
print(confusion_matrix(y_test, yhat_test_mdl))