- Google Coolab Notebook: Jupyter Notebook
- Github Repository: Respository
- Paper: Classification of MNIST 70,000 Handwritten Digits 0-9 Image Data Set
- Categorical Cross Entropy Algorithm:
- Load the Modified National Institute of Standards and Technology (MNIST) Handwritten digits 0-9 data set
- Train/Test split the data at a ratio of 6:1, respectively
- Reshape the images from 28x28 pixels to 784x1 pixels
- Normalise the image pixels by dividing by the gray scale image intensity level set L:
- Create 10 Categories for the 10 digits 0-9 to be classified
- Categorical Cross Entropy (CE) Model Parameters:
- categorical cross entropy (CE) Loss Function:
- Model Accuracy:
- Neural Network Architecture:
- Input Layer = 16 hyperbolic tangent activation (tanh) neurons with an input shape of 784x1
- Hidden Layer = 16 hyperbolic tangent activation (tanh) neurons with an input shape of 16x1
- Output Layer = 10 softmax neurons
- Stochastic Gradient Descent optimizer
- Learing Rate = 0.4
- Exponential Decay Factor = 0
- Momentum = 0.5
- Train Duration: 10 Epochs
- Batch Size = 128
- training samples = 60,000
- testing samples = 10,000
Creating classes 10 classes for the 10 digits 0-9 of handwritten digits
Where: The formula can be seen as above, where ti refers to the i -th element of the target vector and si refers to the i -th element of the models output vector, and C the number of classes.
Visualization of Log Loss (Cross Entropy)
Cross Entropy between probability distributions for each Class
Where: M is the number of samples in the dataset, tk is the target vector for the k-th sample, and sk is the models output vector for the k-th sample.
7. Show Results:
Visualization of Model Loss and Accuracy (0.1532 and 95.49% Respectively)
Visualization of First Layer Weights W1 from Neural Network Architecture
Visualization of Second Layer Weights W2 from Neural Network Architecture
Visualization of Third Layer Weights W3 from Neural Network Architecture