3. Classification of MNIST 70,000 Handwritten Digits 0-9 Image Data Set

  • Google Coolab Notebook: Jupyter Notebook
  • Github Repository: Respository
  • Paper: Classification of MNIST 70,000 Handwritten Digits 0-9 Image Data Set
  • Categorical Cross Entropy Algorithm:
    1. Load the Modified National Institute of Standards and Technology (MNIST) Handwritten digits 0-9 data set
    2. Train/Test split the data at a ratio of 6:1, respectively
    3. Reshape the images from 28x28 pixels to 784x1 pixels
    4. Normalise the image pixels by dividing by the gray scale image intensity level set L:
    5. Create 10 Categories for the 10 digits 0-9 to be classified

    6. Creating classes 10 classes for the 10 digits 0-9 of handwritten digits

    7. Categorical Cross Entropy (CE) Model Parameters:
      • categorical cross entropy (CE) Loss Function:

      • Where: The formula can be seen as above, where ti refers to the i -th element of the target vector and si refers to the i -th element of the models output vector, and C the number of classes.

        Visualization of Log Loss (Cross Entropy)

        Cross Entropy between probability distributions for each Class

      • Model Accuracy:

      • Where: M is the number of samples in the dataset, tk is the target vector for the k-th sample, and sk is the models output vector for the k-th sample.

      • Neural Network Architecture:
        • Input Layer = 16 hyperbolic tangent activation (tanh) neurons with an input shape of 784x1
        • Hidden Layer = 16 hyperbolic tangent activation (tanh) neurons with an input shape of 16x1
        • Output Layer = 10 softmax neurons

        • Classification Neural Network Architecture

      • Stochastic Gradient Descent optimizer
        • Learing Rate = 0.4
        • Exponential Decay Factor = 0
        • Momentum = 0.5
      • Train Duration: 10 Epochs
      • Batch Size = 128
      • training samples = 60,000
      • testing samples = 10,000

7. Show Results:

Visualization of Model Loss and Accuracy (0.1532 and 95.49% Respectively)

Visualization of First Layer Weights W1 from Neural Network Architecture

Visualization of Second Layer Weights W2 from Neural Network Architecture

Visualization of Third Layer Weights W3 from Neural Network Architecture


