This project aims to train a deep learning model that can correctly recognize and classify handwritten digits from 0 to 9, using the MNIST dataset. To add upon this simpler task, we'll import handwritten digits images that I myself have drown to verify if the model can correctly identify which digit is written in each of them

# Importing and loading the dataset

In [1]:
# Importing necessary libraries
import tensorflow as tf
from tensorflow import keras
import numpy as np
from PIL import Image

In [2]:
# Loading MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


In [3]:
# Understanding how the dataset was imported
for i in [x_train, y_train, x_test, y_test]:
    print(i.shape, type(i))

(60000, 28, 28) <class 'numpy.ndarray'>
(60000,) <class 'numpy.ndarray'>
(10000, 28, 28) <class 'numpy.ndarray'>
(10000,) <class 'numpy.ndarray'>


From the output above, we identify that there are 60,000 images in the training set and 10,000 in the testing set, each of them having 28x28 pixels. All the data is stored in numpy arrays

In [None]:
# Normalizing data (0 to 1)
x_train = x_train / 255.0
x_test = x_test / 255.0

# Neural Network architecture, training and evaluation

We'll first train a simple neural network with some hidden layers and see how it performs both on the testing set and on the images I took of my own handwritten digits. After that, we'll train a convolutional neural network (CNN), which is known for being great at image recognition and classification

In [None]:
# Flattening data (for the simpler neural network)
x_train_flat = x_train.reshape(60000, 784)
x_test_flat = x_test.reshape(10000, 784)

In [None]:
# Building model architecture
model = keras.models.Sequential()
model.add(keras.layers.Dense(500, activation='relu', input_shape=(784,)))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(200, activation='relu'))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(100, activation='relu'))
model.add(keras.layers.Dropout(0.1))
model.add(keras.layers.Dense(10, activation='softmax'))

In [None]:
# Compiling model
model.compile(optimizer='Adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [None]:
# Training the model
model.fit(x_train_flat, y_train, epochs=3, validation_split=0.2, shuffle=True)

In [None]:
# Evaluating model performance on test set
test_loss, test_accuracy = model.evaluate(x_test_flat, y_test)
print(test_loss, test_accuracy)

# Neural Network performance on my own handwritten digits

Now, we'll see if the model can correctly classify my own handwritten number!

In [None]:
# Loading image of a 1
img = Image.open('/content/drive/MyDrive/Model Testing - 1.png')
img_grey = img.convert('L')
length, height = img_grey.size
pixels = img_grey.load()
for i in range(length):
    for j in range(height):
        pixels[i, j] = 255 - pixels[i, j]
img_grey = img_grey.resize((28, 28))
img_array = np.array(img_grey)
img_array = img_array / 255.0
img_flat = img_array.reshape(1, 784)

In [None]:
# Loading image of a 4
img_4 = Image.open('/content/drive/MyDrive/Model Testing - 4.png')
img_4_grey = img_4.convert('L')
length_4, height_4 = img_4_grey.size
pixels = img_4_grey.load()
for i in range(length_4):
    for j in range(height_4):
        pixels[i, j] = 255 - pixels[i, j]
img_4_grey = img_4_grey.resize((28, 28))
img_4_array = np.array(img_4_grey)
img_4_array = img_4_array / 255.0
img_4_flat = img_4_array.reshape(1, 784)

In [None]:
# Testing
prediction = model.predict(img_flat)
predicted_class = np.argmax(prediction)
print('A imagem é provavelmente o número: ', predicted_class)

We rapidly identify that our simple neural network is not predicting my handwritten digits very well :(

To improve on this, we'll create a convulational neural network, which is notoriously known for its great performance on image-related tasks

# Convolutional Neural Network (CNN) architecture, training and evaluation

In [None]:
# CNN design and architecture
cnn_model = keras.Sequential()
cnn_model.add(keras.layers.Conv2D(28, kernel_size=(2,2), activation='relu', input_shape=(28, 28, 1)))
cnn_model.add(keras.layers.MaxPool2D())
cnn_model.add(keras.layers.Dropout(0.3))
cnn_model.add(keras.layers.Conv2D(56, kernel_size=(2,2), activation='relu'))
cnn_model.add(keras.layers.MaxPool2D())
cnn_model.add(keras.layers.Flatten())
cnn_model.add(keras.layers.Dense(56, activation='relu'))
cnn_model.add(keras.layers.Dropout(0.2))
cnn_model.add(keras.layers.Dense(10, activation='softmax'))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [None]:
# Compiling CNN
cnn_model.compile(optimizer='adam',
               loss='sparse_categorical_crossentropy',
               metrics=['accuracy'])

In [None]:
# Training CNN
cnn_model.fit(x_train, y_train, epochs=3, validation_split=0.2, shuffle=True)

Epoch 1/3
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m38s[0m 24ms/step - accuracy: 0.8027 - loss: 0.6160 - val_accuracy: 0.9693 - val_loss: 0.1026
Epoch 2/3
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m42s[0m 24ms/step - accuracy: 0.9564 - loss: 0.1377 - val_accuracy: 0.9770 - val_loss: 0.0766
Epoch 3/3
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m42s[0m 25ms/step - accuracy: 0.9679 - loss: 0.1005 - val_accuracy: 0.9827 - val_loss: 0.0564


<keras.src.callbacks.history.History at 0x7c5ffaf63130>

In [None]:
# Evaluating CNN performance on test set
cnn_test_loss, cnn_test_accuracy = cnn_model.evaluate(x_test, y_test)
print(cnn_test_loss, cnn_test_accuracy)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 8ms/step - accuracy: 0.9781 - loss: 0.0629
0.05135170742869377 0.9829000234603882


In [None]:
# The greatest task: correctly identifying my own handwritten digits
cnn_prediction = cnn_model.predict(tf.expand_dims(img_array, axis=0))
cnn_predicted_class = np.argmax(cnn_prediction)
print('A imagem é provavelmente o número: ', cnn_predicted_class)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
A imagem é provavelmente o número:  0


The CNN model **correctly** identified my own handwritten digits!!!

Notice that the choices I made on the number of hidden layers, rates of dropout and other parameters were made based on trial and error and by identifying where the issues could possibly be (overfitting, lack of complexity of the model...)