<a href="https://colab.research.google.com/github/tc-wandering/mnist-digit-classifier/blob/main/MNIST_digit_classifier.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

The MNIST dataset contains 60,000 grayscale images of handwritten digits (0–9) for training, plus 10,000 test images. We’ll build a simple convolutional neural network (CNN) to classify these digits. Start a Colab with TensorFlow/Keras

In [20]:
!pip install tensorflow keras
import tensorflow as tf
from tensorflow import keras
from keras import layers
import numpy as np



Load data: Keras has MNIST built-in. Load and split.

In [21]:
(num_classes, input_shape) = (10, (28, 28, 1))
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0
# Reshape to (28,28,1) for convnet
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)


after loading you scale pixel values to [0,1] and reshape to include the channel dimension. One-hot encode labels:

In [22]:
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

Build the model: Create a Sequential CNN. This architecture (two conv layers with pooling, then Flatten, Dropout, and a Dense softmax) matches the Simple MNIST convnet example

In [23]:
model = keras.Sequential([
    keras.Input(shape=input_shape),
    layers.Conv2D(32, kernel_size=(3,3), activation="relu"),
    layers.MaxPooling2D(pool_size=(2,2)),
    layers.Conv2D(64, kernel_size=(3,3), activation="relu"),
    layers.MaxPooling2D(pool_size=(2,2)),
    layers.Flatten(),
    layers.Dropout(0.5),
    layers.Dense(num_classes, activation="softmax"),
])
model.summary()

. Compile and train: Use categorical crossentropy loss and the Adam optimizer. This training loop will report accuracy each epoch. Even a few epochs often yields >98% accuracy.

In [24]:
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(x_train, y_train, batch_size=128, epochs=10, validation_split=0.1)

Epoch 1/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 10ms/step - accuracy: 0.7592 - loss: 0.7720 - val_accuracy: 0.9782 - val_loss: 0.0821
Epoch 2/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 4ms/step - accuracy: 0.9620 - loss: 0.1228 - val_accuracy: 0.9837 - val_loss: 0.0598
Epoch 3/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.9723 - loss: 0.0881 - val_accuracy: 0.9878 - val_loss: 0.0469
Epoch 4/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.9787 - loss: 0.0715 - val_accuracy: 0.9892 - val_loss: 0.0423
Epoch 5/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.9811 - loss: 0.0600 - val_accuracy: 0.9890 - val_loss: 0.0399
Epoch 6/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.9822 - loss: 0.0569 - val_accuracy: 0.9905 - val_loss: 0.0367
Epoch 7/10
[1m422/422[0m 

<keras.src.callbacks.history.History at 0x7ee18ea2f210>

Evaluate: Finally, test on the held-out test set. You should see a high test accuracy.

In [25]:
score = model.evaluate(x_test, y_test, verbose=0)
print("Test accuracy:", score[1])

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.9901 - loss: 0.0306
Test accuracy: 0.9916999936103821
