# MNIST Digit Classification

## About MNIST

The MNIST dataset consists of 70,000 grayscale images (28x28 pixels) of handwritten digits (0-9), split into 60,000 for training and 10,000 for testing, serving as a classic benchmark for training and evaluating machine learning models, especially for image classification tasks like recognizing digits on checks. It's a foundational dataset in AI, often called the "hello world" for data scientists, used to test algorithms' ability to learn patterns from simple visual data.

## About this Notebook
This notebook contains the implementation of MNIST classification using 3 popular deeplearning frameworks
- Tensorflow
- Pytorch
- Keras

This will establish a baseline template for vision based classification projects which can be used as a reference for anyone looking to do similar projects.

In [2]:
# Check GPU availability
import torch
import tensorflow as tf

print("Torch CUDA:", torch.cuda.is_available())
print("TensorFlow GPU:", tf.config.list_physical_devices('GPU'))

import matplotlib.pyplot as plt

Torch CUDA: False
TensorFlow GPU: []


# Tensorflow

TensorFlow is an open-source, end-to-end platform for machine learning (ML) and artificial intelligence (AI), originally developed by the Google Brain team. It provides a flexible and comprehensive ecosystem of tools, libraries, and community resources that allow developers and researchers to build, train, and deploy ML-powered applications across a variety of platforms.

In [3]:
import tensorflow as tf
from tensorflow.keras import layers, models

In [5]:
# Load the MNIST Dataset from Tensorflow directly
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize and Reshape the Data (standard vision practice)
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# The reshape is done to make sure the input becomes a vector compatible with the models first Dense Layer
x_train = x_train.reshape(-1, 784)
x_test  = x_test.reshape(-1, 784)

In [6]:
# Model Definition using TF Sequential API

model_tf = models.Sequential([
    layers.Dense(512, activation='relu', input_shape=(784, )),
    layers.Dense(256, activation='relu'),
    layers.Dense(10, activation='softmax')
])



model_tf.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)


model_tf.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [7]:
model_tf.fit(
    x_train,
    y_train,
    epochs=10,
    batch_size=32,
    validation_split=0.1
)


loss, acc = model_tf.evaluate(x_test, y_test)
print(f"Test Accuracy: {acc * 100:.2f}%")

Epoch 1/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 9ms/step - accuracy: 0.8979 - loss: 0.3392 - val_accuracy: 0.9687 - val_loss: 0.1037
Epoch 2/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 9ms/step - accuracy: 0.9742 - loss: 0.0831 - val_accuracy: 0.9770 - val_loss: 0.0768
Epoch 3/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 10ms/step - accuracy: 0.9818 - loss: 0.0557 - val_accuracy: 0.9763 - val_loss: 0.0873
Epoch 4/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m18s[0m 10ms/step - accuracy: 0.9874 - loss: 0.0372 - val_accuracy: 0.9765 - val_loss: 0.0919
Epoch 5/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 11ms/step - accuracy: 0.9888 - loss: 0.0331 - val_accuracy: 0.9807 - val_loss: 0.0807
Epoch 6/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 10ms/step - accuracy: 0.9918 - loss: 0.0242 - val_accuracy: 0.9793 - val_loss: 0.0902
Epoch 

## Pytorch

PyTorch is an open-source machine learning framework that is widely used for building and training deep learning models, particularly neural networks. Originally developed by Meta (formerly Facebook) AI Research, it is now part of the Linux Foundation's PyTorch Foundation.

In [8]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

In [14]:
# Define transformations and load data

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Lambda(lambda x: x.view(-1))
])

train_dataset = datasets.MNIST(
    root="./data", train=True, download=True, transform=transform
)
test_dataset = datasets.MNIST(
    root="./data", train=False, download=True, transform=transform
)


train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32)

In [15]:
class MLP(nn.Module):
  def __init__(self):
    super().__init__()
    self.net = nn.Sequential(
        nn.Linear(784, 512),
        nn.ReLU(),
        nn.Linear(512, 256),
        nn.ReLU(),
        nn.Linear(256, 10)
    )

  def forward(self, x):
    return self.net(x)

In [16]:
model_torch = MLP()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model_torch.parameters(), lr=1e-3)

In [17]:

# Train the Model

for epoch in range(10):
  model_torch.train()
  correct = total = 0

  for x, y in train_loader:
    optimizer.zero_grad()
    out = model_torch(x)
    loss = criterion(out, y)
    loss.backward()
    optimizer.step()

    pred = out.argmax(dim=1)
    correct += (pred == y).sum().item()
    total += y.size(0)

  print(f"Epoch {epoch+1}: Train Acc = {100*correct/total:2f}%")

Epoch 1: Train Acc = 93.670000%
Epoch 2: Train Acc = 97.531667%
Epoch 3: Train Acc = 98.163333%
Epoch 4: Train Acc = 98.685000%
Epoch 5: Train Acc = 98.926667%
Epoch 6: Train Acc = 99.110000%
Epoch 7: Train Acc = 99.190000%
Epoch 8: Train Acc = 99.386667%
Epoch 9: Train Acc = 99.353333%
Epoch 10: Train Acc = 99.410000%


In [19]:
# Test
model_torch.eval()
correct = total = 0

with torch.no_grad():
    for x, y in test_loader:
        out = model_torch(x)
        pred = out.argmax(dim=1)
        correct += (pred == y).sum().item()
        total += y.size(0)

print(f"Test Accuracy: {100 * correct / total:.2f}%")

Test Accuracy: 97.77%


## Keras

Keras is a user-friendly, high-level Python API for building and experimenting with deep learning models, acting as an interface for powerful backend engines like TensorFlow, JAX, and PyTorch, allowing developers to create neural networks with less code and faster iteration. It simplifies complex tasks by offering pre-built components (like layers, optimizers, and loss functions) and focusing on code elegance, speed, and deployment, making deep learning more accessible for both beginners and experts.

In [20]:
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical

In [21]:
# Load and Preprocess Data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(-1, 784).astype("float32") / 255.0
x_test  = x_test.reshape(-1, 784).astype("float32") / 255.0

y_train = to_categorical(y_train, 10)
y_test  = to_categorical(y_test, 10)

In [22]:
# Model Definition
mlp_keras = Sequential([
    Dense(512, activation="relu", input_shape=(784,)),
    Dense(256, activation="relu"),
    Dense(10, activation="softmax")
])

mlp_keras.compile(
    optimizer="adam",
    loss="categorical_crossentropy",
    metrics=["accuracy"]
)

mlp_keras.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [23]:
# Train
mlp_keras.fit(
    x_train, y_train,
    epochs=10,
    batch_size=128,
    validation_split=0.1
)

Epoch 1/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 14ms/step - accuracy: 0.8749 - loss: 0.4415 - val_accuracy: 0.9710 - val_loss: 0.1046
Epoch 2/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 12ms/step - accuracy: 0.9724 - loss: 0.0930 - val_accuracy: 0.9748 - val_loss: 0.0831
Epoch 3/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 16ms/step - accuracy: 0.9835 - loss: 0.0544 - val_accuracy: 0.9730 - val_loss: 0.0894
Epoch 4/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 13ms/step - accuracy: 0.9882 - loss: 0.0376 - val_accuracy: 0.9817 - val_loss: 0.0629
Epoch 5/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 14ms/step - accuracy: 0.9927 - loss: 0.0237 - val_accuracy: 0.9793 - val_loss: 0.0758
Epoch 6/10
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 12ms/step - accuracy: 0.9936 - loss: 0.0209 - val_accuracy: 0.9798 - val_loss: 0.0778
Epoch 7/10
[1m422/42

<keras.src.callbacks.history.History at 0x7d8e4c0fc230>

In [24]:
# Evaluate
loss, acc = mlp_keras.evaluate(x_test, y_test)
print(f"Test Accuracy: {acc:.4f}")

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.9725 - loss: 0.1251
Test Accuracy: 0.9771
