simple nn for handwritten digit recognition mnist dataset

In [9]:
import torch
import os
import torch.nn as nn 
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

In [6]:
#define transforms
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5),(0.5))
])

In [10]:
dataset_root = os.path.expanduser('~/MNIST_data')
#load Mnist dataset
train_dataset = datasets.MNIST(root=dataset_root, train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root=dataset_root, train=False, download=True, transform=transform)


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to /Users/aravindryali/MNIST_data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:42<00:00, 232993.89it/s]


Extracting /Users/aravindryali/MNIST_data/MNIST/raw/train-images-idx3-ubyte.gz to /Users/aravindryali/MNIST_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to /Users/aravindryali/MNIST_data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 53466.22it/s]


Extracting /Users/aravindryali/MNIST_data/MNIST/raw/train-labels-idx1-ubyte.gz to /Users/aravindryali/MNIST_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to /Users/aravindryali/MNIST_data/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:05<00:00, 294329.80it/s]


Extracting /Users/aravindryali/MNIST_data/MNIST/raw/t10k-images-idx3-ubyte.gz to /Users/aravindryali/MNIST_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to /Users/aravindryali/MNIST_data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 370986.52it/s]

Extracting /Users/aravindryali/MNIST_data/MNIST/raw/t10k-labels-idx1-ubyte.gz to /Users/aravindryali/MNIST_data/MNIST/raw






In [13]:
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=True)

In [23]:
#define the NN model
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(28*28,128)
        self.fc2 = nn.Linear(128,64)
        self.fc3 = nn.Linear(64,10)
    
    def forward(self,x):
        x = x.view(-1,28*28)
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x
    
Model = SimpleNN()

In [24]:
#Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(Model.parameters(), lr=0.001)

In [25]:
#train the model
num_epochs = 5
for epoch in range(num_epochs):
    Model.train()
    running_loss = 0.0
    for images, labels in train_loader:
        optimizer.zero_grad()       #reset gradients to 0
        output = Model(images)
        loss = criterion(output, labels)
        loss.backward()
        optimizer.step()
        running_loss = loss.item() * images.size(0)
    epoch_loss = running_loss / len(train_loader.dataset)
    print(f'epoch [{epoch+1}/{num_epochs}], loss:{epoch_loss:.4f}')

epoch [1/5], loss:0.0000
epoch [2/5], loss:0.0000
epoch [3/5], loss:0.0001
epoch [4/5], loss:0.0000
epoch [5/5], loss:0.0000


In [26]:
#evaluate the model
Model.eval()
correct = 0
total = 0
with torch.no_grad():
    for images, labels in test_loader:
        outputs = Model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
print(f'Accuracy of the model on the test images: {100 * correct / total:.2f}%')

Accuracy of the model on the test images: 96.26%


what did we do exactly

This is a neural network which can classify handwritten digits from MNIST dataset

MINIST dataset consists of 60,000 training images and 10,000 test images of handwritten digits (0-9), each image being 28x28 pixels in grayscale.
Transformations: The images are converted to tensors and normalized to have a mean of 0.5 and a standard deviation of 0.5 to make the training process more efficient.

Input Layer: Takes the flattened 28x28 pixel image (a vector of 784 elements).
Hidden Layer: Contains 128 neurons, applies the ReLU (Rectified Linear Unit) activation function to introduce non-linearity.
Output Layer: Contains 10 neurons, one for each digit class (0-9), with raw score outputs for each class (logits).

Loss Function: Uses Cross-Entropy Loss, which is suitable for multi-class classification tasks. It measures the difference between the predicted probability distribution and the true distribution (one-hot encoded labels).
Optimizer: Uses the Adam optimizer, which is an adaptive learning rate optimization algorithm. It adjusts the learning rate for each parameter individually.

Training Loop: Iterates over the dataset multiple times (epochs). 
In each iteration (batch), it performs:
    Forward pass: Computes the model's predictions.
    Backward pass: Computes the gradients of the loss with respect to the model's parameters.
    Optimization step: Updates the model's parameters using the gradients.
    
Accuracy Measurement: After training, the model's performance is evaluated on the test dataset by measuring the accuracy. Accuracy is calculated as the number of correct predictions divided by the total number of predictions.



