<a href="https://colab.research.google.com/github/mahausmani/deep_learning/blob/main/digit-recognition/mlp_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Imports

In [1]:
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

import numpy as np
import matplotlib.pyplot as plt
import os

print('Using PyTorch version:', torch.__version__)
if torch.cuda.is_available():
    print('Using GPU, device name:', torch.cuda.get_device_name(0))
    device = torch.device('cuda')
else:
    print('No GPU found, using CPU instead.')
    device = torch.device('cpu')

Using PyTorch version: 2.2.1+cu121
No GPU found, using CPU instead.


## Constants

In [2]:
batch_size = 32

data_dir = "/content/train"

## Loading data


PyTorch has two classes from torch.utils.data to work with data:

Dataset which represents the actual data items, such as images or pieces of text, and their labels

DataLoader which is used for processing the dataset in batches in an efficient manner.

The dataloader randomly selects batches from the training dataset when you loop over the data loader. In the for loop below, each time a new tensor of size [32, 1, 28, 28] is loaded

In [38]:
train_dataset = datasets.MNIST(data_dir, train = True, download = True, transform = ToTensor())
test_dataset = datasets.MNIST(data_dir, train = False, download = True, transform = ToTensor())

train_dataloader = DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)
test_dataloader = DataLoader(dataset = test_dataset, batch_size = batch_size, shuffle = False)

In [10]:
for data, target in train_dataloader:
    print(target.shape, data.shape)
    break

torch.Size([32]) torch.Size([32, 1, 28, 28])


## Example

The cell below gives example of how to use nn.Flatten().
The optional parameters, give the start and end positions

In [11]:
# With default parameters
input = torch.randn(32, 1, 5, 5)
m = nn.Flatten()
output = m(input)
print(output.size())  # torch.Size([32, 25])

# With non-default parameters
m = nn.Flatten(0, 2)
output = m(input)
print(output.size())  # torch.Size([160, 5])

torch.Size([32, 25])
torch.Size([160, 5])


## Model

In [33]:
class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Flatten(),
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )
    def forward(self, x):
        output = self.layers(x)
        return output

In [34]:
model = MLP().to(device)
model

MLP(
  (layers): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=784, out_features=512, bias=True)
    (2): ReLU()
    (3): Linear(in_features=512, out_features=10, bias=True)
  )
)

## Train

In [19]:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

In [35]:
def train(data_loader, model, criterion, optimizer):
    total_loss = 0
    for data, target in data_loader:
        data = data.to(device)
        target = target.to(device)
        output = model.forward(data)
        loss = criterion(output, target)
        total_loss += loss
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
    return total_loss/len(data_loader)

In [37]:
for i in range(20):
    loss = train(train_dataloader, model, criterion, optimizer)
    print(f"Epoch--> {i}    Loss--> {loss}")

Epoch--> 0    Loss--> 2.3071506023406982
Epoch--> 1    Loss--> 2.307149887084961
Epoch--> 2    Loss--> 2.3071494102478027
Epoch--> 3    Loss--> 2.3071482181549072
Epoch--> 4    Loss--> 2.3071484565734863
Epoch--> 5    Loss--> 2.307147741317749
Epoch--> 6    Loss--> 2.3071482181549072
Epoch--> 7    Loss--> 2.3071484565734863
Epoch--> 8    Loss--> 2.3071467876434326
Epoch--> 9    Loss--> 2.307149887084961
Epoch--> 10    Loss--> 2.3071510791778564
Epoch--> 11    Loss--> 2.307149887084961
Epoch--> 12    Loss--> 2.3071513175964355
Epoch--> 13    Loss--> 2.307149648666382
Epoch--> 14    Loss--> 2.307147741317749
Epoch--> 15    Loss--> 2.3071467876434326
Epoch--> 16    Loss--> 2.3071513175964355
Epoch--> 17    Loss--> 2.3071491718292236
Epoch--> 18    Loss--> 2.3071506023406982
Epoch--> 19    Loss--> 2.307147741317749


## Test

In [40]:
def test(test_dataloader, model, criterion, optimizer):
    total_loss = 0
    with torch.no_grad():
        for data, target in test_dataloader:
            data = data.to(device)
            target = target.to(device)
            output =  model(data)
            loss = criterion(output, target)
            total_loss += loss
    print(f"Average loss: {total_loss/len(test_dataloader):>7f}")
    return total_loss/len(test_dataloader)


In [41]:
test(test_dataloader, model, criterion, optimizer)

Average loss: 2.306066


tensor(2.3061)

## Predict

In [64]:
def predict(image, output, model):
    with torch.no_grad():
        output1 = model(image)
    print(np.argmax(np.array(output1)), output)

input, output = train_dataloader.dataset[0]
predict(input.reshape(1,784), output, model)

4 5
