---
title: "Neural Networks in PyTorch | Log #004"
description: "Let's build a simple NN in PyTorch"
date: 2025-01-04
author: "Hemanth KR"
categories: [Vision, PyTorch]
image: "thumbnails/004.jpg"
---

## Introduction

Until now, we have understood what an image is and the operations that are done to better extract information from them. Now it's time to dive deep into Neural Networks. This blog will introduce a lot of new concepts and I hope to write separate blogs explaining some of those concepts in depth. 

PyTorch is one of the most popular deep learning frameworks and provides some of the core functionalities required for building a neural network. I find this highly intuitive and there's always scope to dig deep if want to better understand something by writing from scratch.

### Data

PyTorch has domain specific libraries like `torchtext`, `torchvision`, `torchaudio` which provide very useful building blocks along with some of the popular datasets. 

For this example, we will do a simple image classification task with FashionMNIST dataset. The dataset consists of 28x28 grayscale images, each associated with a single label from 10 classes. There are 60k training examples and 10k testing examples.

In [None]:
import torch
from torch import nn
from torchvision import datasets
from torchvision.transforms import ToTensor
from torch.utils.data import DataLoader

training_data = datasets.FashionMNIST(root="data", download=True, train=True, transform=ToTensor())
test_data = datasets.FashionMNIST(root="data", download=True, train=False, transform=ToTensor())

The above code downloads the data into the root folder and applys the transformations. This provides a `Dataset` object that stores the sample and their corresponding labels. We want to batch our data to feed them into the model iteratively, using the `DataLoader` class. This wraps an iterable over the dataset and supports multiprocess data loading and automatic batching.

In [None]:
batch_size = 64

train_dataloader = DataLoader(dataset=training_data, batch_size=batch_size)
test_dataloader = DataLoader(dataset=test_data, batch_size=batch_size)

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"------------- Using {device} ------------")

We have created a batch of 64 elements in the dataloader which will return 64 features and labels per batch.

### Model

Now let's create a simple neural network. Every module in PyTorch inherits from `nn.Module`. We will define our model in the `__init__` function and the forward pass in `forward` funtion which get's called automatically.

In [10]:
class NeuralNetwork(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )
    
    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


Now that we have a model, let's define a loss function and an optimizer which will help us train the model. 

Loss function simply measures the difference between the predicted output value and the actual target value. This loss is what we need to minimize during training.

Optimizer is defined as the process of reducing the model error step by step during training. And there are many ways to implement this. This is how our model is learning to perform better. In a training loop, we call the optimizer to replace the gradients of model paramenters using backpropagation.  We will look into these algorithms in depth separately.

In [11]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)
        pred = model(X)
        loss = loss_fn(pred, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    
    if batch % 100==0:
        loss, current = loss.item(), (batch + 1) * len(x)
        print(f"Training: \n loss: {loss:>7f} [{current:>5d}/{size:>5d}]")


def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss = loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

We pass the train and test data through the model and calculate the losses and accuracy. Each full pass through the data is called an Epoch.

In [None]:
epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Now that we have a model, we can save it and load for deployment. We can also do inference to see how the model performs for new data.

In [None]:
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch model state to model.pth")

In [None]:
model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("model.pth"), weights_only=True)

classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    x = x.to(device)
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

There is a lot of concepts we introduced in this blog. The aim was to see how the process of creating and training a model is done. And how this model can be put to use. We will discuss some of the main aspects of the architecture in the future blogs and hopefully do a PyTorch series to learn how everything works in the background. This is it for now.