### Building Neural Networks with PyTorch

## Introduction to PyTorch and Its Core Components

### What is PyTorch?
PyTorch is an open-source deep learning framework developed by Facebook's AI Research lab. It provides flexibility and dynamic computation graphs, making it easy to build, train, and deploy machine learning models. PyTorch is widely used for research and production due to its intuitive interface and strong GPU acceleration.

### Core Components of PyTorch

- **Tensors**:  
    Multi-dimensional arrays similar to NumPy arrays, but with additional capabilities such as GPU acceleration for faster computation. Tensors are the fundamental building blocks for all computations in PyTorch.

- **Autograd**:  
    PyTorch's automatic differentiation engine that computes gradients for all tensor operations. This is essential for training neural networks using backpropagation.

- **torch.nn Module**:  
    Provides a suite of tools to define and train neural networks, including layers (e.g., `Linear`, `Conv2d`), activation functions (e.g., `ReLU`, `Sigmoid`), and loss functions (e.g., `CrossEntropyLoss`, `MSELoss`).

- **Optimizers**:  
    PyTorch offers various optimization algorithms (e.g., SGD, Adam) in the `torch.optim` module to update model parameters based on computed gradients.

---

## Building a Neural Network in PyTorch

### Steps:

1. **Define the Model**  
     Use the `torch.nn.Module` class to create a neural network by specifying layers and the forward propagation logic.

2. **Define the Loss Function**  
     Choose a suitable loss function (e.g., cross-entropy loss for classification, mean squared error for regression) to measure the difference between predictions and true values.

3. **Define the Optimizer**  
     Select an optimizer (e.g., Adam, SGD) from `torch.optim` to update the model's weights during training.

---

## Training, Evaluating, and Saving a Model in PyTorch

- **Training**:  
    - Perform a forward pass to compute predictions.
    - Calculate the loss using the chosen loss function.
    - Perform a backward pass to compute gradients using autograd.
    - Update model weights using the optimizer.
    - Repeat for multiple epochs over the dataset.

- **Evaluation**:  
    - Test the trained model on unseen data (validation or test set).
    - Calculate evaluation metrics such as accuracy, precision, recall, or F1-score to assess model performance.

- **Saving and Loading Models**:  
    - Save the model's parameters using `torch.save(model.state_dict(), PATH)`.
    - Load the saved parameters into a model using `model.load_state_dict(torch.load(PATH))`.
    - This allows you to reuse trained models without retraining.

---

PyTorch's flexibility, ease of use, and strong community support make it a popular choice for both beginners and advanced practitioners in deep learning.


In [13]:
import torch
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.nn.functional as F

Define transformation

In [14]:
transform = transforms.Compose(
    [transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))]
)

# load datasets
train_dataset = datasets.MNIST(
    root="./data", train=True, transform=transform, download=True
)
test_dataset = datasets.MNIST(
    root="./data", train=False, transform=transform, download=True
)

# create data loaders
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

print(f"Training Data Size: {len(train_dataset)}")
print(f"Test Data Size: {len(test_dataset)}")   

Training Data Size: 60000
Test Data Size: 10000


Define the model

In [None]:
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(28 * 28, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10)

    def forward(self, x):
        x = self.flatten(x)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


model = NeuralNetwork()
print(model)

# define loss function and optimiser
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)


# training loop
def train_model(model, train_loader, criterion, optimizer, epochs=5):
    model.train()
    for epoch in range(epochs):
        running_loss = 0.0
        for images, labels in train_loader:
            # zero gradients
            optimizer.zero_grad()

            # forward pass
            outputs = model(images)
            loss = criterion(outputs, labels)

            # backward pass and optimise
            loss.backward()
            optimizer.step()
            running_loss += loss.item()

        print(f"Epoch {epoch+1}, Loss: {running_loss/len(train_loader)}")


train_model(model, train_loader, criterion, optimizer, epochs=5)


# evaluate loop
def evaluate_model(model, test_loader):
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for images, labels in test_loader:
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    print(f"Accuracy: {correct/total}")


evaluate_model(model, test_loader)

# save the model
torch.save(model.state_dict(), "mnist_model.pth")

# reload the model
loaded_model = NeuralNetwork()
loaded_model.load_state_dict(torch.load("mnist_model.pth"))

# verify loaded model performance
evaluate_model(loaded_model, test_loader)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (fc1): Linear(in_features=784, out_features=128, bias=True)
  (fc2): Linear(in_features=128, out_features=64, bias=True)
  (fc3): Linear(in_features=64, out_features=10, bias=True)
)
Epoch 1, Loss: 0.35669283079504965
Epoch 2, Loss: 0.1668147985600556
Epoch 3, Loss: 0.12700036020614205
Epoch 4, Loss: 0.10266157542467118
Epoch 5, Loss: 0.08850854687945296
Accuracy: 0.9657
Accuracy: 0.9657
