# Basic PyTorch Concepts in Practice: Building an MNIST CNN Model

## Introduction to the Lecture

**Purpose of this Lecture:**
This lecture aims to introduce you to the fundamental concepts of PyTorch, a powerful open-source machine learning library. We'll do this by working through a practical example: building an image classification model. By the end of this session, you'll have a foundational understanding of how to define, train, and test a simple neural network using PyTorch.

**The MNIST Example: What Are We Doing?**
We will be working with the **MNIST dataset** (pronounced "em-nist"). This is a famous dataset in the machine learning world, consisting of 70,000 grayscale images of handwritten digits, from 0 to 9 (60,000 for training and 10,000 for testing). Each image is small, 28x28 pixels.
Our goal is to build a **Convolutional Neural Network (CNN)** that can look at one of these handwritten digit images and correctly predict which digit it is (0, 1, 2, ..., 9).

**Why This Example?**
*   **MNIST - A Classic Starting Point:** MNIST is often called the "hello world" of image classification. Its simplicity and small size make it an ideal dataset for beginners to learn the basics without getting bogged down by large, complex data.
*   **CNNs - Fundamental for Images:** Convolutional Neural Networks are a cornerstone of modern computer vision. Understanding their basic structure is crucial for anyone interested in working with image data. This example provides a gentle introduction.
*   **PyTorch - Flexible and Pythonic:** PyTorch is widely used in both research and industry. It's known for its Python-friendly interface, dynamic computation graphs (which offer flexibility), and strong GPU acceleration. Learning PyTorch is a valuable skill.

**Learning Outcomes:**
By completing this exercise, you will learn to:
1.  Load and prepare image datasets (like MNIST) specifically for use in PyTorch.
2.  Understand and use **Tensors**, the core data structure in PyTorch.
3.  Define a simple Convolutional Neural Network (CNN) architecture using PyTorch's `nn.Module`.
4.  Understand the roles of an **optimizer** (like Adam) and a **loss function** (like Cross-Entropy Loss) in training a neural network.
5.  Implement the basic training loop:
    *   **Forward Pass:** Getting predictions from the model.
    *   **Loss Calculation:** Measuring how good/bad the predictions are.
    *   **Backward Pass (Backpropagation):** Calculating gradients to understand how to improve the model.
    *   **Optimizer Step:** Updating the model's parameters (weights) to make it better.
6.  Evaluate your model's performance on unseen test data to see how well it has learned.

**Brief Explanation of Key Concepts (for Absolute Beginners):**
Don't worry if these terms are new; we'll see them in action!
*   **Neural Network (NN):** Imagine a computer system that learns from examples, much like a human brain. It's made of interconnected units called "neurons" that work together to process information and make decisions or predictions.
*   **Convolutional Neural Network (CNN):** A special type of neural network that's really good at understanding images. It uses "convolutional" layers that act like sets of learnable filters, sliding across images to detect patterns like edges, textures, and shapes.
*   **Tensor:** In PyTorch (and other deep learning frameworks), a tensor is the main way we store and manipulate data. Think of it as a multi-dimensional array or grid of numbers. A 1D tensor is a vector, a 2D tensor is a matrix, and you can have 3D, 4D, or even higher-dimensional tensors (e.g., a batch of color images might be a 4D tensor: batch_size x height x width x color_channels).
*   **Training:** This is the process of "teaching" the neural network. We show it lots of example images and their correct labels (e.g., this image is a "7"). The network makes a prediction, we see how wrong it is, and then we slightly adjust its internal settings (called "weights") to make it more accurate next time.
*   **Epoch:** One complete round of showing the *entire* training dataset to the neural network. We usually train for multiple epochs.
*   **Batch:** Because training datasets can be very large, we often break them into smaller chunks called batches. The model processes one batch at a time within an epoch.
*   **Loss Function (or Criterion):** A mathematical function that measures how "wrong" the model's predictions are compared to the actual correct answers (labels). A high loss means the model is doing poorly; a low loss means it's doing well. The main goal of training is to minimize this loss.
*   **Optimizer:** An algorithm that helps the neural network adjust its internal weights to reduce the loss. It uses the information from the loss function to decide how to change the weights to make better predictions. Adam, SGD (Stochastic Gradient Descent), and RMSprop are common optimizers.
*   **Activation Function (e.g., ReLU):** A function applied to the output of neurons within the network. They introduce non-linearity, which is crucial for the network to learn complex relationships in the data. ReLU (Rectified Linear Unit) is a very common one; it basically outputs the input if it's positive, and zero otherwise.
*   **Softmax:** Often used as the last activation function in a classification model. It takes a vector of raw scores (logits) from the network and converts them into a vector of probabilities, where each probability represents how likely the input image is to belong to a particular class (e.g., 70% chance it's a '2', 10% it's a '7', etc.). All probabilities sum up to 1.
---

# Basic Pytorch Concepts in practice by building MNIST CNN model

## 1. Load Libraries

This first code block is dedicated to importing all the necessary Python libraries and modules that we'll use throughout this notebook. Libraries are collections of pre-written code that provide useful functions and tools, so we don't have to write everything from scratch.

Here's a brief overview of the key libraries we're importing:

*   **`torch`**: This is the main PyTorch library. It provides the core functionalities like tensor operations (the fundamental data structures for PyTorch) and automatic differentiation (which is key for training neural networks).
*   **`torch.nn`**: This submodule of PyTorch contains the building blocks for constructing neural networks, such_as layers (convolutional, linear, etc.), activation functions, and loss functions. `nn` stands for Neural Network.
*   **`torch.nn.functional` (often imported as `F`)**: This module contains functions that are used in building neural networks, like activation functions (e.g., ReLU) and pooling operations. It's often used for functions that don't have learnable parameters.
*   **`torchvision`**: This library is part of the PyTorch ecosystem and provides access to popular datasets, pre-trained model architectures, and common image transformations for computer vision tasks.
*   **`torchvision.transforms`**: A submodule of `torchvision` that provides tools for pre-processing image data, such as converting images to tensors, normalizing them, or applying data augmentation techniques.
*   **`torchvision.datasets`**: This submodule makes it easy to download and use standard datasets like MNIST, CIFAR10, ImageNet, etc.
*   **`pandas` (as `pd`)**: A powerful library for data manipulation and analysis, particularly useful for working with tabular data (though not heavily used in this specific image-focused notebook).
*   **`numpy` (as `np`)**: A fundamental package for numerical computation in Python. PyTorch tensors can be easily converted to and from NumPy arrays.
*   **`torch.utils.data.Dataset` and `torch.utils.data.DataLoader`**: These are PyTorch utilities that help in creating custom datasets and loading data efficiently in batches during model training and evaluation.
*   **`sklearn.metrics.recall_score`**: A function from scikit-learn, a comprehensive machine learning library, to calculate the recall score. While not directly used in the main training loop here, it's imported and could be used for more detailed evaluation.
*   **`matplotlib.pyplot` (as `plt`)**: A widely used plotting library in Python. We'll use it to display images from our dataset.
*   **`joblib`**: A library for lightweight pipelining in Python. It can be useful for saving and loading Python objects, including trained models or data.
*   **`tqdm`**: A library that provides a simple and effective way to add progress bars to loops, which is very helpful for monitoring the progress of time-consuming tasks like training neural networks.
*   **`os`**: A standard Python library for interacting with the operating system, for example, to manage files and directories.
*   **`random`**: A standard Python library for generating random numbers, which can be useful for various tasks like shuffling data or initializing parameters.

The lines `%reload_ext autoreload`, `%autoreload 2`, and `%matplotlib inline` are "magic commands" often used in Jupyter Notebooks:
*   `%reload_ext autoreload` and `%autoreload 2`: These commands automatically reload modules before executing code. This is useful if you're editing external Python scripts and want the changes to be reflected in the notebook without restarting the kernel.
*   `%matplotlib inline`: This command ensures that plots generated by `matplotlib` are displayed directly within the notebook interface.

---

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
from torchvision import datasets

import pandas as pd
import numpy as np
from torch.utils.data import Dataset, DataLoader
from sklearn.metrics import recall_score
import matplotlib.pyplot as plt
import joblib
from tqdm import tqdm
import os
import random

%reload_ext autoreload
%autoreload 2
%matplotlib inline

## 2. Read / Import Data & Initial Setup

In this section, we'll set up some basic parameters and then download and prepare our dataset.

*   **`BATCH_SIZE = 32` (or 64, 128):**
    *   When training neural networks, we usually don't feed the entire dataset to the model at once. Instead, we divide it into smaller, manageable chunks called **batches**.
    *   `BATCH_SIZE` defines how many samples (images, in our case) are included in each batch.
    *   **Why use batches?**
        *   **Memory Efficiency:** Processing the entire dataset at once might require more memory (RAM or GPU VRAM) than available. Batches make it feasible to train on large datasets.
        *   **Faster Training (per update):** The model's weights are updated after processing each batch. Smaller batches mean more frequent updates, which can sometimes lead to faster convergence.
        *   **Stable Gradient Estimation:** While very small batches can lead to noisy gradient estimates, moderately sized batches provide a good balance, offering a more stable estimate of the gradient across the dataset compared to processing one sample at a time (stochastic gradient descent).
    *   The choice of batch size (e.g., 32, 64, 128) is a hyperparameter that can affect training speed and model performance. There's no single "best" size; it often depends on the dataset, model, and available hardware.

*   **`transform = transforms.Compose([...])`**:
    *   PyTorch's `torchvision.transforms` module provides tools for common image transformations. `transforms.Compose` allows us to chain multiple transformations together.
    *   **`transforms.ToTensor()`**: This is a crucial transformation. It converts images (which might be in formats like PIL Image or NumPy arrays) into PyTorch **Tensors**.
        *   For PIL Images, it also changes the pixel value range from `[0, 255]` (typical for images) to `[0.0, 1.0]` (a floating-point range suitable for neural networks) by dividing each pixel value by 255.
        *   It rearranges the dimensions of the image tensor from HWC (Height, Width, Channel) to CHW (Channel, Height, Width), which is the format PyTorch's convolutional layers expect. Since MNIST images are grayscale, the channel will be 1.

*   **`trainset = torchvision.datasets.MNIST(...)`** and **`testset = torchvision.datasets.MNIST(...)`**:
    *   These lines download the MNIST dataset using `torchvision.datasets.MNIST`. PyTorch makes it very convenient to access many standard datasets.
    *   **`root='./data'`**: Specifies the directory where the MNIST data will be downloaded or, if already downloaded, where it's stored.
    *   **`train=True`**: This flag indicates that we want to load the **training** portion of the MNIST dataset. This is the data the model will learn from.
    *   **`train=False`**: This flag indicates that we want to load the **testing** (or evaluation) portion of the MNIST dataset. This data is kept separate and is used to evaluate how well our trained model generalizes to new, unseen images. It's crucial not to train the model on the test data.
    *   **`download=True`**: If the MNIST dataset is not found in the `root` directory, this option allows PyTorch to download it automatically.
    *   **`transform=transform`**: This applies the transformations we defined earlier (in our case, `transforms.ToTensor()`) to each image as it's loaded from the dataset. This means each image will be converted into a PyTorch tensor with pixel values between 0.0 and 1.0.

The output of this cell (the download progress bars) shows that PyTorch is fetching the dataset files. These typically include files for training images, training labels, test images, and test labels.
---