# Handwritten Digit Recognition Project 🚀

Welcome to our exciting journey into the world of Deep Learning! In this project, you'll dive into the fascinating realm of handwritten digit recognition using PyTorch, one of the most popular machine learning libraries. 🧠💻

## Project Overview 📝

Your mission, should you choose to accept it, involves building and optimizing a PyTorch model to recognize handwritten digits from the MNIST dataset. This dataset is like the 'Hello World' of machine learning, perfect for beginners and yet intriguing for experienced coders. 🌟

## Learning Objectives 🎯

- **Understanding PyTorch**: Get hands-on experience with PyTorch, understanding its basic operations and how to build models with it.
- **Model Optimization**: Explore various training optimization techniques such as adding dropout layers, implementing regularizers, and utilizing early stopping to enhance model performance.
- **Experimentation**: Test different hyperparameters and observe how they impact your model's learning process and accuracy.

## Project Structure 🗂️

- **Data Preprocessing**: Learn how to prepare your data for optimal model training.
- **Model Building**: Design a neural network architecture suitable for digit recognition.
- **Training and Testing**: Implement the training loop, and test your model's performance.
- **Optimization Techniques**: Apply different optimization strategies to improve your model.

## TODOs 📌

Throughout this notebook, you'll find `TODO` sections. These are areas where you'll need to apply what you've learned and write your own code. Don't worry, though; guidance and hints are provided to help you on your journey!

So, are you ready to embark on this adventure in machine learning? Let's get started! 🚀👩‍💻👨‍💻

---

Remember, the goal of this project is not just to build a model but to experiment and learn. Every challenge you encounter is an opportunity to grow. Let's do this! 💪


## Loading the MNIST Dataset 📚

Before diving into the model building, the first crucial step is to load our dataset. In this section, you'll learn how to load and visualize the MNIST dataset, which is a collection of 70,000 grayscale images of handwritten digits (0 through 9). This dataset is widely used for training and testing in the field of machine learning. 🤖📈



In [1]:
# Import necessary libraries
import torch
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# TODO: Define a transform to normalize the data
transform = transforms.Compose([
    # TODO: Add necessary transformations
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

# TODO: Load the MNIST dataset
train_set = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_set = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# TODO: Create data loaders
train_loader = DataLoader(train_set, batch_size=64, shuffle=True)
test_loader = DataLoader(test_set, batch_size=64, shuffle=False)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 14772673.93it/s]


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 443235.04it/s]


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 3931268.45it/s]


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 2414821.75it/s]

Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw






In [None]:
# TODO: Figure out how many images are in the train_set and test_set.


In [2]:
len(train_set)

60000

In [3]:
len(test_set)

10000

## Building the Neural Network Model 🛠️

Now that our dataset is ready, it's time to build the neural network model that will learn to recognize handwritten digits. In this section, you will define the architecture of your neural network.

### Key Concepts:
- **Layers**: Neural networks are composed of layers. Each layer has a specific role, like convolutional layers for feature extraction or fully connected (dense) layers for decision making.
- **Activation Functions**: These functions introduce non-linear properties to the network, allowing it to learn more complex patterns.

In [None]:
image, label = train_set[0]
print(image.shape)

torch.Size([1, 28, 28])


In [4]:
# Import necessary PyTorch libraries
import torch.nn as nn
import torch.nn.functional as F

# TODO: Define the neural network class
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # TODO: Define layers of the neural network
        self.fc1 = nn.Linear(784,10 ) # First fully connected layer


    def forward(self, x):
        # Flatten the input tensor
        x = x.view(-1, 28 * 28)
        x = self.fc1(x)  # TODO: add an activation function
        x =F.log_softmax(x, dim=1)
        return x

# Create an instance of the network
model = Net()
print(model)

Net(
  (fc1): Linear(in_features=784, out_features=10, bias=True)
)


## Training the Neural Network Model 🏋️‍♀️🏋️‍♂️

With our neural network model defined, the next exciting step is to train it. This process involves feeding the training data to the model and adjusting the model parameters (weights and biases) based on the computed loss and the chosen optimization algorithm.

### Key Concepts:
- **Loss Function**: Measures how well the model performs. A common choice for classification tasks is Cross-Entropy Loss.
- **Optimizer**: Helps in updating the model parameters based on the computed gradients. We'll be using Stochastic Gradient Descent (SGD) in this example.
- **Epochs**: One epoch means the model has seen the entire dataset once. Training for multiple epochs means going through the dataset multiple times.



In [None]:
# TODO: Complete this code
# Import optimizer
from torch.optim import SGD


# TODO: Define the loss function and optimizer
criterion =nn.CrossEntropyLoss()
optimizer = SGD(model.parameters(), lr=0.1)

# TODO: Set the number of epochs
num_epochs =10

# Training loop
for epoch in range(num_epochs):
    running_loss = 0.0
    for images, labels in train_loader:
        # TODO: Complete Training pass
        optimizer.zero_grad()

        outputs = model(images)

        loss =criterion(outputs,labels)

        loss.backward()

        optimizer.step()

        running_loss += loss.item()
    else:
        print(f"Epoch {epoch+1}, Loss: {running_loss/len(train_loader)}")

    # TODO: evaluate on the test_loader
    test_loss = 0.0
    for images, labels in test_loader:
        # TODO: Complete evaluation pass
        optimizer.zero_grad()

        outputs = model(images)

        loss =criterion(outputs,labels)

        test_loss += loss.item()
    else:
        print(f" Loss: {test_loss/len(test_loader)}")

print("Training is finished!")

Epoch 1, Loss: 0.317570639492225
 Loss: 0.40004401257748057
Epoch 2, Loss: 0.3118408035669627
 Loss: 0.32226775886763814
Epoch 3, Loss: 0.3083542331949925
 Loss: 0.36989566122602885
Epoch 4, Loss: 0.3056124110203753
 Loss: 0.2857994048731627
Epoch 5, Loss: 0.30441292845554696
 Loss: 0.3061939826365679
Epoch 6, Loss: 0.3034819480913407
 Loss: 0.32723684109462675
Epoch 7, Loss: 0.3045578286973144
 Loss: 0.3753653633271813
Epoch 8, Loss: 0.30522877856421826
 Loss: 0.37915449941851154
Epoch 9, Loss: 0.2999077463216746
 Loss: 0.2937800096522922
Epoch 10, Loss: 0.2976090125580713
 Loss: 0.29498700181520576
Training is finished!


In [None]:
# TODO: plot the model complexity graph

### TODO1: Comment the model complexity graph
### TODO2: Change the model and add more layer (use a complex model)

In [5]:
# Import necessary PyTorch libraries
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # TODO: Define layers of the neural network
        self.fc1 = nn.Linear(784,256 ) # First fully connected layer
        self.fc2 = nn.Linear(256,128 )
        self.fc3 = nn.Linear(128,10 )


    def forward(self, x):
        # Flatten the input tensor
        x = x.view(-1, 28 * 28)
        x = self.fc1(x)  # TODO: add an activation function
        x = F.relu(x)
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        x =F.log_softmax(x, dim=1)
        return x

# Create an instance of the network
model = Net()
print(model)

Net(
  (fc1): Linear(in_features=784, out_features=256, bias=True)
  (fc2): Linear(in_features=256, out_features=128, bias=True)
  (fc3): Linear(in_features=128, out_features=10, bias=True)
)


In [8]:
# TODO: Complete this code
# Import optimizer
from torch.optim import SGD


# TODO: Define the loss function and optimizer
criterion =nn.CrossEntropyLoss()
optimizer = SGD(model.parameters(), lr=0.1)

# TODO: Set the number of epochs
num_epochs =20

# Training loop
for epoch in range(num_epochs):
    running_loss = 0.0
    for images, labels in train_loader:
        # TODO: Complete Training pass
        optimizer.zero_grad()

        outputs = model(images)

        loss =criterion(outputs,labels)

        loss.backward()

        optimizer.step()

        running_loss += loss.item()
    else:
        print(f"Epoch {epoch+1}, Loss: {running_loss/len(train_loader)}")

    # TODO: evaluate on the test_loader
    test_loss = 0.0
    for images, labels in test_loader:
        # TODO: Complete evaluation pass
        optimizer.zero_grad()

        outputs = model(images)

        loss =criterion(outputs,labels)

        test_loss += loss.item()
    else:
        print(f" Loss: {test_loss/len(test_loader)}")

print("Training is finished!")

Epoch 1, Loss: 0.005752749406507483
 Loss: 0.07250975104170286
Epoch 2, Loss: 0.006449911077194703
 Loss: 0.07291533469129242
Epoch 3, Loss: 0.0045316212949079015
 Loss: 0.07112634171798232
Epoch 4, Loss: 0.0032975018814122565
 Loss: 0.06735397260079
Epoch 5, Loss: 0.0019133148490276157
 Loss: 0.06662699377769721
Epoch 6, Loss: 0.00213255827696197
 Loss: 0.06725587778842027
Epoch 7, Loss: 0.0013493871558068683
 Loss: 0.07197256504951413
Epoch 8, Loss: 0.0017550431907102102
 Loss: 0.07160960813009958
Epoch 9, Loss: 0.0008700578639874138
 Loss: 0.07224944116239586
Epoch 10, Loss: 0.0010081380153980314
 Loss: 0.07277647795535958
Epoch 11, Loss: 0.0006327422604649878
 Loss: 0.07093196414770131
Epoch 12, Loss: 0.0007780643419330526
 Loss: 0.07228402411775896
Epoch 13, Loss: 0.0006750607391503335
 Loss: 0.0740413475639307
Epoch 14, Loss: 0.0004586162098744324
 Loss: 0.07317676467861142
Epoch 15, Loss: 0.0003894092628463292
 Loss: 0.0743005623933007
Epoch 16, Loss: 0.0004550565212892481
 Loss

## Implementing Early Stopping 🛑

One of the key techniques in training neural networks effectively is 'Early Stopping'. This technique halts the training process if the model performance stops improving on a held-out validation set. Early stopping is a form of regularization used to avoid overfitting.

### Key Concepts:
- **Validation Loss**: Monitor the loss on a validation set to detect when it begins to increase, indicating overfitting.

In [10]:
# TODO: Complete this code to implement Early stopping
patience = 3
min_delta = 0.01
best_loss = None
patience_counter = 0

# Training loop with early stopping
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    for images, labels in train_loader:
        # Training pass

        optimizer.zero_grad()

        outputs = model(images)

        loss =criterion(outputs,labels)

        loss.backward()

        optimizer.step()

        running_loss += loss.item()

    # evaluation phase
    model.eval()
    validation_loss = 0.0
    with torch.no_grad():
        for images, labels in test_loader:
            output = model(images)
            validation_loss += loss.item()

    # Calculate average losses
    training_loss = running_loss / len(train_loader)
    validation_loss /= len(test_loader)

    print(f"Epoch {epoch+1}, Training Loss: {training_loss}, Validation Loss: {validation_loss}")

    # Early stopping logic
    if best_loss is None or validation_loss < best_loss - min_delta:
        best_loss = validation_loss
        patience_counter = 0
    else:
        patience_counter += 1
        if patience_counter >= patience:
            print("Early stopping triggered!")
            break

print("Training is finished!")

Epoch 1, Training Loss: 0.0001847588073771257, Validation Loss: 0.00047103059478104115
Epoch 2, Training Loss: 0.0001806932147625028, Validation Loss: 0.00021229659614618868
Epoch 3, Training Loss: 0.000175263833978103, Validation Loss: 9.418222907697782e-05
Epoch 4, Training Loss: 0.00017001310016225815, Validation Loss: 3.23802960338071e-05
Early stopping triggered!
Training is finished!


In [None]:
# TODO: Answer this questions
# What does min_delta and patience refer to ?
# What is different from the first training ?

**Patience** is a paramers used to specify the number of consecutive epochs with no improvement in the validation loss before training is stopped

**Min_delta** is a parameter used to define the minimum change in the validation loss

## Experimenting with Dropout 🌧️

Dropout is a regularization technique that temporarily drops units (along with their connections) from the neural network during training. This prevents units from co-adapting too much and helps the model to generalize better to unseen data.

### Key Concepts:
- **Dropout Rate**: The probability of a neuron being dropped during training. Common rates are 0.2, 0.5, etc.
- **Generalization**: Dropout improves the generalization of the model on the test data.


In [12]:
class NetWithDropout(nn.Module):
    def __init__(self):
        super(NetWithDropout, self).__init__()
        # Define layers of the neural network
        self.fc1 = nn.Linear(28 * 28, 128)
        self.dropout1 = nn.Dropout(p=0.2)  # Dropout layer with 20% probability
        self.fc2 = nn.Linear(128, 64)
        self.dropout2 = nn.Dropout(p=0.5)  # Dropout layer with 50% probability
        self.fc3 = nn.Linear(64, 10)

    def forward(self, x):
        # Flatten the input tensor
        x = x.view(-1, 28 * 28)
        # Forward pass with dropout
        x = F.relu(self.fc1(x))
        x = self.dropout1(x)
        x = F.relu(self.fc2(x))
        x = self.dropout2(x)
        x = self.fc3(x)
        return x

# Create an instance of the network with dropout
model_with_dropout = NetWithDropout()
print(model_with_dropout)

NetWithDropout(
  (fc1): Linear(in_features=784, out_features=128, bias=True)
  (dropout1): Dropout(p=0.2, inplace=False)
  (fc2): Linear(in_features=128, out_features=64, bias=True)
  (dropout2): Dropout(p=0.5, inplace=False)
  (fc3): Linear(in_features=64, out_features=10, bias=True)
)


In [None]:
# TODO: Train the dropout model
# What do you notice ?

In [13]:

num_epochs =10

# Training loop
for epoch in range(num_epochs):
    running_loss = 0.0
    for images, labels in train_loader:
        # TODO: Complete Training pass
        optimizer.zero_grad()

        outputs = model(images)

        loss =criterion(outputs,labels)

        loss.backward()

        optimizer.step()

        running_loss += loss.item()
    else:
        print(f"Epoch {epoch+1}, Loss: {running_loss/len(train_loader)}")


print("Training is finished!")

Epoch 1, Loss: 0.0001661985941034258
Epoch 2, Loss: 0.00016045922787848572
Epoch 3, Loss: 0.00015796959084335332
Epoch 4, Loss: 0.00015358168136264352
Epoch 5, Loss: 0.00015048369456637184
Epoch 6, Loss: 0.00014528210868408693
Epoch 7, Loss: 0.00014349422547327095
Epoch 8, Loss: 0.00013974140595420206
Epoch 9, Loss: 0.00013667499299716457
Epoch 10, Loss: 0.00013404696047111743
Training is finished!


## Submitting Your Project on GitHub 🚀

Submitting your project on GitHub not only allows you to showcase your work but also helps in version control and collaboration. Here's how you can do it:

### Step 1: Create a New Repository on GitHub
1. **Sign in to GitHub**: Go to [GitHub](https://github.com) and sign in with your account.
2. **Create a New Repository**: Click on the '+' icon in the top right corner and select 'New repository'.
3. **Name Your Repository**: Give your repository a meaningful name, like 'handwritten-digit-recognition'.
4. **Initialize with a README**: Check the box 'Initialize this repository with a README'.
5. **Create Repository**: Click the 'Create repository' button.

### Step 2: Clone the Repository to Your Local Machine
1. **Copy the Repository URL**: On your repository page on GitHub, click the 'Code' button and copy the URL.
2. **Clone in Terminal**: Open your terminal, navigate to where you want the repository, and run `git clone [URL]`, replacing `[URL]` with the URL you copied.

### Step 3: Add Your Project to the Repository
1. **Copy Your Notebook**: Place your Jupyter notebook file into the cloned repository folder on your local machine.
2. **Add the File**: Run `git add [filename]` in your terminal, replacing `[filename]` with the name of your notebook file.

### Step 4: Commit and Push Your Changes
1. **Commit Your Changes**: Run `git commit -m "Add project notebook"`.
2. **Push to GitHub**: Run `git push` to push your changes to the GitHub repository.

### Step 5: Create and Edit the README File
1. **Edit README.md**: On GitHub, open the README.md file and click the pencil icon to edit.
2. **Write Your README**: Include a project title, a brief description, installation instructions, and usage instructions. Optionally, add screenshots or additional sections as needed.
3. **Save Changes**: After editing, commit your changes by clicking 'Commit changes' at the bottom.

### 📌 TODOs for Submission:
- Ensure your Jupyter notebook is well-commented and formatted.
- Write a clear, concise README that effectively describes your project.
- Double-check that all files have been committed and pushed to your GitHub repository.

---

Remember, a well-documented GitHub repository not only reflects your technical skills but also your ability to communicate and present your work effectively. Happy coding and best of luck with your project submission! 🌟👩‍💻👨‍💻
