# <font color='black'> **Handwritten Digits Detection**

<img src='download.png'>


### <font color='black'> **Prepared by** : Basant Saad El_din mohamed  

# <font color='red'>**Important Libraries**

##### <font color='black'> 1- **torch** : deep learning framework

##### 2- **torchvision** : for CV tasks ,that have :
###### <font color='black'>   *Transform* : preprocessing for images, like data augmentation and normalization
###### <font color='black'>   *datasets* : preloaded datasets

##### <font color='black'>3- **torch.utils.data** : provides tools for managing datasets and preparing them for training. Its main goal is to handle loading, batching, and splitting datasets efficiently, especially for large datasets.
###### <font color='black'> *DataLoader* : batcheing for data. shuffles and prepares data for the model.
###### <font color='black'> *random_split* : is a function that splits a dataset into two (or more) subsets randomly.

##### 4- **torch.nn** : simplifies building neural networks
##### 5- **tqdm** : displays progress bars for loops(epoch or batchs), monitor long-running processes that shown How far the loop has progressed,,etc.
##### 6- **sklearn.metrics** : evaluation metrics

In [1]:
import torch  
from torchvision import datasets, transforms 
from torch.utils.data import DataLoader, random_split 
from torch import nn 
from sklearn.metrics import precision_recall_fscore_support, accuracy_score #evaluation metrics.
from tqdm import tqdm


#### <font color='black'>Define data <font color='red'>**Augmentation** <font color='black'>and <font color='red'>**Transformations** <font color='black'>for images and <font color='red'>**Loaded** <font color='black'>the dataset

In [2]:
train_transform = transforms.Compose([
    transforms.RandomRotation(10),        # Random rotation between -10 and 10 degrees
    transforms.RandomHorizontalFlip(p=0.5),  # 50% chance to flip horizontally
    transforms.ToTensor(),                # Convert image to tensor
    transforms.Normalize((0.1307,), (0.3081,))  # Normalize with mean and std specific to MNIST
])

test_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])
# Load MNIST dataset
train_dataset = datasets.MNIST(root='data', train=True, download=True, transform=train_transform)
test_dataset = datasets.MNIST(root='data', train=False, download=True, transform=test_transform)


#### <font color='red'>**Managing** <font color='black'>datasets and <font color='red'>**Preparing**<font color='black'> data for training ( Split training dataset )

In [3]:
# Split training dataset for training and validation 
train_size_samples = int(0.8 * len(train_dataset))
val_size_samples = len(train_dataset) - train_size_samples

train_dataset, val_dataset = random_split( train_dataset , [train_size_samples, val_size_samples])

#---------------------------------------------------------------------------------------------------------

# Define DataLoaders
batch_size = 32
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=2)

val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=2)

test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=2)

print(f"Number of training samples: {len(train_dataset)}")
print(f"Number of validation samples: {len(val_dataset)}")
print(f"Number of test samples: {len(test_dataset)}")

Number of training samples: 48000
Number of validation samples: 12000
Number of test samples: 10000


### <font color='red'> **The Model Architecture**
#####  - <font color='black'>This model consist of 3 layers [input layer , hidden layer , output layer]
### <font color='red'> **the main details** :
##### <font color='black'>- The **_init_** Method : The constructor initializes the network's layers.

##### - **super(NewCNN, self)._init_()** : Calls the constructor of the parent class (nn.Module).
##### - **the convolutional block**, which consists of:
###### - **Convolutional layers (nn.Conv2d)** :Extract spatial features from input images by applying learnable filters.
###### - **Activation Function (nn.ReLU)** : ReLU(x) = max(0, x).
###### - **Pooling Layers (nn.MaxPool2d)** : Reduces the spatial dimensions by taking the maximum value in each pool (kernel size: 2x2, stride: 2)
###### - **Dropout (nn.Dropout)** : Regularization technique to prevent overfitting by randomly dropping a fraction of neurons during training

##### - **fully connected block**, which consists of:

###### - **nn.Flatten()** : Converts the multi-dimensional tensor into a 1D tensor for the fully connected layers
###### - **nn.Linear(64 * 7 * 7, 128)** 
###### - **ReLU Activation (nn.ReLU)**
###### - **Dropout (nn.Dropout(0.5))**
###### - **nn.Linear(128, num_classes)**

In [8]:
class CNN_Neural_network(nn.Module):

    def __init__(self, num_classes):

        super(CNN_Neural_network, self).__init__()
        
        self.convolutional_block = nn.Sequential(

            #the hidden layer
            nn.Conv2d(1, 16, kernel_size=3, stride=1, padding=1),   #the inputshape (28x28) ->the outputshape (28x28)
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),  #the inputshape (28x28) -> the outputshape(14x14)
            nn.Dropout(0.25),

            # the output layer
            nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=0),  #the inputshape (14x14) -> the outputshape(14x14)9
            nn.ReLU(),
            nn.AvgPool1d(kernel_size=2, stride=2),  #the inputshape (14x14) -> the outputshape(7x7)
            nn.Dropout(0.25)
        )
        
        self.fully_connected_block = nn.Sequential(
            #before Flattening: (batch_size=32, 64, 7, 7)
            nn.Flatten(),
            nn.Linear(64 * 7 * 7, 128),  
            nn.ReLU(),
            nn.Dropout(0.5),
            
            nn.Linear(128, num_classes) 
        )

    def forward(self, x):
        x = self.convolutional_block(x)
        x = self.fully_connected_block(x)
        print(x.shape)
        return x

### <font color='red'> **Train_and_Evaluate the model**

#### <font color='black'> **Training Phase** (model.train())
###### - **model.train()** : Puts the model in training mode, enabling features like dropout (if used).
###### - **train_loss = 0** :Initializes a variable to accumulate the total training loss over all batches in the current epoch.

#### **Loop Over Training Data** (for images, labels in train_loader):

###### - **images, labels** = images.to(device), labels.to(device): Moves the batch of images and labels to the specified device. and in the same device
###### - **optimizer.zero_grad()** :Clears gradients from the previous step to prevent accumulation.
###### - **outputs = model(images)** : Feeds the images through the model to get predictions (logits).
###### - **loss = loss_fn(outputs, labels)**: Computes the loss by comparing predictions (outputs) to the true labels. (logit to logit) (Cross-Entropy Loss: Used for classification tasks.) 
###### - **loss.backward()**: This step uses backpropagation to calculate the gradients of the loss with respect to each model parameter (e.g., weights and biases). 
     (Updates the .grad attribute of each parameter in the model with its respective gradient)
###### - **optimizer.step()**: Updates the model's parameters using the computed gradients to minimize the loss.
###### - **train_loss += loss.item()** : Accumulates the loss value for the current batch.


#### **Validation Phase (model.eval())**
###### - **model.eval()**: Puts the model in evaluation mode, disabling features like dropout and batch normalization updates.
###### - **val_loss = 0, all_preds = [], all_labels = []**: Initializes validation loss and empty lists to store predictions and true labels for later evaluation.

#### **Validation Loop (for images, labels in val_loader)**:
###### - **with torch.no_grad()**: Disables gradient computation to save memory and speed up inference.  (No gradients are calculated or stored during the operations inside the with block) | (The .grad attributes of the model parameters are not updated.)
###### - **outputs = model(images)**: Feeds the validation images through the model to get predictions.
###### - **loss = loss_fn(outputs, labels)**:Computes the loss for the current batch of validation data.
###### - **val_loss += loss.item()**: Accumulates validation loss over all batches.
###### - **preds = outputs.argmax(dim=1)**: Selects the predicted class (index of the highest score) for each sample in the batch.
###### - **all_preds.extend(preds.cpu().numpy())**:Stores predictions (moved to CPU for compatibility with NumPy). the same with all_labels

###### - **val_loss /= len(val_loader)**: Computes the average validation loss over all batches.

In [5]:
def train_and_evaluate(model, train_loader, val_loader, loss_fn, optimizer, epochs, device="cpu"):

    model.to(device)

    for epoch in range(epochs):

        # Training Phase
        model.train()

        train_loss = 0  
        
        for images, labels in train_loader:
            images, labels = images.to(device), labels.to(device)
            
            optimizer.zero_grad() 

            outputs = model(images) 

            loss = loss_fn(outputs, labels) #The loss_fn compares the logits (outputs) with the true labels (labels).

            loss.backward() 

            optimizer.step() 
            
            train_loss += loss.item() 
        
        train_loss /= len(train_loader) #avg for the epoch

        #---------------------------------------------------------------------------------------------------------------------
        # Validation Phase

        model.eval()

        val_loss = 0
        all_preds, all_labels = [], []

        with torch.no_grad(): 
            for images, labels in val_loader:
                images, labels = images.to(device), labels.to(device)
                outputs = model(images)
                loss = loss_fn(outputs, labels)
                val_loss += loss.item()
                
                preds = outputs.argmax(dim=1) #: predicted class for each sample in the batch .
                all_preds.extend(preds.cpu().numpy())
                all_labels.extend(labels.cpu().numpy())
        
        val_loss /= len(val_loader)
        accuracy = accuracy_score(all_labels, all_preds)
        precision, recall, f1, _ = precision_recall_fscore_support(all_labels, all_preds, average="weighted")
        
        print(f"Epoch {epoch+1}/{epochs}:")
        print(f"Train Loss: {train_loss:.4f}, Val Loss: {val_loss:.4f}, Accuracy: {accuracy:.4f}")
        print(f"Precision: {precision:.4f}, Recall: {recall:.4f}, F1 Score: {f1:.4f}")

### <font color='red'> **The Test function** 

##### <font color='black'> - **have the same princeples of the validation loop** 

In [6]:
# Test function
def test_model(model, test_loader, device="cpu"):
    model.eval()
    all_preds, all_labels = [], []
    with torch.no_grad():
        for images, labels in test_loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            preds = outputs.argmax(dim=1)
            all_preds.extend(preds.cpu().numpy())
            all_labels.extend(labels.cpu().numpy())
    
    accuracy = accuracy_score(all_labels, all_preds)
    precision, recall, f1, _ = precision_recall_fscore_support(all_labels, all_preds, average="weighted")
    print(f"Test Accuracy: {accuracy:.4f}")
    print(f"Precision: {precision:.4f}, Recall: {recall:.4f}, F1 Score: {f1:.4f}")

### <font color='red'> **Instanciation of CNN_Neural_network**

In [7]:
device = "cuda" if torch.cuda.is_available() else "cpu"

num_classes = 10  #[0,1,2,3,4,5,6,7,8,9]

model = CNN_Neural_network(num_classes)

optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

loss_fn = nn.CrossEntropyLoss()

# Train the model
epochs = 10
train_and_evaluate(model, train_loader, val_loader, loss_fn, optimizer, epochs, device)

# Test the model
test_model(model, test_loader, device)


torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([32, 10])
torch.Size([3