
# Convolutional Neural Network

## Background
A Convolutional Neural Network (CNN or ConvNet) is a type of deep learning model most commonly used for analyzing visual imagery. As a system inspired by the mammel's visual cortex, it's the technology behind tasks like image recognition, object detection, and facial recognition.

Computer vision is the primary and most common use of ConvNet.  

- Social media platforms use it to automatically suggest tags for people (facial recognition) or objects in your photos.
- Cashier-less stores (like Amazon Go) use it to track what items you pick up.
- Self-Driving cars use it to detect and react to people, other cars, traffic lights, and road signs.

While most famous for images, CNNs can also be applied to sequential data :

- Sentiment Analysis: Determining if a product review or tweet is positive, negative, or neutral.
- Text Classification: Automatically sorting documents or articles by topic.
- Speech Recognition: Used in virtual assistants (like Siri or Google Assistant) to help process and understand the sounds in your voice.
- Finance: Analyzing time-series data (like stock charts) to detect patterns or fraudulent activity.

## Objectives

1. Review data loading and pre-processing using PyTorch. 
2. Practice construction of Convolutional Neural Network
3. Practice model training with PyTorch.

<font color=582c83>

## Exercises

1. (30%) Exercise 1: ConvNet construction
2. (20%) Exercise 2: Training function
3. (10%) Exercise 3: Validationion function
4. (20%) Exercise 4: Model optimization
5. (20%) Exercise 5: New images classification

</font>

## 1. Load Data
We will train a deep learning model using the ImageNette dataset.
Imagenette is a subset of 10 easily classified classes from Imagenet (tench, English springer, cassette player, chain saw, church, French horn, garbage truck, gas pump, golf ball, parachute).

- The raw images in ImageNette dataset have various resolutions, and we will resize them to 160x120x3.
- Two dataloaders will be created for easily batching in training and validation.  

In [None]:
import torch
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms.v2 import Compose, ToImage, RGB, Resize, ToDtype
import matplotlib.pyplot as plt

# Construct transform pipeline for input features
transform_pipeline = Compose([
    ToImage(),
    RGB(),
    Resize((160, 120)),
    ToDtype(torch.float32, scale=True),
])

# Download datasets
dataset_train = datasets.Imagenette(
    root="data",
    split="train",
    size="160px",
    download=True,
    transform=transform_pipeline,
)
labels = [c[0] for c in dataset_train.classes]

dataset_val = datasets.Imagenette(
   root="data",
    split="val",
    size="160px",
    download=True,
    transform=transform_pipeline,
)


print(f"Number of training samples: {len(dataset_train)}")
print(f"Number of validation samples: {len(dataset_val)}")
print(f"Features shape: {dataset_train[0][0].shape}")
print(f"Categories: {labels}")

# Create dataloaders
batch_size = 64
dataloader_train = DataLoader(dataset_train, batch_size=batch_size, shuffle=True)
dataloader_val = DataLoader(dataset_val, batch_size=batch_size, shuffle=False)


# Visualize data samples
sample_batch_train = next(iter(dataloader_train))
fig, axs = plt.subplots(5, 5, figsize=(10, 10))
for i in range(25):
    sample_img = sample_batch_train[0][i].permute(1, 2, 0).numpy()  # reconstruct image to (H, W, C)
    sample_cls = sample_batch_train[1][i].item()
    sample_lbl = labels[sample_cls]
    axs[i//5, i%5].imshow(sample_img)
    axs[i//5, i%5].set_title(sample_lbl)
    axs[i//5, i%5].axis("off")
plt.tight_layout()


## 2. Convolutional Neural Network (ConvNet) Construction

Design your ConvNet's architecture to classify images (shape: `(3, 160, 120)`) into 10 categories.
The output feature matrix dimension can be computed using the following equation.

$$
W_{out} = \frac{W_{in} - K + 2P}{S} + 1
$$
Given the dimension of the input matrix (either horizontal or vertical), $W_{in}$ is the length of the corresponding input matrix's dimension, $K$ is the kernel/filter length, $P$ is the padding length, $S$ is the stride of convolution. $W_{out}$ is the length of the corresponding output matrix's dimension.

For example, if the input matrix is with shape `(5, 8)`, a `(3, 3)` kernel applied on 1 pixel padded input with stride of 2 will output a `(3, 4)` matrix.

> Check this [post](https://www.geeksforgeeks.org/machine-learning/cnn-introduction-to-padding/) for more details.

### <font color=#582c83> (30%) Exercise 1: ConvNet construction </font>



In [None]:
import torch.nn as nn


class ConvNet(nn.Module):
    def __init__(self):
        super().__init__()
### START CODE HERE ###

    def forward(self, x):

### END CODE HERE ###
        return y


device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device}.")
model = ConvNet().to(device)  # use GPU if available
# print(model)`(5, 8)`
from torchinfo import summary
summary(model, input_size=(batch_size, 3, 160, 120))

## 3. ConvNet Training

### <font color=#582c83> (20%) Exercise 2: Training function </font>
Model optimization in one epoch using all samples in the dataset which are organized in batches.

Repeat until all batches are used:
1. Get a batch of features and lables.
2. Make prediction.
3. Calculate loss
4. Compute gradients of loss with back-propagation.
5. Update model parameters

In [None]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    sum_losses, sum_correct_counts = 0, 0
    for batch_index, (X, y) in enumerate(dataloader):
        ### START CODE HERE ### (~ 6 lines)
        X, y = None, None
        # Compute prediction error
        batch_preds = None
        batch_loss = None
        # Backpropagation
        None
        None
        None
        ### END CODE HERE ###
        # Stats
        sum_losses += batch_loss.item()
        sum_correct_counts += (batch_preds.argmax(1) == y).type(torch.float).sum().item()
        # Log
        if batch_index % 50 == 0:  # print every 50 batches
            print(f"Training batch: [{(batch_index+1)*len(y):>5d}/{size:>5d}] loss: {batch_loss.item():>7f}")
    # Summarize epoch metrics
    epoch_loss = sum_losses / len(dataloader)
    epoch_accuracy = sum_correct_counts / size
    print(f"Training: \n Accuracy: {(100*epoch_accuracy):>0.1f}%, Avg loss: {epoch_loss:>8f} \n")
    return epoch_loss, epoch_accuracy



### <font color=#582c83> (10%) Exercise 3: Validationion function </font>
Classification accuracy is an important metric. 
Calculate accuracy using validation dataset. 

In [None]:
def validate(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    model.eval()
    sum_losses, sum_correct_counts = 0, 0
    with torch.no_grad():   
        for X, y in dataloader:
            ### START CODE HERE ### (~ 3 lines)
            X, y = None, None
            pred = None
            sum_losses += loss_fn(pred, y).item()
            sum_correct_counts += None
            ### END CODE HERE ###
    # Summarize epoch metrics
    epoch_loss = sum_losses / len(dataloader)
    epoch_accuracy = sum_correct_counts / size
    print(f"Validation: \n Accuracy: {(100*epoch_accuracy):>0.1f}%, Avg loss: {epoch_loss:>8f} \n")
    return epoch_loss, epoch_accuracy


### <font color=#582c83> (20%) Exercise 4: Model optimization </font>
Use appropriate hyper-parameters and functions to optimize the ConvNet model.

> - `SGD` is not the only optimization [algorithm](https://docs.pytorch.org/docs/stable/optim.html#algorithms)
> - Pick a reasonable [loss function](https://docs.pytorch.org/docs/stable/nn.html#loss-functions).

<font color=red> Validation accuracy is expected to beyond **70%** </font>

In [None]:
import torch.optim as optim
            
### START CODE HERE ### (~ 5 lines)
# Initialize model
model = ConvNet().to(device)
# Hyperparameters
loss_fn = None
optimizer = None
num_epochs = None
# Metrics storage
losses_train, losses_val = [], []
accuracies_train, accuracies_val = [], []
# Training loop
for t in range(num_epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    ep_loss_train, ep_acc_train = None
    ep_loss_val, ep_acc_val = None
    losses_train.append(ep_loss_train)
    accuracies_train.append(ep_acc_train)
    losses_val.append(ep_loss_val)
    accuracies_val.append(ep_acc_val)
print("Done!")
### END CODE HERE ###

# Visualize training metrics
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(range(num_epochs), losses_train, range(num_epochs), losses_val)
plt.legend(['Train', 'Validation'])
plt.xlabel("Epoch #")
plt.ylabel("Cross Entropy Loss")
plt.subplot(1, 2, 2)
plt.plot(range(num_epochs), accuracies_train, range(num_epochs), accuracies_val)
plt.legend(['Train', 'Validation'])
plt.xlabel("Epoch #")
plt.ylabel("Accuracy (%)")
plt.ylim(0, 1)

# Save model
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

### 4. Test Model
There are 10 pictures prepared in the [test_images](./test_images/) directory (1 testing image in each category).
Please classify these images using the trained model.

### <font color=#582c83> (20%) Exercise 5: New images classification </font>

In [None]:
from pathlib import Path
from torchvision.io import decode_image, ImageReadMode

# Glob test files
test_dir = Path.cwd() / "test_images"  # locate dataset directory from this repo in the whole system
test_files = list(test_dir.glob("**/*.jpg"))
# print(test_files)

# Predict test images with trained model
model.eval()
with torch.no_grad():  # ensure model will not be updated
    for file in test_files:
        img_raw = decode_image(file, mode=ImageReadMode.RGB)  # decode image file to pytorch tensor
        ### START CODE HERE ### (~ 3 lines)
        img_resize = None  # resize image to comply with model input size
        image_test = None  # rescale and add batch dimension
        pred_test = None  # predict with trained model
        ### END CODE HERE ###
        print(f"Predicted {file.name} class: {labels[pred_test.argmax().item()]}")

# Congrats on finishing this assignment!