# Intro

 - We can define computer vision as the art of teaching a computer to see.
 - Types of Computer Vision tasks:
  - Binary Classification
  - Multi-class Classification
  - Object Detection
  - Panoptic segmentation

# CV Libraries in PyTorch

- ***torchvision*** contains:
  - Datasets
  - Model architectures
  - Image transformations
- ***torchvision.datasets*** contains:
  - CV datasets
  - series of base classes for making custom datasets
- ***torchvision.models*** contains:
  - CV model architectures
- ***torchvision.transforms*** contains:
  - common image transformations (turning images into numbers, or processing or augmenting images)
---
- ***torch.utils.data.Dataset***: base dataset class for PyTorch
- ***torch.utils.data.Dataloader***: creates a Python iterable over a dataset

These last two classes aren't only for CV tasks, but they can deal with many different data types.

In [None]:
# import dependencies

import torch
from torch import nn
import torchvision
from torchvision import datasets
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt

print(f"torch version: {torch.__version__}\ntorchvision version: {torchvision.__version__}")

# Getting a dataset

- CV dataset: FashionMNIST contains grayscale images of 10 different kinds of clothing.
- MNIST: Modified National Institute of Standards and Technology
- Multiclass problem
- Our task is to identify the type of clothing in an image.


In [None]:
# Training data
train_data = datasets.FashionMNIST(
    root = "data", # root directory
    download= True, # download data to the root directory
    train= True, # get the training set
    transform= ToTensor(), # transform a PIL image to tensor
    target_transform= None # if you want to transform the labels too
)

test_data = datasets.FashionMNIST(
    root="data",
    download=True,
    train= False,
    transform= ToTensor()
)

In [None]:
image, label = train_data[0]
image, label

### Shapes of input and output

- We have a tensor leading to one lablel.


In [None]:
image.shape

- This corresponds to: [color_channels=1, height=28, width=28]; referred to as CHW
- Sometimes images are represented as HWC instead.
- N stands for number of images in NCHW or NHWC.
- NCHW is the default that PyTorch generally expects.
- However, PyTorch states that NHWC is the best practice for better performance when data is large.

In [None]:
train_data, test_data

In [None]:
len(train_data.data), len(train_data.targets)

In [None]:
len(test_data.data), len(test_data.targets)

In [None]:
train_data.classes

### Data Visualization

In [None]:
import matplotlib.pyplot as plt

In [None]:
image, label = train_data[1]
print(f"Image shape: {image.shape}")
plt.imshow(image.squeeze())
plt.title(label)

In [None]:
# using the grayscale

plt.imshow(image.squeeze(), cmap="gray")
plt.title(train_data.classes[label])

In [None]:
# plotting more images

torch.manual_seed(42)

fig = plt.figure(figsize=(9,9))
rows, cols = 4, 4

for i in range(1, rows*cols+1):
    random_idx = torch.randint(0, len(train_data), size=[1]).item()
    img, label = train_data[random_idx]
    fig.add_subplot(rows, cols, i)
    plt.imshow(img.squeeze(), cmap="gray")
    plt.title(train_data.classes[label])
    plt.axis(False);

- Find patterns based on the pixels values.
- 60,000 is considered a small dataset in deep learning.
- Goal: classify each image.

# Create DataLoader

- DataLoader helps load data into a model for training and inference.
- Large dataset are turned into smaller chunks called mini-batches.
- This is computationally more efficient while dealing with large datasets.
- *batch_size* Hyperparameter: You can use it to adjust the mini-batches size. 32 is a good start and powers of 2 are used often.

In [None]:
from torch.utils.data import DataLoader

In [None]:
BATCH_SIZE = 32

train_dataloader = DataLoader(
    dataset=train_data,
    batch_size=BATCH_SIZE,
    shuffle=True
)

test_dataloader = DataLoader(
    dataset=test_data,
    batch_size=BATCH_SIZE,
    shuffle=False
)

In [None]:
print(train_dataloader, test_dataloader)

In [None]:
print(len(train_dataloader), BATCH_SIZE)
print(len(test_dataloader), BATCH_SIZE)

In [None]:
train_features_batch, train_labels_batch = next(iter(train_dataloader))
train_features_batch.shape, train_labels_batch.shape

In [None]:
# Check one sample

torch.manual_seed(42)

rand_idx = torch.randint(0, len(train_features_batch), size=[1]).item()
img, label = train_features_batch[rand_idx], train_labels_batch[rand_idx]
plt.imshow(img.squeeze(), cmap="gray")
plt.title(train_data.classes[label])
plt.axis("off");
print(f"Image size: {img.shape}")
print(f"Label: {label}")

# Baseline Model

- Using *nn.Module* to build the baseline model (simplest imagined model).
- Start with the baseline model then add more complications to it subsequently as needed.
- *nn.Flaten()* layer is used to compress the dimensions of image data (tensor) into a single long vector.

In [None]:
# Try nn.Flaten()

x = train_features_batch[0]

output = nn.Flatten()(x)

print(f"Shape before flattening: {x.shape} -> [color_channels, height, width]")
print(f"Shape before flattening: {output.shape} -> [color_channels, height*width]")

In [None]:
from torch import nn

In [None]:
class FashionMNISTModelV0(nn.Module):
    def __init__(self, input_shape:int, hidden_units:int, output_shape:int):
        super().__init__()
        self.layers_stack= nn.Sequential(
            nn.Flatten(),
            nn.Linear(in_features= input_shape, out_features= hidden_units),
            nn.Linear(in_features= hidden_units, out_features= output_shape)
        )

    def forward(self, x):
        return self.layers_stack(x)

In [None]:
class_names= train_data.classes
class_names

In [None]:
# Instantiate a model
torch.manual_seed(42)

model_0= FashionMNISTModelV0(
    input_shape= 784,
    hidden_units= 10,
    output_shape= len(class_names)
)

model_0.to("cpu")

### Setup loss, optimizer, and evaluation metrics

- Summon helper functions.

In [None]:
import requests
from pathlib import Path

In [None]:
if Path("helper_functions.py").is_file():
    print("helper_funtions.py already exists.")
else:
    print("Downloading helper_functions.py")
    request= requests.get("https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/helper_functions.py") # raw github url
    with open("helper_functions.py", "wb") as f:
        f.write(request.content)

In [None]:
!pip install -q torchmetrics

In [None]:
import torchmetrics

In [None]:
loss_fn= nn.CrossEntropyLoss()
optimizer= torch.optim.SGD(params= model_0.parameters(), lr= 0.1)
accuracy_fn= torchmetrics.Accuracy(task= "multiclass", num_classes= len(class_names)).to("cpu")

### Measure time of our experiments

- We can compare time of training on CPU vs GPU.

In [None]:
from timeit import default_timer as timer

In [None]:
def print_train_time(start: float, end:float, device: torch.device= None):
    """Prints difference between start and end time.

    Args:
        start (float): start time of computation.
        end (float): end time of computation.
        device ([type], optional): device in which the computation is running on (None is the default).
    """

    total_time= end - start
    print(f"Train time on {device}= {total_time:.3f} seconds")
    return total_time

### Training

- Since we are dealing with batches, there will be nested loops.
- Loss and evaluation metrics will be calculated per batch instead of the whole dataset. So, at the end, we divide the loss and evaluation metric by the number of batches (normalization).

In [None]:
from tqdm.auto import tqdm # Gives progress bar

In [None]:
torch.manual_seed(42)
train_start_time_cpu = timer()

EPOCHS= 3 # small number for faster training

for epoch in range(EPOCHS):

    print(f"Epoch: {epoch}\n-------")

    ### Training ###

    train_loss= 0

    for batch, (X,y) in enumerate(train_dataloader):

        model_0.train()

        train_pred = model_0(X)

        curr_loss = loss_fn(train_pred, y)
        train_loss += curr_loss

        optimizer.zero_grad()

        curr_loss.backward()

        optimizer.step()

        if batch % 400 == 0:
            print(f"Looked at {batch*len(X)}/{len(train_dataloader.dataset)} samples")

    # Calculate average train loss per batch
    train_loss /= len(train_dataloader)

    ### Testing ###

    test_loss, test_acc = 0, 0  # to accumulate tesing loss and accuracy

    model_0.eval()

    with torch.inference_mode():

        for X,y in test_dataloader:

            test_pred = model_0(X)

            test_loss += loss_fn(test_pred, y)

            test_acc += accuracy_fn(
                preds = test_pred,
                target = y
            )

        # Calculate avergae test loss per batch
        test_loss /= len(test_dataloader)

        # Calculate average accuracy per batch
        test_acc /= len(test_dataloader)

    print(f"\nTrain loss: {train_loss:.5f} | Test loss: {test_loss:0.5f} | Accuracy: {test_acc:0.5f}")

train_end_time_cpu = timer()

total_train_time_model_0 = print_train_time(
    start = train_start_time_cpu,
    end = train_end_time_cpu,
    device = str(next(model_0.parameters()).device)
)

### Inference and evaluation

- Create a function that takes:
  1. a trained model
  2. dataloader
  3. loss function
  4. accuracy function
- The function should use the data in the dataloader and the model to make predictions and then evaluate those predictions with the loss and accuracy functions.
- Use the results of this function to compare different models.


In [None]:
torch.manual_seed(42)

def eval_model(model: torch.nn.Module,
               data_loader: torch.utils.data.DataLoader,
               loss_fn: torch.nn.Module,
               accuracy_fn: torchmetrics):

    """Returns a dictionary contains the model's predictions on the data_loader.

    Args:
        model (torch.nn.Module): A PyTorch model capable of making predictions on the data_loader
        data_loader (torch.utils.data.DataLoader): The target dataset to predict on.
        loss_fn (torch.nn.Module): A function to calculate the loss for the model's predictions.
        accuracy_fn (torchmetrics): A function to calculate accuracy between the model's predictions and the true labels.

    Returns:
        (dict): The resulting predictions of the model on the dataloader.
    """

    loss, acc = 0, 0

    model.eval()

    with torch.inference_mode():
        for X, y in data_loader:
            preds = model(X)
            loss += loss_fn(preds, y)
            acc += accuracy_fn(
                preds = preds,
                target = y
            )

        # Average loss and accuracy per batch
        loss /= len(data_loader)
        acc /= len(data_loader)

    return {
        "model_name": model.__class__.__name__,
        "model_loss": loss.item(),
        "model_accuracy": acc.item()
    }

In [None]:
# Evaluate model_0

res = eval_model(
    model = model_0,
    data_loader = test_dataloader,
    loss_fn = loss_fn,
    accuracy_fn = accuracy_fn
)

res

# Non-linear Model

In [None]:
# Turning the code to the device-agnostic mode

device= "cuda" if torch.cuda.is_available() else "cpu"
device

- Does our data need non-linearity?
  - Linear = straight
  - Non-linear = non-straight

In [None]:
class FashionMNISTModelV1(nn.Module):

    def __init__(self, input_shape:int, hidden_units:int, output_shape:int):
        super().__init__()
        self.layers_stack = nn.Sequential(
            nn.Flatten(),
            nn.Linear(in_features=input_shape, out_features=hidden_units),
            nn.ReLU(),
            nn.Linear(in_features=hidden_units, out_features=output_shape),
            nn.ReLU()
        )

    def forward(self, x:torch.Tensor):
        return self.layers_stack(x)

- In ML, it is a good practice to start with a baseline model then experiment changing one thing after another.
- This time, we added non-linear functions.

In [None]:
# Instantiate model_1

torch.manual_seed(42)

model_1 = FashionMNISTModelV1(
    input_shape = 784,
    hidden_units = 10,
    output_shape = len(class_names)
).to(device)

next(model_1.parameters()).device

### Setup loss, optimizer, and evaluation metrics

In [None]:
loss_fn = nn.CrossEntropyLoss()

optimizer = torch.optim.SGD(
    params = model_1.parameters(),
    lr = 0.1
)

accuracy_fn = torchmetrics.Accuracy(
    task = "multiclass",
    num_classes = len(class_names)
).to(device)

### Wrapping up training and testing loops

- Create functions that can be repeatedly called:
  - train() will take:
    - model
    - DataLoader
    - loss function
    - optimizer
  - test() will take:
    - model
    - DataLoader
    - loss function
    - evaluation function

In [None]:
def train(model: torch.nn.Module,
          data_loader: torch.utils.data.DataLoader,
          loss_fn: torch.nn.Module,
          optimizer: torch.optim.Optimizer,
          accuracy_fn: torchmetrics.Metric,
          device: torch.device = device):
      """Loop through your data to train your model"""

      train_loss, train_acc = 0,0
      model.to(device)

      for batch, (X,y) in enumerate(data_loader):

          X.to(device), y.to(device)

          preds = model(X)

          loss = loss_fn(preds, y)
          train_loss += loss
          train_acc += accuracy_fn(preds = preds,
                                   target = y)

          optimizer.zero_grad()

          loss.backward()

          optimizer.step()

      # Average loss and accuracy

      train_loss /= len(data_loader)
      train_acc /= len(data_loader)

      print(f"Train Loss: {train_loss:5.f} | Test Loss: {test_loss:.2f}%")


If you found something wrong, please contact me at: muhammadhelmymmo@gmail.com