<a href="https://colab.research.google.com/github/christophergaughan/PyTorch/blob/main/ComputerVision_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Computer Vision- Using PYTorch

**Basis**

pixels are read as RGB colors and turned into --> numbers (tensors) or `numerical encoding` --> model (algorithm) --> output probability that the image is X ot Y or Z

**Details**
 Tensors contain the following information:
 1. Width of image
 2. Height of image
 3. Color channels == 3 (RGB)
 depending on what algorithm you're working with data as tensors whose ID is as follows:

 [batch_size, height, width, color_channels] OR [batch_size, color_channels, height, width]

 These will be mainly CNN models

 We will be working with `torch.nn.Conv2d`

 ## Computer version libraries in PyTorch

* `torchvision`- base domain library for PyTorch computer vision-
  https://pytorch.org/vision/stable/index.html
* `torchvision.datassets`get datasets and loading functions here:
  https://pytorch.org/vision/stable/datasets.html#built-in-datasets
* `torchvision.models` get pre-trained computer vision models i.e. have pretrained weights, etc. that you can leverage for your own problems.
* `torchvision.transforms`- functions for manipulating your vision data (images) to be suitable for use with an ML model.
* `torch.utils.Dataset`- Base dataset class for PyTorch.
* `torch.utils.data.DataLoader` - Creates a Python iterable over a dataset

Torchvision supports common computer vision transformations in the torchvision.transforms and torchvision.transforms.v2 modules. Transforms can be used to transform or augment data for training or inference of different tasks (image classification, detection, segmentation, video classification).

* PIL is the Python Imaging Library by Fredrik Lundh and contributors.

### torchvision.datasets

All datasets are subclasses of torch.utils.data.Dataset i.e, they have __getitem__ and __len__ methods implemented. Hence, they can all be passed to a torch.utils.data.DataLoader which can load multiple samples parallelly using torch.multiprocessing workers. For example:
```
imagenet_data = torchvision.datasets.ImageNet('path/to/imagenet_root/')
data_loader = torch.utils.data.DataLoader(imagenet_data,
                                          batch_size=4,
                                          shuffle=True,
                                          num_workers=args.nThreads)
```

In [None]:
import torch
import torchvision
from torchvision import datasets
from torchvision import transforms
from torchvision.transforms import ToTensor
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
import numpy as np
import matplotlib.pyplot as plt

print(torch.__version__)
print(torchvision.__version__)

## Getting a dataset

we will be using `fashion.mnist` datset- greyscale images of clothing
basic dataset for implementation here

Be aware that IMAGENET  is the gold standard for computer vision evaluations

`torchvision.datasets.FashionMNIST(root: str, train: bool = True, transform: Union[Callable, NoneType] = None, target_transform: Union[Callable, NoneType] = None, download: bool = False) → None[source]`

### Fashion-MNIST Dataset.

Parameters:
* **root (string)** – Root directory of dataset where FashionMNIST/processed/training.pt and FashionMNIST/processed/test.pt exist.
* **train (bool, optional)** – If True, creates dataset from training.pt, otherwise from test.pt.
* **download (bool, optional)** – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop
* **target_transform (callable, optional)** – A function/transform that takes in the target and transforms it.

In [None]:
# Setup Training data
train_data = datasets.FashionMNIST(
    root="data", # where to download data to
    train=True, # do we want the training dataset?
    download=True, # do we want to download?
    transform=torchvision.transforms.ToTensor(), # how to transform the data
    target_transform=None # how do we want to transform the labels/target
)

test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=torchvision.transforms.ToTensor(),
    target_transform=None
)



In [None]:
len(train_data), len(test_data)

In [None]:
# See the first training data- this will output the data as tensors (C x H x W) NOTE: grey scale images only have 1 color channel
image, label = train_data[0]
image, label

In [None]:
class_names = train_data.classes
class_names

In [None]:
class_to_idx = train_data.class_to_idx
class_to_idx

In [None]:
train_data.targets

In [None]:
# Check shape of our image
print(f"Image Shape: {image.shape} --> [color_channels, height, width], Image Label: {class_names[label]}")

## Visualizing our data

In [None]:
image, label = train_data[0]
print(f"Image Shape: {image.shape}")
plt.imshow(image.squeeze(), cmap="gray") # had to remove a dimension so it would plot
plt.title(class_names[label])
plt.axis("off")
plt.imshow(image.squeeze())
# image

In [None]:
# Plot more images
torch.manual_seed(42)
fig = plt.figure(figsize=(9, 9))
row, cols = 4, 4
for i in range(1, row * cols + 1):
    random_idx = torch.randint(0, len(train_data), size=[1]).item()
    img, label = train_data[random_idx]
    fig.add_subplot(row, cols, i)
    plt.imshow(img.squeeze(), cmap="gray")
    plt.title(class_names[label])
    plt.axis(False)

## Check Input/Output shapes of Data

In [None]:
print(f"Image Shape: {image.shape}")
print(f"Image Label: {class_names[label]}")

Visualizing data

In [None]:
image, label = train_data[0]
print(f"Image Shape: {image.shape}")
plt.imshow(image.squeeze(), cmap="plasma") # had to remove a dimension so it would plot b/c shape issue (1, 28, 28) and output data is not correlating with image size it is looking for, in this case it expects color channels to be last the squeze gets rid of the 1 in [1, 28, 28]
plt.title(class_names[label])
plt.axis("off")

In [None]:
from matplotlib import colormaps
list(colormaps)

In [None]:
# Plot more images
torch.manual_seed(42)
fig = plt.figure(figsize=(9, 9))
row, cols = 4, 4
for i in range(1, row * cols + 1):
    random_idx = torch.randint(0, len(train_data), size=[1]).item()
    img, label = train_data[random_idx]
    fig.add_subplot(row, cols, i)
    plt.imshow(img.squeeze(), cmap="gray")
    plt.title(class_names[label])
    plt.axis(False);

Can these items of clothing (images) could be modelled with linear lines only? Or is it the case we will have to introduce some non-linearity? Just a thought.

In [None]:
train_data, test_data

## Prepare DataLoader

Right now, our data is in the form of PyTorch Datasets.

DataLoader turns our dataset into Python iterable.

More specifically, we want to turn our data into batches (or mini-batches)

Q) Why do we do this?

A) The data takes up memory, and we have 60,000 training mages and 10,000 testing images. To alleviate this memeory load, we break the data up into batches. More Specifically:

1. It is more computationally efficient, as in, your computing hardware may not be able to look at (store in memory) 60000 images at once. Thus we brak these images up into batches of 32 (batch_size=32). This is a very common batch size.
2. It gives our neural network more chances to update it's gradients per epoch. See video by Andrew ng: https://www.youtube.com/watch?v=4qJaSmvhxi8 for more info about this.
3. One parameter in the DataLoader is `shuffle`. We want to be able to shuffle the data incase there is some pre-determined order to our data and this helps randomize the images the training loop sees without that order grafted onto our model, thus producing a poor model. We don't want our model to 'memorize' the data.


In [None]:
# Batchify our dataset
from torch.utils.data import DataLoader
BATCH_SIZE = 32
# Turn our datasets into iterables (batches)
train_dataloader = DataLoader(dataset=train_data,
                              batch_size=BATCH_SIZE,
                              shuffle=True)

test_dataloader = DataLoader(dataset=test_data,
                             batch_size=BATCH_SIZE,
                             shuffle=False) # we don't shuffle the test dataset

train_dataloader, test_dataloader

In [None]:
# Let's check out what we've created
print(f"Length of train dataloader: {len(train_dataloader)} batches of {BATCH_SIZE}")
print(f"Length of test dataloader: {len(test_dataloader)} batches of {BATCH_SIZE}")

## Check out what is inside the training dataloader

In [None]:
train_features_batch, train_labels_batch = next(iter(train_dataloader))
train_features_batch.shape, train_labels_batch.shape

Note above, the color channels are first

In [None]:
torch.manual_seed(42)
random_idx = torch.randint(0, len(train_features_batch), size=[1]).item()
image, label = train_features_batch[random_idx], train_labels_batch[random_idx]
plt.imshow(img.squeeze(), cmap="gray")
plt.title(class_names[label])
plt.axis(False)
print(f"Image Shape: {image.shape}")
print(f"Label: {label}, label_size: {label.shape}")

## Model 0: Build a baseline model

When starting to build a series of machine learning modelling experiments, it's best practice to start with a *baseline model*

A baseline model in a model you will try to improve upon with subsequent models/expt's

AKA: start simply and add/ experiment with complexity when necessary 🧪

In [None]:
# Create a flattened layer
flatten_model = torch.nn.Flatten()

# Get a single sample
x = train_features_batch[0]
x.shape
# Flatten the sample
output = flatten_model(x)

# Print out what happened
print(f"Shape before flattening: {x.shape} -> [color_channels, height, width]")
print(f"Shape after flattening: {output.shape} -> [color_chanells, height*width]")

we can see the batch size and the product of 78x78

In [None]:
import torch
from torch import nn

torch.manual_seed(42)

class FashionMNISTModelV0(nn.Module):  # Inherit from nn.Module
    def __init__(self, input_shape: int, hidden_units: int, output_shape: int):
        super().__init__()
        self.layer_stack = nn.Sequential(  # Correct the attribute name
            nn.Flatten(),
            nn.Linear(in_features=input_shape, out_features=hidden_units),
            nn.Linear(in_features=hidden_units, out_features=output_shape)
        )

    def forward(self, x):
        return self.layer_stack(x)  # Use the correct attribute name


In [None]:
model_0 = FashionMNISTModelV0(
    input_shape=784,
    hidden_units=10,
    output_shape=len(class_names)
).to("cpu")  # Move model to CPU
print(model_0)

In [None]:
dummy_x = torch.rand([1, 1, 28, 28])
model_0(dummy_x)

In [None]:
model_0.state_dict()

## Setup loss, optimizer and evaluation metrics

* Loss function- since we're working with multi-class data, our loss function will be `nn.CrossEntropyLoss()`
* Optimizer - our optimizer `torch.optim.SGD()`
* Evaluation Metric- since this is a classification problem, we'll use Accuracy


In [None]:
import requests
from pathlib import Path

# Download helper function for accuracy from learn PyTorch.repo
if Path("helper_functions.py").is_file():
  print("helper_functions.py already exists, skipping download....")
else:
  print("Downloading helper_functions.py")
  request = requests.get("https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/helper_functions.py")
  with open("helper_functions.py", "wb") as f:
    f.write(request.content)

In [None]:
# Import accuracy metric
from helper_functions import accuracy_fn

In [None]:
accuracy_fn(torch.tensor([[0.2, 0.5, 0.3]]), torch.tensor([2]))

In [None]:
# Setup loss and optimizer functions
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(params=model_0.parameters(), lr=0.1)

## Creating a function to time our experiments

We need to be cognizant of the fact that Machine/Deep Learning is very experimental. These experiments can be very costly with respect to the resources that they require in terms of memory and GPU usage. When scaled up to very large jobs, you might find that the added complexity also comes at the cost of <u>*time*</u>.

Thus two main things we'll keep track  (we'll find there is a trade-off between these):
1. Model's performance (loss and accuracy values, etc.)
2. How fast model runs.

We are already tracking our model wrt lossfunction and accuracy, let's explore the time dimension below. Since we'll be using `timeit`, here's where to find the documentation: https://docs.python.org/3/library/timeit.html

The default timer, which is always `time.perf_counter()`, returns float seconds. An alternative, `time.perf_counter_ns`, returns integer nanoseconds.

```python
class timeit.Timer(stmt='pass', setup='pass', timer=<timer function>, globals=None)
```

In [None]:
from timeit import default_timer as timer

def print_train_time(start: float, end: float, device: torch.device = None):

    '''
    prints difference between start and end time
    '''
    total_time = end - start
    print(f"Train time on {device}: {total_time:.3f} seconds")
    return total_time

In [None]:
start_time = timer()
end_time = timer()
print_train_time(start=start_time, end=end_time, device=None)
print_train_time(start=start_time, end=end_time, device="cpu")

## Creating a training loop and training a model on batches of the data
remember: the optimizer will update a model's parameters once per batch rather than one per epoch....

key steps:
1. Loop through the epochs
2. Loop through training batches, perform training steps, calculate the loss *per batch*
3. Loop through testing batches, perform testing steps, calculate the loss *per batch*
4. print out what's happening
5. time it all

### NOTE Below we are iterating and keeping count of the accumulated `train_loss` below. Here are some specific details about the use of the `enumerate()` function and how it is being used:

1. `train_dataloader`: This is an iterable object, such as a PyTorch `DataLoader`, which provides batches of data (`X`) and corresponding labels (`y`) for training our machine learning model.

2. `enumerate(train_dataloader)`: The `enumerate` function iterates over `train_dataloader` and, in addition to yielding each batch of data `(X, y)`, it also provides an index (`batch`) for the current iteration. The `batch` variable represents the batch number, starting from 0 by default.

### Purpose of enumerate in this loop:
1. Tracking batch indices: The `batch` variable allows you to keep track of which batch is being processed. This can be useful for:

* Logging or debugging (e.g., printing the batch number during training).
* Performing specific actions at certain batch intervals (e.g., saving a model every 100 batches).
* Analyzing batch-specific metrics.
2. Improved readability: By using `enumerate`, you don't have to manually maintain a counter variable and increment it in each iteration. It keeps the code concise and clean.

**Here’s how it might be used in practice in the generic sense:**
```
for batch, (X, y) in enumerate(train_dataloader):
    print(f"Processing batch {batch}")
    # Perform training step
    output = model(X)
    loss = loss_function(output, y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

```
Here:
* `batch` keeps track of the current batch number.
* `(X, y)` contains the features (independent vars) and labels (target) for that batch.
#### Using `enumerate` is a common practice in Python loops whenever you need both the index and the elements of an iterable.

In [None]:
# import tqdm for progress bar- .auto recognizes programming environment
from tqdm.auto import tqdm

# Set seed and start timer
torch.manual_seed(42)
train_time_start_on_cpu = timer()

#Set the number of epochs (keep small for faster training time)
epochs = 3

# Create training and test loop
for epoch in tqdm(range(epochs)):
    print(f"Epoch: {epoch}\n-------")
    # Training
    train_loss = 0
    # Add loop through training batches
    for batch, (X, y) in enumerate(train_dataloader):
        model_0.train()
        # Forward pass
        y_pred = model_0(X)

        # Calculate loss (per batch)
        loss = loss_fn(y_pred, y)
        # accumulate the training loss per batch
        train_loss += loss
        # Optimizer zero grad
        optimizer.zero_grad()
        # Loss backward
        loss.backward()
        # Optimizer step
        optimizer.step() # updating our models parameters per batch

        # Print out how many samples have been seen
        if batch % 400 == 0:
            print(f"Looked at {batch * len(X)}/{len(train_dataloader.dataset)} samples")

    # Come back to the epoch loop and divide total train loss by length of train dataloader
    train_loss /= len(train_dataloader)

    # Testing
    test_loss, test_acc = 0, 0
    model_0.eval()
    with torch.inference_mode():
        for X_test, y_test in test_dataloader:
            # Forward pass
            test_pred = model_0(X_test)

            # calculate the loss (accumulated)
            test_loss += loss_fn(test_pred, y_test)

            # Calculate the accuracy (accumulated)
            test_acc += accuracy_fn(y_true=y_test, y_pred=test_pred.argmax(dim=1)) # getting the logit value with the highest idx and that is the pred label

        # Scale loss and acc
        test_loss /= len(test_dataloader)
        #Calculate the test accuracy
        test_acc /= len(test_dataloader)

    # print out what's happening
    print(f"\nTrain loss: {train_loss:.4f} | Test loss: {test_loss:.4f}, Test acc: {test_acc:.4f}")

    # Calculate training time
    train_time_end_on_cpu = timer()
    total_train_time_model_0 = print_train_time(start=train_time_start_on_cpu, end=train_time_end_on_cpu, device=str(next(model_0.parameters()).device))


## Evaluate the model_0 and make predictions: This is us functionalizing this step for use on any model

Also note that the argmax is finding the index of the highest logit value. The raw outputs of our model are logits and if we ant to convert them into labels we could use the softmax function but here we use the argmax.

In [None]:
torch.manual_seed(42)
def eval_model(model: torch.nn.Module, data_loader: torch.utils.data.DataLoader, loss_fn: torch.nn.Module, accuracy_fn):
    '''
    Returns a dictionary containing the results of model predicting on data_loader
    '''
    loss, acc = 0, 0
    model.eval()
    with torch.inference_mode():
        for X, y in tqdm(data_loader):
            # Make predictions
            y_pred = model(X) # note, don't have to specify model, see above

            # Accumulate the loss and acc values per batch
            loss += loss_fn(y_pred, y)
            acc += accuracy_fn(y_true=y, y_pred=y_pred.argmax(dim=1))

        # Scale loss and acc to find the average loss/acc per batch
        loss /= len(data_loader)
        acc /= len(data_loader)
    return {"model_name": model.__class__.__name__, # only works when model was created with a class
            "model_loss": loss.item(),
            "model_acc": acc}

# Calculate model 0 ewsuts on test dataset
model_0_results = eval_model(model=model_0,
                             data_loader=test_dataloader,
                             loss_fn=loss_fn,
                             accuracy_fn=accuracy_fn)
model_0_results

Set up device agnostic code to run on gpu

In [None]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

In [None]:
!nvidia-smi

## Our model did ok using no nonliearity, however now we will employ some non-liear functions

In past notebooks we've learned about the power of non-liearity in evaluating data. Let's put that to the test.

In [None]:
import torch
from torch import nn

class FashionMNISTModelV1(nn.Module):
    def __init__(self, input_shape: int, hidden_units: int, output_shape: int):
        """
        Initializes the FashionMNISTModelV1 model.

        Args:
            input_shape (int): The number of input features (e.g., 28*28 for flattened images).
            hidden_units (int): The number of hidden units in the first linear layer.
            output_shape (int): The number of output features (e.g., 10 for FashionMNIST classes).
        """
        super(FashionMNISTModelV1, self).__init__()
        self.layer_stack = nn.Sequential(
            nn.Flatten(),  # Flatten the inputs into a single vector
            nn.Linear(in_features=input_shape, out_features=hidden_units),
            nn.ReLU(),
            nn.Linear(in_features=hidden_units, out_features=output_shape),
            nn.ReLU(),
        )

    def forward(self, x: torch.Tensor):
        """
        Defines the forward pass of the model.
        """
        return self.layer_stack(x)





In [None]:
# Define parameters
input_shape = 28 * 28  # For flattened 28x28 images
hidden_units = 10  # Number of hidden units
output_shape = 10  # Number of classes in FashionMNIST (e.g., 10 classes)

# Define device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Initialize the model
torch.manual_seed(42)
model_1 = FashionMNISTModelV1(input_shape=input_shape, hidden_units=hidden_units, output_shape=output_shape).to(device)

# Verify model device
print(next(model_1.parameters()).device)



In [None]:
# Setup loss and optimizer functions
loss_fn = nn.CrossEntropyLoss() # measures how far from test values our model is
optimizer = torch.optim.SGD(params=model_1.parameters(), lr=0.1) # tries to update our models parameters to improve performance/ reduce loss

## Funtionalize training/evaluation loop

Let's create a function for:
* training loop - `train_step()`
* testing loop - `test_step()`

In [None]:
def train_step(model: torch.nn.Module,
               data_loader: torch.utils.data.DataLoader,
               loss_fn: torch.nn.Module,
               optimizer: torch.optim.Optimizer,
               accuracy_fn,
               device: torch.device):

    """
    Performs a training step where the model learns on data_loader.

    Args:
        model (torch.nn.Module): The model to train.
        data_loader (torch.utils.data.DataLoader): DataLoader for training data.
        loss_fn (torch.nn.Module): Loss function to optimize.
        optimizer (torch.optim.Optimizer): Optimizer for model parameters.
        accuracy_fn (callable): Function to calculate accuracy.
        device (torch.device): Device to run training on (e.g., 'cuda' or 'cpu').
    """
    # Put the model into training mode
    model.train()

    # Initialize tracking metrics
    train_loss = 0
    train_acc = 0

    # Loop through the training batches
    for batch, (X, y) in enumerate(data_loader):

        # Move data to target device
        X, y = X.to(device), y.to(device)

        # Forward pass - outputs raw logits from the model
        y_pred = model(X)

        # Calculate the loss per batch
        loss = loss_fn(y_pred, y)
        train_loss += loss.item()  # Add scalar value to train_loss

        # Calculate the accuracy per batch
        train_acc += accuracy_fn(y_true=y, y_pred=y_pred.argmax(dim=1))  # Converts logits to labels

        # Zero gradients for the optimizer
        optimizer.zero_grad()

        # Backpropagation
        loss.backward()

        # Optimizer step - update model parameters
        optimizer.step()

    # Calculate average loss and accuracy across all batches
    train_loss /= len(data_loader)
    train_acc /= len(data_loader)

    # Print metrics
    print(f"Train loss: {train_loss:.5f} | Train accuracy: {train_acc:.2f}%")


In [None]:
def test_step(model: torch.nn.Module,
              data_loader: torch.utils.data.DataLoader,
              loss_fn: torch.nn.Module,
              accuracy_fn,
              device: torch.device):
    """
    Performs a testing loop on the given model over the data_loader.

    Args:
        model (torch.nn.Module): The model to test.
        data_loader (torch.utils.data.DataLoader): DataLoader for test data.
        loss_fn (torch.nn.Module): Loss function to calculate the loss.
        accuracy_fn (function): Function to calculate accuracy.
        device (torch.device): Device to run the testing on.
    """
    test_loss, test_acc = 0, 0
    model.eval()

    # Turn on inference mode context manager
    with torch.inference_mode():
        for X, y in data_loader:
            # Send the data to target device
            X, y = X.to(device), y.to(device)

            # Forward pass (raw logits)
            test_pred = model(X)

            # Calculate the loss (accumulated)
            test_loss += float(loss_fn(test_pred, y))

            # Calculate the accuracy (accumulated)
            test_acc += accuracy_fn(y_true=y, y_pred=test_pred.argmax(dim=1))

        # Adjust metrics
        test_loss /= len(data_loader)
        test_acc /= len(data_loader)

        # Print results (adjust based on accuracy_fn behavior)
        print(f"Test loss: {test_loss:.5f} | Test accuracy: {test_acc:.2f}%")


### Let's put our functions to use

In [None]:
import requests
from pathlib import Path

# Download helper function for accuracy from learn PyTorch.repo
if Path("helper_functions.py").is_file():
  print("helper_functions.py already exists, skipping download....")
else:
  print("Downloading helper_functions.py")
  request = requests.get("https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/helper_functions.py")
  with open("helper_functions.py", "wb") as f:
    f.write(request.content)

In [None]:
# Import accuracy metric
from helper_functions import accuracy_fn

In [None]:
torch.manual_seed(42)

# Measure time
from timeit import default_timer as timer
train_start_time_on_gpu = timer()

# Set epochs
epochs = 3

# Create optimimization loop using train_step() and test_step()
for epoch in tqdm(range(epochs)):
    print(f"Epoch: {epoch}\n-------")
    train_step(model=model_1,
               data_loader=train_dataloader,
               loss_fn=loss_fn,
               optimizer=optimizer,
               accuracy_fn=accuracy_fn,
               device=device)
    test_step(model=model_1,
              data_loader=test_dataloader,
              loss_fn=loss_fn,
              accuracy_fn=accuracy_fn,
              device=device)

train_end_time_on_gpu = timer()
total_train_time_model_1 = print_train_time(start=train_start_time_on_gpu,      end=train_end_time_on_gpu, device=device)

In [None]:

model_0_results

In [None]:
total_train_time_model_0, total_train_time_model_1

### It's interesting to see that the model ran on the GPU took about the same time to run as the one on the cpu! This is likely down to the fact that this model isn't that large, and our code for setting up the layers in the CNN is also not that complex.

> **Note**: Sometimes, depending on your data/hardware you might find that your model trains faster on a CPU than a GPU

> Why is this?
> 1. It could be that the overheadfor copying data/model to and from the GPU outweighs the compute benefits offered by the GPU. So there is some extra time involved in copying the data to the GPU.
> 2. The hardware you're using has a better CPU in terms of its capability than the GPU.

See this article about gpu's:
https://horace.io/brrr_intro.html

#### Now to evaluate our model we need to remember to put the results on the gpu!

In [None]:
def eval_model(model: torch.nn.Module, data_loader: torch.utils.data.DataLoader, loss_fn: torch.nn.Module, accuracy_fn, device=device):
    '''
    Returns a dictionary containing the results of model predicting on data_loader
    '''
    loss, acc = 0, 0
    model.eval()
    with torch.inference_mode():
        for X, y in tqdm(data_loader):
            # Make our data device agnostic
            X, y = X.to(device), y.to(device)
            # Make predictions
            y_pred = model(X) # note, don't have to specify model, see above

            # Accumulate the loss and acc values per batch
            loss += loss_fn(y_pred, y)
            acc += accuracy_fn(y_true=y, y_pred=y_pred.argmax(dim=1))

        # Scale loss and acc to find the average loss/acc per batch
        loss /= len(data_loader)
        acc /= len(data_loader)
    return {"model_name": model.__class__.__name__, # only works when model was created with a class
            "model_loss": loss.item(),
            "model_acc": acc}


In [None]:
# Get model_1 results dictionary
model_1_results = eval_model(model=model_1,
                             data_loader=test_dataloader,
                             loss_fn=loss_fn,
                             accuracy_fn=accuracy_fn,
                             device=device)
model_1_results

compared with

In [None]:
model_0_results

## Model 2: Building Convolutional Neural Netowrk (CNN)

CNN's arre known as ConvNET's

CNN's are known for their capabilities to find patterns in visual data

#### Below is a table outlining critical pieces of a CNN

| **Hyperparameter/Layer Type**         | **What does it do?**                                                                 | **Typical Values**                                                                                      |
|----------------------------------------|-------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|
| **Input image(s)**                     | Target images you'd like to discover patterns in                                     | Whatever you can take a photo (or video) of                                                            |
| **Input layer**                        | Takes in target images and preprocesses them for further layers                     | `input_shape = [batch_size, image_height, image_width, color_channels]` (channels last) or              |
|                                        |                                                                                     | `input_shape = [batch_size, color_channels, image_height, image_width]` (channels first)               |
| **Convolution layer**                  | Extracts/learns the most important features from target images                      | Multiple, can create with `torch.nn.ConvXd()` (X can be multiple values)                               |
| **Hidden activation/non-linear activation** | Adds non-linearity to learned features (non-straight lines)                         | Usually ReLU (`torch.nn.ReLU()`), though can be many more                                              |
| **Pooling layer**                      | Reduces the dimensionality of learned image features                                | Max (`torch.nn.MaxPool2d()`) or Average (`torch.nn.AvgPool2d()`)                                       |
| **Output layer/linear layer**          | Takes learned features and outputs them in shape of target labels                   | `torch.nn.Linear(out_features=[number_of_classes])` (e.g., 3 for pizza, steak, or sushi)               |
| **Output activation**                  | Converts output logits to prediction probabilities                                  | `torch.sigmoid()` (binary classification) or `torch.softmax()` (multi-class classification)            |


To find out what's happening inside a CNN, see this website:

https://poloclub.github.io/cnn-explainer/

In [None]:
# Create a CNN
class FashionMNISTModelV2(nn.Module):
    '''
    Model architecture that replicates that replicates the TinyVGG
    model from the CNN Explainer website.
    '''
    def __init__(self, input_shape: int, hidden_units: int, output_shape: int):
        super().__init__()
        self.conv_block_1 = nn.Sequential(
            nn.Conv2d(in_channels=input_shape,
                      out_channels=hidden_units,
                      kernel_size=3,
                      stride=1,
                      padding=1), # values we can set ourselves- hyperparameters
            nn.ReLU(),
            nn.Conv2d(in_channels=hidden_units,
                      out_channels=hidden_units,
                      kernel_size=3,
                      stride=1,
                      padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2,
                         stride=2)
        )
        self.conv_block_2 = nn.Sequential(
            nn.Conv2d(in_channels=hidden_units,
            out_channels=hidden_units,
            kernel_size=3,
            padding=1),
            nn.ReLU(),
            nn.Conv2d(in_channels=hidden_units,
            out_channels=hidden_units,
            kernel_size=3,
            padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2,)
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(in_features=hidden_units*0,
            out_features=output_shape)
        )

    def forward(self, x):
        x = self.conv_block_1(x)
        print(x.shape)
        x = self.conv_block_2(x)
        print(x.shape)
        x = self.classifier(x)
        return x

In [None]:
torch.manual_seed(42)
model_2 = FashionMNISTModelV2(input_shape=1, # since we're working with greyscale images the coloe channel is 1
                              hidden_units=10,
                              output_shape=len(class_names)).to(device)

## Stepping through `nn.Conv2d()`
```
classtorch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)```

docs:

https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html


In [None]:
model_2.state_dict()

In [None]:
torch.manual_seed(42)

# Create a batch of images
images = torch.randn(size=(32, 3, 64, 64))
test_image = images[0]

print(f"Image shape: {image.shape}")
print(f"Single image shape: {test_image.shape}")
print(f"Test image:\n{test_image}")

In [None]:
# Create a single conv2d layer
conv_layer = nn.Conv2d(in_channels=3,
                       out_channels=10,
                       kernel_size=3, # kernel also known as a filter (3x3)
                       stride=1,
                       padding=0)

# pass the data through the convolutional layer
conv_output = conv_layer(test_image.unsqueeze(0))
conv_output