# Convolutional Neural Networks: Application

**Please type your name and A number here:**

In [None]:
Name = "Kevin Roberts"
assert Name != "", 'Please enter your name in the above quotation marks, thanks!'

A_number = "A02256264"
assert A_number != "", 'Please enter your A-number in the above quotation marks, thanks!'



In this notebook, you will:

- Create a mood classifer using the Torch Sequential API
- Build a ConvNet to identify sign language digits using the Torch Module API

**After this assignment you will be able to:**

- Build and train a ConvNet in PyTorch for a __binary__ classification problem
- Build and train a ConvNet in PyTorch for a __multiclass__ classification problem
- Explain different use cases for the Sequential and Module APIs



## Table of Contents

- [1 - Packages](#1)
    - [1.1 - Load the Data and Split the Data into Train/Test Sets](#1-1)
- [2 - Layers in PyTorch](#2)
- [3 - The Sequential API](#3)
    - [3.1 - Create the Sequential Model](#3-1)
        - [Exercise 1 - happyModel](#ex-1)
    - [3.2 - Train and Evaluate the Model](#3-2)
- [4 - The Module API](#4)
    - [4.1 - Load the SIGNS Dataset](#4-1)
    - [4.2 - Split the Data into Train/Test Sets](#4-2)
    - [4.3 - Forward Propagation](#4-3)
        - [Exercise 2 - convolutional_model](#ex-2)
    - [4.4 - Train the Model](#4-4)
- [5 - History Object](#5)
- [6 - Bibliography](#6)

<a name='1'></a>
## 1 - Packages

As usual, begin by loading in the packages.

In [None]:
### Modified 2/11/2025 by Nathan Nelson for Torch support in place of TensorFlow.

In [None]:
### If you use Google Colab, you can install the torchinfo package by running the following command:
!pip install torchinfo

In [None]:
import math
import numpy as np
import h5py
import matplotlib.pyplot as plt
from matplotlib.pyplot import imread
import scipy
from PIL import Image
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

from torchinfo import summary

# Set seed for reproducibility
torch.manual_seed(1)
np.random.seed(1)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

<a name='1-1'></a>
### 1.1 - Load the Data and Split the Data into Train/Test Sets

You'll be using the Happy House dataset for this part of the assignment, which contains images of peoples' faces. Your task will be to build a ConvNet that determines whether the people in the images are smiling or not -- because they only get to enter the house if they're smiling!  

In [None]:
def load_happy_dataset():     # No need to modify unless using a different directory.
    train_dataset = h5py.File('datasets/train_happy.h5', "r")
    # your train set features
    train_set_x_orig = np.array(train_dataset["train_set_x"][:])
    train_set_y_orig = np.array(
        train_dataset["train_set_y"][:])  # your train set labels

    test_dataset = h5py.File('datasets/test_happy.h5', "r")
    # your test set features
    test_set_x_orig = np.array(test_dataset["test_set_x"][:])
    test_set_y_orig = np.array(
        test_dataset["test_set_y"][:])  # your test set labels

    classes = np.array(test_dataset["list_classes"][:])  # the list of classes

    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes

def load_signs_dataset():   # No need to modify unless using a different directory.
    train_dataset = h5py.File('datasets/train_signs.h5', "r")
    # your train set features
    train_set_x_orig = np.array(train_dataset["train_set_x"][:])
    train_set_y_orig = np.array(
        train_dataset["train_set_y"][:])  # your train set labels

    test_dataset = h5py.File('datasets/test_signs.h5', "r")
    # your test set features
    test_set_x_orig = np.array(test_dataset["test_set_x"][:])
    test_set_y_orig = np.array(
        test_dataset["test_set_y"][:])  # your test set labels

    classes = np.array(test_dataset["list_classes"][:])  # the list of classes

    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes

In [None]:
X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_happy_dataset()

# Normalize image vectors
X_train = X_train_orig/255.
X_test = X_test_orig/255.

# Convert data loading and preprocessing to PyTorch. Conv2D expects the inputs to be in the form ([Number of samples, Channels, Width, Height]), while the data are currently in the form ([Number of samples, Width, Height, Channels]). We can use the permute() function here to quickly change the data to the expected format.
X_train = torch.tensor(X_train, dtype=torch.float32).permute(0, 3, 1, 2)  # Convert to (N, C, H, W)
X_test = torch.tensor(X_test, dtype=torch.float32).permute(0, 3, 1, 2)
Y_train = torch.tensor(Y_train_orig, dtype=torch.float32).T
Y_test = torch.tensor(Y_test_orig, dtype=torch.float32).T

print ("number of training examples = " + str(X_train.shape[0]))
print ("number of test examples = " + str(X_test.shape[0]))
print ("X_train shape: " + str(X_train.shape))
print ("Y_train shape: " + str(Y_train.shape))
print ("X_test shape: " + str(X_test.shape))
print ("Y_test shape: " + str(Y_test.shape))

You can display the images contained in the dataset. Images are **64x64** pixels in RGB format (3 channels).

In [None]:
index = 124
plt.imshow(X_train_orig[index]) #display sample training image
plt.show()

<a name='2'></a>
## 2 - Layers in PyTorch

In Torch, you don't have to write code directly to create layers. Rather, Torch has pre-defined layers you can use.

When you create a layer in Torch, you are creating a function that takes some input and transforms it into an output you can reuse later. Nice and easy!

<a name='3'></a>
## 3 - The Sequential API

Most practical applications of deep learning today are built using programming frameworks, which have many built-in functions you can simply call.

For the first part of this assignment, you'll create a model using Torch's Sequential API, which allows you to build layer by layer, and is ideal for building models where each layer has **exactly one** input tensor and **one** output tensor.

As you'll see, using the Sequential API is simple and straightforward, but is only appropriate for simpler, more straightforward tasks. Later in this notebook you'll spend some time building with a more flexible, powerful alternative: the Module API.


<a name='3-1'></a>
### 3.1 - Create the Sequential Model

As mentioned earlier, the PyTorch Sequential API can be used to build simple models with layer operations that proceed in a sequential order.

You can think of a Sequential model as behaving like a list of layers. Like Python lists, Sequential layers are ordered, and the order in which they are specified matters.  If your model is non-linear or contains layers with multiple inputs or outputs, a Sequential model wouldn't be the right choice!

For any layer construction in Keras, you'll need to specify the input shape in advance. This is because in Keras, the shape of the weights is based on the shape of the inputs. The weights are only created when the model first sees some input data. Sequential models can be created by passing a list of layers to the Sequential constructor, like you will do in the next assignment.

<a name='ex-1'></a>
### Exercise 1 - happyModel

Implement the `happyModel` function below to build the following model: `ZEROPAD2D -> CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> FLATTEN -> LINEAR`. Take help from [torch.nn](https://pytorch.org/docs/stable/nn.html)

Also, plug in the following parameters for all the steps:

 - [nn.ZeroPad2d](https://pytorch.org/docs/stable/generated/torch.nn.ZeroPad2d.html): padding 3 x 3, input shape 3 x 64 x 64
 - [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html): Use 32 out channels with 7x7 filters, stride 1 x 1, from 3 in channels
 - [nn.BatchNorm2d](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html): on 32 features
 - [nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html)
 - [nn.MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html): Using kernel size of 2 x 2
 - [nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html) the previous output.
 - Fully-connected ([nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)) layer: Apply a fully connected layer with 1 output neuron and then a sigmoid activation:
 - [nn.Sigmoid](https://pytorch.org/docs/stable/generated/torch.nn.Sigmoid.html)


<font color='red'> **rubric={30 points}** </font> 

In [None]:
def happyModel():
    """
    Implements the forward propagation for the binary classification model:
    ZEROPAD2D -> CONV2D -> BATCHNORM -> RELU -> MAXPOOL2D -> FLATTEN -> LINEAR -> SIGMOID

    Note that for simplicity and grading purposes, you'll hard-code all the values
    such as the stride and kernel (filter) sizes.
    Normally, functions should take these values as function parameters.

    Arguments:
    None

    Returns:
    model -- PyTorch Sequential model
    """
    #         # YOUR CODE STARTS HERE


    #         # YOUR CODE ENDS HERE

    return model

In [None]:
happy_model = happyModel().to(device)

In [None]:
batch_size = 16
summary(happy_model, input_size=(batch_size, 3, 64, 64))

Now that your model is created, you can prepare it for training with an optimizer and loss of your choice

<font color='red'> **rubric={5 points}** </font> 

In [None]:
criterion =   # YOUR CODE HERE; Choose appropriate loss function
optimizer = optim.Adam(happy_model.parameters())

<a name='3-2'></a>
### 3.2 - Train and Evaluate the Model

After creating the model, we will use DataLoaders to make batch training easier. Use the TensorDataset function followed by the DataLoader function to turn the training data into prepared batches (maybe try a batch size of 16 to start).

In [None]:
train_dataset = TensorDataset(X_train, Y_train)
train_loader = DataLoader(train_dataset, batch_size=16, shuffle=True)

We can now train the model! Make sure to call zero_grad() in every iteration.

<font color='red'> **rubric={15 points}** </font> 

In [None]:
# Training loop
num_epochs = 10

for epoch in range(num_epochs):
    happy_model.train()
    running_loss = 0.0
    correct = 0
    total = 0

    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)

        # YOUR CODE STARTS HERE (Hint: Forward pass, loss computation, backward pass, optimizer step) You may need to update the variable names in the following starter code here based on your implementation.


        # YOUR CODE ENDS HERE

        # Accumulate loss
        running_loss += loss.item()

        # Convert outputs to binary predictions
        predicted = (outputs > 0.5).float()  # Threshold at 0.5
        correct += (predicted == labels).sum().item()
        total += labels.size(0)

    accuracy = 100 * correct / total  # Calculate accuracy

    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {running_loss / len(train_loader):.4f}, Accuracy: {accuracy:.2f}%")



Now that we have trained on part of the dataset, let's test the model on the remaining part to know how it performs on unseen data. Be sure to set the model into eval mode. You can use the same loss function for testing as you did for training.

In [None]:
happy_model.eval()

test_dataset = TensorDataset(X_test, Y_test)
test_loader = DataLoader(test_dataset, batch_size=1, shuffle=False)

total_loss = 0.0
correct = 0
total = 0

# Disable gradient calculation
with torch.no_grad():  # Disable gradient computation for evaluation
    for inputs, labels in test_loader:
        inputs, labels = inputs.to(device), labels.to(device)

        outputs = happy_model(inputs)
        loss = criterion(outputs, labels)

        total_loss += loss.item()

        # Convert outputs to binary predictions (0 or 1)
        predicted = (outputs > 0.5).float()
        correct += (predicted == labels).sum().item()
        total += labels.size(0)

avg_loss = total_loss / len(test_loader)
accuracy = 100 * correct / total

print(f"Test Loss: {avg_loss:.4f}, Test Accuracy: {accuracy:.2f}%")


The test accuracy should be above 80%

Easy, right? But what if you need to build a model with shared layers, branches, or multiple inputs and outputs? This is where Sequential, with its beautifully simple yet limited functionality, won't be able to help you.

Next up: Enter the Module API, your slightly more complex, highly flexible friend.  

<a name='4'></a>
## 4 - The Module API

Welcome to the second half of the assignment, where you'll use Torch's flexible [Module API](https://pytorch.org/docs/stable/generated/torch.nn.Module.html) to build a ConvNet that can differentiate between 6 sign language digits.

The Module API can handle models with non-linear topology, shared layers, as well as layers with multiple inputs or outputs. Imagine that, where the Sequential API requires the model to move in a linear fashion through its layers, the Module API allows much more flexibility. Where Sequential is a straight line, a Module model is a graph, where the nodes of the layers can connect in many more ways than one.

In the visual example below, the one possible direction of the movement Sequential model is shown in contrast to a skip connection, which is just one of the many ways a Module model can be constructed. A skip connection, as you might have guessed, skips some layer in the network and feeds the output to a later layer in the network. Don't worry, you'll be spending more time with skip connections very soon!

<img src="https://raw.githubusercontent.com/amanchadha/coursera-deep-learning-specialization/35547c07c53ba9c06a6fa5866ef4620471717820/C4%20-%20Convolutional%20Neural%20Networks/Week%201/images/seq_vs_func.png" style="width:350px;height:200px;">

<a name='4-1'></a>
### 4.1 - Load the SIGNS Dataset

As a reminder, the SIGNS dataset is a collection of 6 signs representing numbers from 0 to 5.

In [None]:
# Loading the data (signs)
X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_signs_dataset()

<img src="https://raw.githubusercontent.com/amanchadha/coursera-deep-learning-specialization/35547c07c53ba9c06a6fa5866ef4620471717820/C4%20-%20Convolutional%20Neural%20Networks/Week%201/images/SIGNS.png" style="width:800px;height:300px;">

The next cell will show you an example of a labelled image in the dataset. Feel free to change the value of `index` below and re-run to see different examples.

In [None]:
# Example of an image from the dataset
index = 9
plt.imshow(X_train_orig[index])
print ("y = " + str(np.squeeze(Y_train_orig[:, index])))

<a name='4-2'></a>
### 4.2 - Split the Data into Train/Test Sets

In Course 2, you built a fully-connected network for this dataset. But since this is an image dataset, it is more natural to apply a ConvNet to it.

To get started, let's examine the shapes of your data.

In [None]:
def convert_to_one_hot(Y, C):
    Y = np.eye(C)[Y.reshape(-1)].T
    return Y

In [None]:
X_train = X_train_orig/255.
X_test = X_test_orig/255.
X_train = torch.tensor(X_train, dtype=torch.float32).permute(0, 3, 1, 2)  # Convert to (N, C, H, W)
X_test = torch.tensor(X_test, dtype=torch.float32).permute(0, 3, 1, 2)
Y_train = convert_to_one_hot(Y_train_orig, 6).T
Y_test = convert_to_one_hot(Y_test_orig, 6).T
Y_train = torch.tensor(np.float32(Y_train))
Y_test = torch.tensor(np.float32(Y_test))

print ("number of training examples = " + str(X_train.shape[0]))
print ("number of test examples = " + str(X_test.shape[0]))
print ("X_train shape: " + str(X_train.shape))
print ("Y_train shape: " + str(Y_train.shape))
print ("X_test shape: " + str(X_test.shape))
print ("Y_test shape: " + str(Y_test.shape))

<a name='4-3'></a>
### 4.3 - Forward Propagation

In PyTorch, there are built-in functions that implement the convolution steps for you. In the [nn.Module API](https://pytorch.org/docs/stable/generated/torch.nn.Module.html), you create a graph of layers.

The following model could also be defined using the Sequential API, as in the previous part, since the information flow is on a single line. But don't deviate. What we want you to learn is to use the Module API. Module can be seen as a generalization of Sequential; Sequential does more for you but operates on the assumption that the network is strictly feed-forward. Module does less for you, but also assumes less about the network you are building, leaving more for you to define in forward propogation.

Inside the model class definition you will need to define your various layers; unlike in Sequential, the order does not matter since we will be defining our own forward propogation funciton later. Note that these are example function and layer names and you should probably not choose these:
```
class arbitrary_model(nn.Module):
    def __init__(self, input_shape):
        super(convolutional_model, self).__init__()
        
        # Define the layers
        self.first_layer_function_name = nn.SomeKindaLayer(someParameter = value)
        self.second_layer_name = nn.AnotherLayer()
        self.third_layer_name = nn.DifferentLayer(parameter = value)
```
        

After our layer functions have been defined in the `__init__` method, begin building your graph of layers by defining the forward propogation function for your model. Simply call your first layer function on your input:
```
def forward(self, x):
   x = self.first_layer_function_name(x)
```

Then, create a new node in the graph of layers by calling a layer on the calling other layers on that output:
```
   x = self.second_layer_name(x)
   x = self.third_layer_name(x)
   #etc.
   return x
```

- **nn.Conv2d(in_channels=input_shape[0], out_channels=8, kernel_size=(4, 4), stride=(1, 1), padding=(p, p)):** Read the full documentation on [Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html).

- **nn.MaxPool2d(kernel_size=(f, f), stride=(s, s), padding=(p, p)):** `MaxPool2D()` downsamples your input using a window of size (f, f) and strides of size (s, s) to carry out max pooling over each window.  For max pooling, you usually operate on a single example at a time and a single channel at a time. Read the full documentation on [MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html).

- **nn.ReLU():** computes the elementwise ReLU of Z (which can be any shape). You can read the full documentation on [ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html).

- **nn.Flatten()**: given a tensor "P", this function takes each training (or test) example in the batch and flattens it into a 1D vector.  

    * If a tensor P has the shape (batch_size,h,w,c), it returns a flattened tensor with shape (batch_size, k), where $k=h \times w \times c$.  "k" equals the product of all the dimension sizes other than the first dimension.
    
    * For example, given a tensor with dimensions [100, 2, 3, 4], it flattens the tensor to be of shape [100, 24], where 24 = 2 * 3 * 4.  You can read the full documentation on [Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html).

- **nn.Linear(in_features= , out_features=):** given the flattened input F, it returns the output computed using a fully connected layer. You can read the full documentation on [Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html).

In the last function above (`nn.Linear`), the fully connected layer automatically initializes weights in the graph and keeps on training them as you train the model. Hence, you did not need to initialize those weights when initializing the parameters.

Lastly, before creating the model, you'll need to define the output using the last of the function's compositions (in this example, a Linear layer):

- **outputs = nn.Linear(in_features=64, out_features=6,)**




<a name='ex-2'></a>
### Exercise 2 - convolutional_model

Implement the `convolutional_model` function below to build the following model: `CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> LINEAR`. Use the functions above!

Also, plug in the following parameters for all the steps:

 - [Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html): Use 8 output channels, 4 by 4 kernel, stride 1, padding 1 by 1 (for in_channels, use input_shape[0] to get the original number of channels)
 - [ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html)
 - [MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html): Use an 8 by 8 kernel size and an 8 by 8 stride, padding 1 by 1
 - **Conv2d**: From 8 in channels, use 16 out channels of 2 by 2 kernels, stride 1, padding is 1 by 1
 - **ReLU**
 - **MaxPool2d**: Use a 4 by 4 kernel size and a 4 by 4 stride, padding 1 by 1
 - [Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html) the previous output.
 - Fully-connected ([Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)) layer: Apply a fully connected layer with 6 neurons.

 <font color='red'> **rubric={30 points}** </font> 

In [None]:
# GRADED FUNCTION: convolutional_model
"""
    Implements __init__ for the model's layers.

    Implements the forward propagation for the model:
    CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> LINEAR

    Note that for simplicity and grading purposes, you'll hard-code some values
    such as the stride and kernel sizes.
    Normally, functions should take these values as function parameters.

    Arguments:
    input_shape -- a tuple of the input shape of the model (in this case, 3x64x64)

    Returns:
    model -- nn.Module model (object containing the information for the entire training process)
"""
class convolutional_model(nn.Module):
    def __init__(self, input_shape):
        super(convolutional_model, self).__init__()

        # Define the layers
        # YOUR CODE STARTS HERE


        # YOUR CODE ENDS HERE

    # Define the forward pass - just pass x through each layer in sequence. You could do more advanced things here, which is the advantage of Module over Sequential.
    def forward(self, x):
        # YOUR CODE STARTS HERE


        # YOUR CODE ENDS HERE

        return x




In [None]:
batch_size = 16
# Example usage:
input_shape = (3, 64, 64)  # Assuming input shape is (channels, width, height)
sign_model = convolutional_model(input_shape).to(device)
summary(sign_model, input_size = (batch_size, 3, 64, 64))

<font color='red'> **rubric={5 points}** </font> 

In [None]:
criterion =   # YOUR CODE HERE Use appropriate loss function 
optimizer = optim.Adam(sign_model.parameters())

Your Module model should now be ready to use, just like the Sequential model. It took a little more prep, but now that it is complete, training the models is mostly the same.

<a name='4-4'></a>
### 4.4 - Train the Model

Repeat the same steps as earlier to get the data ready for training: Convert to a Torch tensor, reshape the data into the [Samples, Channels, Width, Height] format, send it to a TensorDataset, and finally get it ready for batching as a DataLoader.

In [None]:
train_dataset = TensorDataset(X_train, Y_train)
train_loader = DataLoader(train_dataset, batch_size=16, shuffle=True)

test_dataset = TensorDataset(X_test, Y_test)
test_loader = DataLoader(test_dataset, batch_size=1, shuffle=False)

Before we train, let's also set up some way to store the history of model loss after each step. We are going to record and graph the following for each epoch:
- Training Loss
- Validation Loss (using "test" set)
- Training Accuracy
- Validation Accuracy (using "test" set)
Yes, this means that during every training step, we will be "validating" as well. We shouldn't call this "testing" since it is being used to gauge how the model is trained. This lets us see after how many epochs the model begins to overfit to the training data.

In [None]:
history = {'train_loss': [],
            'val_loss': [],
            'train_acc': [],
            'val_acc': []}

<font color='red'> **rubric={15 points}** </font> 

In [None]:
num_epochs = 100
for epoch in range(num_epochs):  # Adjust num_epochs as needed
    sign_model.train()                                          # Be sure to set the model to training mode to start!
    counter = 0                                             # Counts the number of batches we have gone through - this is needed to calculate the training accuracy.
    temp_acc = 0.0                                          # Stores the sum of all training accuracy readings for a given epoch.
    temp_loss = 0.0                                         # Stores the sum of the loss found each each batch by the built-in loss function, whatever that was defined as.
    for i, (inputs, labels) in enumerate(train_loader):         # Loop through the training DataLoader - this acts as our batcher.
        inputs, labels = inputs.to(device), labels.to(device)

        
        # YOUR CODE STARTS HERE (Hint: Forward pass, loss computation, backward pass, optimizer step) You may need to update the variable names in the following starter code here based on your implementation.


        # YOUR CODE ENDS HERE


        _, predicted = torch.max(outputs, 1)    # Get the class (index) with the highest logit for each sample.
                                                # (print out the object "outputs" and read docs on torch.max if you're curious why this is written like this)
        _, labels_num = torch.max(labels, 1)

        correct = (predicted == labels_num).sum().item() # Get the number of correctly predicted items in the batch.

        total = len(predicted)
        train_accuracy = float(correct) / total     # Self-explanatory

        temp_acc += train_accuracy
        temp_loss += (loss.item())


        counter += 1.0

    print(f"Epoch {epoch+1}, Training Loss: {temp_loss / counter}")
    history['train_loss'].append(temp_loss / counter)
    history['train_acc'].append(temp_acc / counter)

    sign_model.eval()

    accumulated_test_loss = 0.0
    total = 0.0
    correct = 0.0
    # Iterate over the test set
    for inputs, labels in test_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = sign_model(inputs)
        test_loss = criterion(outputs, labels).item()  # .item() to get scalar value
        accumulated_test_loss += test_loss
        _, predicted = torch.max(outputs[0].data, 0)    # These come out with an extra dimension due to being in batches of 1. The [0] eliminates that problem.
        _, num_label = torch.max(labels[0].data, 0)
        total += 1
        if (predicted == num_label): correct += 1.0

    # Calculate average test loss if accumulated_test_loss is not zero
    avg_test_loss = accumulated_test_loss / len(test_loader)
    history['val_loss'].append(avg_test_loss)

    test_accuracy = float(correct) / total
    history['val_acc'].append(test_accuracy)



<a name='5'></a>
## 5 - Training History

PyTorch does not have a built-in "history" object like TensorFlow does, which is why we made one in the previous part. Now we can observe these values and start charting them.

Now visualize the loss over time by turning the measurements into charts:

In [None]:
# This code was written by the original author of the TensorFlow version of this worksheet. It works but I take no responsibility for the readability.
# If you stored your data points differently than I did, this will probably need to be modified.
df_loss_acc = pd.DataFrame(history)
df_loss= df_loss_acc[['train_loss','val_loss']]
df_loss.rename(columns={'train_loss':'train','val_loss':'validation'},inplace=True)
df_acc= df_loss_acc[['train_acc','val_acc']]
df_acc.rename(columns={'train_acc':'train','val_acc':'validation'},inplace=True)
df_loss.plot(title='Model loss',figsize=(12,8)).set(xlabel='Epoch',ylabel='Loss')
df_acc.plot(title='Model Accuracy',figsize=(12,8)).set(xlabel='Epoch',ylabel='Accuracy')

**Congratulations**! You've finished the assignment and built two models: One that recognizes  smiles, and another that recognizes SIGN language with almost 80% accuracy on the test set. In addition to that, you now also understand the applications of two Torc APIs: Sequential and Module. Nicely done!

By now, you know a bit about how the Module API works and may have glimpsed the possibilities. In your next assignment, you'll really get a feel for its power when you get the opportunity to build a very deep ConvNet, using ResNets!

<a name='6'></a>
## 6 - Bibliography

You're always encouraged to read the official documentation. To that end, you can find the docs for the Sequential and Module APIs here:

https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html

https://pytorch.org/docs/stable/generated/torch.nn.Module.html