<a href="https://colab.research.google.com/github/Pattiecodes/DataCamp_As.AIEng/blob/main/Module_3_IntroToDeepLearningWithPyTorch_Learn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook is for Module 3 of the "AI for Data Scientists" Track from DataCamp

# Getting started with PyTorch tensors
Tensors are the primary data structure in PyTorch and will be the building blocks for our deep learning models. They share many similarities with NumPy arrays but have some unique attributes too.

In this exercise, you'll practice creating a tensor from a Python list of temperature data from two weather stations. The Python list is named temperatures and has two sublists whose elements represent a different day each, with columns for readings from two stations.

Instructions
100 XP
Begin by importing PyTorch.
Create a tensor from the Python list temperatures.

In [None]:
# Import PyTorch
import torch

temperatures = [[72, 75, 78], [70, 73, 76]]

# Create a tensor from temperatures
temp_tensor = torch.tensor(temperatures)

# Checking and adding tensors
While continuing your temperature data collection, you realize that the recorded temperatures are off by 2 degrees, so you need to add 2 degrees to the tensor of temperatures. Before adjusting the data, you want to verify the shape and type of the tensor, to make sure they are compatible to be added together. The torch library has been pre-imported.

Instructions
100 XP
Check the shape of the temperatures tensor.
Check the type of the temperatures tensor.
Add the temperatures and adjustment tensors.

In [None]:
temperatures = torch.tensor([[72, 75, 78], [70, 73, 76]])
adjustment = torch.tensor([[2, 2, 2], [2, 2, 2]])

# Check the shape of the temperatures tensor
temp_shape = temperatures.shape
print("Shape of temperatures:", temp_shape)

# Check the type of the temperatures tensor
temp_type = temperatures.dtype
print("Data type of temperatures:", temp_type)

# Adjust the temperatures by adding the adjustment tensor
corrected_temperatures = torch.tensor(temperatures + adjustment)

print("Corrected temperatures:", corrected_temperatures)

# Creating my first Neural Network

# Your first neural network
In this exercise, you will implement a small neural network containing two linear layers. The first layer takes an eight-dimensional input, and the last layer outputs a one-dimensional tensor.

The torch package and the torch.nn package have already been imported for you.

Instructions
100 XP
Create a neural network of two linear layers that takes a tensor of dimensions
 as input, representing 8 features, and outputs a tensor of dimensions
.
Use any output dimension for the first layer you want.

In [None]:
import torch
import torch.nn as nn

input_tensor = torch.Tensor([[2, 3, 6, 7, 9, 3, 2, 1]])

# Implement a small neural network with two linear layers
model = nn.Sequential(nn.Linear(8, 4),
                      nn.Linear(4, 1)
                     )

output = model(input_tensor)
print(output)

# The sigmoid and softmax functions
The sigmoid and softmax functions are two of the most popular activation functions in deep learning. They are both usually used as the last step of a neural network. Sigmoid functions are used for binary classification problems, whereas softmax functions are often used for multiclass classification problems.

Let's say that you have a neural network that returns the values contained in the score tensor as a pre-activation output. Apply the activation function corresponding to the use cases described to get the output.

torch.nn is already imported as nn.

Instructions 1/2
50 XP
Create a sigmoid function and apply it on input_tensor to generate a probability for a binary classification task.

In [None]:
input_tensor = torch.tensor([[0.8]])

# Create a sigmoid function and apply it on input_tensor
sigmoid = nn.Sigmoid()
probability = sigmoid(input_tensor)
print(probability)

Create a softmax function and apply it on input_tensor to generate a probability for a multiclass classification task.

In [None]:
input_tensor = torch.tensor([[1.0, -6.0, 2.5, -0.3, 1.2, 0.8]])

# Create a softmax function and apply it on input_tensor
softmax = nn.Softmax(dim=-1)
probabilities = softmax(input_tensor)
print(probabilities)

# Building a binary classifier in PyTorch
Recall that a small neural network with a single linear layer followed by a sigmoid function is a binary classifier. It acts just like a logistic regression.

In this exercise, you'll practice building this small network and interpreting the output of the classifier.

The torch package and the torch.nn package have already been imported for you.

Instructions 1/2
50 XP
Create a neural network that takes a tensor of dimensions 1x8 as input, and returns an output of the correct shape for binary classification.
Pass the output of the linear layer to a sigmoid, which both takes in and return a single float.

In [None]:
import torch
import torch.nn as nn

input_tensor = torch.Tensor([[3, 4, 6, 2, 3, 6, 8, 9]])

# Implement a small neural network for binary classification
model = nn.Sequential(
  nn.Linear(8, 1),
  nn.Sigmoid()
)

output = model(input_tensor)
print(output)

# From regression to multi-class classification
Recall that the models we have seen for binary classification, multi-class classification and regression have all been similar, barring a few tweaks to the model.

In this exercise, you'll start by building a model for regression, and then tweak the model to perform a multi-class classification.

Instructions 1/2
50 XP
1
Create a 4-layer linear neural network compatible with input_tensor as the input, and a regression value as output.

In [None]:
import torch
import torch.nn as nn

input_tensor = torch.Tensor([[3, 4, 6, 7, 10, 12, 2, 3, 6, 8, 9]])

# Implement a neural network with exactly four linear layers
model = nn.Sequential(
  nn.Linear(11, 20),
  nn.Linear(20, 12),
  nn.Linear(12, 6),
  nn.Linear(6, 1)
)

output = model(input_tensor)
print(output)

Update the network provided to perform a multi-class classification with four outputs.

In [None]:
import torch
import torch.nn as nn

input_tensor = torch.Tensor([[3,4,6,7,10,12,2,3,6,8,9]])

#Update network below to perform a multi-class classification with four labels
model = nn.Sequential(
    nn.Linear(11, 20),
    nn.Linear(20, 12),
    nn.Linear(12, 6),
    nn.Linear(6, 4),
    nn.Softmax(dim=-1)
)

output = model(input_tensor)
print(output)

# Creating one-hot encoded labels
One-hot encoding is a technique that turns a single integer label into a vector of N elements, where N is the number of classes in your dataset. This vector only contains zeros and ones. In this exercise, you'll create the one-hot encoded vector of the label y provided.

You'll practice doing this manually, and then make your life easier by leveraging the help of PyTorch! Your dataset contains three classes, and the class labels range from 0 to 2 (e.g., 0, 1, 2).

NumPy is already imported as np, and torch.nn.functional as F. The torch package is also imported.

Instructions
100 XP
Manually create a one-hot encoded vector of the ground truth label y by filling in the NumPy array provided.
Create a one-hot encoded vector of the ground truth label y using PyTorch.

In [None]:
  y = 1
num_classes = 3

# Create the one-hot encoded vector using NumPy
one_hot_numpy = np.array([0, 1, 0])

# Create the one-hot encoded vector using PyTorch
one_hot_pytorch = F.one_hot(torch.tensor(y), num_classes)

# Calculating cross entropy loss
Cross entropy loss is one of the most common ways to measure loss for classification problems. In this exercise, you will calculate cross entropy loss in PyTorch for a vector of predicted scores and a ground truth label. You are provided with the ground truth label y and the scores vector, a vector of model predictions before the final softmax function.

You'll start by creating a one-hot encoded vector of the ground truth label y. Next, you'll instantiate a cross entropy loss function. Last, you'll call the loss function, which takes scores, and the one-hot encoded ground truth label, as inputs. Its output will be a single float, the loss of that sample.

torch, CrossEntropyLoss, and torch.nn.functional as F have already been imported for you.

Instructions 1/3
35 XP
Create the one-hot encoded vector of the ground truth label y, with 4 features (one for each class), and assign it to one_hot_label.

In [None]:
import torch
import torch.nn.functional as F
from torch.nn import CrossEntropyLoss

y = [2]
scores = torch.tensor([[0.1, 6.0, -2.0, 3.2]])

# Create a one-hot encoded vector of the label y
one_hot_label = F.one_hot(torch.tensor([2]), num_classes=4)

Instructions 2/3
35 XP
Create the cross entropy loss function and store it as criterion.

In [None]:
import torch
import torch.nn.functional as F
from torch.nn import CrossEntropyLoss

y = [2]
scores = torch.tensor([[0.1, 6.0, -2.0, 3.2]])

# Create a one-hot encoded vector of the label y
one_hot_label = F.one_hot(torch.tensor(y), num_classes = scores.shape[1])

# --- Code added ---
# Create the cross entropy loss function
criterion = CrossEntropyLoss()

Instructions 3/3
30 XP
Calculate the cross entropy loss using the one_hot_label vector and the scores vector, by calling the loss_function you created.

In [None]:
import torch
import torch.nn.functional as F
from torch.nn import CrossEntropyLoss

y = [2]
scores = torch.tensor([[0.1, 6.0, -2.0, 3.2]])

# Create a one-hot encoded vector of the label y
one_hot_label = F.one_hot(torch.tensor(y), scores.shape[1])

# Create the cross entropy loss function
criterion = CrossEntropyLoss()

# --- Code added ---
# Calculate the cross entropy loss
loss = criterion(scores.double(), one_hot_label.double())
print(loss)

# Accessing the model parameters
A PyTorch model created with the nn.Sequential() is a module that contains the different layers of your network. Recall that each layer parameter can be accessed by indexing the created model directly. In this exercise, you will practice accessing the parameters of different linear layers of a neural network.

Instructions
100 XP
Access the weight parameter of the first linear layer.
Access the bias parameter of the second linear layer.

In [None]:
model = nn.Sequential(nn.Linear(16, 8),
                      nn.Linear(8, 2)
                     )

# Access the weight of the first linear layer
weight_0 = model[0].weight

# Access the bias of the second linear layer
bias_1 = model[1].bias

# Updating the weights manually
Now that you know how to access weights and biases, you will manually perform the job of the PyTorch optimizer. PyTorch functions can do what you're about to do, but it's helpful to do the work manually at least once, to understand what's going on under the hood.

A neural network of three layers has been created and stored as the model variable. This network has been used for a forward pass and the loss and its derivatives have been calculated. A default learning rate, lr, has been chosen to scale the gradients when performing the update.

Instructions 1/2
50 XP
1
2
Create the gradient variables by accessing the local gradients of each weight tensor.

In [None]:
weight0 = model[0].weight
weight1 = model[1].weight
weight2 = model[2].weight

# Access the gradients of the weight of each linear layer
grads0 = model[0].weight.grad
grads1 = model[1].weight.grad
grads2 = model[2].weight.grad

Instructions 2/2
50 XP
2
Update the weights using the gradients scaled by the learning rate.

In [None]:
weight0 = model[0].weight
weight1 = model[1].weight
weight2 = model[2].weight

# --- Code edited ---
# Access the gradients of the weight of each linear layer
grads0 = weight0.grad
grads1 = weight1.grad
grads2 = weight2.grad

# --- Code added ---
# Update the weights using the learning rate and the gradients
weight0 = weight0 - lr * grads0
weight1 = weight1 - lr * grads1
weight2 = weight2 - lr * grads2

#lr is also given, but not in the code. The usual lr = 0.001

# Using the PyTorch optimizer
In the previous exercise, you manually updated the weight of a network. You now know what's going on under the hood, but this approach is not scalable to a network of many layers.

Thankfully, the PyTorch SGD optimizer does a similar job in a handful of lines of code. In this exercise, you will practice the last step to complete the training loop: updating the weights using a PyTorch optimizer.

A neural network has been created and provided as the model variable. This model was used to run a forward pass and create the tensor of predictions pred. The one-hot encoded tensor is named target and the cross entropy loss function is stored as criterion.

torch.optim as optim, and torch.nn as nn have already been loaded for you.

Instructions 1/2
50 XP
2
Use optim to create an SGD optimizer with a learning rate of your choice (must be less than one) for the model provided.

In [None]:
# Create the optimizer
optimizer = optim.SGD(model.parameters(), lr = 0.001)

Instructions 2/2
50 XP
2
Update the model's parameters using the optimizer.

In [None]:
# Create the optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001)

# --- Added code ---
loss = criterion(pred, target)
loss.backward()

# Update the model's parameters using the optimizer
optimizer.step()

# Using the MSELoss
For regression problems, we often use Mean Squared Error (MSE) as a loss function instead of cross-entropy. MSE calculates the squared difference between predicted values (y_pred) and actual values (y). In this exercise, you'll compute MSE loss using both NumPy and PyTorch.

The torch package has been imported, along with numpy as np and torch.nn as nn.

Instructions
100 XP
Calculate the MSE loss using NumPy.
Create a MSE loss function using PyTorch.
Convert y_pred and y to tensors and then float data types, and then use them to calculate MSELoss using PyTorch as mse_pytorch.

In [None]:
y_pred = np.array(10)
y = np.array(1)

# Calculate the MSELoss using NumPy
mse_numpy = np.mean((y_pred - y)**2)

# Create the MSELoss function
criterion = nn.MSELoss()

# Calculate the MSELoss using the created loss function
mse_pytorch = criterion(torch.tensor(y_pred).float(), torch.tensor(y).float())
print(mse_pytorch)

# Writing a training loop
In scikit-learn, the training loop is wrapped in the .fit() method, while in PyTorch, it's set up manually. While this adds flexibility, it requires a custom implementation.

In this exercise, you'll create a loop to train a model for salary prediction.

The show_results() function is provided to help you visualize some sample predictions.

The package imports provided are: pandas as pd, torch, torch.nn as nn, torch.optim as optim, as well as DataLoader and TensorDataset from torch.utils.data.

The following variables have been created: num_epochs, containing the number of epochs (set to 5); dataloader, containing the dataloader; model, containing the neural network; criterion, containing the loss function, nn.MSELoss(); optimizer, containing the SGD optimizer.

Instructions 1/3
35 XP
Write a for loop that iterates over the dataloader; this should be nested within a for loop that iterates over a range equal to the number of epochs.
Set the gradients of the optimizer to zero.

In [None]:
# Loop over the number of epochs and then the dataloader
for i in range(num_epochs):
  for data in dataloader:
    # Set the gradients to zero
    optimizer.zero_grad()

Instructions 2/3
35 XP
Write the forward pass.
Compute the MSE loss value using the criterion() function provided.
Compute the gradients.

In [None]:
# Loop over the number of epochs and the dataloader
for i in range(num_epochs):
  for data in dataloader:
    # Set the gradients to zero
    optimizer.zero_grad()

    # --- Added code ---
    # Run a forward pass
    feature, target = data
    prediction = model(feature)
    # Calculate the loss
    loss = criterion(prediction, target)
    # Compute the gradients
    loss.backward()

Instructions 3/3
30 XP
Update the model's parameters.

In [None]:
# Loop over the number of epochs and the dataloader
for i in range(num_epochs):
  for data in dataloader:
    # Set the gradients to zero
    optimizer.zero_grad()
    # Run a forward pass
    feature, target = data
    prediction = model(feature)
    # Calculate the loss
    loss = criterion(prediction, target)
    # Compute the gradients
    loss.backward()

    # --- Added code ---
    # Update the model's parameters
    optimizer.step()
show_results(model, dataloader)

# Implementing ReLU
The rectified linear unit (or ReLU) function is one of the most common activation functions in deep learning.

It overcomes the training problems linked with the sigmoid function you learned, such as the vanishing gradients problem.

In this exercise, you'll begin with a ReLU implementation in PyTorch. Next, you'll calculate the gradients of the function.

The nn module has already been imported for you.

Instructions 1/2
50 XP
Create a ReLU function in PyTorch.

In [None]:
# Create a ReLU function with PyTorch
relu_pytorch = nn.ReLU()

Instructions 2/2
50 XP
Calculate the gradient of the ReLU function for x using the relu_pytorch() function you defined, then running a backward pass.
Find the gradient at x.

In [None]:
# Create a ReLU function with PyTorch
relu_pytorch = nn.ReLU()

# Apply your ReLU function on x, and calculate gradients
x = torch.tensor(-1.0, requires_grad=True)
y = relu_pytorch(x)
y.backward()

# Print the gradient of the ReLU function for x
gradient = x.grad
print(gradient)

# Implementing leaky ReLU
You've learned that ReLU is one of the most used activation functions in deep learning. You will find it in modern architecture. However, it does have the inconvenience of outputting null values for negative inputs and therefore, having null gradients. Once an element of the input is negative, it will be set to zero for the rest of the training. Leaky ReLU overcomes this challenge by using a multiplying factor for negative inputs.

In this exercise, you will implement the leaky ReLU function in NumPy and PyTorch and practice using it. The numpy as np package, the torch package as well as the torch.nn as nn have already been imported.

Instructions 1/2
50 XP
Create a leaky ReLU function in PyTorch with a negative slope of 0.05.
Call the function on the tensor x, which has already been defined for you

In [None]:
# Create a leaky relu function in PyTorch
leaky_relu_pytorch = nn.LeakyReLU(negative_slope=0.05)

x = torch.tensor(-2.0)
# Call the above function on the tensor x
output = leaky_relu_pytorch(x)
print(output)

# Counting the number of parameters
Deep learning models are famous for having a lot of parameters. Recent language models have billions of parameters. With more parameters comes more computational complexity and longer training times, and a deep learning practitioner must know how many parameters their model has.

In this exercise, you will calculate the number of parameters in your model, first manually, and then using PyTorch.

The torch.nn package has been imported as nn.

Instructions 1/2
50 XP
2
Question
Calculate manually the number of parameters of the model below. How many does it have?

```
# This is formatted as code
model = nn.Sequential(nn.Linear(16, 4),
                      nn.Linear(4, 2),
                      nn.Linear(2, 1))
```



In [None]:
model = nn.Sequential(nn.Linear(16, 4),
                      nn.Linear(4, 2),
                      nn.Linear(2, 1))

total = 0
for parameter in model.parameters():
    total += parameter.numel()
print(total)

Instructions 2/2
50 XP
2
Now, confirm your manual calculation by iterating through the model's parameters to update the total variable with the total number of parameters in the model.

In [None]:
model = nn.Sequential(nn.Linear(16, 4),
                      nn.Linear(4, 2),
                      nn.Linear(2, 1))

total = 0

# Calculate the number of parameters in the model
for parameter in model.parameters():
  total += parameter.numel()

print(f"The number of parameters in the model is {total}")

# Manipulating the capacity of a network
In this exercise, you will practice creating neural networks with different capacities. The capacity of a network reflects the number of parameters in said network. To help you, a calculate_capacity() function has been implemented, as follows:


```
# This is formatted as code
def calculate_capacity(model):
  total = 0
  for p in model.parameters():
    total += p.numel()
  return total
```
This function returns the number of parameters in your model.

The dataset you are training this network on has n_features features and n_classes classes. The torch.nn package has been imported as nn.

Instructions 1/2
50 XP
1
Create a 3-layer linear neural network with <120 parameters, using n_features as input and n_classes as output sizes.


In [None]:
n_features = 8
n_classes = 2

input_tensor = torch.Tensor([[3, 4, 6, 2, 3, 6, 8, 9]])

# Create a neural network with less than 120 parameters
model = nn.Sequential(nn.Linear(n_features, 8),
                      nn.Linear(8, 4),
                      nn.Linear(4, n_classes))
output = model(input_tensor)

print(calculate_capacity(model))

Instructions 2/2
50 XP
Create a 3-layer linear neural network with <120 parameters, using n_features as input and n_classes as output sizes.

In [None]:
n_features = 8
n_classes = 2

input_tensor = torch.Tensor([[3, 4, 6, 2, 3, 6, 8, 9]])

# Create a neural network with more than 120 parameters
model = nn.Sequential(nn.Linear(n_features, 16),
                      nn.Linear(16, 8),
                      nn.Linear(8, 3),
                      nn.Linear(3, n_classes))

output = model(input_tensor)

print(calculate_capacity(model))

# Experimenting with learning rate
In this exercise, your goal is to find the optimal learning rate such that the optimizer can find the minimum of the non-convex function
 in ten steps.

You will experiment with three different learning rate values. For this problem, try learning rate values between 0.001 to 0.1.

You are provided with the optimize_and_plot() function that takes the learning rate for the first argument. This function will run 10 steps of the SGD optimizer and display the results.

Instructions 1/3
35 XP
1
Try a small learning rate value such that the optimizer isn't able to get past the first minimum on the right.

In [None]:
# Try a first learning rate value
lr0 = 0.01
optimize_and_plot(lr=lr0)

Try a large learning rate value such that the optimizer skips past the global minimum at -2.

In [None]:
# Try a second learning rate value
lr1 = 0.1
optimize_and_plot(lr=lr1)

Based on the previous results, try a better learning rate value.



In [None]:
# Try a third learning rate value
lr2 = 0.09
optimize_and_plot(lr=lr2)

# Experimenting with momentum
In this exercise, your goal is to find the optimal momentum such that the optimizer can find the minimum of the following non-convex function
 in 20 steps. You will experiment with two different momentum values. For this problem, the learning rate is fixed at 0.01.

You are provided with the optimize_and_plot() function that accepts as input the momentum parameter. This function will run 20 steps of the SGD optimizer and display the results.

Instructions 1/2
50 XP
Try a first value for the momentum such that the optimizer gets stuck in the first minimum.

In [None]:
# Try a first value for momentum
mom0 = 0
optimize_and_plot(momentum=mom0)

Try a second value for the momentum such that the optimizer finds the global optimum.

In [None]:
# Try a second value for momentum
mom1 = 0.92
optimize_and_plot(momentum=mom1)

# Transfer Learning AND Fine-Tuning

## Freeze layers of a model
You are about to fine-tune a model on a new task after loading pre-trained weights. The model contains three linear layers. However, because your dataset is small, you only want to train the last linear layer of this model and freeze the first two linear layers.

The model has already been created and exists under the variable model. You will be using the named_parameters method of the model to list the parameters of the model. Each parameter is described by a name. This name is a string with the following naming convention: x.name where x is the index of the layer.

Remember that a linear layer has two parameters: the weight and the bias.

Instructions
100 XP
Use an if statement to determine if the parameter should be frozen or not based on its name.
Freeze the parameters of the first two layers of this model.

In [None]:
for name, param in model.named_parameters():

    # Check if the parameters belong to the first layer
    if name == '0.weight' or name == '0.bias':

        # Freeze the parameters
        param.requires_grad = False

    # Check if the parameters belong to the second layer
    if name == '1.weight' or name == '1.bias':

        # Freeze the parameters
        param.requires_grad = False

# Layer initialization
The initialization of the weights of a neural network has been the focus of researchers for many years. When training a network, the method used to initialize the weights has a direct impact on the final performance of the network.

As a machine learning practitioner, you should be able to experiment with different initialization strategies. In this exercise, you are creating a small neural network made of two layers and you are deciding to initialize each layer's weights with the uniform method.

Instructions
100 XP
For each layer (layer0 and layer1), use the uniform initialization method to initialize the weights.

In [None]:
layer0 = nn.Linear(16, 32)
layer1 = nn.Linear(32, 64)

# Use uniform initialization for layer0 and layer1 weights
nn.init.uniform_(layer0.weight)
nn.init.uniform_(layer1.weight)

model = nn.Sequential(layer0, layer1)

# Using the TensorDataset class
In practice, loading your data into a PyTorch dataset will be one of the first steps you take in order to create and train a neural network with PyTorch.

The TensorDataset class is very helpful when your dataset can be loaded directly as a NumPy array. Recall that TensorDataset() can take one or more NumPy arrays as input.

In this exercise, you'll practice creating a PyTorch dataset using the TensorDataset class.

torch and numpy have already been imported for you, along with the TensorDataset class.

Instructions
100 XP
Create a TensorDataset using the torch_features and the torch_target tensors provided (in this order).
Return the last element of the dataset.

In [None]:
import numpy as np
import torch
from torch.utils.data import TensorDataset

np_features = np.array(np.random.rand(12, 8))
np_target = np.array(np.random.rand(12, 1))

torch_features = torch.tensor(np_features)
torch_target = torch.tensor(np_target)

# Create a TensorDataset from two tensors
dataset = TensorDataset(torch_features.float(), torch_target.float())

# Return the last element of this dataset
print()

# From data loading to running a forward pass
In this exercise, you'll create a PyTorch DataLoader from a pandas DataFrame and call a model on this dataset. Specifically, you'll run a forward pass on a neural network. You'll continue working with fully connected neural networks, as you have done so far.

You'll begin by subsetting a loaded DataFrame called dataframe, converting features and targets NumPy arrays, and converting to PyTorch tensors in order to create a PyTorch dataset.

This dataset can be loaded into a PyTorch DataLoader, batched, shuffled, and used to run a forward pass on a custom fully connected neural network.

NumPy as np, pandas as pd, torch, TensorDataset(), and DataLoader() have been imported for you.

Instructions 1/3
Extract the features (ph, Sulfate, Conductivity, Organic_carbon) and target (Potability) values and load them into tensors to represent features and targets.
Use both tensors to generate a PyTorch dataset using the tensor dataset class.

In [None]:
# Load the different columns into two PyTorch tensors
features = torch.tensor(dataframe[['ph', 'Sulfate', 'Conductivity', 'Organic_carbon']].to_numpy()).float()
target = torch.tensor(dataframe['Potability'].to_numpy()).float()

# Create a dataset from the two generated tensors
dataset = TensorDataset(features, target)

Instructions 2/3
35 XP
Create a PyTorch DataLoader from the created TensorDataset; this DataLoader should use a batch_size of two and shuffle the dataset.

In [None]:
# Load the different columns into two PyTorch tensors
features = torch.tensor(dataframe[['ph', 'Sulfate', 'Conductivity', 'Organic_carbon']].to_numpy()).float()
target = torch.tensor(dataframe['Potability'].to_numpy()).float()

# Create a dataset from the two generated tensors
dataset = TensorDataset(features, target)

# Create a dataloader using the above dataset
dataloader = DataLoader(dataset, batch_size=2, shuffle=True)
x, y = next(iter(dataloader))

Instructions 3/3
30 XP
Implement a small, fully connected neural network using exactly two linear layers and the nn.Sequential() API, where the final output size is 1.

In [None]:
# Load the different columns into two PyTorch tensors
features = torch.tensor(dataframe[['ph', 'Sulfate', 'Conductivity', 'Organic_carbon']].to_numpy()).float()
target = torch.tensor(dataframe['Potability'].to_numpy()).float()

# Create a dataset from the two generated tensors
dataset = TensorDataset(features, target)

# Create a dataloader using the above dataset
dataloader = DataLoader(dataset, shuffle=True, batch_size=2)
x, y = next(iter(dataloader))

# Create a model using the nn.Sequential API
model = nn.Sequential(
  nn.Linear(4, 16),
  nn.Linear(16, 1)
)
output = model(features)
print(output)

# Writing the evaluation loop
In this exercise, you will practice writing the evaluation loop. Recall that the evaluation loop is similar to the training loop, except that you will not perform the gradient calculation and the optimizer step.

The model has already been defined for you, along with the object validationloader, which is a dataset.

Instructions 1/2
50 XP
Set the model to evaluation mode.
Sum the current batch loss to the validation_loss variable.

In [None]:
# Set the model to evaluation mode
model.eval()
validation_loss = 0.0

with torch.no_grad():

  for data in validationloader:

      outputs = model(data[0])
      loss = criterion(outputs, data[1])

      # Sum the current loss to the validation_loss variable
      validation_loss += loss.item()

Instructions 2/2
50 XP
Calculate the mean loss value for the epoch.
Set the model back to training mode.

In [None]:
# Set the model to evaluation mode
model.eval()
validation_loss = 0.0

with torch.no_grad():

  for data in validationloader:

      outputs = model(data[0])
      loss = criterion(outputs, data[1])

      # Sum the current loss to the validation_loss variable
      validation_loss += loss.item()

# Calculate the mean loss value
validation_loss_epoch = validation_loss / len(validationloader)
print(validation_loss_epoch)

# Set the model back to training mode
model.train()

# Calculating accuracy using torchmetrics
In addition to the losses, you should also be keeping track of the accuracy during training. By doing so, you will be able to select the epoch when the model performed the best.

In this exercise, you will practice using the torchmetrics package to calculate the accuracy. You will be using a sample of the facemask dataset. This dataset contains three different classes. The plot_errors function will display samples where the model predictions do not match the ground truth. Performing such error analysis will help you understand your model failure modes.

The torchmetrics package is already imported. The model outputs are the probabilities returned by a softmax as the last step of the model. The labels tensor contains the labels as one-hot encoded vectors.

Instructions 1/2
50 XP

Create an accuracy metric for a "multiclass" problem with three classes.
Calculate the accuracy for each batch of the dataloader.

In [None]:
# Create accuracy metric using torch metrics
metric = torchmetrics.Accuracy(task="multiclass", num_classes=3)
for data in dataloader:
    features, labels = data
    outputs = model(features)

    # Calculate accuracy over the batch
    acc = metric(outputs, labels.argmax(dim=-1))

Instructions 2/2
50 XP
Calculate accuracy for the epoch.
Reset the metric for the next epoch.

In [None]:
# Create accuracy metric using torch metrics
metric = torchmetrics.Accuracy(task="multiclass", num_classes=3)
for data in dataloader:
    features, labels = data
    outputs = model(features)

    # Calculate accuracy over the batch
    acc = metric(outputs, labels.argmax(dim=-1))

# Calculate accuracy over the whole epoch
acc = metric.compute()

# Reset the metric for the next epoch
metric.reset()
plot_errors(model, dataloader)