In [1]:
# run this to shorten the data import from the files
path_data = '/home/nero/Documents/Estudos/DataCamp/Python/courses/Introduction_to_Deep_Learning_with_PyTorch/datasets/'

In [14]:
# exercise 01

"""
Building a binary classifier in PyTorch

Recall that a small neural network with a single linear layer followed by a sigmoid function is a binary classifier. It acts just like a logistic regression.

In this exercise, you'll practice building this small network and interpreting the output of the classifier.

The torch package and the torch.nn package have already been imported for you.
"""

# Instructions

"""

    Create a neural network that takes a tensor of dimensions 1x8 as input, and returns an output of the correct shape for binary classification.
    Pass the output of the linear layer to a sigmoid, which both takes in and return a single float.
---
Question

Which of the following is false about the output returned by your binary classifier?
Possible answers:
(It can take any float values)
"""

# solution

import torch
import torch.nn as nn

input_tensor = torch.Tensor([[3, 4, 6, 2, 3, 6, 8, 9]])

# Implement a small neural network for binary classification
model = nn.Sequential(
  nn.Linear(8,1),
  nn.Sigmoid()
)

output = model(input_tensor)
print(output)

#----------------------------------#

# Conclusion

"""
Correct! The sigmoid output cannot take just any float value: the output returned by your binary classifier is bounded between zero and one.
"""

tensor([[0.3069]], grad_fn=<SigmoidBackward0>)


'\nCorrect! The sigmoid output cannot take just any float value: the output returned by your binary classifier is bounded between zero and one.\n'

In [34]:
# exercise 02

"""
From regression to multi-class classification

Recall that the models we have seen for binary classification, multi-class classification and regression have all been similar, barring a few tweaks to the model.

In this exercise, you'll start by building a model for regression, and then tweak the model to perform a multi-class classification.
"""

# Instructions

"""

    Create a neural network with exactly four linear layers, which takes the input tensor as input, and outputs a regression value, using any shapes you like for the hidden layers.

    A similar neural network to the one you just built is provided, containing four linear layers; update this network to perform a multi-class classification with four outputs.

"""

# solution

import torch
import torch.nn as nn

input_tensor = torch.Tensor([[3, 4, 6, 7, 10, 12, 2, 3, 6, 8, 9]])

# Implement a neural network with exactly four linear layers
model = nn.Sequential(
    nn.Linear(11,1),
    nn.Linear(1,8),
    nn.Linear(8,4),
    nn.Linear(4,1)
)

output = model(input_tensor)
print(output)

#----------------------------------#

import torch
import torch.nn as nn

input_tensor = torch.Tensor([[3, 4, 6, 7, 10, 12, 2, 3, 6, 8, 9]])

# Update network below to perform a multi-class classification with four labels
model = nn.Sequential(
  nn.Linear(11, 20),
  nn.Linear(20, 12),
  nn.Linear(12, 6),
  nn.Linear(6, 4), 
  nn.Softmax(dim=-1)
)

output = model(input_tensor)
print(output)

#----------------------------------#

# Conclusion

"""
Nice work! You turned your continuous regression values into probabilities bounded between zero and one by changing the output dimensions of the last linear layer, as well as by applying the softmax function.
"""

tensor([[-1.1175]], grad_fn=<AddmmBackward0>)
tensor([[0.1662, 0.2746, 0.3039, 0.2553]], grad_fn=<SoftmaxBackward0>)


'\nNice work! You turned your continuous regression values into probabilities bounded between zero and one by changing the output dimensions of the last linear layer, as well as by applying the softmax function.\n'

In [36]:
# exercise 03

"""
Creating one-hot encoded labels

One-hot encoding is a technique that turns a single integer label into a vector of N elements, where N is the number of classes in your dataset. This vector only contains zeros and ones. In this exercise, you'll create the one-hot encoded vector of the label y provided.

You'll practice doing this manually, and then make your life easier by leveraging the help of PyTorch! Your dataset contains three classes.

NumPy is already imported as np, and torch.nn.functional as F. The torch package is also imported.
"""

# Instructions

"""

    Manually create a one-hot encoded vector of the ground truth label y by filling in the NumPy array provided.
    Create a one-hot encoded vector of the ground truth label y using PyTorch.

"""
import torch.nn.functional as F
import numpy as np
# solution

y = 1
num_classes = 3

# Create the one-hot encoded vector using NumPy
one_hot_numpy = np.array([0, 1, 0])

# Create the one-hot encoded vector using PyTorch
one_hot_pytorch = F.one_hot(torch.tensor(1), num_classes = num_classes)

#----------------------------------#

# Conclusion

"""
If you implement a custom dataset, you can make it output the one-hot encoded label directly. Indeed, you can add the one-hot encoding step to the __getitem__ method such that the returned label is already one-hot encoded!
"""

'\nIf you implement a custom dataset, you can make it output the one-hot encoded label directly. Indeed, you can add the one-hot encoding step to the __getitem__ method such that the returned label is already one-hot encoded!\n'

In [37]:
# exercise 04

"""
Calculating cross entropy loss

Cross entropy loss is the most used loss for classification problems. In this exercise, you will create inputs and calculate cross entropy loss in PyTorch. You are provided with the ground truth label y and a vector of scores predicted by your model.

You'll start by creating a one-hot encoded vector of the ground truth label y, which is a required step to compare y with the scores predicted by your model. Next, you'll create a cross entropy loss function. Last, you'll call the loss function, which takes scores (model predictions before the final softmax function), and the one-hot encoded ground truth label, as inputs. It outputs a single float, the loss of that sample.

torch, torch.nn as nn, and torch.nn.functional as F have already been imported for you.
"""

# Instructions

"""

    Create the one-hot encoded vector of the ground truth label y and assign it to one_hot_label.
---

    Create the cross entropy loss function and store it as criterion.
---

    Calculate the cross entropy loss using the one_hot_label vector and the scores vector, by calling the loss_function you created.

"""

# solution

import torch
import torch.nn as nn
import torch.nn.functional as F

y = [2]
scores = torch.tensor([[0.1, 6.0, -2.0, 3.2]])

# Create a one-hot encoded vector of the label y
one_hot_label = F.one_hot(torch.tensor(y), scores.shape[1])

# Create the cross entropy loss function
criterion = nn.CrossEntropyLoss()

# Calculate the cross entropy loss
loss = criterion(scores.double(), one_hot_label.double())
print(loss)

#----------------------------------#

# Conclusion

"""
Nicely done calculating cross entropy loss in PyTorch! This is one of the most commonly used loss functions for classification tasks, where the goal is to predict the probability distribution of a set of target categories or classes.
"""

tensor(8.0619, dtype=torch.float64)


'\nNicely done calculating cross entropy loss in PyTorch! This is one of the most commonly used loss functions for classification tasks, where the goal is to predict the probability distribution of a set of target categories or classes.\n'

In [4]:
# exercise 05

"""
Estimating a sample

In previous exercises, you used linear layers to build networks.

Recall that the operation performed by nn.Linear() is to take an input X
and apply the transformation W * X + b ,where W and b are two tensors (called the weight and bias).

A critical part of training PyTorch models is to calculate gradients of the weight and bias tensors with respect to a loss function.

In this exercise, you will calculate weight and bias tensor gradients using cross entropy loss and a sample of data.

The following tensors are provded:

weight: a 2 X 9 -element tensor
bias: a 2 -element tensor
preds: a 1 X 2 -element tensor containing the model predictions
target: a 1 X 2 -element one-hot encoded tensor containing the ground-truth label
"""

# Instructions

"""

    Use the criterion you have defined to calculate the loss value with respect to the predictions and target values.
    Compute the gradients of the cross entropy loss.
    Display the gradients of the weight and bias tensors, in that order.

"""

# solution

criterion = nn.CrossEntropyLoss()

# Calculate the loss
loss = criterion(preds, target)

# Compute the gradients of the loss
loss.backward()

# Display gradients of the weight and bias tensors in order
print(weight.grad)
print(bias.grad)

#----------------------------------#

# Conclusion

"""
Great job going from calculating and returning gradients of larger input tensors than you have done previously! In later videos, you will learn how to access the weights and biases of the nn.Linear() modules directly!
"""

'\n\n'

In [38]:
# exercise 06

"""
Accessing the model parameters

A PyTorch model created with the nn.Sequential() is a module that contains the different layers of your network. Recall that each layer parameter can be accessed by indexing the created model directly. In this exercise, you will practice accessing the parameters of different linear layers of a neural network. You won't be accessing the sigmoid.
"""

# Instructions

"""

    Access the weight parameter of the first linear layer.
    Access the bias parameter of the second linear layer.

"""

# solution

import torch.nn as nn

model = nn.Sequential(nn.Linear(16, 8),
                      nn.Sigmoid(),
                      nn.Linear(8, 2))

# Access the weight of the first linear layer
weight_0 = model[0].weight

# Access the bias of the second linear layer
bias_1 = model[2].bias


#----------------------------------#

# Conclusion

"""
Parameters and gradients are usually automatically calculated and updated, as we've seen in the video. However, on your deep learning journey, you'll find that accessing the model parameters or gradients is a great way to debug training.
"""

"\nParameters and gradients are usually automatically calculated and updated, as we've seen in the video. However, on your deep learning journey, you'll find that accessing the model parameters or gradients is a great way to debug training.\n"

In [39]:
# exercise 07

"""
Updating the weights manually

Now that you know how to access weights and biases, you will manually perform the job of the PyTorch optimizer. PyTorch functions can do what you're about to do, but it's helpful to do the work manually at least once, to understand what's going on under the hood.

A neural network of three layers has been created and stored as the model variable. This network has been used for a forward pass and the loss and its derivatives have been calculated. A default learning rate, lr, has been chosen to scale the gradients when performing the update.
"""

# Instructions

"""

    Create the gradient variables by accessing the local gradients of each weight tensor.
---

    Update the weights using the gradients scaled by the learning rate.

"""

# solution
lr = 0.001

weight0 = model[0].weight
weight1 = model[1].weight
weight2 = model[2].weight

# Access the gradients of the weight of each linear layer
grads0 = model[0].weight.grad
grads1 = model[1].weight.grad
grads2 = model[2].weight.grad

# Update the weights using the learning rate and the gradients
weight0 = weight0 - lr * grads0
weight1 = weight1 - lr * grads1
weight2 = weight2 - lr * grads2

#----------------------------------#

# Conclusion

"""
Good job! This would be very tedious with a network of a hundred layers and we should be thankful for the PyTorch optimizer that will do a similar job in a single line of code!
"""

AttributeError: 'Sigmoid' object has no attribute 'weight'

In [41]:
# exercise 08

"""
Using the PyTorch optimizer

In the previous exercise, you manually updated the weight of a network. You now know what's going on under the hood, but this approach is not scalable to a network of many layers.

Thankfully, the PyTorch SGD optimizer does a similar job in a handful of lines of code. In this exercise, you will practice the last step to complete the training loop: updating the weights using a PyTorch optimizer.

A neural network has been created and provided as the model variable. This model was used to run a forward pass and create the tensor of predictions pred. The one-hot encoded tensor is named target and the cross entropy loss function is stored as criterion.

torch.optim as optim, and torch.nn as nn have already been loaded for you.
"""

# Instructions

"""

    Use optim to create an SGD optimizer with a learning rate of your choice (must be less than one).
---

    Update the model's parameters using the optimizer.

"""
import torch.optim as optim
# solution

# Create the optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001)

loss = criterion(pred, target)
loss.backward()

# Update the model's parameters using the optimizer
optimizer.step()

#----------------------------------#

# Conclusion

"""
SGD is only one of the many optimizers implemented in PyTorch. Researchers keep on improving the optimization process for training.
"""

NameError: name 'pred' is not defined

In [42]:
# exercise 09

"""
Using the MSELoss

Recall that we can't use cross-entropy loss for regression problems. The mean squared error loss (MSELoss) is a common loss function for regression problems. In this exercise, you will practice calculating and observing the loss using NumPy as well as its PyTorch implementation.

The torch package has been imported as well as numpy as np and torch.nn as nn.
"""

# Instructions

"""

    Calculate the MSELoss using NumPy.
    Create a MSELoss function using PyTorch.
    Convert y_hat and y to tensors and then float data types, and then use them to calculate MSELoss using PyTorch as mse_pytorch.

"""

# solution

y_hat = np.array(10)
y = np.array(1)

# Calculate the MSELoss using NumPy
mse_numpy = np.mean((y_hat - y)**2)

# Create the MSELoss function
criterion = nn.MSELoss()

# Calculate the MSELoss using the created loss function
mse_pytorch = criterion(torch.tensor(y_hat).float(), torch.tensor(y).float())
print(mse_pytorch)


#----------------------------------#

# Conclusion

"""
Great work, the loss outputs 81, the square of 9, as expected! The MSE loss is also called L2 loss. Another common loss function for regression problem is the mean absolute error loss, also called L1 loss.
"""

tensor(81.)


'\nGreat work, the loss outputs 81, the square of 9, as expected! The MSE loss is also called L2 loss. Another common loss function for regression problem is the mean absolute error loss, also called L1 loss.\n'

In [9]:
# exercise 10

"""
Writing a training loop

In scikit-learn, the whole training loop is contained in the .fit() method. In PyTorch, however, you implement the loop manually. While this provides control over loop's content, it requires a custom implementation.

You will write a training loop every time you train a deep learning model with PyTorch, which you'll practice in this exercise. The show_results() function provided will display some sample ground truth and the model predictions.

The package imports provided are: pandas as pd, torch, torch.nn as nn, torch.optim as optim, as well as DataLoader and TensorDataset from torch.utils.data.

The following variables have been created: dataloader, containing the dataloader; model, containing the neural network; criterion, containing the loss function, nn.MSELoss(); optimizer, containing the SGD optimizer; and num_epochs, containing the number of epochs.
"""

# Instructions

"""

    Write a for loop that iterates over the dataloader; this should be nested within a for loop that iterates over a range equal to the number of epochs.
    Set the gradients of the optimizer to zero.
---

    Write the forward pass.
    Compute the loss value.
    Compute the gradients.
---

    Update the model's parameters.
"""

# solution

# Loop over the number of epochs and the dataloader
for i in range(num_epochs):
  for data in dataloader:
    # Set the gradients to zero
    optimizer.zero_grad()
    # Run a forward pass
    feature, target = data
    prediction = model(feature)    
    # Calculate the loss
    loss = criterion(prediction, target)    
    # Compute the gradients
    loss.backward()
    # Update the model's parameters
    optimizer.step()

#----------------------------------#

# Conclusion

"""
Congratulations on writing your first training loop in PyTorch!
"""

'\n\n'