**Building a binary classifier in PyTorch**

Recall that a small neural network with a single linear layer followed by a sigmoid function is a binary classifier. It acts just like a logistic regression.

In this exercise, you'll practice building this small network and interpreting the output of the classifier.

The `torch` package and the `torch.nn` package have already been imported for you.

**Instructions:**
* Create a neural network that takes a tensor of dimensions `1x8` as input, and returns an output of the correct shape for binary classification.
* Pass the output of the linear layer to a sigmoid, which both takes in and return a single float.
* Which of the following is false about the output returned by your binary classifier?
  * A. We can use a threshold of `0.5` to determine if the output belongs to one class or the other.
  * B. It can return any float value.
  * C. It is produced from an untrained model so it is not yet meaningful.
  * D. The sigmoid function transforms the values of the input without changing its shape.


In [1]:
import torch
import torch.nn as nn

input_tensor = torch.Tensor([[3, 4, 6, 2, 3, 6, 8, 9]])

# Implement a small neural network for binary classification
model = nn.Sequential(
  nn.____(____),
  nn.____()
)

output = model(input_tensor)
print(output)

#Print either A, B, C or D for the last question.
print(____)



AttributeError: module 'torch.nn' has no attribute '____'

# **From regression to multi-class classification**

Recall that the models we have seen for binary classification, multi-class classification and regression have all been similar, barring a few tweaks to the model.

In this exercise, you'll start by building a model for regression, and then tweak the model to perform a multi-class classification.

**Instructions:**
1. Create a neural network with exactly four linear layers, which takes the input tensor as input, and outputs a regression value, using any shapes you like for the hidden layers.

2. A similar neural network to the one you just built is provided, containing four linear layers; update this network to perform a multi-class classification with four outputs.


In [None]:
import torch
import torch.nn as nn

input_tensor = torch.Tensor([[3, 4, 6, 7, 10, 12, 2, 3, 6, 8, 9]])

# Implement a neural network with exactly four linear layers
model = ____

output = model(input_tensor)
print(output)

# Update network below to perform a multi-class classification with four labels
model = nn.Sequential(
  nn.Linear(11, 20),
  nn.Linear(20, 12),
  nn.Linear(12, 6),
  nn.Linear(6, 1),
  ____
)

output = model(input_tensor)
print(output)

# **Creating one-hot encoded labels**

One-hot encoding is a technique that turns a single integer label into a vector of `N` elements, where `N` is the number of classes in your dataset. This vector only contains zeros and ones. In this exercise, you'll create the one-hot encoded vector of the label `y` provided.

You'll practice doing this manually, and then make your life easier by leveraging the help of PyTorch! Your dataset contains three classes.

NumPy is already imported as `np`, and `torch.nn.functional as F`. The `torch` package is also imported.

**Instructions:**
* Manually create a one-hot encoded vector of the ground truth label `y` by filling in the NumPy array provided.
* Create a one-hot encoded vector of the ground truth label `y` using PyTorch.


In [None]:
#Importing necessary libraries and modules
import torch
import torch.nn.functional as F
import numpy as np

y = 1
num_classes = 3

# Create the one-hot encoded vector using NumPy
one_hot_numpy = np.array([____, ____, ____])

# Create the one-hot encoded vector using PyTorch
one_hot_pytorch = ____

# **Calculating cross entropy loss**

Cross entropy loss is the most used loss for classification problems. In this exercise, you will create inputs and calculate cross entropy loss in PyTorch. You are provided with the ground truth label `y` and a vector of `scores` predicted by your model.

You'll start by creating a one-hot encoded vector of the ground truth label `y`, which is a required step to compare `y` with the scores predicted by your model. Next, you'll create a cross entropy loss function. Last, you'll call the loss function, which takes scores (model predictions before the final softmax function), and the one-hot encoded ground truth label, as inputs. It outputs a single float, the `loss` of that sample.

`torch`, CrossEntropyLoss, and `torch.nn.functional as F` have already been imported for you.

**Instructions:**
* Create the one-hot encoded vector of the ground truth label `y` and assign it to `one_hot_label`.

* Create the cross entropy loss function and store it as `criterion`.
* Calculate the cross entropy loss using the `one_hot_label` vector and the `scores` vector, by calling the loss_function you created.


In [None]:
import torch
import torch.nn.functional as F
from torch.nn import CrossEntropyLoss

y = [2]
scores = torch.tensor([[0.1, 6.0, -2.0, 3.2]])

# Create a one-hot encoded vector of the label y
one_hot_label = ____(____, num_classes=____)

# Create the cross entropy loss function
criterion = ____

# Calculate the cross entropy loss
loss = criterion(____)
print(loss)



# **Estimating a sample**

In previous exercises, you used linear layers to build networks.

Recall that the operation performed by `nn.Linear()` is to take an input
and apply the transformation `W * X + b`, where `W` and `b` are two tensors (called the weight and bias).

A critical part of training PyTorch models is to calculate gradients of the weight and bias tensors with respect to a loss function.

In this exercise, you will calculate weight and bias tensor gradients using cross entropy loss and a sample of data.

The following tensors are provided:

* `weight`: a `2 x 9`-element tesnor

* `bias`: a `2`-element tensor
* `preds`: a `1 x 2`-element tensor containing the model predictions
* `target`: a `1 x 2`-element one-hot encoded tensor containing the ground-truth label


**Instructions:**
* Use the criterion you have defined to calculate the loss value with respect to the predictions and target values.
* Compute the gradients of the cross entropy loss.
* Display the gradients of the weight and bias tensors, in that order.


In [None]:
#Run this cell first to do the exercise on the next cell
import torch
import torch.nn as nn

weight = torch.tensor([[-1.1975,  2.0069, -0.8426, -0.8938,  0.1091,  0.1335,  0.6463, -0.8330, 0.0271],
        [ 1.2193,  0.9024, -0.7527, -0.5498,  1.6105,  1.4751,  0.0624, -1.4398,-1.4400]], requires_grad=True)

bias = torch.tensor([ 0.9410, -2.5964], requires_grad=True)
target = torch.tensor([[1., 0.]])
inputs = torch.rand(1, 9)
preds = torch.matmul(inputs, weight.t()) + bias


In [None]:
criterion = nn.CrossEntropyLoss()

# Calculate the loss
loss = criterion(____, ____)

# Compute the gradients of the loss
____

# Display gradients of the weight and bias tensors in order
print(____)
print(____)

# **Accessing the model parameters**

A PyTorch model created with the `nn.Sequential()` is a module that contains the different layers of your network. Recall that each layer parameter can be accessed by indexing the created model directly. In this exercise, you will practice accessing the parameters of different linear layers of a neural network. You won't be accessing the sigmoid.

**Instructions:**
* Access the weight parameter of the first linear layer.
* Access the bias parameter of the second linear layer.


In [None]:
model = nn.Sequential(nn.Linear(16, 8),
                      nn.Sigmoid(),
                      nn.Linear(8, 2))

# Access the weight of the first linear layer
weight_0 = ____

# Access the bias of the second linear layer
bias_1 = ____

#Display the values
print("The weight:", weight_0)
print("The bias:", bias_1)

# **Updating the weights manually**

Now that you know how to access weights and biases, you will manually perform the job of the PyTorch optimizer. PyTorch functions can do what you're about to do, but it's helpful to do the work manually at least once, to understand what's going on under the hood.

A neural network of three layers has been created and stored as the `model` variable. This network has been used for a forward pass and the loss and its derivatives have been calculated. A default learning rate, `lr`, has been chosen to scale the gradients when performing the update.

**Instructions:**
* Create the gradient variables by accessing the local gradients of each weight tensor.
* Update the weights using the gradients scaled by the learning rate.


In [None]:
#Run this cell before doing the exercise in the next cell

#Import libraries and modules
import torch
import torch.nn as nn

#Creating the model
model = nn.Sequential(nn.Linear(16, 10),
                      nn.Linear(10,8),
                      nn.Linear(8, 2))
#Doing the forward pass
inputs = torch.rand(1, 16)
preds = model(inputs)
target = torch.tensor([[1., 0.]])
criterion = nn.CrossEntropyLoss()
loss_value = criterion(preds, target)
loss_value.backward()
lr = 0.01

In [None]:
weight0 = model[0].weight
weight1 = model[1].weight
weight2 = model[2].weight

# Access the gradients of the weight of each linear layer
grads0 = ____
grads1 = ____
grads2 = ____

# Update the weights using the learning rate and the gradients
weight0 = ____
weight1 = ____
weight2 = ____


# **Using the PyTorch optimizer**

In the previous exercise, you manually updated the weight of a network. You now know what's going on under the hood, but this approach is not scalable to a network of many layers.

Thankfully, the PyTorch SGD optimizer does a similar job in a handful of lines of code. In this exercise, you will practice the last step to complete the training loop: updating the weights using a PyTorch optimizer.

A neural network has been created and provided as the `model` variable. This model was used to run a forward pass and create the tensor of predictions `pred`. The one-hot encoded tensor is named `target` and the cross entropy loss function is stored as `criterion`.

`torch.optim as optim`, and `torch.nn as nn` have already been loaded for you.

**Instructions:**
* Use `optim` to create an SGD optimizer with a learning rate of your choice (must be less than one) for the `model` provided.


In [None]:
#Importing libraries
import torch
import torch.nn as nn
import torch.optim as optim

# Create the optimizer
optimizer = ____

#Doing the forward pass
inputs = torch.rand(1, 16)
preds = model(inputs)
target = torch.tensor([[1., 0.]])
criterion = nn.CrossEntropyLoss()
loss_value = criterion(preds, target)
loss_value.backward()

# Update the model's parameters using the optimizer
____


# **Using the MSELoss**

Recall that we can't use cross-entropy loss for regression problems. The mean squared error loss (MSELoss) is a common loss function for regression problems. In this exercise, you will practice calculating and observing the loss using NumPy as well as its PyTorch implementation.

The `torch` package has been imported as well as `numpy` as `np` and `torch.nn` as `nn`.

**Instructions:**
* Calculate the MSELoss using NumPy.
* Create a MSELoss function using PyTorch.
* Convert `y_hat` and `y` to tensors and then float data types, and then use them to calculate MSELoss using PyTorch as `mse_pytorch`.


In [None]:
y_hat = np.array(10)
y = np.array(1)

# Calculate the MSELoss using NumPy
mse_numpy = ____

# Create the MSELoss function
criterion = ____

# Calculate the MSELoss using the created loss function
mse_pytorch = criterion(____, ____)
print(mse_pytorch)

# **Writing a training loop**

In scikit-learn, the whole training loop is contained in the `.fit()` method. In PyTorch, however, you implement the loop manually. While this provides control over loop's content, it requires a custom implementation.

You will write a training loop every time you train a deep learning model with PyTorch, which you'll practice in this exercise. The `show_results()` function provided will display some sample ground truth and the model predictions.

The package imports provided are: `pandas as pd`, `torch, torch.nn as nn`, `torch.optim as optim`, as well as `DataLoader` and `TensorDataset` from `torch.utils.data`.

The following variables have been created: `dataloader`, containing the dataloader; `model`, containing the neural network; `criterion`, containing the loss function, `nn.MSELoss()`; `optimizer`, containing the SGD optimizer; and `num_epochs`, containing the number of epochs.

**Instructions:**
* Write a for loop that iterates over the `dataloader`; this should be nested within a for loop that iterates over a range equal to the number of epochs.
* Set the gradients of the optimizer to zero.


In [None]:
#Run this cell before doing the exercise in the next cell

#Import all necessary libraries and modules
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

#Create the dataloader, criterion, optimizer, num_epochs
X_df = pd.read_csv('https://github.com/DataAnalyst21/DatasetsForDataAnalytics/blob/main/model_input_data.csv?raw=True')
y_df = pd.read_csv('https://github.com/DataAnalyst21/DatasetsForDataAnalytics/blob/main/model_true_output.csv?raw=True')
X = torch.tensor(X_df.values, dtype=torch.float)
y = torch.tensor(y_df.values, dtype=torch.float)
dataset = TensorDataset(X, y)

dataloader = DataLoader(dataset, batch_size=10, shuffle=True)
model = nn.Sequential(nn.Linear(4, 10),
                      nn.Linear(10,8),
                      nn.Linear(8, 1))
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.001)
num_epochs = 5

#Write a function which takes the dataloader and iterates in it and displays the predictions and ground truth values
def show_results(model, dataloader):
  model.eval()
  for data in dataloader:
    # Run a forward pass
    feature, target = data
    with torch.no_grad():
      prediction = model(feature)
      # Print ground truth and predicted salary
      print("Ground truth: {:.2f}, Predicted: {:.2f}".format(target.item(), prediction.item()))


In [None]:
# Loop over the number of epochs and then the dataloader
for ____:
  for ____:
    # Set the gradients to zero
    ____

    # Run a forward pass
    feature, target = data
    prediction = ____
    # Calculate the loss
    loss = ____
    # Compute the gradients
    ____


    # Update the model's parameters
    ____
show_results(model, dataloader)
