Exercise: PyTorch and HuggingFace scavenger hunt!
PyTorch and HuggingFace have emerged as powerful tools for developing and deploying neural networks.

In this scavenger hunt, we will explore the capabilities of PyTorch and HuggingFace, uncovering hidden treasures on the way.

We have two parts:

Familiarize yourself with PyTorch
Get to know HuggingFace

PyTorch tensors
Scan through the PyTorch tensors documentation here. Be sure to look at the examples.

In the following cell, create a tensor named my_tensor of size 3x3 with values of your choice. The tensor should be created on the GPU if available. Print the tensor.

In [None]:
# Fill in the missing parts labelled <MASK> with the appropriate code to complete the exercise.

# Hint: Use torch.cuda.is_available() to check if GPU is available

import torch
import numpy as np

# Set the device to be used for the tensor
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Create a tensor on the appropriate device
my_tensor = torch.tensor([[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]])

# Print the tensor
print(my_tensor)


In [None]:
# Check the previous cell

assert my_tensor.device.type in {"cuda", "cpu"}
assert my_tensor.shape == (3, 3)

print("Success!")

Neural Net Constructor Kit torch.nn
You can think of the torch.nn (documentation) module as a constructor kit for neural networks. It provides the building blocks for creating neural networks, including layers, activation functions, loss functions, and more.

Instructions:

Create a three layer Multi-Layer Perceptron (MLP) neural network with the following specifications:

Input layer: 784 neurons
Hidden layer: 128 neurons
Output layer: 10 neurons
Use the ReLU activation function for the hidden layer and the softmax activation function for the output layer. Print the neural network.

Hint: MLP's use "fully-connected" or "dense" layers. In PyTorch's nn module, this type of layer has a different name. See the examples in this tutorial to find out more.

In [None]:
# Replace <MASK> with the appropriate code to complete the exercise.

import torch.nn as nn


class MyMLP(nn.Module):
    """My Multilayer Perceptron (MLP)

    Specifications:

        - Input layer: 784 neurons
        - Hidden layer: 128 neurons with ReLU activation
        - Output layer: 10 neurons with softmax activation

    """

    def __init__(self):
        super(MyMLP, self).__init__()
        self.fc1 =  nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 10)
        self.relu = nn.ReLU()
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        # Pass the input to the second layer
        x = self.fc1(x)

        # Apply ReLU activation
        x =  self.relu(x)

        # Pass the result to the final layer
        x = self.fc2(x)

        # Apply softmax activation
        x = self.softmax(x)
        
        return x


my_mlp = MyMLP()
print(my_mlp)

In [None]:
# Check your work here:


# Check the number of inputs
assert my_mlp.fc1.in_features == 784

# Check the number of outputs
assert my_mlp.fc2.out_features == 10

# Check the number of nodes in the hidden layer
assert my_mlp.fc1.out_features == 128

# Check that my_mlp.fc1 is a fully connected layer
assert isinstance(my_mlp.fc1, nn.Linear)

# Check that my_mlp.fc2 is a fully connected layer
assert isinstance(my_mlp.fc2, nn.Linear)

PyTorch Loss Functions and Optimizers
PyTorch comes with a number of built-in loss functions and optimizers that can be used to train neural networks. The loss functions are implemented in the torch.nn (documentation) module, while the optimizers are implemented in the torch.optim (documentation) module.

Instructions:

Create a loss function using the torch.nn.CrossEntropyLoss (documentation) class.
Create an optimizer using the torch.optim.SGD (documentation) class with a learning rate of 0.01.

In [None]:
# Replace <MASK> with the appropriate code to complete the exercise.

# Loss function
loss_fn = torch.nn.CrossEntropyLoss()

# Optimizer (by convention we use the variable optimizer)
optimizer = torch.optim.SGD(my_mlp.parameters(), lr=0.01, momentum=0.9)

In [None]:
# Check

assert isinstance(
    loss_fn, nn.CrossEntropyLoss
), "loss_fn should be an instance of CrossEntropyLoss"
assert isinstance(optimizer, torch.optim.SGD), "optimizer should be an instance of SGD"
assert optimizer.defaults["lr"] == 0.01, "learning rate should be 0.01"
assert optimizer.param_groups[0]["params"] == list(
    my_mlp.parameters()
), "optimizer should be passed the MLP parameters"

PyTorch Training Loops
PyTorch makes writing a training loop easy!

Instructions:

Fill in the blanks!

In [None]:
# Replace <MASK> with the appropriate code to complete the exercise.

def fake_training_loaders():
    for _ in range(30):
        yield torch.randn(64, 784), torch.randint(0, 10, (64,))


for epoch in range(3):
    # Create a training loop
    for i, data in enumerate(fake_training_loaders()):
        # Every data instance is an input + label pair
        x, y = data

        # Zero your gradients for every batch!
        optimizer.zero_grad()

        # Forward pass (predictions)
        y_pred = my_mlp(x)

        # Compute the loss and its gradients
        loss = loss_fn(y_pred, y)
        loss.backward()

        # Adjust learning weights
        optimizer.step()

        if i % 10 == 0:
            print(f"Epoch {epoch}, batch {i}: {loss.item():.5f}")

In [None]:
# Check

assert abs(loss.item() - 2.3) < 0.1, "the loss should be around 2.3 with random data"

Get to know HuggingFace
HuggingFace is a popular destination for pre-trained models and datasets that can be applied to a variety of tasks quickly and easily. In this section, we will explore the capabilities of HuggingFace and learn how to use it to build and train neural networks.

Download a model from HuggingFace and use it for sentiment analysis
HuggingFace provides a number of pre-trained models that can be used for a variety of tasks. In this exercise, we will use the distilbert-base-uncased-finetuned-sst-2-english model to perform sentiment analysis on a movie review.

Instructions:

Review the AutoModel tutorial on the HuggingFace website.
Instantiate an AutoModelForSequenceClassification model using the distilbert-base-uncased-finetuned-sst-2-english model.
Instantiate an AutoTokenizer using the distilbert-base-uncased-finetuned-sst-2-english model.
Define a function that will get a prediction

In [None]:
# Replace <MASK> with the appropriate code to complete the exercise.

# Get the model and tokenizer

from transformers import AutoModelForSequenceClassification, AutoTokenizer

pt_model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(pt_model_name)
pt_model  = AutoModelForSequenceClassification.from_pretrained(pt_model_name)


def get_prediction(review):
    """Given a review, return the predicted sentiment"""

    # Tokenize the review
    # (Get the response as tensors and not as a list)
    inputs = tokenizer(review, return_tensors="pt")

    # Perform the prediction (get the logits)
    outputs = pt_model(**inputs)

    # Get the predicted class (corresponding to the highest logit)
    predictions = torch.argmax(outputs.logits, dim=-1)

    return "positive" if predictions.item() == 1 else "negative"