# Handling multiple sequences (PyTorch)

The explanation of this notebook is in the Hugging Face course, chapter 2, section 5: [Handling multiple sequences](https://huggingface.co/course/chapter2/5?fw=pt)

The original code of this notebook is in the Hugging Face's SageMaker repository: [section5_pt.ipynb](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/en/chapter2/section5_pt.ipynb)

## Run conditions

This notebook has been tested in the following environment:
- Environment: Project created in [Paperspace Gradient](https://gradient.paperspace.com) with Python 3.9.13.
- Machine: P5000 (30GiB RAM 8 CPU 16GiB GPU) (more details on [Paperspace Machines](https://docs.paperspace.com/gradient/machines/)).
- IDE: Visual Studio Code using remote Jupyter server.

## Install dependencies

Install the Transformers, Datasets, and Evaluate libraries to run this notebook.

In [1]:
# Install the libraries datasets v2.7.1, evaluate v0.3.0, and transformers v4.25.1 with quiet and upgrade flags.
%pip install -q datasets==2.7.1 evaluate==0.3.0 transformers==4.25.1 --upgrade

[0mNote: you may need to restart the kernel to use updated packages.


In [2]:
# Import PyTorch, and AutoTokenizer and AutoModelForSequenceClassification from Transformers.
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Create a checkpoint from the model name.
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
# Create a tokenizer from the checkpoint.
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# Create a model from the checkpoint.
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
# Create a sentence.
sentence = "I've been waiting for a HuggingFace course my whole life."
# Create tokens from the sentence.
tokens = tokenizer(sentence, return_tensors="pt")
# Print input IDs.
print(tokens.input_ids)


tensor([[  101,  1045,  1005,  2310,  2042,  3403,  2005,  1037, 17662, 12172,
          2607,  2026,  2878,  2166,  1012,   102]])


In [3]:
# Import PyTorch, and AutoTokenizer and AutoModelForSequenceClassification from Transformers.
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Create a checkpoint from the model name.
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
# Create a tokenizer from the checkpoint.
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# Create a model from the checkpoint.
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
# Create a sentence.
sentence = "I've been waiting for a HuggingFace course my whole life."
# Create tokens from the sentence using tokenize method.
tokens = tokenizer.tokenize(sentence)
# Create input IDs from the tokens.
input_ids = tokenizer.convert_tokens_to_ids(tokens)
# Create a tensor from the input IDs, adding a dimension to the top because these tokens only have a sequence length of 1.
input_ids = torch.tensor([input_ids])
# Print input IDs with the text: "Input IDs: ".
print("Input IDs:", input_ids)
# Create the logits from the model and the input IDs.
logits = model(input_ids).logits
# Print logits with the text: "Logits: ".
print("Logits:", logits)

Input IDs: tensor([[ 1045,  1005,  2310,  2042,  3403,  2005,  1037, 17662, 12172,  2607,
          2026,  2878,  2166,  1012]])
Logits: tensor([[-2.7276,  2.8789]], grad_fn=<AddmmBackward0>)


## Padding the inputs

In [4]:
# Create batched_id list wit 2 lists. The first list contains the value 200 repeated 3 times and the second list contains the value 200 repeated 2 times.
batched_ids = [[200] * 3, [200] * 2]
# Print batched_ids with the text: "Batched IDs: ".
print("Batched IDs:", batched_ids)

Batched IDs: [[200, 200, 200], [200, 200]]


In [5]:
# Set padding_id to 100.
padding_id = 100
# Create the batched_ids list. The first list contains the value 200 repeated 3 times and the second list contains the value 200 repeated 2 times and the padding_id repeated 1 time.
batched_ids = [[200] * 3, [200] * 2 + [padding_id]]
# Print batched_ids with the text: "Batched IDs: ".
print("Batched IDs:", batched_ids)

Batched IDs: [[200, 200, 200], [200, 200, 100]]


In [6]:
# Create model from the checkpoint.
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
# Create sequence1_ids list with a list of 3 values: 200, 200, 200.
sequence1_ids = [200, 200, 200]
# Create sequence2_ids list with a list of 2 values: 200, 200.
sequence2_ids = [200, 200]
# Create batched_ids list with sequence1_ids, and sequence2_ids. sequence2_ids is padded with the padding_id.
batched_ids = [sequence1_ids, sequence2_ids + [padding_id]]
# Print logits of sentence 1 with the text: "Logits of sentence 1: ".
print("Logits of sentence 1:", model(torch.tensor([batched_ids[0]])).logits)
# Print logits of sentence 2 with the text: "Logits of sentence 2: ".
print("Logits of sentence 2:", model(torch.tensor([batched_ids[1]])).logits)
# Print logits of both sentences with the text: "Logits of both sentences: ".
print("Logits of both sentences:", model(torch.tensor(batched_ids)).logits)

Logits of sentence 1: tensor([[ 1.5694, -1.3895]], grad_fn=<AddmmBackward0>)
Logits of sentence 2: tensor([[ 0.9907, -0.9139]], grad_fn=<AddmmBackward0>)
Logits of both sentences: tensor([[ 1.5694, -1.3895],
        [ 0.9907, -0.9139]], grad_fn=<AddmmBackward0>)


## Attention masks

In [7]:
# Create batched_ids list with 2 list. The first list contains the value 200 repeated 3 times and the second list contains the value 200 repeated 2 times with the padding_id repeated 1 time.
batched_ids = [[200] * 3, [200] * 2 + [padding_id]]
# Crete attention_mask list with 2 lists. The first list contains the value 1 repeated 3 times and the second list contains the value 1 repeated 2 times and the value 0 repeated 1 time.
attention_mask = [[1] * 3, [1] * 2 + [0]]
# Create outputs from the model with batched_ids and attention_mask.
outputs = model(torch.tensor(batched_ids), attention_mask=torch.tensor(attention_mask))
# Print logits with the text: "Logits: ".
print("Logits:", outputs.logits)


Logits: tensor([[ 1.5694, -1.3895],
        [ 0.5803, -0.4125]], grad_fn=<AddmmBackward0>)
