## Handling multiple sequences

In [None]:
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)

sequence = "I've been waiting for a HuggingFace course my whole life."

tokens = tokenizer.tokenize(sequence)
ids = tokenizer.convert_tokens_to_ids(tokens)
input_ids = torch.tensor(ids)
# This line will fail.
model(input_ids)

>In the previous exercise you saw how sequences get translated into lists of numbers. Let’s convert this list of numbers to a tensor and send it to the model:

In [4]:
import torch
from transformers import AutoTokenizer,AutoModelForSequenceClassification
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"

In [5]:
tokenizer=AutoTokenizer.from_pretrained(checkpoint)
model=AutoModelForSequenceClassification.from_pretrained(checkpoint)

In [6]:
sequence="I've been waiting for a HuggingFace course my whole life."

In [7]:
tokens=tokenizer.tokenize(sequence)

In [8]:
ids=tokenizer.convert_tokens_to_ids(tokens)
input_ids=torch.tensor([ids])#extra dim[]

In [9]:
print("Input IDs:", input_ids)

Input IDs: tensor([[ 1045,  1005,  2310,  2042,  3403,  2005,  1037, 17662, 12172,  2607,
          2026,  2878,  2166,  1012]])


In [10]:
output=model(input_ids)

In [11]:
print("logits:",output.logits)

logits: tensor([[-2.7276,  2.8789]], grad_fn=<AddmmBackward0>)


>Batching is the act of sending multiple sentences through the model, all at once. If you only have one sentence, you can just build a batch with a single sequence

In [15]:
batched_ids = [ids, ids]
batched_input_ids=torch.tensor(batched_ids)

In [17]:
output_1=model(batched_input_ids)

print("logits:",output_1.logits)

logits: tensor([[-2.7276,  2.8789],
        [-2.7276,  2.8789]], grad_fn=<AddmmBackward0>)


## Padding the inputs
>we’ll use padding to make our tensors have a rectangular shape

In [18]:
batched_ids = [
    [200, 200, 200],
    [200, 200]
]

In [19]:
#simple example
padding_id = 100

batched_ids = [
    [200, 200, 200],
    [200, 200, padding_id],
]

In [20]:
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)

sequence1_ids = [[200, 200, 200]]
sequence2_ids = [[200, 200]]
batched_ids = [
    [200, 200, 200],
    [200, 200, tokenizer.pad_token_id],
]

print(model(torch.tensor(sequence1_ids)).logits)
print(model(torch.tensor(sequence2_ids)).logits)
print(model(torch.tensor(batched_ids)).logits)

We strongly recommend passing in an `attention_mask` since your input_ids may be padded. See https://huggingface.co/docs/transformers/troubleshooting#incorrect-output-when-padding-tokens-arent-masked.


tensor([[ 1.5694, -1.3895]], grad_fn=<AddmmBackward0>)
tensor([[ 0.5803, -0.4125]], grad_fn=<AddmmBackward0>)
tensor([[ 1.5694, -1.3895],
        [ 1.3374, -1.2163]], grad_fn=<AddmmBackward0>)


>Our batched predictions: the second row should be the same as the logits for the second sentence, but we’ve got completely different values!
>   
>Transformer models is attention layers that contextualize each token. These will take into account the padding tokens since they attend to all of the tokens of a sequence   
> we need to tell those attention layers to ignore the padding tokens. This is done by using an attention mask.

In [21]:
attention_mask=[
    [1,1,1],
    [1,1,0]
]

In [23]:
outputs=model(torch.tensor(batched_ids),attention_mask=torch.tensor(attention_mask))
print(outputs.logits)


tensor([[ 1.5694, -1.3895],
        [ 0.5803, -0.4125]], grad_fn=<AddmmBackward0>)


>now check the second rows

## Truncation
>With Transformer models, there is a limit to the lengths of the sequences we can pass the models. Most models handle sequences of up to 512 or 1024 tokens, and will crash when asked to process longer sequences.

>Solution:
>1. Use a model with a longer supported sequence length.
 2. Truncate your sequences.

In [25]:
#Otherwise, we recommend you truncate your sequences by specifying the max_sequence_length parameter:
max_sequence_length=1024
sequence = sequence[:max_sequence_length]