<a href="https://colab.research.google.com/github/kurtsenol/Transformers/blob/main/02_04_Handling_multiple_sequences.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Handling multiple sequences (TensorFlow)

Install the Transformers and Datasets libraries to run this notebook.

In [None]:
! pip install datasets transformers[sentencepiece]

In [None]:
import tensorflow as tf
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)

sequence = "I've been waiting for a HuggingFace course my whole life."

tokens = tokenizer.tokenize(sequence)
ids = tokenizer.convert_tokens_to_ids(tokens)
input_ids = tf.constant(ids)
model(input_ids)

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=629.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=231508.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=48.0, style=ProgressStyle(description_w…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=267949840.0, style=ProgressStyle(descri…




All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification.

All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


InvalidArgumentError: ignored

In [None]:
tokenized_inputs = tokenizer(sequence, return_tensors="tf")
print(tokenized_inputs["input_ids"])

tf.Tensor(
[[  101  1045  1005  2310  2042  3403  2005  1037 17662 12172  2607  2026
   2878  2166  1012   102]], shape=(1, 16), dtype=int32)


In [None]:
import tensorflow as tf
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)

sequence = "I've been waiting for a HuggingFace course my whole life."

tokens = tokenizer.tokenize(sequence)
ids = tokenizer.convert_tokens_to_ids(tokens)

Some layers from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing TFDistilBertForSequenceClassification: ['dropout_19']
- This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english and are newly initialized: ['dropout_79']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [None]:
print(ids)

[1045, 1005, 2310, 2042, 3403, 2005, 1037, 17662, 12172, 2607, 2026, 2878, 2166, 1012]


In [None]:
input_ids = tf.constant([ids])
print("Input IDs:", input_ids)

output = model(input_ids)
print("Logits:", output.logits)

Input IDs: tf.Tensor(
[[ 1045  1005  2310  2042  3403  2005  1037 17662 12172  2607  2026  2878
   2166  1012]], shape=(1, 14), dtype=int32)
Logits: tf.Tensor([[-2.7276192  2.8789363]], shape=(1, 2), dtype=float32)


In [None]:
batched_ids = [ids, ids]
print(batched_ids)

[[1045, 1005, 2310, 2042, 3403, 2005, 1037, 17662, 12172, 2607, 2026, 2878, 2166, 1012], [1045, 1005, 2310, 2042, 3403, 2005, 1037, 17662, 12172, 2607, 2026, 2878, 2166, 1012]]


In [None]:
input_ids = tf.constant(batched_ids)
print(input_ids)

tf.Tensor(
[[ 1045  1005  2310  2042  3403  2005  1037 17662 12172  2607  2026  2878
   2166  1012]
 [ 1045  1005  2310  2042  3403  2005  1037 17662 12172  2607  2026  2878
   2166  1012]], shape=(2, 14), dtype=int32)


In [None]:
output = model(input_ids)
print("Logits:", output.logits)

Logits: tf.Tensor(
[[-2.7276204  2.878937 ]
 [-2.72762    2.8789363]], shape=(2, 2), dtype=float32)


In [None]:
batched_ids = [
  [200, 200, 200],
  [200, 200]
]

In [None]:
padding_id = 100

batched_ids = [
  [200, 200, 200],
  [200, 200, padding_id]
]

In [None]:
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)

sequence1_ids = [[200, 200, 200]]
sequence2_ids = [[200, 200]]
batched_ids = [
    [200, 200, 200],
    [200, 200, tokenizer.pad_token_id]
]

print(model(tf.constant(sequence1_ids)).logits)
print(model(tf.constant(sequence2_ids)).logits)
print(model(tf.constant(batched_ids)).logits)

Some layers from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing TFDistilBertForSequenceClassification: ['dropout_19']
- This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english and are newly initialized: ['dropout_59']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


tf.Tensor([[ 1.5693673 -1.3894573]], shape=(1, 2), dtype=float32)
tf.Tensor([[ 0.5803025 -0.4125263]], shape=(1, 2), dtype=float32)
tf.Tensor(
[[ 1.5693673 -1.3894577]
 [ 1.3373486 -1.2163193]], shape=(2, 2), dtype=float32)


In [None]:
batched_ids = [
    [200, 200, 200],
    [200, 200, tokenizer.pad_token_id]
]

attention_mask = [
  [1, 1, 1],
  [1, 1, 0]
]

outputs = model(tf.constant(batched_ids), attention_mask=tf.constant(attention_mask))
print(outputs.logits)

tf.Tensor(
[[ 1.5693673  -1.3894577 ]
 [ 0.5803034  -0.41252705]], shape=(2, 2), dtype=float32)


In [None]:
# for longer sentences
# sequence = sequence[:max_sequence_length]

In [None]:
sentence1= "I’ve been waiting for a HuggingFace course my whole life."
sentence2= "I hate this so much!"

In [None]:
sentence1_tokens = tokenizer.tokenize(sentence1)
sentence1_ids = tokenizer.convert_tokens_to_ids(sentence1_tokens)

In [None]:
print(sentence1_ids)

[1045, 1521, 2310, 2042, 3403, 2005, 1037, 17662, 12172, 2607, 2026, 2878, 2166, 1012]


In [None]:
print(type(sentence1_ids))

<class 'list'>


In [None]:
sentence1_input_ids = tf.constant([sentence1_ids])
print("Input IDs:", sentence1_input_ids)

sentence1_output = model(sentence1_input_ids)
print("Logits:", sentence1_output.logits)

Input IDs: tf.Tensor(
[[ 1045  1521  2310  2042  3403  2005  1037 17662 12172  2607  2026  2878
   2166  1012]], shape=(1, 14), dtype=int32)
Logits: tf.Tensor([[-2.5719736  2.685238 ]], shape=(1, 2), dtype=float32)


In [None]:
sentence2_tokens = tokenizer.tokenize(sentence2)
sentence2_ids = tokenizer.convert_tokens_to_ids(sentence2_tokens)

In [None]:
print(sentence2_ids)

[1045, 5223, 2023, 2061, 2172, 999]


In [None]:
sentence2_input_ids = tf.constant([sentence2_ids])
print("Input IDs:", sentence2_input_ids)

sentence2_output = model(sentence2_input_ids)
print("Logits:", sentence2_output.logits)

Input IDs: tf.Tensor([[1045 5223 2023 2061 2172  999]], shape=(1, 6), dtype=int32)
Logits: tf.Tensor([[ 3.1930926 -2.6685236]], shape=(1, 2), dtype=float32)


In [None]:
batched_ids = [
   sentence1_ids,
    sentence2_ids.extend([tokenizer.pad_token_id]*8)
]

In [None]:
attention_mask = [
  [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
  [1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0 ]
]

In [None]:
outputs = model(tf.constant(batched_ids), attention_mask=tf.constant(attention_mask))
print(outputs.logits)

ValueError: ignored