# Behind the pipeline

In [None]:
!pip install transformers

In [4]:
from transformers import pipeline
from transformers import AutoTokenizer

In [5]:
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

In [14]:
# raw_inputs = [
#     "I've been waiting for a HuggingFace course my whole life.",
#     "I hate this so much!",
# ]

raw_inputs = [
    "A few years ago, I decided to study the field of data science and never regretted the decision.",
    "I was pleasantly surprized with the way they introduced and explained the final decision.",
]

inputs = tokenizer(raw_inputs, padding=True, truncation=True, return_tensors="tf")
print(inputs)

{'input_ids': <tf.Tensor: shape=(2, 22), dtype=int32, numpy=
array([[  101,  1037,  2261,  2086,  3283,  1010,  1045,  2787,  2000,
         2817,  1996,  2492,  1997,  2951,  2671,  1998,  2196, 18991,
         1996,  3247,  1012,   102],
       [  101,  1045,  2001, 27726,  7505, 18098,  3550,  2007,  1996,
         2126,  2027,  3107,  1998,  4541,  1996,  2345,  3247,  1012,
          102,     0,     0,     0]], dtype=int32)>, 'attention_mask': <tf.Tensor: shape=(2, 22), dtype=int32, numpy=
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]],
      dtype=int32)>}


In [16]:
from transformers import TFAutoModel
model = TFAutoModel.from_pretrained(checkpoint)

Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFDistilBertModel: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
- This IS expected if you are initializing TFDistilBertModel from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertModel from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
All the weights of TFDistilBertModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertModel for predictions without further training.


In [17]:
outputs = model(inputs)
print(outputs.last_hidden_state.shape)

(2, 22, 768)


In [18]:
from transformers import TFAutoModelForSequenceClassification

model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)
outputs = model(inputs)

All PyTorch model weights were used when initializing TFDistilBertForSequenceClassification.

All the weights of TFDistilBertForSequenceClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


In [19]:
print(outputs.logits.shape)

(2, 2)


Since we have just two sentences and two labels, the result we get from our model is of shape 2 x 2.

In [20]:
print(outputs.logits)

tf.Tensor(
[[-3.0995371  3.2634885]
 [-3.1437578  3.3439093]], shape=(2, 2), dtype=float32)


Our model predicted `[-3.1, 3.26]` for the first sentence and `[-3.14, 3.34]` for the second one. Those are not probabilities but *logits*, the raw, unnormalized scores outputted by the last layer of the model. To be converted to probabilities, they need to go through a SoftMax layer (all 🤗 Transformers models output the logits, as the loss function for training will generally fuse the last activation function, such as SoftMax, with the actual loss function, such as cross entropy):

In [21]:
import tensorflow as tf

predictions = tf.math.softmax(outputs.logits, axis=-1)
print(predictions)

tf.Tensor(
[[0.00172117 0.99827886]
 [0.00151978 0.99848026]], shape=(2, 2), dtype=float32)


To get the labels corresponding to each position, we can inspect the id2label attribute of the model config (more on this in the next section):

In [22]:
model.config.id2label

{0: 'NEGATIVE', 1: 'POSITIVE'}