#### Behind the pipelines , there are 3 major steps:

1. Tokenizer
2. Model
3. Postprocessing

#### 1. Tokenizer
The first step of pipeline is to convert the text inputs into numbers that the model can make sense of it

Tokenizer does the following:
a. Splitting the input into words, subwords, or symbols called tokens
b. Mapping each token to an integer
c. Adding additional inputs that may be useful to the model

In [2]:
from transformers import AutoTokenizer

checkpoint= "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer= AutoTokenizer.from_pretrained(checkpoint)

#once we have the tokenizer , we can directly pass our sentences to it and get back a dictionary
#we can pass one sentence or a list of sentences
raw_inputs= [
    "I've been studing engineering and i dont dont know why",
    "I dont understand these subject. Its so complicated"]

inputs= tokenizer(raw_inputs, padding=True, truncation=True, return_tensors="pt")
print(inputs)

{'input_ids': tensor([[  101,  1045,  1005,  2310,  2042, 16054,  2075,  3330,  1998,  1045,
          2123,  2102,  2123,  2102,  2113,  2339,   102],
        [  101,  1045,  2123,  2102,  3305,  2122,  3395,  1012,  2049,  2061,
          8552,   102,     0,     0,     0,     0,     0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]])}


#### 2. Model

We can download pretrained model same as tokenizer.
For each model input, we’ll retrieve a high-dimensional vector representing the contextual understanding of that input by the Transformer model.

In [3]:
from transformers import AutoModel

checkpoint= "distilbert-base-uncased-finetuned-sst-2-english"
model= AutoModel.from_pretrained(checkpoint)

Some weights of the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing DistilBertModel: ['classifier.weight', 'pre_classifier.weight', 'pre_classifier.bias', 'classifier.bias']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [4]:
outputs= model(**inputs)
print(outputs)

BaseModelOutput(last_hidden_state=tensor([[[-0.0869,  0.8823, -0.4551,  ...,  0.1345, -0.2991,  0.0832],
         [-0.1462,  0.8814, -0.2718,  ..., -0.0106, -0.2913,  0.1056],
         [-0.0470,  0.5186, -0.2083,  ..., -0.0832, -0.5778, -0.5181],
         ...,
         [-0.5489,  0.9351, -0.1455,  ..., -0.3314, -0.5637, -0.0822],
         [-0.7415,  0.9158, -0.6319,  ..., -0.0727, -0.7742,  0.2345],
         [-0.2190,  0.7717, -0.1616,  ..., -0.0045, -0.5507, -0.4335]],

        [[-0.3710,  0.5880, -0.1502,  ..., -0.2938, -0.1979,  0.3746],
         [-0.4135,  0.9116,  0.1039,  ..., -0.2464, -0.2319,  0.2671],
         [-0.5402,  0.8410,  0.3072,  ..., -0.3392, -0.3055,  0.1072],
         ...,
         [-0.4916,  0.4242, -0.1777,  ..., -0.3625, -0.1853,  0.4384],
         [-0.3994,  0.4076, -0.2245,  ..., -0.2317, -0.1596,  0.3773],
         [-0.3991,  0.4054, -0.2113,  ..., -0.2379, -0.1343,  0.3737]]],
       grad_fn=<NativeLayerNormBackward0>), hidden_states=None, attentions=None)


In [7]:
print(outputs.last_hidden_state.shape)

torch.Size([2, 17, 768])


In [8]:
from transformers import AutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
outputs = model(**inputs)

print(outputs.logits.shape)   #Since we have just two sentences and two labels, the result we get from our model is of shape 2 x 2.

torch.Size([2, 2])


#### 3. PostProcessing the output

As an output of the model , we get logits. Logits are raw, unnormalized scores outputted by last laer of the model.

We need to convert logits into probabilities score. For that the logits values are passed through softmax layer.

In [9]:
import torch

predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(predictions)

tensor([[9.9905e-01, 9.5343e-04],
        [9.9656e-01, 3.4362e-03]], grad_fn=<SoftmaxBackward0>)


In [10]:
model.config.id2label

{0: 'NEGATIVE', 1: 'POSITIVE'}