In [1]:
from transformers import AutoTokenizer

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

raw_inputs = [
    "I've been waiting for a HuggingFace Course for a very long time.",
    "I love and hate this so much."
]
inputs = tokenizer(
    raw_inputs,
    padding=True,
    truncation=True,
    return_tensors="pt"
)

In [2]:
from transformers import AutoModel

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModel.from_pretrained(checkpoint)
outputs = model(**inputs)
print(outputs.last_hidden_state.shape)

torch.Size([2, 18, 768])


We can see a high-dimensional tensor i.e., representation of the sentences passed, but which is not directly useful for our sentiment analysis classification problem. 
Here, the tensor has `2` sentences, each of `18` tokens and the last dimension is the hidden size of the model `768`.

To get an output i.e., linked to our classification problem, we'll use `AutoModelForSequenceClassification` class. 

In [3]:
from transformers import AutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
outputs = model(**inputs)
print(outputs.logits)

tensor([[ 2.0498, -1.7484],
        [-1.8256,  1.8829]], grad_fn=<AddmmBackward0>)


We can see now tensor of size 2x2. **One results for each sentence and another for each possible label.** </br>
These labels are not probabilities yet, as we can see that they won't sum to 1. This is because each model of the transformers library returns `logits`.