In [1]:
## pipeline example
from transformers import pipeline
sent=pipeline('sentiment-analysis')
sent(['i hate this subject'])

Downloading:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

[{'label': 'NEGATIVE', 'score': 0.9996384978294373}]

In [2]:
## splitting into steps
#default checkpoint of the sentiment-analysis pipeline is distilbert-base-uncased-finetuned-sst-2-english 
from transformers import AutoTokenizer
checkpoint='distilbert-base-uncased-finetuned-sst-2-english'
tokenizer=AutoTokenizer.from_pretrained(checkpoint)

In [3]:
#Transformer models only accept tensors as input
##All this preprocessing needs to be done in exactly the same way as when the model was pretrained
##To specify the type of tensors we want to get back (PyTorch, TensorFlow, or plain NumPy), we use the return_tensors argument:
raw_inputs = [
    "I've been waiting for a HuggingFace course my whole life.", 
    "I hate this so much!",]
inputs=tokenizer(raw_inputs,padding=True,truncation=True,return_tensors='tf')

There are many different architectures available in 🤗 Transformers, with each one designed around tackling a specific task. Here is a non-exhaustive list:

*Model (retrieve the hidden states)
*ForCausalLM
*ForMaskedLM
*ForMultipleChoice
*ForQuestionAnswering
*ForSequenceClassification
*ForTokenClassification
and others 🤗
For our example, we will need a model with a sequence classification head (to be able to classify the sentences as positive or negative). So, we won’t actually use the TFAutoModel class, but TFAutoModelForSequenceClassification:

In [4]:
from transformers import TFAutoModelForSequenceClassification
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)

Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification.

All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


In [5]:
outputs=model(inputs)

In [6]:
print(outputs.logits.shape)

(2, 2)


In [7]:
##postprocessing
outputs.logits

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[-1.5606962,  1.6122814],
       [ 4.169231 , -3.3464472]], dtype=float32)>

In [8]:
import tensorflow as tf

predictions = tf.math.softmax(outputs.logits, axis=-1)
print(predictions)

tf.Tensor(
[[4.0195383e-02 9.5980465e-01]
 [9.9945587e-01 5.4418424e-04]], shape=(2, 2), dtype=float32)


In [9]:
model.config.id2label

{0: 'NEGATIVE', 1: 'POSITIVE'}