# Getting started with pipeline

The easiest way to use a pretrained model on a given task is to use pipeline. 

In [1]:
# Transformers installation
#pip install transformers

from transformers import pipeline
classifier = pipeline('sentiment-analysis')

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)


In [2]:
txt_1 = 'Football is the most exciting and popular game in the world'
classifier(txt_1)

[{'label': 'POSITIVE', 'score': 0.9998573064804077}]

In [3]:
txt_2 = 'The weather is unpredictable around here'
classifier(txt_2)

[{'label': 'POSITIVE', 'score': 0.6950832605361938}]

# AutoTokenizer, TFAutoModelForSequenceClassification

What happens when we use pipelines

In [4]:
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
model = TFAutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification.

All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


In [5]:
inputs = tokenizer([txt_1, txt_2])

In [6]:
inputs

{'input_ids': [[101, 2374, 2003, 1996, 2087, 10990, 1998, 2759, 2208, 1999, 1996, 2088, 102], [101, 1996, 4633, 2003, 21446, 2105, 2182, 102]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1]]}

In [7]:
inputs_with_padding = tokenizer([txt_1, txt_2], padding = True, truncation = True, max_length = 256, return_tensors="tf")

In [8]:
inputs_with_padding

{'input_ids': <tf.Tensor: shape=(2, 13), dtype=int32, numpy=
array([[  101,  2374,  2003,  1996,  2087, 10990,  1998,  2759,  2208,
         1999,  1996,  2088,   102],
       [  101,  1996,  4633,  2003, 21446,  2105,  2182,   102,     0,
            0,     0,     0,     0]])>, 'attention_mask': <tf.Tensor: shape=(2, 13), dtype=int32, numpy=
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]])>}

In [9]:
outputs = model(inputs_with_padding)

In [10]:
outputs

TFSequenceClassifierOutput(loss=None, logits=<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[-4.2494802 ,  4.605343  ],
       [-0.33054897,  0.49343637]], dtype=float32)>, hidden_states=None, attentions=None)

In [11]:
import tensorflow as tf
predictions = tf.nn.softmax(outputs[0], axis=-1)

In [12]:
predictions

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[1.4267150e-04, 9.9985731e-01],
       [3.0491832e-01, 6.9508165e-01]], dtype=float32)>