# Transformers, what can they do?

https://huggingface.co/learn/nlp-course/chapter1/3?fw=pt#transformers-what-can-they-do

The most basic object in the Hugging Face Transformers library is the pipeline() function. It connects a model with its necessary preprocessing and postprocessing steps, allowing us to directly input any text and get an intelligible answer:

Some of the currently available pipelines are:

- feature-extraction (get the vector representation of a text)
- fill-mask
- ner (named entity recognition)
- question-answering
- sentiment-analysis
- summarization
- text-generation
- translation
- zero-shot-classification

Some examples of using a pipeline:

#### Sentiment analysis

In [5]:
# simple usage with one input sequence
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'label': 'POSITIVE', 'score': 0.9598049521446228}]

In [32]:
# simple usage with one input sequence given by the user
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
text = input("Type the desired sentence for the sentiment analysis:")
classifier(text)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Type the desired sentence for the sentiment analysis: The fish in my tank are swimming from left to right all the time.


[{'label': 'NEGATIVE', 'score': 0.9445313811302185}]

In [49]:
# A bit more elaborate: specifying the model/checkpoint, using a list of sequences as input and outputting to a pandas dataframe.
# Note: we have to use padding=True now, because the model can only work with tensors of same size.
# Note: truncation=True is not necessary, as long as the sequences are shorter than the model's max sequence length.
from transformers import pipeline
import pandas as pd

classifier = pipeline(model="distilbert-base-uncased-finetuned-sst-2-english", task="sentiment-analysis", padding=True, truncation=True)
text = ["I love pizza!", "My mom is the best!", "Getting old is nothing for beginners!"]
outputs = classifier(text)
result = pd.DataFrame(outputs)  # create a dataframe from the models output
result.insert(0, 'sequence', text)  # adding the input texts as first column to the dataframe
result

Unnamed: 0,sequence,label,score
0,I love pizza!,POSITIVE,0.999813
1,My mom is the best!,POSITIVE,0.999877
2,Getting old is nothing for beginners!,NEGATIVE,0.999535
