# A tour of Tranformer Applications

### Text classification

In [5]:
from transformers import pipeline

classifier = pipeline('sentiment-analysis')

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [6]:
text = "I have gone through plenty of headsets over the years spending over a hundred dollars or more on headphones., and my no Bulls**t take on the Bengoo G9000 after almost a year of use is that I'll waste no more money on any (off the shelf) budget brands sold at Walmart ever again as the G9000's offer a greater above average price and quality ratio compared to other overpriced budget brands on the market. I'm not going to tell you that the G9000's have amazing sound quality compared to the hyper cloud X' phones I once had, nor is the microphone anything to rave about, but I will say that price and quality offered will bring me back every time before I'll go back into a local Walmart and pay the prices of other budget brands that have often not even lasted beyond the return policy."

In [8]:
import pandas as pd

outputs = classifier(text)
df = pd.DataFrame(outputs)
df

Unnamed: 0,label,score
0,NEGATIVE,0.957303


### Named Entity Recognition

In [10]:
ner_tagger = pipeline('ner', aggregation_strategy='simple')
outputs = ner_tagger(text)
pd.DataFrame(outputs)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Unnamed: 0,entity_group,score,word,start,end
0,MISC,0.933426,Bengoo G9000,141,153
1,ORG,0.783949,Walmart,259,266
2,MISC,0.916458,G9000,285,290
3,MISC,0.904751,G9000,440,445
4,ORG,0.764736,Walmart,680,687


### Question Answering

In [11]:
reader = pipeline('question-answering')
question = "What is the best headset?"
outputs = reader(question=question, context=text)
pd.DataFrame([outputs])

No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
Downloading (…)lve/main/config.json: 100%|██████████| 473/473 [00:00<00:00, 957kB/s]
Downloading model.safetensors: 100%|██████████| 261M/261M [00:52<00:00, 4.94MB/s] 
Downloading (…)okenizer_config.json: 100%|██████████| 29.0/29.0 [00:00<00:00, 61.2kB/s]
Downloading (…)solve/main/vocab.txt: 100%|██████████| 213k/213k [00:00<00:00, 1.26MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████| 436k/436k [00:00<00:00, 4.84MB/s]


Unnamed: 0,score,start,end,answer
0,0.671055,141,153,Bengoo G9000


### Summarization

In [13]:
summarizer = pipeline('summarization')
outputs = summarizer(text, max_length=56, clean_up_tokenization_spaces=True)
outputs[0]['summary_text']

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


" Bengoo G9000's offer a greater above average price and quality ratio compared to other overpriced budget brands on the market. I have gone through plenty of headsets over the years spending over a hundred dollars or more on headphones. I'll waste no more money on"

### Translation

In [16]:
translator = pipeline('translation_en_to_de', model='Helsinki-NLP/opus-mt-en-de')
outputs = translator(text, clean_up_tokenization_spaces=True, min_length=100)
outputs[0]['translation_text']

ValueError: This tokenizer cannot be instantiated. Please make sure you have `sentencepiece` installed in order to use this tokenizer.

### Text Generation

In [17]:
generator = pipeline('text-generation')
response = "Dear buyer,"
prompt = text + "\n\nCustomer service response:\n " + response
outputs = generator(prompt, max_length=200)
outputs[0]['generated_text']

No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Downloading (…)lve/main/config.json: 100%|██████████| 665/665 [00:00<00:00, 1.18MB/s]
Downloading model.safetensors: 100%|██████████| 548M/548M [00:58<00:00, 9.31MB/s] 
Downloading (…)neration_config.json: 100%|██████████| 124/124 [00:00<00:00, 245kB/s]
Downloading (…)olve/main/vocab.json: 100%|██████████| 1.04M/1.04M [00:00<00:00, 3.52MB/s]
Downloading (…)olve/main/merges.txt: 100%|██████████| 456k/456k [00:00<00:00, 1.82MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████| 1.36M/1.36M [00:00<00:00, 3.73MB/s]
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


"I have gone through plenty of headsets over the years spending over a hundred dollars or more on headphones., and my no Bulls**t take on the Bengoo G9000 after almost a year of use is that I'll waste no more money on any (off the shelf) budget brands sold at Walmart ever again as the G9000's offer a greater above average price and quality ratio compared to other overpriced budget brands on the market. I'm not going to tell you that the G9000's have amazing sound quality compared to the hyper cloud X' phones I once had, nor is the microphone anything to rave about, but I will say that price and quality offered will bring me back every time before I'll go back into a local Walmart and pay the prices of other budget brands that have often not even lasted beyond the return policy.\n\nCustomer service response:\n Dear buyer, Thank you for your inquiry. We have received no complaints from the customer service that you requested due to the"