# A Tour of Transformer Applications

In [3]:
%pip install "transformers==4.45.0" sentencepiece torch tqdm ipywidgets pandas "numpy<2" -q --extra-index-url https://download.pytorch.org/whl/cpu


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m26.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [4]:
from tqdm.notebook import tqdm

## 1. Text Classification

In [5]:
text = """Dear Amazon, last week I ordered an Optimus Prime action figure
from your online store in Germany. Unfortunately, when I opened the package,
I discovered to my horror that I had been sent an action figure of Megatron
instead! As a lifelong enemy of the Decepticons, I hope you can understand my
dilemma. To resolve the issue, I demand an exchange of Megatron for the
Optimus Prime figure I ordered. Enclosed are copies of my records concerning
this purchase. I expect to hear from you soon. Sincerely, Bumblebee."""

In [6]:
from transformers import pipeline

classifier = pipeline("text-classification")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [7]:
import pandas as pd
output = classifier(text)
df = pd.DataFrame(output)
display(df)

Unnamed: 0,label,score
0,NEGATIVE,0.901547


## 2. Named Entity Recognition

In [8]:
ner_tagger = pipeline("ner", aggregation_strategy="simple")
outputs = ner_tagger(text)
df = pd.DataFrame(outputs)
display(df)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Unnamed: 0,entity_group,score,word,start,end
0,ORG,0.879096,Amazon,5,11
1,MISC,0.99086,Optimus Prime,36,49
2,LOC,0.999755,Germany,90,97
3,MISC,0.556944,Mega,208,212
4,PER,0.590036,##tron,212,216
5,ORG,0.669727,Decept,253,259
6,MISC,0.498346,##icons,259,264
7,MISC,0.775494,Megatron,350,358
8,MISC,0.987853,Optimus Prime,367,380
9,PER,0.812165,Bumblebee,502,511


## 3. Question Answering

In [9]:
reader = pipeline("question-answering")
question = "What does the customer want?"
output = reader(question=question, context=text)
df = pd.DataFrame([output])
display(df)

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


Unnamed: 0,score,start,end,answer
0,0.631292,335,358,an exchange of Megatron


## 4. Summarization

In [10]:
summarizer = pipeline("summarization")
output = summarizer(text, max_length=45, clean_up_tokenization_spaces=True)
df = pd.DataFrame(output)
display(df)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Your min_length=56 must be inferior than your max_length=45.


Unnamed: 0,summary_text
0,Bumblebee ordered an Optimus Prime action fig...


## 5. Translation

Translate text from English to Text.

In [11]:
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-de")
output = translator(text)
df = pd.DataFrame(output)
display(df)

source.spm:   0%|          | 0.00/768k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/797k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]



Unnamed: 0,translation_text
0,"Sehr geehrter Amazon, letzte Woche habe ich ei..."


## 6. Text Generation

Autocomplete feature for a customer response.

In [12]:
generator = pipeline("text-generation")
response = "Dear Bumblebee, I am sorry to hear that your order was mixed up."
prompt = text + "\n\nCustomer Service response:\n" + response
output = generator(prompt, max_length=200)
print(output[0]['generated_text'])

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.


Dear Amazon, last week I ordered an Optimus Prime action figure
from your online store in Germany. Unfortunately, when I opened the package,
I discovered to my horror that I had been sent an action figure of Megatron
instead! As a lifelong enemy of the Decepticons, I hope you can understand my
dilemma. To resolve the issue, I demand an exchange of Megatron for the
Optimus Prime figure I ordered. Enclosed are copies of my records concerning
this purchase. I expect to hear from you soon. Sincerely, Bumblebee.

Customer Service response:
Dear Bumblebee, I am sorry to hear that your order was mixed up. Your order will include one part which is not the included retail packaging. I cannot

exclude if not exactly what you are asking for for your figure. Unfortunately, the two items mentioned in your order are not directly connected or in contact with each other as

this is what our
