In [1]:
#%%
from transformers import pipeline
import pandas as pd 
import torch

# Set the device to GPU if available
device = "cuda:0" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

Using device: cuda:0


In [2]:
#Original text from the book
text = """Dear Amazon, last week I ordered an Optimus Prime action figure \
from your online store in Germany. Unfortunately, when I opened the package, \
I discovered to my horror that I had been sent an action figure of Megatron \
instead! As a lifelong enemy of the Decepticons, I hope you can understand my \
dilemma. To resolve the issue, I demand an exchange of Megatron for the \
Optimus Prime figure I ordered. Enclosed are copies of my records concerning \
this purchase. I expect to hear from you soon. Sincerely, Bumblebee."""

text_interstellar = '''
Sometimes I just need to see the start. Or end. Or a trailer.
Or the music and theme from Hans Zimmer. Or the whole movie.
Just to feel that thing, I only get from this movie. 
That the earth, space and time are something special, mystical.
I never forget the first time I saw this movie, in an IMAX theatre in 2014. 
I was struck by it. Totally got me. And it stil does, 7 years later. This is the best movie 
ever made for me. Because of the feeling it gives me, no other movie can. So hard to get all 
of this emotion in only one movie. Brilliant.
'''
# The base models are trained on Englisht text, there is not a check to
# verify the language and load a multilingual model. 
text_interstellar_cat = '''
Que ens arribin des de Hollywood cintes tan espectaculars, originals 
i volgudament èpiques com INTERSTELLAR és sens dubte un miracle del que ens hem d’alegrar 
en els temps que corren. Actualment, es poden comptar amb la mà els directors que com Christopher Nolan, 
aposten per històries pròpies, grans formats cinematogràfics i un respecte reverencial a la 
misteriosa atracció cap al cinema que tots sentim. Trobar directors que ho facin dins del sistema 
de Hollywood amb èxit i grans pressupostos és més difícil si cap. Per això Nolan és un dels 
reis del moment; un director únic, privilegiat i com tot autor, admirat i criticat a parts iguals.
'''

In [3]:
### Text Classification ###
classifier = pipeline('text-classification', device=device)
outputs = classifier(text)
pd.DataFrame(outputs)


No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Unnamed: 0,label,score
0,NEGATIVE,0.901546


In [4]:
### Named Entity Recognition ###
ner_tagger = pipeline('ner', aggregation_strategy='simple', device=device)
outputs = ner_tagger(text)
pd.DataFrame(outputs)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Unnamed: 0,entity_group,score,word,start,end
0,ORG,0.87901,Amazon,5,11
1,MISC,0.990859,Optimus Prime,36,49
2,LOC,0.999755,Germany,90,97
3,MISC,0.556569,Mega,208,212
4,PER,0.590257,##tron,212,216
5,ORG,0.669692,Decept,253,259
6,MISC,0.49835,##icons,259,264
7,MISC,0.775361,Megatron,350,358
8,MISC,0.987854,Optimus Prime,367,380
9,PER,0.812096,Bumblebee,502,511


In [5]:
### Question answreing ###
reader = pipeline('question-answering', device=device)
question = 'What did Bumblebee order from Amazon?'
question = 'What does the customer want?'
outputs = reader(question=question, context=text)
pd.DataFrame([outputs])

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


Unnamed: 0,score,start,end,answer
0,0.631292,335,358,an exchange of Megatron


In [6]:
### Summarization ###
summarizer = pipeline('summarization', device=device)
outputs = summarizer(text, max_length=45, clean_up_tokenization_spaces=True)
print(outputs[0]['summary_text'])

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Your min_length=56 must be inferior than your max_length=45.


 Bumblebee ordered an Optimus Prime action figure from your online store in Germany. Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead.


In [7]:
### Translation ###
translator = pipeline('translation_end_to_end', 
                      model = 'Helsinki-NLP/opus-mt-en-de',
                      device=device)
outputs = translator(text, clean_up_tokenization_spaces=True, min_length=100)
print(outputs[0]['translation_text'])

source.spm:   0%|          | 0.00/768k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/797k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.27M [00:00<?, ?B/s]



Sehr geehrter Amazon, letzte Woche habe ich eine Optimus Prime Action Figur aus Ihrem Online-Shop in Deutschland bestellt. Leider, als ich das Paket öffnete, entdeckte ich zu meinem Entsetzen, dass ich stattdessen eine Action Figur von Megatron geschickt worden war! Als lebenslanger Feind der Decepticons, Ich hoffe, Sie können mein Dilemma verstehen. Um das Problem zu lösen, Ich fordere einen Austausch von Megatron für die Optimus Prime Figur habe ich bestellt. Eingeschlossen sind Kopien meiner Aufzeichnungen über diesen Kauf. Ich erwarte, von Ihnen bald zu hören. Aufrichtig, Bumblebee.


In [8]:
###Text Generation ###
generator = pipeline('text-generation', device=device)
respone = 'Dear Bumblebee, we are sorry for the inconvenience. We will proceed to '
prompt = text + "\n\nCustomer service response:\n" + respone
outputs = generator(prompt, max_length=200)
print(outputs[0]['generated_text'])


No model was supplied, defaulted to openai-community/gpt2 and revision 6c0e608 (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Dear Amazon, last week I ordered an Optimus Prime action figure from your online store in Germany. Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead! As a lifelong enemy of the Decepticons, I hope you can understand my dilemma. To resolve the issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered. Enclosed are copies of my records concerning this purchase. I expect to hear from you soon. Sincerely, Bumblebee.

Customer service response:
Dear Bumblebee, we are sorry for the inconvenience. We will proceed to irm the merchandise that we have ordered and collect that is eligible for exchange of our merchandise for your order prior to November 1, 2015. We will keep you informed of any such actions if and when appropriate.
