<a href="https://colab.research.google.com/github/cbittel/nlp/blob/main/Transformers_HuggingFace_pie_and_ai_meetup.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

Install the `transformers` library by running the cell below:

In [None]:
%%capture
!pip install transformers[sentencepiece]

Next we import the libraries we'll be using in this notebook:

In [None]:
import pandas as pd
from pprint import pprint
from transformers import pipeline, set_seed

We'll use the following text as a running example:

In [None]:
text = """Dear Amazon, last week I ordered an Optimus Prime action figure \
from your online store in Germany. Unfortunately, when I opened the package, \
I discovered to my horror that I had been sent an action figure of Megatron \
instead! As a lifelong enemy of the Decepticons, I hope you can understand my \
dilemma. To resolve the issue, I demand an exchange of Megatron for the \
Optimus Prime figure I ordered. Enclosed are copies of my records concerning \
this purchase. I expect to hear from you soon. Sincerely, Bumblebee."""

## Text classification

In [None]:
classifier = pipeline("sentiment-analysis")
outputs = classifier(text)
pd.DataFrame.from_records(outputs)   

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)


Downloading:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Unnamed: 0,label,score
0,NEGATIVE,0.901546


## Question answering

In [None]:
reader = pipeline("question-answering")
question = "What does the customer want?"
#reader(question=question, context=text)
outputs = reader(question=question, context=text)
pd.DataFrame.from_records([outputs])    

No model was supplied, defaulted to distilbert-base-cased-distilled-squad (https://huggingface.co/distilbert-base-cased-distilled-squad)


Unnamed: 0,score,start,end,answer
0,0.631292,335,358,an exchange of Megatron




```
# Esto tiene formato de código
```

## Translation to Deutsch

In [None]:
translator_en_to_de = pipeline("translation_en_to_de", 
                      model="Helsinki-NLP/opus-mt-en-de")
outputs = translator_en_to_de(text, clean_up_tokenization_spaces=True, min_length=100)
pprint(outputs[0]['translation_text'])
pd.DataFrame.from_records([outputs])

('Sehr geehrter Amazon, letzte Woche habe ich eine Optimus Prime Action Figur '
 'aus Ihrem Online-Shop in Deutschland bestellt. Leider, als ich das Paket '
 'öffnete, entdeckte ich zu meinem Entsetzen, dass ich stattdessen eine Action '
 'Figur von Megatron geschickt worden war! Als lebenslanger Feind der '
 'Decepticons, Ich hoffe, Sie können mein Dilemma verstehen. Um das Problem zu '
 'lösen, Ich fordere einen Austausch von Megatron für die Optimus Prime Figur '
 'habe ich bestellt. Anbei sind Kopien meiner Aufzeichnungen über diesen Kauf. '
 'Ich erwarte, bald von Ihnen zu hören. Aufrichtig, Bumblebee.')


Unnamed: 0,0
0,"{'translation_text': 'Sehr geehrter Amazon, le..."



## Translation to Spanish

In [None]:
translator_en_to_es = pipeline("translation_en_to_es", 
                      model="Helsinki-NLP/opus-mt-en-es")
outputs = translator_en_to_es(text, clean_up_tokenization_spaces=True, min_length=100)
pprint(outputs[0]['translation_text'])
pd.DataFrame.from_records([outputs])

('Querida Amazona, la semana pasada ordené una figura de acción Optimus Prime '
 'de su tienda en línea en Alemania. Desafortunadamente, cuando abrí el '
 'paquete, descubrí para mi horror que me habían enviado una figura de acción '
 'de Megatron en su lugar! Como un enemigo de toda la vida de los Decepticons, '
 'espero que pueda entender mi dilema. Para resolver el problema, exijo un '
 'intercambio de Megatron por la figura Optimus Prime que ordené. Adjunto son '
 'copias de mis registros relativos a esta compra. Espero escuchar de usted '
 'pronto. Sinceramente, Bumblebee.')


Unnamed: 0,0
0,"{'translation_text': 'Querida Amazona, la sema..."


## Text generation

In [None]:
# Fix the random seed for reproducibility
set_seed(42)

generator = pipeline("text-generation")
response = "Dear Bumblebee, I am sorry to hear that your order was mixed up."
prompt = text + "\n\nCustomer service response:\n" + response
outputs = generator(prompt, max_length=220)
pprint(outputs[0]['generated_text'])

No model was supplied, defaulted to gpt2 (https://huggingface.co/gpt2)


Downloading:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Using pad_token, but it is not set yet.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


('Dear Amazon, last week I ordered an Optimus Prime action figure from your '
 'online store in Germany. Unfortunately, when I opened the package, I '
 'discovered to my horror that I had been sent an action figure of Megatron '
 'instead! As a lifelong enemy of the Decepticons, I hope you can understand '
 'my dilemma. To resolve the issue, I demand an exchange of Megatron for the '
 'Optimus Prime figure I ordered. Enclosed are copies of my records concerning '
 'this purchase. I expect to hear from you soon. Sincerely, Bumblebee.\n'
 '\n'
 'Customer service response:\n'
 'Dear Bumblebee, I am sorry to hear that your order was mixed up. The order '
 'was completely mislabeled, which is very common in our online store, but I '
 'can appreciate it because it was my understanding from this site and our '
 'customer service of the previous day that your order was not made correct in '
 'our mind and that we are in a process of resolving this matter. We can '
 'assure you that your order 