![Practicum AI Logo image](https://github.com/PracticumAI/practicumai.github.io/blob/main/images/logo/PracticumAI_logo_250x50.png?raw=true)
***
# *Practicum AI:* NLP - Quick Tour

This exercise adapted from Tunstall et al. (2022) <i>Natural Language Processing with Transformers</i> from <a href="https://www.oreilly.com/library/view/natural-language-processing/9781098103231/">O'Reilly Media</a> (page 10).

(10 Minutes)

<div style="padding: 10px;margin-bottom: 20px;border: thin solid #30335D;border-left-width: 10px;background-color: #fff"><strong>Note:</strong> This exercise requires the PyTorch-1.8.1 kernel.</div>


In [3]:
# If you are running on Google Colab or outside of HiPerGator
# uncomment the following line(s) to install the needed packages
# HiPerGator users should not need to do this!

# !pip install transformers
# !pip install datasets

In [2]:
# Import everything from the utils library and then execute setup_chapter() - defined in the utils.py file.
from utils import *
setup_chapter()

Using transformers v4.16.2
Using datasets v1.18.3


## A Tour of Transformer Applications

In [14]:
text = """Dear Amazon, last week I ordered an Optimus Prime action figure \
from your online store in Germany. Unfortunately, when I opened the package, \
I discovered to my horror that I had been sent an action figure of Megatron \
instead! As a lifelong enemy of the Decepticons, I hope you can understand my \
dilemma. To resolve the issue, I demand an exchange of Megatron for the \
Optimus Prime figure I ordered. Enclosed are copies of my records concerning \
this purchase. I expect to hear from you soon. Sincerely, Bumblebee."""

### Text Classification

```python
from transformers import pipeline

classifier = pipeline("text-classification")
```

In [15]:
# Code it!

Downloading:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/255M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

```python
import pandas as pd

outputs = classifier(text)
pd.DataFrame(outputs)  
```

In [16]:
# Code it!

Unnamed: 0,label,score
0,NEGATIVE,0.901546


### Named Entity Recognition

```python
ner_tagger = pipeline("ner", aggregation_strategy = "simple")
outputs = ner_tagger(text)
pd.DataFrame(outputs)    
```

In [17]:
# Code it!

Downloading:   0%|          | 0.00/998 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.24G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/208k [00:00<?, ?B/s]

Unnamed: 0,entity_group,score,word,start,end
0,ORG,0.87901,Amazon,5,11
1,MISC,0.990859,Optimus Prime,36,49
2,LOC,0.999755,Germany,90,97
3,MISC,0.556568,Mega,208,212
4,PER,0.590257,##tron,212,216
5,ORG,0.669692,Decept,253,259
6,MISC,0.49835,##icons,259,264
7,MISC,0.775361,Megatron,350,358
8,MISC,0.987854,Optimus Prime,367,380
9,PER,0.812096,Bumblebee,502,511


### Question Answering

```python
reader = pipeline("question-answering")
question = "What does the customer want?"
outputs = reader(question = question, context = text)
pd.DataFrame([outputs])  
```

In [None]:
# Code it!  

Unnamed: 0,score,start,end,answer
0,0.631291,335,358,an exchange of Megatron


### Summarization

```python
summarizer = pipeline("summarization")
outputs = summarizer(text, max_length = 45, clean_up_tokenization_spaces = True)
print(outputs[0]['summary_text'])
```

In [1]:
# Code it!

### Translation

```python
translator = pipeline("translation_en_to_de", model = "Helsinki-NLP/opus-mt-en-de")
outputs = translator(text, clean_up_tokenization_spaces = True, min_length = 100)
print(outputs[0]['translation_text'])
```

In [None]:
# Code it!

Sehr geehrter Amazon, letzte Woche habe ich eine Optimus Prime Action Figur aus
Ihrem Online-Shop in Deutschland bestellt. Leider, als ich das Paket öffnete,
entdeckte ich zu meinem Entsetzen, dass ich stattdessen eine Action Figur von
Megatron geschickt worden war! Als lebenslanger Feind der Decepticons, Ich
hoffe, Sie können mein Dilemma verstehen. Um das Problem zu lösen, Ich fordere
einen Austausch von Megatron für die Optimus Prime Figur habe ich bestellt.
Anbei sind Kopien meiner Aufzeichnungen über diesen Kauf. Ich erwarte, bald von
Ihnen zu hören. Aufrichtig, Bumblebee.


### Text Generation

In [None]:
from transformers import set_seed
set_seed(42) # Set the seed to get reproducible results

```python
generator = pipeline("text-generation")
response = "Dear Bumblebee, I am sorry to hear that your order was mixed up."
prompt = text + "\n\nCustomer service response:\n" + response
outputs = generator(prompt, max_length = 200)
print(outputs[0]['generated_text'])
```

In [None]:
# Code it!

Dear Amazon, last week I ordered an Optimus Prime action figure from your online
store in Germany. Unfortunately, when I opened the package, I discovered to my
horror that I had been sent an action figure of Megatron instead! As a lifelong
enemy of the Decepticons, I hope you can understand my dilemma. To resolve the
issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered.
Enclosed are copies of my records concerning this purchase. I expect to hear
from you soon. Sincerely, Bumblebee.

Customer service response:
Dear Bumblebee, I am sorry to hear that your order was mixed up. The order was
completely mislabeled, which is very common in our online store, but I can
appreciate it because it was my understanding from this site and our customer
service of the previous day that your order was not made correct in our mind and
that we are in a process of resolving this matter. We can assure you that your
order
