# The Huggingface Ecosystem

This introduction to the transformers library was adapted from the Natural Language Processing with Transformers book
https://github.com/nlp-with-transformers/notebooks

In [1]:
# Run this cell if you're on Colab or Kaggle
# Otherwise make sure you have the linked github repository set up locally
# !git clone https://github.com/nlp-with-transformers/notebooks.git
# %cd notebooks
# from install import *
# install_requirements()

In [None]:
%%bash
. ~/.bashrc
python3 -m pip install transformers
python3 -m pip install datasets

# Hello Transformers

The history of the transformer architecture is currently still "young", by an academic standard. The original paper introducing the "Transformer" architecture is from 2017, less than six years ago. However, in the meantime, a sheer endless multitude of variations have appeared in the literature. Below are some of the notable variations, e.g., GPT (an auto-regressive variant conditioned on text generation), BERT (the encoder-only version, usable for static-length tasks).

<img alt="transformer-timeline" caption="The transformers timeline" src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_timeline.png?raw=1" id="transformer-timeline"/>

## Transfer Learning in NLP

One of the important insights from the recent work in Transformer-related architectures is the recycling of previously trained models in a "transfer setting". In this case, instead of training from scratch, models instead are first conditioned on a general, larger dataset, and then partially adapted using a domain-specific dataset with many fewer samples.  
This generally leads to a drastically reduced number of required labeled instances to obtain decent performance, and also improves generalization in many areas.


However, note that this should not be confused with "zero-shot transfer", which are cases in which the model is *only* trained on the general-purpose data, and no additional fine-tuning is performed. This works better if the underlying model is already extremely complex (e.g., GPT-3), and therefore is likely to give decent results even without the additional benefit of seeing domain-specific examples.

<img alt="transfer-learning" caption="Comparison of traditional supervised learning (left) and transfer learning (right)." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_transfer-learning.png?raw=1" id="transfer-learning"/>  

## Hugging Face Transformers: Bridging the Gap

## A Tour of Transformer Applications

In [3]:
text = """Dear Amazon, last week I ordered an Optimus Prime action figure \
from your online store in Germany. Unfortunately, when I opened the package, \
I discovered to my horror that I had been sent an action figure of Megatron \
instead! As a lifelong enemy of the Decepticons, I hope you can understand my \
dilemma. To resolve the issue, I demand an exchange of Megatron for the \
Optimus Prime figure I ordered. Enclosed are copies of my records concerning \
this purchase. I expect to hear from you soon. Sincerely, Bumblebee."""

### Text Classification
By default, transformers.pipeline only uses the CPU. In order to speed up the processing we can choose a GPU via the `device` parameter. This generally requires you to set up CUDA on your 

In [16]:
#hide_output
from transformers import pipeline

classifier = pipeline("text-classification",
                      # device=0  # Enable this line to use a GPU, if available
                     )

In [17]:
import pandas as pd

outputs = classifier(text)
pd.DataFrame(outputs)    

Unnamed: 0,label,score
0,NEGATIVE,0.901546


### Named Entity Recognition

In [18]:
ner_tagger = pipeline("ner", aggregation_strategy="simple")
outputs = ner_tagger(text)
pd.DataFrame(outputs)    

Unnamed: 0,entity_group,score,word,start,end
0,ORG,0.87901,Amazon,5,11
1,MISC,0.990859,Optimus Prime,36,49
2,LOC,0.999755,Germany,90,97
3,MISC,0.556571,Mega,208,212
4,PER,0.590256,##tron,212,216
5,ORG,0.669692,Decept,253,259
6,MISC,0.498349,##icons,259,264
7,MISC,0.775363,Megatron,350,358
8,MISC,0.987854,Optimus Prime,367,380
9,PER,0.812096,Bumblebee,502,511


### Question Answering 

In [19]:
reader = pipeline("question-answering")
question = "What does the customer want?"
outputs = reader(question=question, context=text)
pd.DataFrame([outputs])    

Unnamed: 0,score,start,end,answer
0,0.631292,335,358,an exchange of Megatron


### Summarization

In [20]:
summarizer = pipeline("summarization")
outputs = summarizer(text, max_length=45, clean_up_tokenization_spaces=True)
print(outputs[0]['summary_text'])

 Bumblebee ordered an Optimus Prime action figure from your online store in
Germany. Unfortunately, when I opened the package, I discovered to my horror
that I had been sent an action figure of Megatron instead.


### Translation

In [21]:
translator = pipeline("translation_en_to_de", 
                      model="Helsinki-NLP/opus-mt-en-de")
outputs = translator(text, clean_up_tokenization_spaces=True, min_length=100)
print(outputs[0]['translation_text'])

Sehr geehrter Amazon, letzte Woche habe ich eine Optimus Prime Action Figur aus
Ihrem Online-Shop in Deutschland bestellt. Leider, als ich das Paket öffnete,
entdeckte ich zu meinem Entsetzen, dass ich stattdessen eine Action Figur von
Megatron geschickt worden war! Als lebenslanger Feind der Decepticons, Ich
hoffe, Sie können mein Dilemma verstehen. Um das Problem zu lösen, Ich fordere
einen Austausch von Megatron für die Optimus Prime Figur habe ich bestellt.
Anbei sind Kopien meiner Aufzeichnungen über diesen Kauf. Ich erwarte, bald von
Ihnen zu hören. Aufrichtig, Bumblebee.


### Text Generation

In [22]:
#hide
from transformers import set_seed
set_seed(42) # Set the seed to get reproducible results

In [23]:
generator = pipeline("text-generation")
response = "Dear Bumblebee, I am sorry to hear that your order was mixed up."
prompt = text + "\n\nCustomer service response:\n" + response
outputs = generator(prompt, max_length=200)
print(outputs[0]['generated_text'])

Dear Amazon, last week I ordered an Optimus Prime action figure from your online
store in Germany. Unfortunately, when I opened the package, I discovered to my
horror that I had been sent an action figure of Megatron instead! As a lifelong
enemy of the Decepticons, I hope you can understand my dilemma. To resolve the
issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered.
Enclosed are copies of my records concerning this purchase. I expect to hear
from you soon. Sincerely, Bumblebee.

Customer service response:
Dear Bumblebee, I am sorry to hear that your order was mixed up. The order was
completely mislabeled, which is very common in our online store, but I can
appreciate it because it was my understanding from this site and our customer
service of the previous day that your order was not made correct in our mind and
that we are in a process of resolving this matter. We can assure you that your
order


## The Hugging Face Ecosystem

Besides the model weights themselves, the Huggingface ecosystem is steadily expanding; not only are they now covering other modalities (such as vision, audio or video) nowadays, but they also provide their own libraries for different sub-tasks, such as the tokenization process, model evaluation, or dataset persistance and loading.

<img alt="ecosystem" width="500" caption="An overview of the Hugging Face ecosystem of libraries and the Hub." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_hf-ecosystem.png?raw=1" id="ecosystem"/>

### The Hugging Face Hub

The Huggingface "Model Hub" offers a variety of pretrained models for many different tasks (primarily NLP as of now), from basic text classification all the way to domain-specific text generation models.  
It is possible to filter models by task, name, dataset, language or even license restrictions. Note that models may be trained on monolingual data, and often differ between dealing with text in cased or uncased fashion (the latter usually being entirely lowercased). On some model pages it is possible to try them out directly through a web inference option.

Importantly, the model pages also display the so-called "model cards" (Mitchell et al., 2019), which essentially describe the training setup and some of the important limitations of the models in more detail. Since model cards are generally provided by the authors of the models themselves, they vary wildly in their level of detail. However, you can use the length of a model card as a naive proxy for the "goodness" of a model -- authors that care about details like a model card are in all likelihood providing a more robust work than those that leave it blank.

Model Hub: https://huggingface.co/models

<img alt="hub-overview" width="1000" caption="The models page of the Hugging Face Hub, showing filters on the left and a list of models on the right." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_hub-overview.png?raw=1" id="hub-overview"/> 

Importantly, model cards 

<img alt="hub-model-card" width="1000" caption="A example model card from the Hugging Face Hub. The inference widget is shown on the right, where you can interact with the model." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_hub-model-card.png?raw=1" id="hub-model-card"/> 