<a href="https://colab.research.google.com/github/tomcat118/NLP-with-Transformers/blob/main/Intro_to_transformers.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [3]:
# Uncomment and run this cell if you're on Colab or Kaggle
!git clone https://github.com/nlp-with-transformers/notebooks.git
%cd notebooks
from install import *
install_requirements()

Cloning into 'notebooks'...
remote: Enumerating objects: 515, done.[K
remote: Counting objects: 100% (161/161), done.[K
remote: Compressing objects: 100% (39/39), done.[K
remote: Total 515 (delta 139), reused 126 (delta 122), pack-reused 354[K
Receiving objects: 100% (515/515), 28.61 MiB | 10.38 MiB/s, done.
Resolving deltas: 100% (246/246), done.
/content/notebooks
⏳ Installing base requirements ...
✅ Base requirements installed!
⏳ Installing Git LFS ...
✅ Git LFS installed!


In [None]:
#hide
from utils import *
setup_chapter()

# Hello Transformers

<img alt="transformer-timeline" caption="The transformers timeline" src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_timeline.png?raw=1" id="transformer-timeline"/>

## The Encoder-Decoder Framework

<img alt="rnn" caption="Unrolling an RNN in time." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_rnn.png?raw=1" id="rnn"/>

<img alt="enc-dec" caption="Encoder-decoder architecture with a pair of RNNs. In general, there are many more recurrent layers than those shown." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_enc-dec.png?raw=1" id="enc-dec"/>

## Attention Mechanisms

<img alt="enc-dec-attn" caption="Encoder-decoder architecture with an attention mechanism for a pair of RNNs." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_enc-dec-attn.png?raw=1" id="enc-dec-attn"/> 

<img alt="attention-alignment" width="500" caption="RNN encoder-decoder alignment of words in English and the generated translation in French (courtesy of Dzmitry Bahdanau)." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter02_attention-alignment.png?raw=1" id="attention-alignment"/> 

<img alt="transformer-self-attn" caption="Encoder-decoder architecture of the original Transformer." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_self-attention.png?raw=1" id="transformer-self-attn"/> 

## Transfer Learning in NLP

<img alt="transfer-learning" caption="Comparison of traditional supervised learning (left) and transfer learning (right)." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_transfer-learning.png?raw=1" id="transfer-learning"/>  

<img alt="ulmfit" width="500" caption="The ULMFiT process (courtesy of Jeremy Howard)." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_ulmfit.png?raw=1" id="ulmfit"/>

## Hugging Face Transformers: Bridging the Gap

## A Tour of Transformer Applications

In [14]:
text = """We aim to develop methods for understanding how multimedia news exposure can affect people’s emotional responses, \
and we especially focus on news content related to gun violence, a very important yet polarizing issue in the US \
We created the dataset NEmo+ by significantly extending the US gun violence news-to-emotions dataset \
BU-NEmo, from 320 to 1,297 news headline and lead image pairings and collecting 38,910 annotations in a large crowdsourcing experiment. \
In curating the NEmo+ dataset, we developed methods to identify news items that will trigger similar versus divergent emotional responses. \
For news items that trigger similar emotional responses, we compiled them into the NEmo+- Consensus dataset \
We benchmark models on this dataset that predict a person’s dominant emotional response toward the target news item (single-label prediction)."""


### Text Classification

In [15]:
#hide_output
from transformers import pipeline

classifier = pipeline("text-classification")

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)


In [16]:
import pandas as pd

outputs = classifier(text)
pd.DataFrame(outputs)    

Unnamed: 0,label,score
0,NEGATIVE,0.906854


### Named Entity Recognition

In [19]:
ner_tagger = pipeline("ner", aggregation_strategy="first")
outputs = ner_tagger(text)
pd.DataFrame(outputs)    

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english)


Unnamed: 0,entity_group,score,word,start,end
0,LOC,0.998844,US,224,226
1,LOC,0.992998,US,287,289
2,MISC,0.841746,NEmo +,686,691
3,MISC,0.888,Consensus,693,702


### Question Answering 

In [20]:
reader = pipeline("question-answering")
question = "What does the author did in the paper?"
outputs = reader(question=question, context=text)
pd.DataFrame([outputs])    

No model was supplied, defaulted to distilbert-base-cased-distilled-squad (https://huggingface.co/distilbert-base-cased-distilled-squad)


Downloading:   0%|          | 0.00/473 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/249M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading: 0.00B [00:00, ?B/s]

Downloading: 0.00B [00:00, ?B/s]

Unnamed: 0,score,start,end,answer
0,0.027584,660,710,we compiled them into the NEmo+- Consensus dat...


### Summarization

In [22]:
summarizer = pipeline("summarization")
outputs = summarizer(text, max_length=45, clean_up_tokenization_spaces=True)
print(outputs[0]['summary_text'])

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 (https://huggingface.co/sshleifer/distilbart-cnn-12-6)


Downloading: 0.00B [00:00, ?B/s]

Downloading:   0%|          | 0.00/1.14G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading: 0.00B [00:00, ?B/s]

Downloading: 0.00B [00:00, ?B/s]

 We aim to develop methods for understanding how multimedia news exposure can affect people’s emotional responses. We especially focus on news content related to gun violence, a very important yet polarizing issue in the US.


### Translation

In [23]:
translator = pipeline("translation_en_to_de", 
                      model="liam168/trans-opus-mt-en-zh")

outputs = translator(text, clean_up_tokenization_spaces=True, min_length=100)
print(outputs[0]['translation_text'])

Downloading: 0.00B [00:00, ?B/s]

Downloading:   0%|          | 0.00/296M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/299 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/788k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/786k [00:00<?, ?B/s]

Downloading: 0.00B [00:00, ?B/s]

Downloading:   0%|          | 0.00/65.0 [00:00<?, ?B/s]

我们的目标是开发各种方法,了解多媒体新闻曝光如何影响人们的情感反应,我们特别侧重于与枪支暴力有关的新闻内容,枪支暴力是一个非常重要而又两极分化的问题。 在美国,我们创建了数据集NEMO+,将美国枪支暴力新闻到情感数据集BU-NEMO从320条新闻头条大幅扩展至1 297条新闻头条新闻和领先图像配对,并在大型众包实验中收集38 910条插图。 在打造NEMO+数据集的过程中,我们开发了各种方法来识别触发类似和不同情感反应的新闻项目。 对于触发类似情感反应的新闻项目,我们将其编入了NEMO+共识数据集,我们在这个数据集上建立了基准模型,预测一个人对目标新闻项目(单标签预测)的主导情感反应。


### Text Generation

In [None]:
#hide
from transformers import set_seed
set_seed(42) # Set the seed to get reproducible results

In [26]:
generator = pipeline("text-generation")
response = "There are many things we learned from this paper."
prompt = text + "\n\n Audience response:\n" + response
outputs = generator(prompt, max_length=200)
print(outputs[0]['generated_text'])

No model was supplied, defaulted to gpt2 (https://huggingface.co/gpt2)
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


We aim to develop methods for understanding how multimedia news exposure can affect people’s emotional responses, and we especially focus on news content related to gun violence, a very important yet polarizing issue in the US We created the dataset NEmo+ by significantly extending the US gun violence news-to-emotions dataset BU-NEmo, from 320 to 1,297 news headline and lead image pairings and collecting 38,910 annotations in a large crowdsourcing experiment. In curating the NEmo+ dataset, we developed methods to identify news items that will trigger similar versus divergent emotional responses. For news items that trigger similar emotional responses, we compiled them into the NEmo+- Consensus dataset We benchmark models on this dataset that predict a person’s dominant emotional response toward the target news item (single-label prediction).

 Audience response:
There are many things we learned from this paper. One is that some people respond with a mixture of sadness, dismay and even 

## The Hugging Face Ecosystem

<img alt="ecosystem" width="500" caption="An overview of the Hugging Face ecosystem of libraries and the Hub." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_hf-ecosystem.png?raw=1" id="ecosystem"/>

### The Hugging Face Hub

<img alt="hub-overview" width="1000" caption="The models page of the Hugging Face Hub, showing filters on the left and a list of models on the right." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_hub-overview.png?raw=1" id="hub-overview"/> 

<img alt="hub-model-card" width="1000" caption="A example model card from the Hugging Face Hub. The inference widget is shown on the right, where you can interact with the model." src="https://github.com/nlp-with-transformers/notebooks/blob/main/images/chapter01_hub-model-card.png?raw=1" id="hub-model-card"/> 

### Hugging Face Tokenizers

### Hugging Face Datasets

### Hugging Face Accelerate

## Main Challenges with Transformers

## Conclusion