# Introduction to Transformers

In this example, we will be exploring transformer networks through some of the functionality in the [HuggingFace Transformers]( https://github.com/huggingface/transformers) library.

This library has great documentation, so please refer to that for more detailed usage after browsing through this simple notebook.

To download the library, use `pip` inside your `conda` environment since the `conda-forge` version is quite out of date:

```bash
pip install transformers
```

In [1]:
import torch
import transformers

## Basic Usage Example

HuggingFace Transformers contains several popular transformer network architectures and pretrained weight checkpoints. The idea is to either use these networks as-is (for example, as pretrained encoders for your own downstream neural network), or a starting points to train on your own datasets.

This library supports both PyTorch and TensorFlow 2.0 frameworks, and outputs all transformer networks as trainable networks you can use however you want.

Below we will get a [Bidirectional Encoder Representations from Transformers (BERT)](https://github.com/google-research/bert) model, which is popularly as an encoder for machine learning tasks (e.g. question answering) due to its powerful ability to encode words in the context they appear rather than as standalone words.

**Historical note:** Before transformers, [Embeddings from Language Models (ELMo)](https://allennlp.org/elmo) did the same thing but using a bidirectional RNN with LSTM units... so it was natural and clever for Google to choose BERT as their catchy acronym. This is also why you will often hear the facetious comment about "muppets and transformers" dominating recent machine learning. For more information, check out this blog by [Jay Alammar](http://jalammar.github.io/illustrated-bert/).

In [2]:
# Get a BERT based tokenizer and model
# NOTE: The models will take some time to download the first time
from transformers import BertModel, BertTokenizer
model_name = "bert-base-uncased"
model = BertModel.from_pretrained(model_name)
tokenizer = BertTokenizer.from_pretrained(model_name,
                                          output_hidden_states=True,
                                          output_attentions=True)
                                          
# Extract outputs from a sample sentence
sample_text = "Machine learning is the study of computer algorithms that improve automatically through experience."
input_ids = torch.tensor([tokenizer.encode(sample_text, add_special_tokens=True)])
with torch.no_grad():
    outputs = model(input_ids)
    hidden_states = outputs[0]
    attentions = outputs[1]

# Print some key values
print("Sample text: {}".format(sample_text))
print("Hidden states: {}".format(hidden_states))
print("Hidden states shape: {}".format(hidden_states.shape))
print("Attentions shape: {}".format(attentions.shape))
del(model)
del(tokenizer)

Sample text: Machine learning is the study of computer algorithms that improve automatically through experience.
Hidden states: tensor([[[ 0.0918, -0.6651, -0.5920,  ...,  0.2772, -0.4451,  0.5621],
         [ 0.3746,  0.4496, -0.4746,  ...,  0.2892,  0.6671,  0.3278],
         [-0.1143,  0.4511, -0.6710,  ..., -0.5707, -0.0513,  0.6873],
         ...,
         [-0.1761, -0.0033,  0.0998,  ..., -0.9533, -0.9809,  0.2631],
         [ 0.8583,  0.0425, -0.4340,  ...,  0.0977, -0.7423, -0.0583],
         [ 0.6726,  0.0225, -0.1304,  ...,  0.2195, -0.8982, -0.1860]]])
Hidden states shape: torch.Size([1, 16, 768])
Attentions shape: torch.Size([1, 768])


## Pipeline Example: Question Answering

HuggingFace Transformers also provides `pipelines` for common tasks like classification, translation, and question answering. Below we will test another pretrained BERT model for question answering applications. 

This model will use different weights that have been refined for question answering using the [Stanford Question Answering Dataset (SQuAD)](https://rajpurkar.github.io/SQuAD-explorer/).

In the documentation, you can find a full list of [pipelines](https://huggingface.co/transformers/main_classes/pipelines.html) and [pretrained models](https://huggingface.co/transformers/pretrained_models.html), among many other useful resources.

In [6]:
# Define a question answering pipeline
# NOTE: The models will take some time to download the first time
from transformers import pipeline
model_name = "bert-large-cased-whole-word-masking-finetuned-squad"
nlp_qa = pipeline("question-answering", 
                  model=model_name, 
                  tokenizer=model_name,
                  framework="pt")

# Define the context and a set of questions
# The context is the first few paragraphs from the RoboCup Wikipedia page:
# https://en.wikipedia.org/wiki/RoboCup
context = """
RoboCup is an annual international robotics competition proposed and founded in 1996 by a group of university professors (including Hiroaki Kitano, Manuela M. Veloso, and Minoru Asada). The aim of the competition is to promote robotics and AI research by offering a publicly appealing – but formidable – challenge. 
The name RoboCup is a contraction of the competition's full name, "Robot Soccer World Cup", but there are many other areas of competition such as "RoboCupRescue", "RoboCup@Home" and "RoboCupJunior". In 2019, the international competition was held in Sydney, Australia. Peter Stone is the current president of RoboCup, and has been since 2019.
"""

questions = [
    "When was RoboCup founded?",
    "Who is the president of RoboCup?",
    "Where was the 2019 RoboCup international competition held?",
    "What is the main goal of RoboCup?"
]

# Loop through each question and display the answer and confidence score
for question in questions:
    result = nlp_qa(question=question, context=context)
    answer = result["answer"]
    score = result["score"]

    print("")
    print("Question: {}".format(question))
    print("Answer: {}".format(answer))
    print("Score: {}".format(score))

Downloading: 100%|██████████| 230/230 [00:00<00:00, 80.0kB/s]

Question: When was RoboCup founded?
Answer: 1996
Score: 0.9921742685001611

Question: Who is the president of RoboCup?
Answer: Peter Stone
Score: 0.994986112397072

Question: Where was the 2019 RoboCup international competition held?
Answer: Sydney, Australia.
Score: 0.8968013221865334

Question: What is the main goal of RoboCup?
Answer: to promote robotics and AI research
Score: 0.4689098121865669
