# Introduction to Transformers

We will be using the HuggingFace Transformers library: https://github.com/huggingface/transformers

This library has great documentation, so please refer to that for more detailed usage after browsing through this simple notebook.

To download the library, use `pip` inside your `conda` environment since the `conda-forge` version is quite out of date:

```bash
pip install transformers
```

In [4]:
import torch
import transformers

## Basic Usage Example

In [5]:
from transformers import BertModel, BertTokenizer

model_name = "bert-base-uncased"
model = BertModel.from_pretrained(model_name)
tokenizer = BertTokenizer.from_pretrained(model_name,
                                          output_hidden_states=True,
                                          output_attentions=True)
                                          
sample_text = "Machine learning is the study of computer algorithms that improve automatically through experience."

input_ids = torch.tensor([tokenizer.encode(sample_text, add_special_tokens=True)])
with torch.no_grad():
    outputs = model(input_ids)
    hidden_states = outputs[0]
    attentions = outputs[1]

print("Sample text: {}".format(sample_text))
print("Hidden states: {}".format(hidden_states))
print("Hidden states shape: {}".format(hidden_states.shape))
print("Attentions shape: {}".format(attentions.shape))

del(model)
del(tokenizer)

Sample text: Machine learning is the study of computer algorithms that improve automatically through experience.
Hidden states: tensor([[[ 0.0918, -0.6651, -0.5920,  ...,  0.2772, -0.4451,  0.5621],
         [ 0.3746,  0.4496, -0.4746,  ...,  0.2892,  0.6671,  0.3278],
         [-0.1143,  0.4511, -0.6710,  ..., -0.5707, -0.0513,  0.6873],
         ...,
         [-0.1761, -0.0033,  0.0998,  ..., -0.9533, -0.9809,  0.2631],
         [ 0.8583,  0.0425, -0.4340,  ...,  0.0977, -0.7423, -0.0583],
         [ 0.6726,  0.0225, -0.1304,  ...,  0.2195, -0.8982, -0.1860]]])
Hidden states shape: torch.Size([1, 16, 768])
Attentions shape: torch.Size([1, 768])


## Pipeline Example: Question Answering

In [6]:
from transformers import pipeline

context = """
RoboCup is an annual international robotics competition proposed and founded in 1996 by a group of university professors (including Hiroaki Kitano, Manuela M. Veloso, and Minoru Asada). The aim of the competition is to promote robotics and AI research by offering a publicly appealing – but formidable – challenge. 
The name RoboCup is a contraction of the competition's full name, "Robot Soccer World Cup", but there are many other areas of competition such as "RoboCupRescue", "RoboCup@Home" and "RoboCupJunior". In 2019, the international competition was held in Sydney, Australia. Peter Stone is the current president of RoboCup, and has been since 2019.
"""

questions = [
    "When was RoboCup founded?",
    "Who is the president of RoboCup?",
    "Where was the 2019 RoboCup international competition held?",
    "What is the main goal of RoboCup?"
]

model_name = "bert-large-cased-whole-word-masking-finetuned-squad"
nlp_qa = pipeline("question-answering", 
                  model=model_name, 
                  tokenizer=model_name,
                  framework="pt")

for question in questions:
    result = nlp_qa(question=question, context=context)
    answer = result["answer"]
    score = result["score"]

    print("")
    print("Question: {}".format(question))
    print("Answer: {}".format(answer))
    print("Score: {}".format(score))

Downloading: 100%|██████████| 230/230 [00:00<00:00, 80.0kB/s]

Question: When was RoboCup founded?
Answer: 1996
Score: 0.9921742685001611

Question: Who is the president of RoboCup?
Answer: Peter Stone
Score: 0.994986112397072

Question: Where was the 2019 RoboCup international competition held?
Answer: Sydney, Australia.
Score: 0.8968013221865334

Question: What is the main goal of RoboCup?
Answer: to promote robotics and AI research
Score: 0.4689098121865669
