### HuggingFace

In questa parte esploreremo un'altra libreria per poter fare esattamente tutto quello che abbiamo imparato prima, in modo più immediato.

In [1]:
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
import torch
import json
from tqdm import tqdm
import numpy as np
from pprint import pprint

In [2]:
model_name = "deepset/roberta-base-squad2" # "mrm8488/longformer-base-4096-finetuned-squadv2"

# a) Get predictions
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)

In [3]:
QA_input = {
    'question': 'Why is model conversion important?',
    'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
}
res = nlp(QA_input)

print(res)

{'score': 0.21171511709690094, 'start': 59, 'end': 84, 'answer': 'gives freedom to the user'}


In [4]:
# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

In [5]:
text = r"""
🤗 Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose
architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural
Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between
TensorFlow 2.0 and PyTorch.
"""

questions = [
    "How many pretrained models are available in Transformers?",
    "What does Transformers provide?",
    "Transformers provides interoperability between which frameworks?",
]

for question in questions:
    inputs = tokenizer.encode_plus(question, text, add_special_tokens=True, return_tensors="pt")
    input_ids = inputs["input_ids"].tolist()[0]

    text_tokens = tokenizer.convert_ids_to_tokens(input_ids)
    model_output = model(**inputs)
    answer_start_scores = model_output["start_logits"]
    answer_end_scores = model_output["end_logits"]

    answer_start = torch.argmax(
        answer_start_scores
    )  # Get the most likely beginning of answer with the argmax of the score
    answer_end = torch.argmax(answer_end_scores) + 1  # Get the most likely end of answer with the argmax of the score

    answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))

    print(f"Question: {question}")
    print(f"Answer: {answer}\n")

Question: How many pretrained models are available in Transformers?
Answer:  32+

Question: What does Transformers provide?
Answer:  general-purpose
architectures

Question: Transformers provides interoperability between which frameworks?
Answer: <s>



In [6]:
for question in questions:
    QA_input = {
        'question': question,
        'context': text
    }
    answer = nlp(QA_input)["answer"]

    print(f"Question: {question}")
    print(f"Answer: {answer}\n")

Question: How many pretrained models are available in Transformers?
Answer: 32+

Question: What does Transformers provide?
Answer: general-purpose
architectures

Question: Transformers provides interoperability between which frameworks?
Answer: 
TensorFlow 2.0 and PyTorch



### Testiamolo sul DEV set di SQUAD v2

In questa parte andiamo a verificare le performance di questo modello su SQUAD v2 Dev set

In [9]:
squad_dev = json.load(open("./SQUAD/dev-v2.0.json"))

In [10]:
qa_collection = []

for passage in squad_dev["data"]:
    for paragraph in passage["paragraphs"]:
        text = paragraph["context"]
        for qas in paragraph["qas"]:
            question = qas["question"]
            for answer_info in qas["answers"]:
                answer = answer_info["text"]
                qa_collection.append({
                    "text": text,
                    "question": question,
                    "answer": answer,
                })

In [11]:
np.random.shuffle(qa_collection)
for i,sample in enumerate(tqdm(qa_collection[:100])):
    QA_input = {
        'question': sample["question"],
        'context': sample["text"]
    }
    predicted_answer = nlp(QA_input)
    qa_collection[i]["roberta_answer"] = predicted_answer

100%|██████████| 100/100 [01:14<00:00,  1.34it/s]


In [12]:
exact_match = 0
soft_match = 0
wrong_prediction = []

for sample in qa_collection[:100]:
    question = sample['question']
    answer = sample['answer']
    if 'roberta_answer' in sample:
        roberta_answer = sample['roberta_answer']['answer']
    else:
        roberta_answer = "None"
        
    print("*********************")
    print(f"Question: {question}")
    print(f"Answer: {answer}")
    print(f"Roberta Answer: {roberta_answer}")
    
    if answer == roberta_answer:
        exact_match += 1
    if roberta_answer in answer or answer in roberta_answer:
        soft_match += 1
    else:
        wrong_prediction.append(sample)



*********************
Question: The Rhine forms an inland delta into which lake?
Answer: Lake Constance
Roberta Answer: Lake Constance
*********************
Question: How long is one term for an elected president of the CJEU?
Answer: three years
Roberta Answer: three years
*********************
Question: A formal design team may be assembled to do what?
Answer: plan the physical proceedings, and to integrate those proceedings with the other parts
Roberta Answer: plan the physical proceedings
*********************
Question: Stable and radioactive isotope studies provide insight into what?
Answer: geochemical evolution of rock units
Roberta Answer: geochemical evolution of rock units
*********************
Question: How can you find the absolute age of sedimentary rock units which do not contain radioactive isotopes?
Answer: Dating of lava and volcanic ash layers found within a stratigraphic sequence
Roberta Answer: Dating of lava and volcanic ash layers found within a stratigraphic seque

In [13]:
print(f"Exact Match accuracy: {exact_match / 100}")
print(f"Soft Match accuracy: {soft_match / 100}")

Exact Match accuracy: 0.69
Soft Match accuracy: 0.94


In [14]:
for wrng in wrong_prediction:
    pprint(wrng)

{'answer': '100,000',
 'question': "How much of Paris' population was killed by the plague?",
 'roberta_answer': {'answer': 'Half',
                    'end': 203,
                    'score': 0.9059361219406128,
                    'start': 199},
 'text': 'The most widely accepted estimate for the Middle East, including '
         'Iraq, Iran and Syria, during this time, is for a death rate of about '
         "a third. The Black Death killed about 40% of Egypt's population. "
         "Half of Paris's population of 100,000 people died. In Italy, the "
         'population of Florence was reduced from 110–120 thousand inhabitants '
         'in 1338 down to 50 thousand in 1351. At least 60% of the population '
         'of Hamburg and Bremen perished, and a similar percentage of '
         'Londoners may have died from the disease as well. Interestingly '
         'while contemporary reports account of mass burial pits being created '
         'in response to the large numbers of dead