# Preliminaries


Write requirements to file, anytime you run it, in case you have to go back and recover dependencies.

Requirements are hosted for each notebook in the companion github repo, and can be pulled down and installed here if needed. Companion github repo is located at https://github.com/azunre/transfer-learning-for-nlp

In [1]:
!pip freeze > kaggle_image_requirements.txt

# Question Answering

Let's run a question answering pipeline using BERT, on a COVID-19 pandemic related article from the World Economic Forum: https://www.weforum.org/agenda/2020/07/mask-mandates-and-other-lockdown-policies-reduced-the-spread-of-covid-19-in-the-us

We only use the summary of the article, not the whole article text - since things work better when the text is shorter. Note that in the absence of a summary, you could generate one using a summarization pipeline first. 

In [2]:
from transformers import pipeline



In [3]:
qNa= pipeline('question-answering', model='bert-large-cased-whole-word-masking-finetuned-squad', tokenizer='bert-large-cased-whole-word-masking-finetuned-squad') # These models would have been loaded by default, but we make it explicit for transparency. It is important to use a model that has been finetuned on SQuAD, otherwise results will be poor.
paragraph = 'A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40% fewer deaths by the start of June. Containment policies had a large impact on the number of COVID-19 cases and deaths, directly by reducing transmission rates and indirectly by constraining people’s behaviour. They account for roughly half the observed change in the growth rates of cases and deaths.'

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=634.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=213450.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1334424802.0, style=ProgressStyle(descr…




In [4]:
ans = qNa({'question': 'What is this article about?','context': f'{paragraph}'})
print(ans)

{'score': 0.47023460869354494, 'start': 148, 'end': 168, 'answer': 'Containment policies'}


In [5]:
ans = qNa({'question': 'Which country is this article about?',
           'context': f'{paragraph}'})
print(ans)

{'score': 0.795254447990601, 'start': 34, 'end': 36, 'answer': 'US'}


In [6]:
ans = qNa({'question': 'Which disease is discussed in this article?',
           'context': f'{paragraph}'})
print(ans)

{'score': 0.9761025334558902, 'start': 205, 'end': 213, 'answer': 'COVID-19'}


In [7]:
ans = qNa({'question': 'What time period is discussed in the article?',
           'context': f'{paragraph}'})
print(ans)

{'score': 0.21781831588181433, 'start': 71, 'end': 79, 'answer': '1 April,'}


# Fill In The Blanks

Now, let's drop some words from some sentences and use the model to predict the most likely word in that place

In [8]:
fill_mask = pipeline("fill-mask",model="bert-base-cased",tokenizer="bert-base-cased")

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=433.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=213450.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=435779157.0, style=ProgressStyle(descri…




In [9]:
fill_mask("A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40% fewer [MASK] by the start of June")

[{'sequence': '[CLS] A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40 % fewer deaths by the start of June [SEP]',
  'score': 0.19625532627105713,
  'token': 6209},
 {'sequence': '[CLS] A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40 % fewer executions by the start of June [SEP]',
  'score': 0.11479416489601135,
  'token': 26107},
 {'sequence': '[CLS] A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40 % fewer victims by the start of June [SEP]',
  'score': 0.0846652239561081,
  'token': 5256},
 {'sequence': '[CLS] A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40 % fewer masks by the start of June [SEP]',
  'score': 0.0419488325715065,
  'token': 17944},
 {'sequence': '[CLS] A new study estimates that if the US had universally mandated masks

In [10]:
fill_mask("A new [MASK] estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40% fewer deaths by the start of June")

[{'sequence': '[CLS] A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40 % fewer deaths by the start of June [SEP]',
  'score': 0.2471013069152832,
  'token': 2025},
 {'sequence': '[CLS] A new estimate estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40 % fewer deaths by the start of June [SEP]',
  'score': 0.20276550948619843,
  'token': 10301},
 {'sequence': '[CLS] A new report estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40 % fewer deaths by the start of June [SEP]',
  'score': 0.16086997091770172,
  'token': 2592},
 {'sequence': '[CLS] A new analysis estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40 % fewer deaths by the start of June [SEP]',
  'score': 0.0335063636302948,
  'token': 3622},
 {'sequence': '[CLS] A new survey estimates that if the US had universally mandated ma

In [11]:
fill_mask("Containment [MASK] had a large impact on the number of COVID-19 cases and deaths, directly by reducing transmission rates and indirectly by constraining people’s behaviour.")

[{'sequence': '[CLS] Containment has had a large impact on the number of COVID - 19 cases and deaths, directly by reducing transmission rates and indirectly by constraining people ’ s behaviour. [SEP]',
  'score': 0.2081695795059204,
  'token': 1144},
 {'sequence': '[CLS] Containment shortages had a large impact on the number of COVID - 19 cases and deaths, directly by reducing transmission rates and indirectly by constraining people ’ s behaviour. [SEP]',
  'score': 0.05083338916301727,
  'token': 25630},
 {'sequence': '[CLS] Containment management had a large impact on the number of COVID - 19 cases and deaths, directly by reducing transmission rates and indirectly by constraining people ’ s behaviour. [SEP]',
  'score': 0.02980988658964634,
  'token': 2635},
 {'sequence': '[CLS] Containment transport had a large impact on the number of COVID - 19 cases and deaths, directly by reducing transmission rates and indirectly by constraining people ’ s behaviour. [SEP]',
  'score': 0.028137

In [12]:
fill_mask("Containment policies had a large impact on the number of COVID-19 cases and deaths, directly by reducing [MASK] rates and indirectly by constraining people’s behaviour.")

[{'sequence': '[CLS] Containment policies had a large impact on the number of COVID - 19 cases and deaths, directly by reducing mortality rates and indirectly by constraining people ’ s behaviour. [SEP]',
  'score': 0.15623445808887482,
  'token': 14471},
 {'sequence': '[CLS] Containment policies had a large impact on the number of COVID - 19 cases and deaths, directly by reducing crime rates and indirectly by constraining people ’ s behaviour. [SEP]',
  'score': 0.08727061748504639,
  'token': 3755},
 {'sequence': '[CLS] Containment policies had a large impact on the number of COVID - 19 cases and deaths, directly by reducing birth rates and indirectly by constraining people ’ s behaviour. [SEP]',
  'score': 0.08088549971580505,
  'token': 3485},
 {'sequence': '[CLS] Containment policies had a large impact on the number of COVID - 19 cases and deaths, directly by reducing suicide rates and indirectly by constraining people ’ s behaviour. [SEP]',
  'score': 0.05626141279935837,
  'toke

# Next Sentence Prediction (NSP)

Let's try next sentence prediction, i.e., is sentence B a plausible follow up to sentence A? This was one of the original training tasks for BERT. 

We will need to update tranformers to the latest version (v3+)to do next sentence prediction (NSP) in this notebook

In [13]:
!pip install transformers==3.0.1 # upgrade transformers for NSP

Collecting transformers==3.0.1
  Downloading transformers-3.0.1-py3-none-any.whl (757 kB)
[K     |████████████████████████████████| 757 kB 136 kB/s 
Collecting tokenizers==0.8.0-rc4
  Downloading tokenizers-0.8.0rc4-cp37-cp37m-manylinux1_x86_64.whl (3.0 MB)
[K     |████████████████████████████████| 3.0 MB 10.2 MB/s 
[31mERROR: allennlp 1.0.0 has requirement transformers<2.12,>=2.9, but you'll have transformers 3.0.1 which is incompatible.[0m
Installing collected packages: tokenizers, transformers
  Attempting uninstall: tokenizers
    Found existing installation: tokenizers 0.7.0
    Uninstalling tokenizers-0.7.0:
      Successfully uninstalled tokenizers-0.7.0
  Attempting uninstall: transformers
    Found existing installation: transformers 2.11.0
    Uninstalling transformers-2.11.0:
      Successfully uninstalled transformers-2.11.0
Successfully installed tokenizers-0.8.0rc4 transformers-3.0.1
You should consider upgrading via the '/opt/conda/bin/python3.7 -m p

We will need to use Hugging Face library directy (versus pipelines) because as of now it has not yet been added to the pipeline API. Great opportunity to dig deeper, since you will need to, to do anything truly novel.

In [14]:
from transformers import BertTokenizer, BertForNextSentencePrediction # NSP-specific BERT
import torch
from torch.nn.functional import softmax # for computing final probabilities from raw outputs

tokenizer = BertTokenizer.from_pretrained('bert-base-cased')
model = BertForNextSentencePrediction.from_pretrained('bert-base-cased')

**Here is the article text again, we will drop the middle sentence as an experiment:**

A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40% fewer deaths by the start of June. Containment policies had a large impact on the number of COVID-19 cases and deaths, directly by reducing transmission rates and indirectly by constraining people’s behaviour. They account for roughly half the observed change in the growth rates of cases and deaths.

In [15]:
prompt = "A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40% fewer deaths by the start of June."
next_sentence = "Containment policies had a large impact on the number of COVID-19 cases and deaths, directly by reducing transmission rates and indirectly by constraining people’s behaviour."
encoding = tokenizer.encode(prompt, next_sentence, return_tensors='pt')
logits = model(encoding)[0] # Output is a tuple, first item describes the relationship between the two sentences we are after
probs = softmax(logits[0],dim=0) # Compute probability from raw numbers
print("Probabilities: [not plausible, plausible]")
print(probs)

Probabilities: [not plausible, plausible]
tensor([0.1725, 0.8275], grad_fn=<SoftmaxBackward>)


In [16]:
prompt = "A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40% fewer deaths by the start of June."
next_sentence = "They account for roughly half the observed change in the growth rates of cases and deaths."
encoding = tokenizer.encode(prompt, next_sentence, return_tensors='pt')
logits = model(encoding)[0] # Output is a tuple, first item describes the relationship between the two sentences we are after
probs = softmax(logits[0],dim=0) # Compute probability
print("Probabilities: [not plausible, plausible]")
print(probs)

Probabilities: [not plausible, plausible]
tensor([0.8242, 0.1758], grad_fn=<SoftmaxBackward>)


In [17]:
prompt = "A new study estimates that if the US had universally mandated masks on 1 April, there could have been nearly 40% fewer deaths by the start of June."
next_sentence = "Cats are independent."
encoding = tokenizer.encode(prompt, next_sentence, return_tensors='pt')
logits = model(encoding)[0] # output is tuple, first item describes the relationship between the two sentences we are after
probs = softmax(logits[0],dim=0)
print("Probabilities: [not plausible, plausible]")
print(probs)

Probabilities: [not plausible, plausible]
tensor([0.7666, 0.2334], grad_fn=<SoftmaxBackward>)
