<a href="https://colab.research.google.com/github/zapper59/NLP-Question-Answering/blob/master/panlp3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
%%shell
### Only needs to be run once per "runtime session"

git clone https://github.com/huggingface/pytorch-pretrained-BERT.git
cd pytorch-pretrained-BERT/
git checkout b8e2a9c5840e
python setup.py install
cd ..

git clone https://github.com/zapper59/NLP-Question-Answering.git
rm pytorch-pretrained-BERT/examples/run_squad.py
cp NLP-Question-Answering/bert_on_colab/run_squad.py pytorch-pretrained-BERT/examples/

Cloning into 'pytorch-pretrained-BERT'...
remote: Enumerating objects: 50, done.[K
remote: Counting objects: 100% (50/50), done.[K
remote: Compressing objects: 100% (40/40), done.[K
remote: Total 3173 (delta 19), reused 29 (delta 9), pack-reused 3123[K
Receiving objects: 100% (3173/3173), 1.54 MiB | 13.62 MiB/s, done.
Resolving deltas: 100% (2167/2167), done.
Note: checking out 'b8e2a9c5840e'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at b8e2a9c Made --reduce_memory actually do something in finetune_on_pregenerated
running install
running bdist_egg
running egg_info
creating pytorch_pretrained_bert.egg-info
writing 



In [0]:
### Setup environment variables
import os
os.environ['pdir'] = 'NLP-Question-Answering/bert_on_colab'
SQUAD_VERSION= 2

### Get packages
!pip install fuzzywuzzy

Collecting fuzzywuzzy
  Downloading https://files.pythonhosted.org/packages/d8/f1/5a267addb30ab7eaa1beab2b9323073815da4551076554ecc890a3595ec9/fuzzywuzzy-0.17.0-py2.py3-none-any.whl
Installing collected packages: fuzzywuzzy
Successfully installed fuzzywuzzy-0.17.0


While training, you may see that the GPU's memory usage is getting high.
Just ignore this.

In [0]:
%%shell
### Train and also evalute on test set

python $pdir/format_data.py --v2

rm -fr out
python pytorch-pretrained-BERT/examples/run_squad.py \
  --bert_model bert-base-cased \
  --do_train \
  --do_predict \
  --train_file $pdir/training.json \
  --predict_file $pdir/testing.json \
  --train_batch_size 12 \
  --learning_rate 3e-5 \
  --num_train_epochs 25.0 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --output_dir out \
  --max_answer_length 8 \
  --version_2_with_negative

### Save most recently trained model
rm -fr saved_model
mv out saved_model

04/28/2019 19:40:22 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
04/28/2019 19:40:22 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-vocab.txt from cache at /root/.pytorch_pretrained_bert/5e8a2b4893d13790ed4150ca1906be5f7a03d6c4ddf62296c383f6db42814db2.e13dbb970cb325137104fb2e5f36fe865f27746c6b526f6352861b1980eb80b1
04/28/2019 19:40:23 - INFO - pytorch_pretrained_bert.modeling -   loading archive file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased.tar.gz from cache at /root/.pytorch_pretrained_bert/distributed_-1/a803ce83ca27fecf74c355673c434e51c265fb8a3e0e57ac62a80e38ba98d384.681017f415dfb33ec8d0e04fe51a619f3f01532ecea04edbfd48c5d160550d9c
04/28/2019 19:40:23 - INFO - pytorch_pretrained_bert.modeling -   extracting archive file /root/.pytorch_pretrained_bert/distributed_-1/a803ce83ca27fecf74c355673c434e51c265fb8a3e0e57a



In [0]:
### Print human friendly prediction results on test set
import json
from pprint import pprint
from fuzzywuzzy import fuzz

with open('NLP-Question-Answering/bert_on_colab/testing.json') as f:
  test = json.load(f)

if SQUAD_VERSION == 2:
  with open('saved_model/nbest_predictions.json') as f:
    nbest_preds = json.load(f)
    for i in nbest_preds:
      nbest_preds[i] = [{'text': p['text'], 'prob':p['probability']} for p in nbest_preds[i]]
else:
  with open('saved_model/predictions.json') as f:
    preds = json.load(f)

if SQUAD_VERSION == 2:
  golds = []
  for char in test['data']:
    for para in char['paragraphs']:
      qas = para['qas']
      for qa in qas:
        if not qa['is_impossible']:
          qid = qa['id']
          q = qa['question']
          gold = qa['answers'][0]['text']
          golds.append({'.question':q, 'gold': gold, 'preds': nbest_preds[qid][0:5]})
else:
  golds = []
  for char in test['data']:
    for para in char['paragraphs']:
      for qa in para['qas']:
        golds.append({'.question': qa['question'], 'gold': qa['answers'][0]['text']})

    i = 0
    for p in preds:
      golds[i]['pred'] = preds[p]['text']
      golds[i]['prob'] = preds[p]['probability']
      i += 1

#pprint(golds)

### Get precision (right answer) and in-top-3 results
def print_results(golds):
  total = 0
  correct = 0
  intop3 = 0
  for j in golds:
    q = j['.question']
    g = j['gold']
    top = [e['text'] for e in j['preds']]
    top1 = top[0] 
    if fuzz.token_set_ratio(g, top1) > 95:
      correct +=1

    top3 = top[0:3]
    if any(list(map(lambda x: fuzz.token_set_ratio(g, x) > 95, top3))):
      intop3 += 1
    total += 1
  print('Precision: ' + str(correct/total))
  print('Top 3 Recall: ' + str(intop3/total))
print_results(golds)

### create list of queries for next part
with open('queries.txt', 'w') as f:
  for t in golds:
    f.write(t['.question']+'\n')

Precision: 0.6521739130434783
Top 3 Recall: 0.782608695652174


From this point on, training is done and you only need run the code blocks below to test a query.

In [0]:
%%shell
### Do a single query using model in saved_model
q="Where was Jon Snow born and raised?"
python $pdir/format_query.py "$q"

rm -fr out
python pytorch-pretrained-BERT/examples/run_squad.py \
  --bert_model bert-base-cased \
  --do_predict \
  --predict_batch_size 32 \
  --predict_file $pdir/query.json \
  --max_seq_length 384 \
  --doc_stride 32 \
  --output_dir out \
  --max_answer_length 8 \
  --only_predict \
  --saved_model_dir saved_model \
  --version_2_with_negative


In [106]:
### Get answer
import collections
import json
import operator

if SQUAD_VERSION == 2:
  with open("out/nbest_predictions.json") as f:
    preds = json.load(f)
else:
  with open("out/predictions.json") as f:
      preds = json.load(f)

freq = collections.Counter()
sumscores = {}
minscores = {}
maxscores = {}

for i in preds:
  if SQUAD_VERSION == 2:
    ans = preds[i][0]['text']
    score = preds[i][0]['probability']
  else:
    ans = preds[i]['text']
    score = preds[i]['probability']
  if ans:
    freq[ans] += 1
    if ans in sumscores:
        sumscores[ans] += score
    else:
        sumscores[ans] = score
    if ans in minscores:
        minscores[ans] = min(minscores[ans], score)
    else:
      minscores[ans] = score
    if ans in maxscores:
        maxscores[ans] = max(maxscores[ans], score)
    else:
      maxscores[ans] = score

avgscores = {}
for text in freq:
    avgscores[text] = sumscores[text]/freq[text]

top_avgscores = sorted(avgscores.items(), key=operator.itemgetter(1), reverse=True)
top_minscores = sorted(minscores.items(), key=operator.itemgetter(1), reverse=True)
top_maxscores = sorted(maxscores.items(), key=operator.itemgetter(1), reverse=True)
print(top_avgscores)
print(top_minscores)
print(top_maxscores[0:5])
print(freq.most_common(1)[0][0])

[('Catelyn Tully', 0.9997352907172847), ('Wylla', 0.9974573814966258), ('Aegon', 0.9970453522698609), ('Jon Arryn', 0.9959338071990338), ('Winterfell', 0.9856000315105505), ('Arya', 0.9662199350637628), ('Aegon Targaryen', 0.9323313614996028), ('Theon was born at Pyke', 0.8062237328695417)]
[('Catelyn Tully', 0.9997352907172847), ('Wylla', 0.9974573814966258), ('Aegon', 0.9970453522698609), ('Jon Arryn', 0.9959338071990338), ('Winterfell', 0.975844259327718), ('Arya', 0.9662199350637628), ('Aegon Targaryen', 0.8480899592718012), ('Theon was born at Pyke', 0.8062237328695417)]
[('Catelyn Tully', 0.9997352907172847), ('Winterfell', 0.9993089649796849), ('Wylla', 0.9974573814966258), ('Aegon', 0.9970453522698609), ('Jon Arryn', 0.9959338071990338)]
Winterfell


In [130]:
!cd NLP-Question-Answering/; git pull

remote: Enumerating objects: 7, done.[K
remote: Counting objects:  14% (1/7)   [Kremote: Counting objects:  28% (2/7)   [Kremote: Counting objects:  42% (3/7)   [Kremote: Counting objects:  57% (4/7)   [Kremote: Counting objects:  71% (5/7)   [Kremote: Counting objects:  85% (6/7)   [Kremote: Counting objects: 100% (7/7)   [Kremote: Counting objects: 100% (7/7), done.[K
remote: Compressing objects: 100% (1/1)   [Kremote: Compressing objects: 100% (1/1), done.[K
remote: Total 4 (delta 3), reused 4 (delta 3), pack-reused 0[K
Unpacking objects:  25% (1/4)   Unpacking objects:  50% (2/4)   Unpacking objects:  75% (3/4)   Unpacking objects: 100% (4/4)   Unpacking objects: 100% (4/4), done.
From https://github.com/zapper59/NLP-Question-Answering
   97da499..a52e00f  master     -> origin/master
Updating 97da499..a52e00f
Fast-forward
 bert_on_colab/analyze_predictions.py | 11 [32m+++++++[m[31m----[m
 1 file changed, 7 insertions(+), 4 deletions(-)


In [131]:
%%shell
### Do multiple queries to get results
rm predictions.txt
while read q
do
  python $pdir/format_query.py "$q"

  rm -fr out
  python pytorch-pretrained-BERT/examples/run_squad.py \
    --bert_model bert-base-cased \
    --do_predict \
    --predict_batch_size 32 \
    --predict_file $pdir/query.json \
    --max_seq_length 384 \
    --doc_stride 32 \
    --output_dir out \
    --max_answer_length 8 \
    --only_predict \
    --saved_model_dir saved_model \
    --version_2_with_negative > /dev/null 2>&1
  
  python $pdir/analyze_predictions.py "out/nbest_predictions.json"
done <queries.txt

rm: cannot remove 'predictions.txt': No such file or directory




In [132]:
with open('predictions.txt', 'r') as f:
  answers = f.readlines()
answers = [w.strip() for w in answers]
count = 0
correct = 0
for j in golds:
  q = j['.question']
  g = j['gold']
  top1 = answers[count]
  if fuzz.token_set_ratio(g, top1) > 95:
    correct +=1
  count += 1
print('Precision: ' + str(correct/count))

Precision: 0.4782608695652174
