# **This Google colab notebook is the project work of:**

> **Srikanth Thirumalasetti**, **Roll# 2019900090**, **PGSSP Student, IIITH, Gachibowli, Hyderabad**

> **Course: CSE 573 (Spring 2020) - Natural Language Processing Applications (NLA)** 

# **About the project**
*   The project is a pilot implementation of BERT (*Base, Uncased, 12-layer, 768-hidden, 12-heads, 110M parameters*) for a closed-domain Q&A system.
*   The above pre-trained BERT model is fine-tuned on SQuAD 1.1 dataset.
*   Evaluation of the model is done by running SQuAD v1.1 evaluation script that compares the predictions made by the fine-tuned model and the SQuAD dev set for evaluation.
*   Additionally, an external test dataset used is from the textual content from Proxzar.
*   A *qualititative* comparision of the current implementation of QnA systems @Proxzar.ai *vis-a-vis* this BERT implementation was also planned in the final report on performance summary.
*   The final fine-tuned model is planned to be saved as a Tensorflow 2.0 model in saved model format.
*   The model is trained on a single GPU provided by Google Colab runtime environment.

# **Project Status**


> **Total 4 Action Items**



---
**Action Item 1:** 

> Build BERT model after fine-tuning pre-trained BERT base, uncased, 12-layer model using SQuAD 1.1 dataset and save it in **TF2 saved model** format.

> **Status**: 100% completed

> **ETC**: *April 18th 2020*

---

**Action Item 2:** 
> Performance Metrics Of Fine-tuned Model.

> **Status**: 100% completed

> **ETC**: *April 22nd 2020*

---

**Action Item 3:** 
> Generate qualitative comparision report by comparing existing implementation of Q&A systems @Proxzar.ai with this BERT implementation.

> **Status**: Partially Completed (Errored out due to bug)

> **ETC**: *April 24th 2020*

---

**Action Item 4:** 
> Submit the project as per the institution guidelines.

> **Status**: TBD

> **ETC**: *May 1st 2020*

---



# **Action Item 1**

# Map Google Drive Locally



In [0]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [0]:
### IMP #####################################################
# Run the below symlink mapping only once per session i.e. 
# when runtime is started or re-started for the first time.
#############################################################
!rm /mydrive
!ln -s "/content/drive/My Drive" /mydrive

rm: cannot remove '/mydrive': No such file or directory


# Install external libraries

In [0]:
!pip install tqdm
# Colab changed to Tensorflow 2 version on March'27th'2020. Hence commenting out the below lines
!pip uninstall tensorflow # default colab version is 1.*
!pip install tensorflow==2.1.0 # install tensorflow version 2 as it has tight integration with Keras

Uninstalling tensorflow-2.2.0rc3:
  Would remove:
    /usr/local/bin/estimator_ckpt_converter
    /usr/local/bin/saved_model_cli
    /usr/local/bin/tensorboard
    /usr/local/bin/tf_upgrade_v2
    /usr/local/bin/tflite_convert
    /usr/local/bin/toco
    /usr/local/bin/toco_from_protos
    /usr/local/lib/python3.6/dist-packages/tensorflow-2.2.0rc3.dist-info/*
    /usr/local/lib/python3.6/dist-packages/tensorflow/*
Proceed (y/n)? y
  Successfully uninstalled tensorflow-2.2.0rc3
Collecting tensorflow==2.1.0
[?25l  Downloading https://files.pythonhosted.org/packages/85/d4/c0cd1057b331bc38b65478302114194bd8e1b9c2bbc06e300935c0e93d90/tensorflow-2.1.0-cp36-cp36m-manylinux2010_x86_64.whl (421.8MB)
[K     |████████████████████████████████| 421.8MB 23kB/s 
Collecting tensorboard<2.2.0,>=2.1.0
[?25l  Downloading https://files.pythonhosted.org/packages/d9/41/bbf49b61370e4f4d245d4c6051dfb6db80cec672605c91b1652ac8cc3d38/tensorboard-2.1.1-py3-none-any.whl (3.8MB)
[K     |████████████████████████

# Import external modules

In [0]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import math
import datetime
from tqdm import tqdm
import tensorflow as tf

# Define global variables

In [0]:
################################################################################################################
# The reason why we've the below variables saved as ENV variables is because we run the training script in a shell
# and using GPU / TPU for parallel processing. This means that we need to maintain these variables across 
# new independent processes to run python scripts in the bash / command prompt.
# Also, to give info on all those variables that are used in BERT training / prediction scripts, another set of 
# "local" variables are listed in the succeeding section (for general info of various variables used in BERT).
################################################################################################################
os.environ['LOCAL_DIR']="/mydrive"

#os.environ['NIGHTLY_BUILD_DIR']='/usr/local/lib/python3.6/dist-packages/official' # use TF_SQUAD_DIR instead

# SQuAD specific:
os.environ['SQUAD_TRG_DATA_FILE']=os.path.join(os.environ['LOCAL_DIR'],'bert_finetuning_outputs','train-v1.1.json')
!echo "(SQUAD_TRG_DATA_FILE) SQuAD training data file with full path is: " ${SQUAD_TRG_DATA_FILE}

os.environ['SQUAD_PRED_FILE']=os.path.join(os.environ['LOCAL_DIR'],'bert_finetuning_outputs','dev-v1.1.json')
!echo "(SQUAD_PRED_FILE) SQuAD prediction file with full path is: " ${SQUAD_PRED_FILE}

############################################################
#Below vars are when the project is cloned via gitbub
#os.environ['TF_SQUAD_DIR']=os.path.join(os.environ['LOCAL_DIR'],'bert_tf')

os.environ['TF_SQUAD_DIR']='/usr/local/lib/python3.6/dist-packages/official'
!echo "(TF_SQUAD_DIR) Tensforflow SQuAD parent directory with Python scripts used for fine-tuning is: " ${TF_SQUAD_DIR}
############################################################

os.environ['SQUAD_VERSION']='v1.1'
!echo "SQuAD version being used for fine-tuning is: " ${SQUAD_VERSION}

# BERT specific:
os.environ['BERT_MODEL_TO_FINE_TUNE_ON_TF_HUB']='https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/1'
!echo "(BERT_MODEL_TO_FINE_TUNE_ON_TF_HUB) BERT model being fine-tuned is: " ${BERT_MODEL_TO_FINE_TUNE_ON_TF_HUB}

os.environ['BERT_DIR']=os.path.join(os.environ['LOCAL_DIR'],'(REL_VER)_bert_tf_OLD', 'BERT_Pretrained_Models','bert_uncased_base')
!echo "(BERT_DIR) BERT directory that has pre-trained model being fine-tuned is: " ${BERT_DIR}

os.environ['OUTPUT_DIR']=os.path.join(os.environ['LOCAL_DIR'],'bert_finetuning_outputs')
!echo "(OUTPUT_DIR) Output directory to save model and predictions is: " ${OUTPUT_DIR}

os.environ['KERAS_SAVED_MODEL_OUTPUT_DIR']=os.path.join(os.environ['LOCAL_DIR'],'bert_finetuning_outputs','keras_saved_model')
!echo "(KERAS_SAVED_MODEL_OUTPUT_DIR) Output directory to save fine-tuned model as Keras model is: " ${KERAS_SAVED_MODEL_OUTPUT_DIR}

############################################################
# Uncomment below lines when the project external scripts are cloned via github
#os.environ['CREATE_FINETUNING_DATA_SCRIPT']=os.path.join('models','official', 'nlp', 'data', 'create_finetuning_data.py')
#!echo "(CREATE_FINETUNING_DATA_SCRIPT) Python script (with relative path to project root) that tokenizes and prepares data for fine tuning is: " ${CREATE_FINETUNING_DATA_SCRIPT}

#os.environ['FINETUNING_TRG_SCRIPT']=os.path.join('models','official', 'nlp', 'bert' , 'run_squad.py')
#!echo "(FINETUNING_TRG_SCRIPT) Python script (with relative path to project root) that trains or fine tunes is: " ${FINETUNING_TRG_SCRIPT}

#os.environ['FINAL_EVALUATION_SCRIPT']=os.path.join('models','official', 'nlp', 'bert' , 'evaluate-v1.1.py')
#!echo "(FINAL_EVALUATION_SCRIPT) Python script (with relative path to project root) that trains or fine tunes is: " ${FINAL_EVALUATION_SCRIPT}

#os.environ['MODEL_SAVE_SCRIPT']=os.path.join('models','official', 'nlp', 'bert' , 'model_saving_utils.py')
#!echo "(MODEL_SAVE_SCRIPT) Python script used to save the fine-tuned model to Keras model." ${MODEL_SAVE_SCRIPT}

# Comment below lines when the project external scripts are cloned via github
os.environ['CREATE_FINETUNING_DATA_SCRIPT']=os.path.join(os.environ['TF_SQUAD_DIR'], 'nlp', 'data', 'create_finetuning_data.py')
!echo "(CREATE_FINETUNING_DATA_SCRIPT) Python script (with absolute path) that tokenizes and prepares data for fine tuning is: " ${CREATE_FINETUNING_DATA_SCRIPT}

os.environ['FINETUNING_TRG_SCRIPT']=os.path.join(os.environ['TF_SQUAD_DIR'], 'nlp', 'bert' , 'run_squad.py')
!echo "(FINETUNING_TRG_SCRIPT) Python script (with absolute path) that trains or fine tunes is: " ${FINETUNING_TRG_SCRIPT}

os.environ['FINAL_EVALUATION_SCRIPT']=os.path.join(os.environ['OUTPUT_DIR'], 'evaluate-v1.1.py')
!echo "(FINAL_EVALUATION_SCRIPT) Python script (with absolute path) that trains or fine tunes is: " ${FINAL_EVALUATION_SCRIPT}

os.environ['MODEL_SAVE_SCRIPT']=os.path.join(os.environ['TF_SQUAD_DIR'], 'nlp', 'bert' , 'model_saving_utils.py')
!echo "(MODEL_SAVE_SCRIPT) Python script used to save the fine-tuned model to Keras model." ${MODEL_SAVE_SCRIPT}

############################################################


(SQUAD_TRG_DATA_FILE) SQuAD training data file with full path is:  /mydrive/bert_finetuning_outputs/train-v1.1.json
(SQUAD_PRED_FILE) SQuAD prediction file with full path is:  /mydrive/bert_finetuning_outputs/dev-v1.1.json
(TF_SQUAD_DIR) Tensforflow SQuAD parent directory with Python scripts used for fine-tuning is:  /usr/local/lib/python3.6/dist-packages/official
SQuAD version being used for fine-tuning is:  v1.1
(BERT_MODEL_TO_FINE_TUNE_ON_TF_HUB) BERT model being fine-tuned is:  https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/1
(BERT_DIR) BERT directory that has pre-trained model being fine-tuned is:  /mydrive/(REL_VER)_bert_tf_OLD/BERT_Pretrained_Models/bert_uncased_base
(OUTPUT_DIR) Output directory to save model and predictions is:  /mydrive/bert_finetuning_outputs
(KERAS_SAVED_MODEL_OUTPUT_DIR) Output directory to save fine-tuned model as Keras model is:  /mydrive/bert_finetuning_outputs/keras_saved_model
(CREATE_FINETUNING_DATA_SCRIPT) Python script (with absolute 

# Get Tensorflow Models from github

In [0]:
### IMP ####################################################################################################
# Installing Tensorflow Models API via github is giving issues with missing attributes, like: CallbackList.
# Hence, commenting out the below lines to install nightly build as suggested in TF models readme.
#os.chdir(os.environ['TF_SQUAD_DIR'])
#print("\nCurrent working directory is: " + os.getcwd() + "\n")
#!git clone 'https://github.com/tensorflow/models.git'
#os.environ['PYTHONPATH'] += ":" + os.path.join(os.environ['TF_SQUAD_DIR'], 'models')
#!pip3 install --user -r models/official/requirements.txt
############################################################################################################

# Install ONLY ONCE (check if the folder exists: os.environ['TF_SQUAD_DIR'] and was not reset in new VM)
# If folder doesn't exists, uncomment the below line and install tf-models.nightly package.
#!pip install tf-models-nightly


Collecting tf-models-nightly
[?25l  Downloading https://files.pythonhosted.org/packages/0f/8f/ceccfb078dc3ef3e2b8c0b3b682f864e35e995bd712736e5c490b9bcf253/tf_models_nightly-2.2.0.dev20200420-py2.py3-none-any.whl (765kB)
[K     |▍                               | 10kB 27.5MB/s eta 0:00:01[K     |▉                               | 20kB 6.1MB/s eta 0:00:01[K     |█▎                              | 30kB 8.6MB/s eta 0:00:01[K     |█▊                              | 40kB 11.0MB/s eta 0:00:01[K     |██▏                             | 51kB 7.1MB/s eta 0:00:01[K     |██▋                             | 61kB 8.3MB/s eta 0:00:01[K     |███                             | 71kB 9.4MB/s eta 0:00:01[K     |███▍                            | 81kB 10.4MB/s eta 0:00:01[K     |███▉                            | 92kB 8.2MB/s eta 0:00:01[K     |████▎                           | 102kB 8.9MB/s eta 0:00:01[K     |████▊                           | 112kB 8.9MB/s eta 0:00:01[K     |█████▏          

# Variables used for training / fine-tuning BERT model (for info only)

In [0]:
### BEGIN #########################################################################################################
# The below variables are NOT required as we are using Tensorflow hub published model of BERT. These
# variables are left here for future experiments to import BERT embeddings and weights via checkpoint file.

# The config json file corresponding to the pre-trained BERT model and that specifies the model architecture.
#bert_config_file = os.path.join(LOCAL_DIR,'bertmodel','uncased_L-12_H-768_A-12','bert_config.json')

# The vocabulary file that the BERT model was trained on.
#vocab_file = os.path.join($BERT_DIR,'vocab.txt')

# Get weights and other variables from the pre-trained BERT saved model file.
#init_checkpoint = os.path.join(LOCAL_DIR,$BERT_DIR, 'bert_model.ckpt.data-00000-of-00001')
### END #########################################################################################################

# The output directory where the model checkpoints will be written.
#output_dir = os.path.join($OUTPUT_DIR, 'model_out')

#### BEGIN ##################################################################################################################################
# The way 'run_squad.py' is used in TF for fine-tuning of QA task using BERT embeddings works differently when doing predictions.
#   - During fine-tuning, a predictions file as set in the variable 'output_prediction_file' is generated in the folder given in the variable 'output_dir'.
#   - After completing training / fine-tuning using 'run_squad.py', we do evaluation by running another script 'squad_evaluate_v1_1.py'.
#   - The above evaluations script takes two args viz., 1) SQuAD's dev_v1.1.json, and 2) predictions.json (generated during fine-tuning).

# SQuAD json for training / fine-tuning
#train_file = $SQUAD_TRG_DATA_FILE

# SQuAD json for evaluating predictions generated in the file 'predictions.json' during fine-tuning
#predict_file = $SQUAD_PRED_FILE

# Output file to log predictions.
#output_prediction_file = os.path.join(output_dir,'predictions.json')
#### END ##################################################################################################################################

# Whether to lower case the input - True for uncased models / False for cased models.
#do_lower_case = True

#### BEGIN ##################################################################################################################################
# The maximum total input sequence length after WordPiece tokenization.
# Sequences longer than this will be truncated, and sequences shorter than this will be padded.
#   - Internally, padding is taken care of by the script 'create_finetuning_data.py', which is run 
#     before running the main training script 'run_squad.py' by setting 
#     input mask to 0 for those tokens that doesn't need to compute attention.
#max_seq_length = $MAX_SEQ_LEN

# When splitting up a long document into chunks, how much stride to take between chunks.
#doc_stride = 128
#### END ##################################################################################################################################

# The maximum number of tokens for the question. Questions longer than this will be truncated to this length.
#max_query_length = 64

# Whether to run training / fine-tuning
#do_train = True 

# Whether to run eval on the dev set.
#do_predict = True

# Total batch size for training. 
#train_batch_size = 32 # not applicable in our case

# Total batch size for predictions
#predict_batch_size = 8

# The initial learning rate for Adam.
#learning_rate = 5e-5

# Total number of training epochs to perform.
#num_train_epochs = 3.0

# Proportion of training to perform linear learning rate warmup for E.g., 0.1 = 10% of training.
#warmup_proportion = 0.1 # not applicable in our case

# How often to save the model checkpoint.
#save_checkpoints_steps = 1000

# How many steps to make in each estimator call.
#iterations_per_loop = 1000

# The total number of n-best predictions to generate in the nbest_predictions.json output file.
#n_best_size = 10

# The maximum length of an answer that can be generated. 
# This is needed because the start and end predictions are not conditioned on one another.
#max_answer_length = 30

# Whether to use TPU or GPU/CPU (we are using GPU for training as this is the most practical scenario for us in future)
# It is estimated to take a few hours for BERT large. Since, we are fine-tuning on GPU, the BERT model has been
# changed to BERT (base) - instead of BERT (large) as originally planned.
#use_tpu = false

# The Cloud TPU to use for training. This should be either the name 
# used when creating the Cloud TPU, or a grpc://ip.address.of.tpu:8470 url.
#tpu_name = None

# [Optional] GCE zone where the Cloud TPU is located in. If not
# specified, we will attempt to automatically detect the GCE project from metadata.
#tpu_zone = None
# [Optional] Project name for the Cloud TPU-enabled project. If not 
# specified, we will attempt to automatically detect the GCE project from metadata.
#gcp_project = None

# [Optional] TensorFlow master URL.
#master = None

# Only used if `use_tpu` is True. Total number of TPU cores to use.
#num_tpu_cores = None

# If true, all of the warnings related to data processing will be printed. 
# A number of warnings are expected for a normal SQuAD evaluation.
#verbose_logging = False

# If true, the SQuAD examples contain some that do not have an answer (SQuAD 2.0).
#version_2_with_negative = False

# If null_score - best_non_null is greater than the threshold predict null.
#null_score_diff_threshold = 0.0


# Import SQuAD 2.0 dataset to LOCAL_DIR

In [0]:
# SQuAD 2.0 dataset is downloaded into Google drive (/content/drive/My Drive/bert_finetuned_model)

# Load, pre-process and tokenize (train) dataset for fine-tuning

In [0]:
#################################################################################################
# The below python script tokenizes the input data in the file ${SQUAD_TRG_DATA_FILE} and 
# writes inputs to a tensorflow record that is eventually written to the drive (${OUTPUT_DIR}).
# This tf.record is read by the next script 'run_squad.py', which actually does fine-tuning.
#################################################################################################
print("\nCurrent working directory is: " + os.getcwd() + "\n")
!python ${CREATE_FINETUNING_DATA_SCRIPT} \
 --squad_data_file=${SQUAD_TRG_DATA_FILE} \
 --vocab_file=${OUTPUT_DIR}/vocab.txt \
 --train_data_output_path=${OUTPUT_DIR}/squad_${SQUAD_VERSION}_train.tf_record \
 --meta_data_file_path=${OUTPUT_DIR}/squad_${SQUAD_VERSION}_meta_data \
 --fine_tuning_task_type=squad --max_seq_length=512


Current working directory is: /content

I0420 19:06:20.358043 140407150372736 squad_lib.py:353] *** Example ***
I0420 19:06:20.358261 140407150372736 squad_lib.py:354] unique_id: 1000000000
I0420 19:06:20.358927 140407150372736 squad_lib.py:355] example_index: 0
I0420 19:06:20.358995 140407150372736 squad_lib.py:356] doc_span_index: 0
I0420 19:06:20.359112 140407150372736 squad_lib.py:358] tokens: [CLS] to whom did the virgin mary allegedly appear in 1858 in lou ##rdes france ? [SEP] architectural ##ly , the school has a catholic character . atop the main building ' s gold dome is a golden statue of the virgin mary . immediately in front of the main building and facing it , is a copper statue of christ with arms up ##rai ##sed with the legend " ve ##ni ##te ad me om ##nes " . next to the main building is the basilica of the sacred heart . immediately behind the basilica is the gr ##otto , a marian place of prayer and reflection . it is a replica of the gr ##otto at lou ##rdes , france

# Start fine-tuning

In [0]:
#################################################################################################
# This script does the actual fine-tuning. However, it does not evaluate the performance of 
# training. It only writes the predictions after training / fine-tuning the model 
# to a file 'predictions.json' in the folder ${OUTPUT_DIR}.
# Actual evaluation is done by running another script that compares it with the file: ${SQUAD_PRED_FILE}.
#################################################################################################
print("\nCurrent working directory is: " + os.getcwd() + "\n")
!python ${FINETUNING_TRG_SCRIPT} \
  --input_meta_data_path=${OUTPUT_DIR}/squad_${SQUAD_VERSION}_meta_data \
  --train_data_path=${OUTPUT_DIR}/squad_${SQUAD_VERSION}_train.tf_record \
  --predict_file=${SQUAD_PRED_FILE} \
  --vocab_file=${OUTPUT_DIR}/vocab.txt \
  --bert_config_file=${OUTPUT_DIR}/bert_config.json \
  --hub_module_url=${BERT_MODEL_TO_FINE_TUNE_ON_TF_HUB} \
  --train_batch_size=8 \
  --predict_batch_size=8 \
  --learning_rate=8e-5 \
  --num_train_epochs=2 \
  --mode=train_and_predict \
  --model_dir=${OUTPUT_DIR} \
  --distribution_strategy=mirrored \
  --log_steps=50 \
  --run_eagerly=False \
  --do_lower_case=True

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
I0420 22:02:31.359487 139904526645120 model_training_utils.py:450] Train Step: 17302/21936  / loss = 0.046177130192518234
I0420 22:02:31.899845 139904526645120 model_training_utils.py:450] Train Step: 17303/21936  / loss = 0.8529201745986938
I0420 22:02:32.437416 139904526645120 model_training_utils.py:450] Train Step: 17304/21936  / loss = 0.8002827763557434
I0420 22:02:32.974557 139904526645120 model_training_utils.py:450] Train Step: 17305/21936  / loss = 0.2637821137905121
I0420 22:02:33.517811 139904526645120 model_training_utils.py:450] Train Step: 17306/21936  / loss = 0.12173084169626236
I0420 22:02:34.055825 139904526645120 model_training_utils.py:450] Train Step: 17307/21936  / loss = 0.3473479151725769
I0420 22:02:34.599259 139904526645120 model_training_utils.py:450] Train Step: 17308/21936  / loss = 0.09881244599819183
I0420 22:02:35.136355 139904526645120 model_training_utils.py:450] Train Step: 17309/21936 

# Evaluate fine-tuned model using SQuAD official evaluation script

*   Below scores can be compared with others on SQuAD leader board

In [0]:
!python ${FINAL_EVALUATION_SCRIPT} ${SQUAD_PRED_FILE} ${OUTPUT_DIR}/predictions.json

{"exact_match": 79.60264900662251, "f1": 87.47596501756003}


# Save BERT fine-tuned model as TF 2.0 model in saved model format

In [0]:
#################################################################################################
# This script saves the fine-tuned model in 'saved_model' format for inference.
#################################################################################################
print("\nModel is exported to: " + os.environ['OUTPUT_DIR'] + '/exported_model' + "\n")
!python ${FINETUNING_TRG_SCRIPT} \
  --input_meta_data_path=${OUTPUT_DIR}/squad_${SQUAD_VERSION}_meta_data \
  --vocab_file=${OUTPUT_DIR}/vocab.txt \
  --bert_config_file=${OUTPUT_DIR}/bert_config.json \
  --mode=export_only \
  --model_dir=${OUTPUT_DIR} \
  --model_export_path=${OUTPUT_DIR}/exported_model \
  --init_checkpoint=${OUTPUT_DIR}/ctl_step_21936.ckpt-2

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
      dtype=float32)>, <tf.Variable 'transformer/layer_2/output_layer_norm/gamma:0' shape=(768,) dtype=float32, numpy=
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,

# **Action Item 2**

# Run predictions using the fine-tuned model with external test set from Proxzar

Predictions were done in another Colab file as running with the current version of **run_squad.py** from the *Tensorflow official* github repo is throwing attribute errors. I've to switch the official repo to *kamalkraj's github repo* that has modified version of **run_squad.py** file.
The colab URL is: [Roll#2019900090 Project - Predictions](https://colab.research.google.com/drive/1rAwGpApMhJeOkoePWMO-rZS9dfMoQOQd#scrollTo=t4PVfgAGuida)

# Performance Metrics Of Fine-tuned Model

On SQuAD 1.1 dev set, the fine-tuned model performed as measured with the below metrics:
**EM score**: **79.60** , 
**F1 score**: **87.47**


# **Action Item 3**

# Generate qualitative comparision report with existing implementation

Even after switching to different github repo that purpotedly addresses the issues with Tensorflow's official **run_squad.py**, there are still issues with the new repo files in *writing the predictions to disk*.

Hence, could not generate performance summary report on the final predictions to compare this fine-tuned BERT model with Proxzar's current implementation that uses Stanford Core NLP and IIITB Research Labs SIML framework. 