This colab file is created by [Pragnakalp Techlabs](https://www.pragnakalp.com/).

You can copy this colab in your drive and then execute the command in given order. For more details check our blog [NLP Tutorial: Setup Question Answering System using BERT + SQuAD on Colab TPU](https://www.pragnakalp.com/nlp-tutorial-setup-question-answering-system-bert-squad-colab-tpu/)

Check our [BERT based Question and Answering system demo for English and other 8 languages](https://www.pragnakalp.com/demos/BERT-NLP-QnA-Demo/).

You can also [purchase the Demo of our BERT based QnA system including fine-tuned models](https://www.pragnakalp.com/bert-question-n-answering-system-in-python/).

##**BERT Fine-tuning and Prediction on SQUAD 2.0 using Cloud TPU!**

---



### **Overview**
**BERT**, or Bidirectional Embedding Representations from Transformers, is a new method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. The academic paper can be found here: https://arxiv.org/abs/1810.04805.

**SQuAD** Stanford Question Answering Dataset is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

This colab file shows how to fine-tune BERT on SQuAD dataset, and then how to perform the prediction. Using this you can create your own **Question Answering System.**

**Prerequisite** : You will need a GCP (Google Compute Engine) account and a GCS (Google Cloud Storage) bucket to run this colab file.

Please follow the Google Cloud for how to create GCP account and GCS bucket. You have $300 free credit to get started with any GCP product. You can learn more about it at https://cloud.google.com/tpu/docs/setup-gcp-account

You can create your GCS bucket from here http://console.cloud.google.com/storage.


### **Change Runtime to TPU**

> On the main menu, click on **Runtime** and select **Change runtime type**. Set "**TPU**" as the hardware accelerator.


### **Clone the BERT github repository**


> First Step is to Clone the BERT github repository, below is the way by which you can clone the repo from github.



In [6]:
!git clone https://github.com/google-research/bert.git

Cloning into 'bert'...


### **Confirm that BERT repo is cloned properly.**


> "ls -l" is used for long listing, if BERT repo is cloned properly you can see the BERT folder in current directory.



In [7]:
ls -l

 Volume in drive C has no label.
 Volume Serial Number is 8C95-8F49

 Directory of C:\Users\alfre\Code\jupyter



File Not Found


In [8]:
cd bert

C:\Users\alfre\Code\jupyter\bert


### **BERT repository files**


> use ls -l to check the content inside BERT folder, you can see all files related to BERT.



In [12]:
import os as os_obj
print(os_obj.getcwd())

C:\Users\alfre\Code\jupyter\bert


### **Download the BERT PRETRAINED MODEL**


BERT Pretrained Model List :


*   [BERT-Large, Uncased (Whole Word Masking)](https://storage.googleapis.com/bert_models/2019_05_30/wwm_uncased_L-24_H-1024_A-16.zip) : 24-layer, 1024-hidden, 16-heads, 340M parameters
*   [BERT-Large, Cased (Whole Word Masking)](https://storage.googleapis.com/bert_models/2019_05_30/wwm_cased_L-24_H-1024_A-16.zip) : 24-layer, 1024-hidden, 16-heads, 340M parameters
*   [BERT-Base, Uncased](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip) : 12-layer, 768-hidden, 12-heads, 110M parameters
*   [BERT-Large, Uncased](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-24_H-1024_A-16.zip) : 24-layer, 1024-hidden, 16-heads, 340M parameters
*   [BERT-Base, Cased](https://storage.googleapis.com/bert_models/2018_10_18/cased_L-12_H-768_A-12.zip): 12-layer, 768-hidden, 12-heads , 110M parameters
*   [BERT-Large, Cased](https://storage.googleapis.com/bert_models/2018_10_18/cased_L-24_H-1024_A-16.zip) : 24-layer, 1024-hidden, 16-heads, 340M parameters
*   [BERT-Base, Multilingual Cased (New, recommended)](https://storage.googleapis.com/bert_models/2018_11_23/multi_cased_L-12_H-768_A-12.zip) : 104 languages, 12-layer, 768-hidden, 12-heads, 110M parameters
*   [BERT-Base, Multilingual Uncased (Orig, not recommended) (Not recommended, use Multilingual Cased instead)](https://storage.googleapis.com/bert_models/2018_11_03/multilingual_L-12_H-768_A-12.zip) : 102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters
*   [BERT-Base, Chinese](https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip) : Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M parameters

BERT has release **BERT-Base** and **BERT-Large** models. Uncased means that the text has been lowercased before WordPiece tokenization, e.g., John Smith becomes john smith, whereas Cased means that the true case and accent markers are preserved. 

**When using a cased model, make sure to pass --do_lower=False at the time of training.** 

You can download any model of your choice. We have used **BERT-Large-Uncased Model.**


In [22]:
!"C:\Program Files (x86)\GnuWin32\bin\wget" --no-check-certificate https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-24_H-1024_A-16.zip

SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
--2020-05-12 16:31:32--  https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-24_H-1024_A-16.zip
Resolving storage.googleapis.com... 172.217.164.144, 2607:f8b0:4004:c09::80
Connecting to storage.googleapis.com|172.217.164.144|:443... connected.
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 200 OK
Length: 1247797031 (1.2G) [application/zip]
Saving to: `uncased_L-24_H-1024_A-16.zip'

     0K .......... .......... .......... .......... ..........  0% 2.72M 7m17s
    50K .......... .......... .......... .......... ..........  0% 4.69M 5m45s
   100K .......... .......... .......... .......... ..........  0% 8.56M 4m36s
   150K .......... .......... .......... .......... ..........  0% 13.7M 3m49s
   200K .......... .......... .......... .......... ..........  0% 15.6M 3m18s
   250K .......... .......... .......... .......... ..........  0

  8550K .......... .......... .......... .......... ..........  0% 45.1M 60s
  8600K .......... .......... .......... .......... ..........  0% 44.1M 60s
  8650K .......... .......... .......... .......... ..........  0% 44.3M 60s
  8700K .......... .......... .......... .......... ..........  0% 44.1M 59s
  8750K .......... .......... .......... .......... ..........  0% 45.3M 59s
  8800K .......... .......... .......... .......... ..........  0% 34.3M 59s
  8850K .......... .......... .......... .......... ..........  0% 41.9M 59s
  8900K .......... .......... .......... .......... ..........  0% 46.5M 59s
  8950K .......... .......... .......... .......... ..........  0% 45.5M 59s
  9000K .......... .......... .......... .......... ..........  0% 45.9M 58s
  9050K .......... .......... .......... .......... ..........  0% 44.2M 58s
  9100K .......... .......... .......... .......... ..........  0% 45.1M 58s
  9150K .......... .......... .......... .......... ..........  0% 42.1M 58s

112700K .......... .......... .......... .......... ..........  9% 53.1M 28s
112750K .......... .......... .......... .......... ..........  9% 43.7M 28s
112800K .......... .......... .......... .......... ..........  9% 35.6M 28s
112850K .......... .......... .......... .......... ..........  9% 44.4M 28s
112900K .......... .......... .......... .......... ..........  9% 34.8M 28s
112950K .......... .......... .......... .......... ..........  9% 41.6M 28s
113000K .......... .......... .......... .......... ..........  9% 70.8M 28s
113050K .......... .......... .......... .......... ..........  9% 44.3M 28s
113100K .......... .......... .......... .......... ..........  9% 36.2M 28s
113150K .......... .......... .......... .......... ..........  9% 57.8M 28s
113200K .......... .......... .......... .......... ..........  9% 31.9M 28s
113250K .......... .......... .......... .......... ..........  9% 47.8M 28s
113300K .......... .......... .......... .......... ..........  9% 45.9M 28s

124300K .......... .......... .......... .......... .......... 10% 43.7M 27s
124350K .......... .......... .......... .......... .......... 10% 50.1M 27s
124400K .......... .......... .......... .......... .......... 10% 39.0M 27s
124450K .......... .......... .......... .......... .......... 10% 48.4M 27s
124500K .......... .......... .......... .......... .......... 10% 48.9M 27s
124550K .......... .......... .......... .......... .......... 10% 41.4M 27s
124600K .......... .......... .......... .......... .......... 10% 47.1M 27s
124650K .......... .......... .......... .......... .......... 10% 42.2M 27s
124700K .......... .......... .......... .......... .......... 10% 48.1M 27s
124750K .......... .......... .......... .......... .......... 10% 39.8M 27s
124800K .......... .......... .......... .......... .......... 10% 34.3M 27s
124850K .......... .......... .......... .......... .......... 10% 47.3M 27s
124900K .......... .......... .......... .......... .......... 10% 42.1M 27s

224300K .......... .......... .......... .......... .......... 18% 43.6M 26s
224350K .......... .......... .......... .......... .......... 18% 43.2M 26s
224400K .......... .......... .......... .......... .......... 18% 35.3M 26s
224450K .......... .......... .......... .......... .......... 18% 44.0M 26s
224500K .......... .......... .......... .......... .......... 18% 43.7M 26s
224550K .......... .......... .......... .......... .......... 18% 44.8M 26s
224600K .......... .......... .......... .......... .......... 18% 45.5M 26s
224650K .......... .......... .......... .......... .......... 18% 41.6M 26s
224700K .......... .......... .......... .......... .......... 18% 43.3M 26s
224750K .......... .......... .......... .......... .......... 18% 46.4M 26s
224800K .......... .......... .......... .......... .......... 18% 34.5M 26s
224850K .......... .......... .......... .......... .......... 18% 47.7M 26s
224900K .......... .......... .......... .......... .......... 18% 39.2M 26s

310250K .......... .......... .......... .......... .......... 25% 41.9M 23s
310300K .......... .......... .......... .......... .......... 25% 45.0M 23s
310350K .......... .......... .......... .......... .......... 25% 45.6M 23s
310400K .......... .......... .......... .......... .......... 25% 33.1M 23s
310450K .......... .......... .......... .......... .......... 25% 41.3M 23s
310500K .......... .......... .......... .......... .......... 25% 43.6M 23s
310550K .......... .......... .......... .......... .......... 25% 39.2M 23s
310600K .......... .......... .......... .......... .......... 25% 50.5M 23s
310650K .......... .......... .......... .......... .......... 25% 51.1M 23s
310700K .......... .......... .......... .......... .......... 25% 45.7M 23s
310750K .......... .......... .......... .......... .......... 25% 48.0M 23s
310800K .......... .......... .......... .......... .......... 25% 32.5M 23s
310850K .......... .......... .......... .......... .......... 25% 43.5M 23s

349200K .......... .......... .......... .......... .......... 28% 32.7M 23s
349250K .......... .......... .......... .......... .......... 28% 45.1M 23s
349300K .......... .......... .......... .......... .......... 28% 47.9M 23s
349350K .......... .......... .......... .......... .......... 28% 43.3M 23s
349400K .......... .......... .......... .......... .......... 28% 42.6M 23s
349450K .......... .......... .......... .......... .......... 28% 48.9M 23s
349500K .......... .......... .......... .......... .......... 28% 43.2M 23s
349550K .......... .......... .......... .......... .......... 28% 47.6M 23s
349600K .......... .......... .......... .......... .......... 28% 31.5M 23s
349650K .......... .......... .......... .......... .......... 28% 46.7M 23s
349700K .......... .......... .......... .......... .......... 28% 46.6M 23s
349750K .......... .......... .......... .......... .......... 28% 43.4M 23s
349800K .......... .......... .......... .......... .......... 28% 44.7M 23s

439900K .......... .......... .......... .......... .......... 36% 22.2M 22s
439950K .......... .......... .......... .......... .......... 36% 89.0M 22s
440000K .......... .......... .......... .......... .......... 36% 59.8M 22s
440050K .......... .......... .......... .......... .......... 36% 47.6M 22s
440100K .......... .......... .......... .......... .......... 36% 42.0M 22s
440150K .......... .......... .......... .......... .......... 36% 44.9M 22s
440200K .......... .......... .......... .......... .......... 36% 44.0M 22s
440250K .......... .......... .......... .......... .......... 36% 47.4M 22s
440300K .......... .......... .......... .......... .......... 36% 43.8M 22s
440350K .......... .......... .......... .......... .......... 36% 45.3M 22s
440400K .......... .......... .......... .......... .......... 36% 32.2M 22s
440450K .......... .......... .......... .......... .......... 36% 44.3M 22s
440500K .......... .......... .......... .......... .......... 36% 46.7M 22s

464100K .......... .......... .......... .......... .......... 38% 46.8M 21s
464150K .......... .......... .......... .......... .......... 38% 47.9M 21s
464200K .......... .......... .......... .......... .......... 38% 42.1M 21s
464250K .......... .......... .......... .......... .......... 38% 42.3M 21s
464300K .......... .......... .......... .......... .......... 38% 45.6M 21s
464350K .......... .......... .......... .......... .......... 38% 45.9M 21s
464400K .......... .......... .......... .......... .......... 38% 32.5M 21s
464450K .......... .......... .......... .......... .......... 38% 47.1M 21s
464500K .......... .......... .......... .......... .......... 38% 43.7M 21s
464550K .......... .......... .......... .......... .......... 38% 43.4M 21s
464600K .......... .......... .......... .......... .......... 38% 46.2M 21s
464650K .......... .......... .......... .......... .......... 38% 45.1M 21s
464700K .......... .......... .......... .......... .......... 38% 43.4M 21s

541850K .......... .......... .......... .......... .......... 44% 56.7M 19s
541900K .......... .......... .......... .......... .......... 44% 48.3M 19s
541950K .......... .......... .......... .......... .......... 44% 42.8M 19s
542000K .......... .......... .......... .......... .......... 44% 32.9M 19s
542050K .......... .......... .......... .......... .......... 44% 47.9M 19s
542100K .......... .......... .......... .......... .......... 44% 42.4M 19s
542150K .......... .......... .......... .......... .......... 44% 48.0M 19s
542200K .......... .......... .......... .......... .......... 44% 41.1M 19s
542250K .......... .......... .......... .......... .......... 44% 46.4M 19s
542300K .......... .......... .......... .......... .......... 44% 42.8M 19s
542350K .......... .......... .......... .......... .......... 44% 46.6M 19s
542400K .......... .......... .......... .......... .......... 44% 33.4M 19s
542450K .......... .......... .......... .......... .......... 44% 45.4M 19s

554500K .......... .......... .......... .......... .......... 45% 45.0M 18s
554550K .......... .......... .......... .......... .......... 45% 42.6M 18s
554600K .......... .......... .......... .......... .......... 45% 47.0M 18s
554650K .......... .......... .......... .......... .......... 45% 45.3M 18s
554700K .......... .......... .......... .......... .......... 45% 45.3M 18s
554750K .......... .......... .......... .......... .......... 45% 44.7M 18s
554800K .......... .......... .......... .......... .......... 45% 33.7M 18s
554850K .......... .......... .......... .......... .......... 45% 36.6M 18s
554900K .......... .......... .......... .......... .......... 45% 56.9M 18s
554950K .......... .......... .......... .......... .......... 45% 41.4M 18s
555000K .......... .......... .......... .......... .......... 45% 50.1M 18s
555050K .......... .......... .......... .......... .......... 45% 41.2M 18s
555100K .......... .......... .......... .......... .......... 45% 46.9M 18s

667750K .......... .......... .......... .......... .......... 54% 43.4M 15s
667800K .......... .......... .......... .......... .......... 54% 38.5M 15s
667850K .......... .......... .......... .......... .......... 54% 55.4M 15s
667900K .......... .......... .......... .......... .......... 54% 48.9M 15s
667950K .......... .......... .......... .......... .......... 54% 43.5M 15s
668000K .......... .......... .......... .......... .......... 54% 33.6M 15s
668050K .......... .......... .......... .......... .......... 54% 44.6M 15s
668100K .......... .......... .......... .......... .......... 54% 37.4M 15s
668150K .......... .......... .......... .......... .......... 54% 57.3M 15s
668200K .......... .......... .......... .......... .......... 54% 43.8M 15s
668250K .......... .......... .......... .......... .......... 54% 22.7M 15s
668300K .......... .......... .......... .......... .......... 54%  105M 15s
668350K .......... .......... .......... .......... .......... 54% 69.2M 15s

691550K .......... .......... .......... .......... .......... 56%  115M 15s
691600K .......... .......... .......... .......... .......... 56% 79.2M 15s
691650K .......... .......... .......... .......... .......... 56% 94.7M 15s
691700K .......... .......... .......... .......... .......... 56%  105M 15s
691750K .......... .......... .......... .......... .......... 56%  110M 15s
691800K .......... .......... .......... .......... .......... 56%  113M 15s
691850K .......... .......... .......... .......... .......... 56% 47.9M 15s
691900K .......... .......... .......... .......... .......... 56% 43.9M 15s
691950K .......... .......... .......... .......... .......... 56% 43.9M 15s
692000K .......... .......... .......... .......... .......... 56% 33.9M 15s
692050K .......... .......... .......... .......... .......... 56% 43.5M 15s
692100K .......... .......... .......... .......... .......... 56% 48.8M 15s
692150K .......... .......... .......... .......... .......... 56% 43.0M 15s

764350K .......... .......... .......... .......... .......... 62% 46.7M 12s
764400K .......... .......... .......... .......... .......... 62% 35.7M 12s
764450K .......... .......... .......... .......... .......... 62% 40.2M 12s
764500K .......... .......... .......... .......... .......... 62% 48.8M 12s
764550K .......... .......... .......... .......... .......... 62% 42.7M 12s
764600K .......... .......... .......... .......... .......... 62% 34.8M 12s
764650K .......... .......... .......... .......... .......... 62% 61.4M 12s
764700K .......... .......... .......... .......... .......... 62% 46.9M 12s
764750K .......... .......... .......... .......... .......... 62% 45.5M 12s
764800K .......... .......... .......... .......... .......... 62% 33.4M 12s
764850K .......... .......... .......... .......... .......... 62% 44.4M 12s
764900K .......... .......... .......... .......... .......... 62% 44.7M 12s
764950K .......... .......... .......... .......... .......... 62% 42.5M 12s

776750K .......... .......... .......... .......... .......... 63% 45.6M 12s
776800K .......... .......... .......... .......... .......... 63% 31.9M 12s
776850K .......... .......... .......... .......... .......... 63% 47.3M 12s
776900K .......... .......... .......... .......... .......... 63% 45.9M 12s
776950K .......... .......... .......... .......... .......... 63% 42.0M 12s
777000K .......... .......... .......... .......... .......... 63% 45.1M 12s
777050K .......... .......... .......... .......... .......... 63% 44.9M 12s
777100K .......... .......... .......... .......... .......... 63% 48.0M 12s
777150K .......... .......... .......... .......... .......... 63% 40.0M 12s
777200K .......... .......... .......... .......... .......... 63% 35.2M 12s
777250K .......... .......... .......... .......... .......... 63% 43.1M 12s
777300K .......... .......... .......... .......... .......... 63% 45.2M 12s
777350K .......... .......... .......... .......... .......... 63% 44.4M 12s

885400K .......... .......... .......... .......... .......... 72% 43.8M 9s
885450K .......... .......... .......... .......... .......... 72% 41.9M 9s
885500K .......... .......... .......... .......... .......... 72% 48.9M 9s
885550K .......... .......... .......... .......... .......... 72% 45.3M 9s
885600K .......... .......... .......... .......... .......... 72% 32.5M 9s
885650K .......... .......... .......... .......... .......... 72% 46.4M 9s
885700K .......... .......... .......... .......... .......... 72% 33.9M 9s
885750K .......... .......... .......... .......... .......... 72% 63.7M 9s
885800K .......... .......... .......... .......... .......... 72% 43.9M 9s
885850K .......... .......... .......... .......... .......... 72% 45.1M 9s
885900K .......... .......... .......... .......... .......... 72% 45.4M 9s
885950K .......... .......... .......... .......... .......... 72% 42.1M 9s
886000K .......... .......... .......... .......... .......... 72% 32.8M 9s
886050K ....

917250K .......... .......... .......... .......... .......... 75%  107M 8s
917300K .......... .......... .......... .......... .......... 75%  114M 8s
917350K .......... .......... .......... .......... .......... 75%  115M 8s
917400K .......... .......... .......... .......... .......... 75%  115M 8s
917450K .......... .......... .......... .......... .......... 75%  103M 8s
917500K .......... .......... .......... .......... .......... 75% 98.8M 8s
917550K .......... .......... .......... .......... .......... 75%  114M 8s
917600K .......... .......... .......... .......... .......... 75% 73.1M 8s
917650K .......... .......... .......... .......... .......... 75%  114M 8s
917700K .......... .......... .......... .......... .......... 75% 99.4M 8s
917750K .......... .......... .......... .......... .......... 75%  114M 8s
917800K .......... .......... .......... .......... .......... 75%  115M 8s
917850K .......... .......... .......... .......... .......... 75%  111M 8s
917900K ....

998950K .......... .......... .......... .......... .......... 81% 73.7M 6s
999000K .......... .......... .......... .......... .......... 81% 81.4M 6s
999050K .......... .......... .......... .......... .......... 81% 77.7M 6s
999100K .......... .......... .......... .......... .......... 81% 45.8M 6s
999150K .......... .......... .......... .......... .......... 81% 48.5M 6s
999200K .......... .......... .......... .......... .......... 82% 33.7M 6s
999250K .......... .......... .......... .......... .......... 82% 45.1M 6s
999300K .......... .......... .......... .......... .......... 82% 44.6M 6s
999350K .......... .......... .......... .......... .......... 82% 44.9M 6s
999400K .......... .......... .......... .......... .......... 82% 43.0M 6s
999450K .......... .......... .......... .......... .......... 82% 44.2M 6s
999500K .......... .......... .......... .......... .......... 82% 45.8M 6s
999550K .......... .......... .......... .......... .......... 82% 45.4M 6s
999600K ....

1015850K .......... .......... .......... .......... .......... 83% 46.1M 6s
1015900K .......... .......... .......... .......... .......... 83% 34.3M 6s
1015950K .......... .......... .......... .......... .......... 83% 63.5M 6s
1016000K .......... .......... .......... .......... .......... 83% 33.2M 6s
1016050K .......... .......... .......... .......... .......... 83% 43.4M 6s
1016100K .......... .......... .......... .......... .......... 83% 40.1M 6s
1016150K .......... .......... .......... .......... .......... 83% 51.1M 6s
1016200K .......... .......... .......... .......... .......... 83% 44.9M 6s
1016250K .......... .......... .......... .......... .......... 83% 44.7M 6s
1016300K .......... .......... .......... .......... .......... 83% 46.1M 6s
1016350K .......... .......... .......... .......... .......... 83% 42.6M 6s
1016400K .......... .......... .......... .......... .......... 83% 33.9M 6s
1016450K .......... .......... .......... .......... .......... 83% 46.2M 6s

1102450K .......... .......... .......... .......... .......... 90% 43.5M 3s
1102500K .......... .......... .......... .......... .......... 90% 42.8M 3s
1102550K .......... .......... .......... .......... .......... 90% 46.5M 3s
1102600K .......... .......... .......... .......... .......... 90% 46.5M 3s
1102650K .......... .......... .......... .......... .......... 90% 43.3M 3s
1102700K .......... .......... .......... .......... .......... 90% 45.4M 3s
1102750K .......... .......... .......... .......... .......... 90% 42.9M 3s
1102800K .......... .......... .......... .......... .......... 90% 31.8M 3s
1102850K .......... .......... .......... .......... .......... 90% 44.6M 3s
1102900K .......... .......... .......... .......... .......... 90% 44.1M 3s
1102950K .......... .......... .......... .......... .......... 90% 47.2M 3s
1103000K .......... .......... .......... .......... .......... 90% 47.7M 3s
1103050K .......... .......... .......... .......... .......... 90% 44.8M 3s

1118300K .......... .......... .......... .......... .......... 91% 44.1M 3s
1118350K .......... .......... .......... .......... .......... 91% 51.0M 3s
1118400K .......... .......... .......... .......... .......... 91% 33.4M 3s
1118450K .......... .......... .......... .......... .......... 91% 44.8M 3s
1118500K .......... .......... .......... .......... .......... 91% 42.8M 3s
1118550K .......... .......... .......... .......... .......... 91% 46.6M 3s
1118600K .......... .......... .......... .......... .......... 91% 43.5M 3s
1118650K .......... .......... .......... .......... .......... 91% 41.9M 3s
1118700K .......... .......... .......... .......... .......... 91% 49.0M 3s
1118750K .......... .......... .......... .......... .......... 91% 42.8M 3s
1118800K .......... .......... .......... .......... .......... 91% 31.4M 3s
1118850K .......... .......... .......... .......... .......... 91% 49.1M 3s
1118900K .......... .......... .......... .......... .......... 91% 43.7M 3s

1210750K .......... .......... .......... .......... .......... 99% 47.7M 0s
1210800K .......... .......... .......... .......... .......... 99% 32.1M 0s
1210850K .......... .......... .......... .......... .......... 99% 46.6M 0s
1210900K .......... .......... .......... .......... .......... 99% 41.3M 0s
1210950K .......... .......... .......... .......... .......... 99% 44.5M 0s
1211000K .......... .......... .......... .......... .......... 99% 42.6M 0s
1211050K .......... .......... .......... .......... .......... 99% 45.5M 0s
1211100K .......... .......... .......... .......... .......... 99% 48.7M 0s
1211150K .......... .......... .......... .......... .......... 99% 45.8M 0s
1211200K .......... .......... .......... .......... .......... 99% 33.5M 0s
1211250K .......... .......... .......... .......... .......... 99% 43.9M 0s
1211300K .......... .......... .......... .......... .......... 99% 44.1M 0s
1211350K .......... .......... .......... .......... .......... 99% 43.3M 0s

**Download the SQUAD 2.0 Dataset**

In [0]:
#Download the SQUAD train and dev dataset
!wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json
!wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json

--2020-03-17 10:11:42--  https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json
Resolving rajpurkar.github.io (rajpurkar.github.io)... 185.199.108.153, 185.199.111.153, 185.199.110.153, ...
Connecting to rajpurkar.github.io (rajpurkar.github.io)|185.199.108.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 42123633 (40M) [application/json]
Saving to: ‘train-v2.0.json’


2020-03-17 10:11:42 (158 MB/s) - ‘train-v2.0.json’ saved [42123633/42123633]

--2020-03-17 10:11:45--  https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json
Resolving rajpurkar.github.io (rajpurkar.github.io)... 185.199.108.153, 185.199.111.153, 185.199.110.153, ...
Connecting to rajpurkar.github.io (rajpurkar.github.io)|185.199.108.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4370528 (4.2M) [application/json]
Saving to: ‘dev-v2.0.json’


2020-03-17 10:11:45 (22.3 MB/s) - ‘dev-v2.0.json’ saved [4370528/4370528]



### **Set up your TPU environment**
*   Verify that you are connected to a TPU device
*   You will get know your TPU Address that is used at time of fine-tuning
*   Perform Google Authentication to access your bucket
*   Upload your credentials to TPU to access your GCS bucket

In [0]:
import datetime
import json
import os
import pprint
import random
import string
import sys
import tensorflow as tf

assert 'COLAB_TPU_ADDR' in os.environ, 'ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!'
TPU_ADDRESS = 'grpc://' + os.environ['COLAB_TPU_ADDR']
print('TPU address is => ', TPU_ADDRESS)

from google.colab import auth
auth.authenticate_user()
with tf.Session(TPU_ADDRESS) as session:
  print('TPU devices:')
  pprint.pprint(session.list_devices())

  # Upload credentials to TPU.
  with open('/content/adc.json', 'r') as f:
    auth_info = json.load(f)
  tf.contrib.cloud.configure_gcs(session, credentials=auth_info)
  # Now credentials are set for all future sessions on this TPU.

TPU address is =>  grpc://10.29.165.74:8470
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

TPU devices:
[_DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:CPU:0, CPU, -1, 7635328332459542216),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 2874832984123831798),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 4040080567667641550),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 5679213459339138745),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 13541815098229104680),
 _DeviceAttributes(/job:tpu_worker/replica:0/task

### **Create output directory** 


> Need to create a output directory at GCS (Google Cloud Storage) bucket, where you will get your fine_tuned model after training completion. For that you need to provide your BUCKET name and OUPUT DIRECTORY name.

> Also need to move Pre-trained Model at GCS (Google Cloud Storage) bucket, as Local File System is not Supported on TPU. If you don't move your pretrained model to TPU you may face an error. 




In [0]:
BUCKET = 'bertnlpdemo' #@param {type:"string"}
assert BUCKET, '*** Must specify an existing GCS bucket name ***'
output_dir_name = 'bert_output' #@param {type:"string"}
BUCKET_NAME = 'gs://{}'.format(BUCKET)
OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET,output_dir_name)
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))

***** Model output directory: gs://bertnlpdemo/bert_output *****


### **Move Pretrained Model to GCS Bucket** 


> Need to move Pre-trained Model at GCS (Google Cloud Storage) bucket, as Local File System is not Supported on TPU. If you don't move your pretrained model to TPU you may face the error. 



> The **gsutil** **mv** command allows you to move data between your local file system and the cloud, move data within the cloud, and move data between cloud storage providers.




In [0]:
!gsutil mv /content/bert/uncased_L-24_H-1024_A-16 $BUCKET_NAME

Copying file:///content/bert/uncased_L-24_H-1024_A-16/bert_config.json [Content-Type=application/json]...
Removing file:///content/bert/uncased_L-24_H-1024_A-16/bert_config.json...
Copying file:///content/bert/uncased_L-24_H-1024_A-16/bert_model.ckpt.index [Content-Type=application/octet-stream]...
Removing file:///content/bert/uncased_L-24_H-1024_A-16/bert_model.ckpt.index...
Copying file:///content/bert/uncased_L-24_H-1024_A-16/vocab.txt [Content-Type=text/plain]...
Removing file:///content/bert/uncased_L-24_H-1024_A-16/vocab.txt...
Copying file:///content/bert/uncased_L-24_H-1024_A-16/bert_model.ckpt.meta [Content-Type=application/octet-stream]...
Removing file:///content/bert/uncased_L-24_H-1024_A-16/bert_model.ckpt.meta...

==> NOTE: You are performing a sequence of gsutil operations that may
run significantly faster if you instead use gsutil -m cp ... Please
see the -m section under "gsutil help options" for further information
about when gsutil -m can be advantageous.

Copying f

### **Training**

> Below is the command to run the training. To run the training on TPU you need to make sure about below Hyperparameter, that is tpu must be true and provide the tpu_address that we have find out above.

1.   --use_tpu=True
2.   --tpu_name=YOUR_TPU_ADDRESS





In [0]:
!python run_squad.py \
  --vocab_file=$BUCKET_NAME/uncased_L-24_H-1024_A-16/vocab.txt \
  --bert_config_file=$BUCKET_NAME/uncased_L-24_H-1024_A-16/bert_config.json \
  --init_checkpoint=$BUCKET_NAME/uncased_L-24_H-1024_A-16/bert_model.ckpt \
  --do_train=True \
  --train_file=train-v2.0.json \
  --do_predict=True \
  --predict_file=dev-v2.0.json \
  --train_batch_size=24 \
  --learning_rate=3e-5 \
  --num_train_epochs=2.0 \
  --use_tpu=True \
  --tpu_name=grpc://10.1.118.82:8470 \
  --max_seq_length=384 \
  --doc_stride=128 \
  --version_2_with_negative=True \
  --output_dir=$OUTPUT_DIR




W1203 11:55:59.995234 139660688828288 module_wrapper.py:139] From run_squad.py:1127: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.


W1203 11:55:59.995468 139660688828288 module_wrapper.py:139] From run_squad.py:1127: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.


W1203 11:55:59.995662 139660688828288 module_wrapper.py:139] From /content/bert/modeling.py:93: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.


W1203 11:56:01.214218 139660688828288 module_wrapper.py:139] From run_squad.py:1133: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related op

### **Create Testing File**


> We are creating input_file.json as a blank json file and then writing the data in SQUAD format in the file.


*   **touch** is used to create a file
*   **%%writefile** is used to write a file in the colab



> You can pass your own questions and context in the below file.


In [0]:
!touch input_file.json

In [0]:
%%writefile input_file.json
{
    "version": "v2.0",
    "data": [
        {
            "title": "your_title",
            "paragraphs": [
                {
                    "qas": [
                        {
                            "question": "Who is current CEO?",
                            "id": "56ddde6b9a695914005b9628",
                            "is_impossible": ""
                        },
                        {
                            "question": "Who founded google?",
                            "id": "56ddde6b9a695914005b9629",
                            "is_impossible": ""
                        },
                        {
                            "question": "when did IPO take place?",
                            "id": "56ddde6b9a695914005b962a",
                            "is_impossible": ""
                        }
                    ],
                    "context": "Google was founded in 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University in California. Together they own about 14 percent of its shares and control 56 percent of the stockholder voting power through supervoting stock. They incorporated Google as a privately held company on September 4, 1998. An initial public offering (IPO) took place on August 19, 2004, and Google moved to its headquarters in Mountain View, California, nicknamed the Googleplex. In August 2015, Google announced plans to reorganize its various interests as a conglomerate called Alphabet Inc. Google is Alphabet's leading subsidiary and will continue to be the umbrella company for Alphabet's Internet interests. Sundar Pichai was appointed CEO of Google, replacing Larry Page who became the CEO of Alphabet."                
                 }
            ]
        }
    ]
}

Overwriting input_file.json


### **Prediction**


> Below is the command to perform your own custom prediction, that is you can change the input_file.json by providing your paragraph and questions after then execute the below command.



In [0]:
!python run_squad.py \
  --vocab_file=$BUCKET_NAME/uncased_L-24_H-1024_A-16/vocab.txt \
  --bert_config_file=$BUCKET_NAME/uncased_L-24_H-1024_A-16/bert_config.json \
  --init_checkpoint=$OUTPUT_DIR/model.ckpt-10859 \
  --do_train=False \
  --max_query_length=30  \
  --do_predict=True \
  --predict_file=input_file.json \
  --predict_batch_size=8 \
  --n_best_size=3 \
  --max_seq_length=384 \
  --doc_stride=128 \
  --output_dir=output/




W1207 12:10:47.304228 140202182711168 module_wrapper.py:139] From run_squad.py:1127: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.


W1207 12:10:47.304487 140202182711168 module_wrapper.py:139] From run_squad.py:1127: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.


W1207 12:10:47.304673 140202182711168 module_wrapper.py:139] From /content/bert/modeling.py:93: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.


W1207 12:10:48.564931 140202182711168 module_wrapper.py:139] From run_squad.py:1133: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related op