# Code for running the BERT Model on the Dataset

All the code code for running the BERT model is uploaded on the google drive in the folder called "BERT".

We use google colab in this experiement since the BERT model is memory hungry and requires atleast 1 GPU to run.

Note: 
* The TPU option can be used for the model to run faster.


First you have to mount your google drive to Google Colab using the code below

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


Changing directory to the folder "BERT" where all our code for running the BERT model was uploaded

In [None]:
%cd /content/drive/My\ Drive/BERT/

List of files in the "BERT" folder before running the BERT model

In [None]:
!ls -lrth

total 125K
drwx------ 2 root root 4.0K Jan  6 09:15 data
-rw------- 1 root root 3.1K Jan  6 12:34 convert_tf_checkpoint_to_pytorch.py
-rw------- 1 root root  20K Jan  6 12:34 run_classifier_TABSA.py
-rw------- 1 root root 8.6K Jan  6 12:34 tokenization.py
-rw------- 1 root root  18K Jan  6 12:34 processor.py
-rw------- 1 root root 7.0K Jan  6 12:34 optimization.py
-rw------- 1 root root  20K Jan  6 12:34 modeling.py
drwx------ 2 root root 4.0K Jan  6 12:55 __pycache__
drwx------ 2 root root 4.0K Jan  6 13:01 uncased_L-12_H-768_A-12
drwx------ 2 root root 4.0K Feb 14 14:21 4_aspect_results
-rw------- 1 root root  16K Feb 14 14:46 evaluation_4aspects.py
-rw------- 1 root root  16K Feb 14 14:46 evaluation.py


Checking the server uptime

In [None]:
!uptime

 14:53:14 up 5 min,  0 users,  load average: 0.03, 0.15, 0.08


# Prepare BERT-pytorch-model

This step is used to convert the tensorflow checkpoint to a pytorch model that we shall be using in our experiement as used in # Reference: https://github.com/huggingface/pytorch-pretrained-BERT

Before running this code download and unzip the BERT-Base model (Uncased: 12-layer, 768-hidden, 12-heads, 110M parameters) to the "SENTITEL_BERT" folder. This model can be obtained from https://github.com/google-research/bert. The are several models choose "BERT-Base, Uncased"

In [None]:
!python convert_tf_checkpoint_to_pytorch.py \
--tf_checkpoint_path uncased_L-12_H-768_A-12/bert_model.ckpt \
--bert_config_file uncased_L-12_H-768_A-12/bert_config.json \
--pytorch_dump_path uncased_L-12_H-768_A-12/pytorch_model.bin

Converting TensorFlow checkpoint from uncased_L-12_H-768_A-12/bert_model.ckpt
Loading bert/embeddings/LayerNorm/beta with shape [768]
Numpy array shape (768,)
Loading bert/embeddings/LayerNorm/gamma with shape [768]
Numpy array shape (768,)
Loading bert/embeddings/position_embeddings with shape [512, 768]
Numpy array shape (512, 768)
Loading bert/embeddings/token_type_embeddings with shape [2, 768]
Numpy array shape (2, 768)
Loading bert/embeddings/word_embeddings with shape [30522, 768]
Numpy array shape (30522, 768)
Loading bert/encoder/layer_0/attention/output/LayerNorm/beta with shape [768]
Numpy array shape (768,)
Loading bert/encoder/layer_0/attention/output/LayerNorm/gamma with shape [768]
Numpy array shape (768,)
Loading bert/encoder/layer_0/attention/output/dense/bias with shape [768]
Numpy array shape (768,)
Loading bert/encoder/layer_0/attention/output/dense/kernel with shape [768, 768]
Numpy array shape (768, 768)
Loading bert/encoder/layer_0/attention/self/key/bias with sh

# Training the BERT Model



Note: 
* The model run for about 1 hour on 1 TPU. You need to be patient as it runs, maybe get a cup of coffee.
* It will take about 3 hours on 1 GPU

The option of either using a GPU or TPU can be set when starting this notebook.

In [None]:
!CUDA_VISIBLE_DEVICES=0,1,2,3 python run_classifier_TABSA.py \
--task_name sentihood_NLI_M \
--data_dir data/bert-pair/ \
--vocab_file uncased_L-12_H-768_A-12/vocab.txt \
--bert_config_file uncased_L-12_H-768_A-12/bert_config.json \
--init_checkpoint uncased_L-12_H-768_A-12/pytorch_model.bin \
--eval_test \
--do_lower_case \
--max_seq_length 128 \
--train_batch_size 24 \
--learning_rate 2e-5 \
--num_train_epochs 6.0 \
--output_dir results/sentitel/NLI_M \
--seed 42

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Iteration:  11% 95/836 [00:56<07:24,  1.67it/s][A
Iteration:  11% 96/836 [00:56<07:23,  1.67it/s][A
Iteration:  12% 97/836 [00:57<07:22,  1.67it/s][A
Iteration:  12% 98/836 [00:58<07:22,  1.67it/s][A
Iteration:  12% 99/836 [00:58<07:21,  1.67it/s][A
Iteration:  12% 100/836 [00:59<07:21,  1.67it/s][A
Iteration:  12% 101/836 [00:59<07:20,  1.67it/s][A
Iteration:  12% 102/836 [01:00<07:20,  1.67it/s][A
Iteration:  12% 103/836 [01:01<07:19,  1.67it/s][A
Iteration:  12% 104/836 [01:01<07:19,  1.66it/s][A
Iteration:  13% 105/836 [01:02<07:19,  1.66it/s][A
Iteration:  13% 106/836 [01:02<07:18,  1.66it/s][A
Iteration:  13% 107/836 [01:03<07:18,  1.66it/s][A
Iteration:  13% 108/836 [01:04<07:17,  1.66it/s][A
Iteration:  13% 109/836 [01:04<07:17,  1.66it/s][A
Iteration:  13% 110/836 [01:05<07:16,  1.66it/s][A
Iteration:  13% 111/836 [01:05<07:16,  1.66it/s][A
Iteration:  13% 112/836 [01:06<07:15,  1.66it/s][A
Iter

# Evaluating the BERT model 

Evaluate the results on test set
The following evaluation metrics are calculated 
 * Acc
 * Macro-F1
 * Macro-AUC

In [None]:
!python evaluation.py --task_name _NLI_M --pred_data_dir results/sentitel/NLI_M/test_ep_4.txt

AUC per aspect Calls, CustomerService, Data, General, Network
[0.9666567645561025, 0.95964558445812, 0.9412558587211249, 0.9577943191895306, 0.9263490315154401]
aspect_strict_Acc = 0.6734333627537511
aspect_Macro_F1 = 0.7810532651311622
aspect_Macro_AUC = 0.9503403116880638
sentiment_Acc = 0.9399815327793167
sentiment_Macro_AUC = 0.9648625266876616
