# Ensemble Modeling for Toxic Spans Detection

### Author: Yakoob Khan

### Load code from Google Drive

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
%cd "drive/My Drive/system/src"

/content/drive/My Drive/system/src


### Install Dependencies

In [3]:
!pip install -r requirements.txt

Collecting sacremoses==0.0.43
[?25l  Downloading https://files.pythonhosted.org/packages/7d/34/09d19aff26edcc8eb2a01bed8e98f13a1537005d31e95233fd48216eed10/sacremoses-0.0.43.tar.gz (883kB)
[K     |████████████████████████████████| 890kB 8.6MB/s 
Collecting tokenizers==0.10.1
[?25l  Downloading https://files.pythonhosted.org/packages/71/23/2ddc317b2121117bf34dd00f5b0de194158f2a44ee2bf5e47c7166878a97/tokenizers-0.10.1-cp37-cp37m-manylinux2010_x86_64.whl (3.2MB)
[K     |████████████████████████████████| 3.2MB 27.7MB/s 
Collecting transformers==4.3.3
[?25l  Downloading https://files.pythonhosted.org/packages/f9/54/5ca07ec9569d2f232f3166de5457b63943882f7950ddfcc887732fc7fb23/transformers-4.3.3-py3-none-any.whl (1.9MB)
[K     |████████████████████████████████| 1.9MB 56.3MB/s 
Building wheels for collected packages: sacremoses
  Building wheel for sacremoses (setup.py) ... [?25l[?25hdone
  Created wheel for sacremoses: filename=sacremoses-0.0.43-cp37-none-any.whl size=893262 sha256=d4

### GPU Info

In [4]:
# Credit: https://colab.research.google.com/notebooks/pro.ipynb
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Select the Runtime > "Change runtime type" menu to enable a GPU accelerator, ')
  print('and then re-execute this cell.')
else:
  print(gpu_info)

from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('To enable a high-RAM runtime, select the Runtime > "Change runtime type"')
  print('menu, and then select High-RAM in the Runtime shape dropdown. Then, ')
  print('re-execute this cell.')
else:
  print('You are using a high-RAM runtime!')

Sun Feb 28 21:12:54 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.39       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P0    26W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## 1. Late Fusion of Sequence and Token Classification Models

### Fine-tune Sequence Classification Model

In [5]:
!python3 './train_sentence_classification.py' \
  --model_type 'bert-base-cased' \
  --train_dir '../data/tsd_train.csv' \
  --dev_dir '../data/tsd_trial.csv' \
  --test_dir '../data/tsd_test.csv' \
  --epochs 1 \
  --warm_up_steps 500 \
  --learning_rate 5e-5 \
  --weight_decay 0.01 \
  --batch_size 16 \

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
2021-02-28 21:13:18.300296: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
Downloading: 100% 213k/213k [00:00<00:00, 851kB/s]
Downloading: 100% 436k/436k [00:00<00:00, 1.33MB/s]
Downloading: 100% 433/433 [00:00<00:00, 429kB/s]
Downloading: 100% 436M/436M [00:09<00:00, 43.8MB/s]
Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or wit

### Fine-tune BERT with Late Fusion 

In [6]:
!python3 './train_bert_late_fusion.py' \
  --model_type 'bert-base-cased' \
  --train_dir '../data/tsd_train.csv' \
  --dev_dir '../data/tsd_trial.csv' \
  --test_dir '../data/tsd_test.csv' \
  --binary_classification_dataset './ensemble_modeling/binary_sentence_classifications.json' \
  --epochs 1.92 \
  --warm_up_steps 500 \
  --learning_rate 5e-5 \
  --weight_decay 0.01 \
  --batch_size 16 

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
2021-02-28 22:30:54.706948: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForTokenClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoi

## 2. Multi-task Learning with MT-DNN

#### Convert toxic spans data into MT-DNN format

In [7]:
!python './create_mt_dnn_datasets.py' \
  --model_type 'bert-base-cased' \
  --train_dir '../data/tsd_train.csv' \
  --dev_dir '../data/tsd_trial.csv' \
  --test_dir '../data/tsd_test.csv' \

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!

> Loading 7939 examples located at '../data/tsd_train.csv'


> Loading 690 examples located at '../data/tsd_trial.csv'


> Loading 2000 examples located at '../data/tsd_test.csv'

> Writing data for MT-DNN NER task

> Writing data for MT-DNN classification task



### Install MT-DNN dependencies

In [8]:
%cd '../mt-dnn'

/content/drive/My Drive/system/mt-dnn


In [9]:
!pip install -r requirements.txt

Collecting torch==1.5.0
[?25l  Downloading https://files.pythonhosted.org/packages/76/58/668ffb25215b3f8231a550a227be7f905f514859c70a65ca59d28f9b7f60/torch-1.5.0-cp37-cp37m-manylinux1_x86_64.whl (752.0MB)
[K     |████████████████████████████████| 752.0MB 20kB/s 
Collecting colorlog
  Downloading https://files.pythonhosted.org/packages/5e/39/0230290df0519d528d8d0ffdfd900150ed24e0076d13b1f19e279444aab1/colorlog-4.7.2-py2.py3-none-any.whl
Collecting boto3
[?25l  Downloading https://files.pythonhosted.org/packages/bd/c8/b5aac643697038ef6eb8c11c73b9ee9c2dc8cb2bc95cda2d4ee656167644/boto3-1.17.17-py2.py3-none-any.whl (130kB)
[K     |████████████████████████████████| 133kB 71.9MB/s 
[?25hCollecting pytorch-pretrained-bert==v0.6.0
[?25l  Downloading https://files.pythonhosted.org/packages/a3/ac/8d72155697620bb9b453dcde3ad8520dc1464fa3abde389afbd542c50402/pytorch_pretrained_bert-0.6.0-py3-none-any.whl (114kB)
[K     |████████████████████████████████| 122kB 75.3MB/s 
Collecting sentencepie

In [10]:
!sh download.sh

download.sh: 7: [: /content/drive/My: unexpected operator
--2021-02-28 23:06:36--  https://mrc.blob.core.windows.net/mt-dnn-model/bert_model_base_v2.pt
Resolving mrc.blob.core.windows.net (mrc.blob.core.windows.net)... 52.190.240.132
Connecting to mrc.blob.core.windows.net (mrc.blob.core.windows.net)|52.190.240.132|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 437961819 (418M) [application/octet-stream]
Saving to: ‘/content/drive/My Drive/system/mt-dnn/mt_dnn_models/bert_model_base_cased.pt’


2021-02-28 23:06:55 (22.7 MB/s) - ‘/content/drive/My Drive/system/mt-dnn/mt_dnn_models/bert_model_base_cased.pt’ saved [437961819/437961819]

--2021-02-28 23:06:59--  https://mrc.blob.core.windows.net/mt-dnn-model/mt_dnn_base.pt
Resolving mrc.blob.core.windows.net (mrc.blob.core.windows.net)... 52.190.240.132
Connecting to mrc.blob.core.windows.net (mrc.blob.core.windows.net)|52.190.240.132|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 43796

In [11]:
!pip uninstall apex
!git clone https://www.github.com/nvidia/apex
!cd apex
!pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Uninstalling apex-0.9.10.dev0:
  Would remove:
    /usr/local/lib/python3.7/dist-packages/apex-0.9.10.dev0.dist-info/*
    /usr/local/lib/python3.7/dist-packages/apex/*
Proceed (y/n)? y
  Successfully uninstalled apex-0.9.10.dev0
fatal: destination path 'apex' already exists and is not an empty directory.
  cmdoptions.check_install_build_global(options)
Created temporary directory: /tmp/pip-ephem-wheel-cache-ppm5ef19
Created temporary directory: /tmp/pip-req-tracker-4_sz0j5d
Created requirements tracker '/tmp/pip-req-tracker-4_sz0j5d'
Created temporary directory: /tmp/pip-install-tpyrikza
Cleaning up...
Removed build tracker '/tmp/pip-req-tracker-4_sz0j5d'
[31mERROR: Directory './' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.[0m
Exception information:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/pip/_internal/cli/base_command.py", line 153, in _main
    status = self.run(options, args)
  File "/usr/local/lib/python3.7/dist-pa

### Multi-task Training of Sequence and Token Classification Models!

In [12]:
# Multi-task training working!
!python3 train.py --data_dir "./canonical_data" --train_datasets "ner,sentenceclassification" --test_datasets "ner,sentenceclassification" --task_def ./experiments/ner/multi_task_def.yml --cuda True

2021-02-28 23:08:09.675185: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
02/28/2021 11:08:23 Launching the MT-DNN training
02/28/2021 11:08:23 Loading ./canonical_data/ner_train.json as task 0
Loaded 7939 samples out of 7939
02/28/2021 11:08:23 Loading ./canonical_data/sentenceclassification_train.json as task 1
Loaded 22940 samples out of 22940
Loaded 690 samples out of 690
Loaded 2000 samples out of 2000
Loaded 2004 samples out of 2004
Loaded 5467 samples out of 5467
02/28/2021 11:08:24 ####################
02/28/2021 11:08:24 {'log_file': 'mt-dnn-train.log', 'tensorboard': False, 'tensorboard_logdir': 'tensorboard_logdir', 'init_checkpoint': 'mt_dnn_models/bert_model_base_cased.pt', 'data_dir': './canonical_data', 'data_sort_on': False, 'name': 'farmer', 'task_def': './experiments/ner/multi_task_def.yml', 'train_datasets':

In [13]:
# Move the mt-dnn NER results to /src/ensemble_modeling/multi_task_learning
import os

epochs = 5
for i in range(epochs):
  os.rename(f"checkpoint/ner_dev_scores_epoch_{i}.json", f"../src/ensemble_modeling/multi_task_learning/ner_dev_scores_epoch_{i}.json")
  os.rename(f"checkpoint/ner_test_scores_epoch_{i}.json", f"../src/ensemble_modeling/multi_task_learning/ner_test_scores_epoch_{i}.json")

In [14]:
%cd '../src'

/content/drive/My Drive/system/src


### Compute MT-DNN Performance Metrics

In [15]:
!python3 './compute_multi_task_metrics.py'

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
2021-02-28 23:49:43.855672: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
Epoch 1:
Dev - Precision: 0.6919711489170147, Recall: 0.5680267713046699, F1: 0.5788398002456524
Test - Precision: 0.6540068640001307, Recall: 0.6754281470157026, F1: 0.6425635016653741 

Epoch 2:
Dev - Precision: 0.7417035799893301, Recall: 0.6135097408467134, F1: 0.6230542448780534
Test - Precision: 0.6686601222930499, Recall: 0.6828037385230084, F1: 0.6530031770973732 

Epoch 3:
Dev - Precision: 0.7440695291640149, Recall: 0.6296454221775931, F1: 0.6340579292405968
Test - Precision: 0.6654820166654941, Recall: 0.6948333789365856, F1: 0.6561592331803717 

Epoch 4:
Dev - Precision: 0.7430871790419927, Recall: 0.6384911890073609, F1: 0.6352471620872295
Test - Precision: 0.6584347775990741, Recall: 0.7007913922724561, F1: 0.6536037851