## Comparison test

This test compares the output from  `run_classifier.py` in the huggingface port to `bert_sklearn` on a small test subset from sst-2

####  `run_classifier.py` from huggingface port

In [1]:
%%time
%%bash
cd ..
python ./tests/run_classifier.py --task_name sst-2 \
                            --data_dir ./tests/data/sst2 \
                            --do_train  --do_eval \
                            --output_dir ./tests/comptest \
                            --bert_model bert-base-uncased \
                            --do_lower_case \
                            --learning_rate 3e-5 \
                            --gradient_accumulation_steps 1 \
                            --max_seq_length 64 \
                            --train_batch_size 16 \
                            --eval_batch_size 8 \
                            --num_train_epochs 2

Loading Pytorch checkpoint
Loading Pytorch checkpoint


07/18/2019 02:11:58 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
07/18/2019 02:11:59 - INFO - bert_sklearn.model.pytorch_pretrained.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /root/.cache/torch/pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
07/18/2019 02:12:00 - INFO - bert_sklearn.model.pytorch_pretrained.modeling -   loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin from cache at /root/.cache/torch/pytorch_pretrained_bert/aa1ef1aede4482d0dbcd4d52baad8ae300e60902e88fcb0bebdec09afd232066.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
07/18/2019 02:12:00 - INFO - bert_sklearn.model.pytorch_pretrained.modeling -   loading configuration file https://s3.amazonaws.com

CPU times: user 16 ms, sys: 4 ms, total: 20 ms
Wall time: 16.7 s


In [2]:
!cat comptest/eval_results.txt

acc = 0.88
eval_loss = 0.31115847940628344
global_step = 26
loss = 0.1801566292460148


In [3]:
!rm -r comptest

###  `bert_sklearn` 

In [4]:
%%time
import os
import sys
import csv

import numpy as np
import pandas as pd
from sklearn import metrics
from sklearn.metrics import classification_report

sys.path.append("../") 
from bert_sklearn import BertClassifier
from bert_sklearn import load_model


def get_sst_test_data(train_file ='./data/sst2/train.tsv',
                dev_file  = './data/sst2/dev.tsv'):
    
    train = pd.read_csv(train_file, sep='\t', encoding='utf8', keep_default_na=False)
    train.columns=['text','label']
    print("SST-2 train data size: %d "%(len(train)))
    
    dev = pd.read_csv(dev_file, sep='\t', encoding='utf8', keep_default_na=False)
    dev.columns=['text','label']
    print("SST-2 dev data size: %d "%(len(dev)))
    label_list = np.unique(train['label'])

    X_train = train['text']
    y_train = train['label']
    X_dev = dev['text']
    y_dev = dev['label']

    return X_train,y_train, X_dev, y_dev


X_train,y_train, X_dev, y_dev =  get_sst_test_data()

# define model
model = BertClassifier('bert-base-uncased')
model.validation_fraction = 0.0
model.learning_rate = 3e-5 
model.gradient_accumulation_steps = 1
model.max_seq_length = 64
model.train_batch_size = 16
model.eval_batch_size = 8
model.epochs = 2

# fit
model.fit(X_train,y_train)

# score
accy = model.score(X_dev,y_dev)

SST-2 train data size: 200 
SST-2 dev data size: 100 
Building sklearn text classifier...
Loading bert-base-uncased model...
Defaulting to linear classifier/regressor
Loading Pytorch checkpoint
train data size: 200, validation data size: 0


Training  : 100%|██████████| 13/13 [00:03<00:00,  4.03it/s, loss=0.671]
Training  : 100%|██████████| 13/13 [00:03<00:00,  4.00it/s, loss=0.36] 
Testing: 100%|██████████| 13/13 [00:00<00:00, 22.96it/s]


Loss: 0.3112, Accuracy: 88.00%
CPU times: user 9.88 s, sys: 3.45 s, total: 13.3 s
Wall time: 14.5 s



