## SST-2 : Stanford Sentiment Treebank

The Stanford Sentiment Treebank (SST-2) task is a single sentence classification task. It consists of semtences drawn from movie reviews and annotated for their sentiment. 

See [website](https://nlp.stanford.edu/sentiment/code.html) and [paper](https://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdf) for more info.

In [1]:
import numpy as np
import pandas as pd
import os
import sys
import csv
from sklearn import metrics
from sklearn.metrics import classification_report

sys.path.append("../") 
from bert_sklearn import BertClassifier
from bert_sklearn import load_model

DATADIR = os.getcwd() + '/glue_data'

In [2]:
%%bash
python3 download_glue_data.py --data_dir glue_data --tasks SST 

Downloading and extracting SST...
	Completed!


In [2]:
"""
SST-2 train data size: 67349 
SST-2 dev data size: 872 
"""
def get_sst_data(train_file = DATADIR + '/SST-2/train.tsv',
                dev_file  = DATADIR + '/SST-2/dev.tsv'):
    
    train = pd.read_csv(train_file, sep='\t',  encoding = 'utf8',keep_default_na=False)
    train.columns=['text','label']
    print("SST-2 train data size: %d "%(len(train)))
    
    dev = pd.read_csv(dev_file, sep='\t',  encoding = 'utf8',keep_default_na=False)
    dev.columns=['text','label']
    print("SST-2 dev data size: %d "%(len(dev)))
    label_list = np.unique(train['label'])
    
    return train,dev,label_list

train,dev,label_list = get_sst_data()


SST-2 train data size: 67349 
SST-2 dev data size: 872 


In [8]:
print(label_list)

[0 1]


In [9]:
train.head()

Unnamed: 0,text,label
0,hide new secretions from the parental units,0
1,"contains no wit , only labored gags",0
2,that loves its characters and communicates som...,1
3,remains utterly satisfied to remain the same t...,0
4,on the worst revenge-of-the-nerds clichés the ...,0


In [3]:

X_train = train['text']
y_train = train['label']

# define model
model = BertClassifier()
model.epochs = 3
model.validation_fraction = 0.05
model.learning_rate = 2e-5
model.max_seq_length = 128

print('\n',model,'\n')

# fit model
model.fit(X_train, y_train)

# test model on dev
test = dev
X_test = test['text']
y_test = test['label']

# make predictions
y_pred = model.predict(X_test)
print("Accuracy: %0.2f%%"%(metrics.accuracy_score(y_pred,y_test) * 100))
print(classification_report(y_test, y_pred, target_names=['negative','positive']))

Building sklearn classifier...

 BertClassifier(bert_model='bert-base-uncased', epochs=3, eval_batch_size=8,
        fp16=False, gradient_accumulation_steps=1, label_list=None,
        learning_rate=2e-05, local_rank=-1, logfile='bert_sklearn.log',
        loss_scale=0, max_seq_length=128, num_mlp_hiddens=500,
        num_mlp_layers=0, random_state=42, restore_file=None,
        train_batch_size=32, use_cuda=True, validation_fraction=0.05,
        warmup_proportion=0.1) 

Loading bert-base-uncased model...
Defaulting to linear classifier/regressor
train data size: 63982, validation data size: 3367


Training: 100%|██████████| 2000/2000 [25:46<00:00,  1.40it/s, loss=0.259]
                                                             

Epoch 1, Train loss : 0.2593, Val loss: 0.1411, Val accy = 94.83%


Training: 100%|██████████| 2000/2000 [25:45<00:00,  1.41it/s, loss=0.0992]
                                                             

Epoch 2, Train loss : 0.0992, Val loss: 0.1287, Val accy = 95.49%


Training: 100%|██████████| 2000/2000 [25:43<00:00,  1.41it/s, loss=0.0671]
                                                             

Epoch 3, Train loss : 0.0671, Val loss: 0.1395, Val accy = 95.75%


                                                             

Accuracy: 92.32%
             precision    recall  f1-score   support

   negative       0.94      0.90      0.92       428
   positive       0.91      0.94      0.93       444

avg / total       0.92      0.92      0.92       872





## with MLP...

In [5]:
%%time

X_train = train['text']
y_train = train['label']

# define model
model = BertClassifier()
model.epochs = 3
model.validation_fraction = 0.05
model.learning_rate = 2e-5
model.max_seq_length = 128
model.num_mlp_layers = 4

print('\n',model,'\n')

# fit model
model.fit(X_train, y_train)

# test model on dev
test = dev
X_test = test['text']
y_test = test['label']

# make predictions
y_pred = model.predict(X_test)
print("Accuracy: %0.2f%%"%(metrics.accuracy_score(y_pred,y_test) * 100))
print(classification_report(y_test, y_pred, target_names=['negative','positive']))

Building sklearn classifier...

 BertClassifier(bert_model='bert-base-uncased', epochs=3, eval_batch_size=8,
        fp16=False, gradient_accumulation_steps=1, label_list=None,
        learning_rate=2e-05, local_rank=-1, logfile='bert_sklearn.log',
        loss_scale=0, max_seq_length=128, num_mlp_hiddens=500,
        num_mlp_layers=4, random_state=42, restore_file=None,
        train_batch_size=32, use_cuda=True, validation_fraction=0.05,
        warmup_proportion=0.1) 

Loading bert-base-uncased model...
Using mlp with D=768,H=500,K=2,n=4
train data size: 63982, validation data size: 3367


Training: 100%|██████████| 1999/1999 [25:51<00:00,  1.28it/s, loss=0.309]
                                                             

Epoch 1, Train loss : 0.3089, Val loss: 0.1527, Val accy = 94.18%


Training: 100%|██████████| 1999/1999 [26:13<00:00,  1.28it/s, loss=0.126]
                                                             

Epoch 2, Train loss : 0.1256, Val loss: 0.1412, Val accy = 95.07%


Training: 100%|██████████| 1999/1999 [25:57<00:00,  1.29it/s, loss=0.0915]
                                                             

Epoch 3, Train loss : 0.0915, Val loss: 0.1427, Val accy = 95.04%


                                                             

Accuracy: 92.78%
             precision    recall  f1-score   support

   negative       0.94      0.91      0.93       428
   positive       0.92      0.94      0.93       444

avg / total       0.93      0.93      0.93       872

CPU times: user 1h 9min, sys: 34min 43s, total: 1h 43min 43s
Wall time: 1h 21min 12s


