Recasens as a weak labeling function
------
This is the strongest-performing weak labeling function. For a given sentence, we first extract what the first-predicted biased word is. We know what this word will be, since our dataset contains the ground-truth labels for which words were edited for bias. We can thus extract the index of the first biased word, and then do some featurization based on that particular word. 

The features we use in this example are the ones proposed by Recasens et al, 2013 (https://nlp.stanford.edu/pubs/neutrality.pdf). For each word her features generate 32 features. Our labeling function then takes these features and applies a very simple regression (1 layer feed forward neural net).

In [1]:
import sys; sys.path.append("../../../../..")
import torch 
from src.experiment import ClassificationExperiment
from src.dataset import ExperimentDataset
from src.params import Params

%load_ext autoreload
%autoreload 2

In [2]:
params = Params.read_params("experiment_params.json")

In [3]:
# Loading in the dataset that we are using in this experiments 
# typically this dataset is the small set of ground-truth labels
dataset = ExperimentDataset.init_dataset(params.dataset)

02/25/2020 18:40:54 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at ./cache/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
386it [00:00, 4974.47it/s]


In [4]:
# importing the Featurizer created by Pryzant et al.
from src.utils.weak_labeling_utils import get_marta_featurizer, extract_marta_features

In [5]:
featurizer = get_marta_featurizer(params.dataset)

02/25/2020 18:40:54 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at ./cache/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084


In [6]:
dataset

Length: 324 Keys: dict_keys(['pre_ids', 'masks', 'pre_lens', 'post_in_ids', 'post_out_ids', 'pre_tok_label_ids', 'post_tok_label_ids', 'rel_ids', 'pos_ids', 'categories', 'index', 'bias_label'])

In [7]:
marta_features = extract_marta_features(dataset, featurizer)

In [8]:
dataset.add_data(marta_features, "marta_features")

### This is where the classification experiment starts

In [9]:
classification_experiment = ClassificationExperiment.init_cls_experiment(params.final_task)

In [10]:
from src.utils.classification_utils import run_bootstrapping

In [11]:
marta_features.shape

torch.Size([324, 90])

In [16]:
statistics = run_bootstrapping(classification_experiment, dataset, params.final_task, num_bootstrap_iters=3, input_key='marta_features', label_key='bias_label', threshold=0.42)

HBox(children=(IntProgress(value=0, description='Cross Validation Iteration', max=3, style=ProgressStyle(descr…

HBox(children=(IntProgress(value=0, description='epochs', max=200, style=ProgressStyle(description_width='init…

HBox(children=(IntProgress(value=0, description='epochs', max=200, style=ProgressStyle(description_width='init…

HBox(children=(IntProgress(value=0, description='epochs', max=200, style=ProgressStyle(description_width='init…




In [17]:
statistics

{'auc': [(0.9503368677016261, 0.9708432171569927), 0.959598823295552],
 'accuracy': [(0.8969600340136054, 0.9377551020408164), 0.9176587301587302]}