## Concatenating all attention distributions 

We extract the attention distribution directly from a model that has not been pretrained for bias detection! We then using windowing to extract a window of attention scores around the biased word.

In [1]:
import sys; sys.path.append("../../../../..")
import torch 
from src.experiment import AttentionExperiment, ClassificationExperiment
from src.dataset import ExperimentDataset
from src.params import Params
from src.utils.attention_utils import reduce_attention_dist, return_idx_attention_dist, window_attention_dist
from src.utils.classification_utils import run_bootstrapping
from src.utils.shared_utils import get_bias_predictions

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
params = Params.read_params("linear-params-window.json")
print("model = {}".format(params.final_task['model']))
print("layers = {}".format(params.intermediary_task["attention"]["layers"]))
print("reducer = {}".format(params.intermediary_task["attention"]["reducer"]))

model = shallow_nn
layers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
reducer = concat


In [4]:
# Loading in the dataset that we are using in this experiments 
# typically this dataset is the small set of ground-truth labels
dev_dataset = ExperimentDataset.init_dataset(params.dataset)

04/02/2020 21:04:40 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at ./cache/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
100it [00:00, 4344.36it/s]


In [5]:
import pickle 
train_dataset = pickle.load(open(params.dataset["weakly_labeled_data"], "rb"))

### Attention Experiment: 
* Is a class that wraps useful methods to extract attention distributions from a given BERT-based model 
* The user has to provide in two config files: One to specify parameters for how the attention scores should be extracted and combined, and other to specify the intermediary model from which the attention scores should be extracted from
* The user needs to instantiate the attention experiment with a function that tells the model how to run 
 inference on the given model. The function header is specified below: 
 
 ``` def initialize_attention_experiment(cls, intermediary_task_params, dataset_params, verbose=False) ```
 


In [6]:
attention_experiment = AttentionExperiment.initialize_attention_experiment(params.intermediary_task, params.dataset, verbose=True, from_pretrained=False)

04/02/2020 21:04:41 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at ./cache/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
04/02/2020 21:04:41 - INFO - pytorch_pretrained_bert.modeling -   loading archive file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz from cache at ./cache/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba
04/02/2020 21:04:41 - INFO - pytorch_pretrained_bert.modeling -   extracting archive file ./cache/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba to temp dir /tmp/tmp1wsviw1q
04/02/2020 21:04:45 - INFO - pytorch_pretrained_bert.modeling -   Model config {
  "attention_probs_d

Instantiated joint model with default HuggingFace weights.
Succesfully loaded in attention experiment!


In [7]:
attention_dataloader_dev = dev_dataset.return_dataloader(batch_size=params.intermediary_task['attention']['attention_extraction_batch_size'])  
attention_dataloader_train = train_dataset.return_dataloader(batch_size=params.intermediary_task['attention']['attention_extraction_batch_size'])

```extract_attention_scores()``` works out of the box because the attention experiment has the config file saved, and knows what BERT model to use/load in, which layers to extract the attention scores from, and what the inference function is that should be used on this particular BERT model.

Attention_scores is then a list of dictionaries. The keys in this dictionary are the specific layers of a BERT model and the values are the corresponding attention distributions extracted from that particular layer.

In [8]:
attention_scores_dev = attention_experiment.extract_attention_scores(attention_dataloader_dev)

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




In [9]:
# Saving out attention weights from the train dataset to facilitate future runs
import os 
attention_weights_file = "model_weights/attention_scores_train.pkl"
if os.path.exists(attention_weights_file):
    print("Loading in existing train attention weights")
    attention_scores_train = pickle.load(open(attention_weights_file, "rb"))
else:
    print("Generating new training attention weights ")
    if (not os.path.isdir("model_weights")):
            os.mkdir("model_weights")
    attention_scores_train = attention_experiment.extract_attention_scores(attention_dataloader_train)
    pickle.dump(attention_scores_train, open(attention_weights_file, "wb+"))

Loading in existing train attention weights


Getting the predictions from the BERT model trained to detect bias, and using those to index into the attention scores

In [10]:
bias_predictions_train = get_bias_predictions(train_dataset, params.intermediary_task, params.dataset, batch_size=32)
bias_predictions_dev = get_bias_predictions(dev_dataset, params.intermediary_task, params.dataset, batch_size=32)

04/02/2020 21:05:25 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at ./cache/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
04/02/2020 21:05:25 - INFO - pytorch_pretrained_bert.modeling -   loading archive file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz from cache at ./cache/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba
04/02/2020 21:05:25 - INFO - pytorch_pretrained_bert.modeling -   extracting archive file ./cache/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba to temp dir /tmp/tmp1aiqy_nb
04/02/2020 21:05:29 - INFO - pytorch_pretrained_bert.modeling -   Model config {
  "attention_probs_d

HBox(children=(FloatProgress(value=0.0, max=1634.0), HTML(value='')))




04/02/2020 21:08:12 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at ./cache/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
04/02/2020 21:08:12 - INFO - pytorch_pretrained_bert.modeling -   loading archive file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz from cache at ./cache/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba
04/02/2020 21:08:12 - INFO - pytorch_pretrained_bert.modeling -   extracting archive file ./cache/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba to temp dir /tmp/tmpztl0ihdx
04/02/2020 21:08:16 - INFO - pytorch_pretrained_bert.modeling -   Model config {
  "attention_probs_d

HBox(children=(FloatProgress(value=0.0, max=3.0), HTML(value='')))




In [11]:
bias_indices_train = torch.argmax(bias_predictions_train == 1, dim=1).tolist()
bias_indices_dev = torch.argmax(bias_predictions_dev == 1, dim=1).tolist()

In [12]:
attention_scores_indexed_train = return_idx_attention_dist(attention_scores_train, bias_indices_train)
reduced_attention_train = reduce_attention_dist(attention_scores_indexed_train, params.intermediary_task["attention"])
windowed_dist_train = window_attention_dist(reduced_attention_train, bias_indices_train, window_size=7, num_concat=len(params.intermediary_task["attention"]["layers"]))
attention_dist_train = windowed_dist_train

In [13]:
windowed_dist_train.shape

torch.Size([52275, 180])

In [14]:
train_dataset.add_data(attention_dist_train, "attention_dist")
train_dataset.shuffle_data()
assert(attention_dist_train.shape[1] == params.final_task['input_dim'])

In [15]:
attention_scores_indexed_dev = return_idx_attention_dist(attention_scores_dev, bias_indices_dev)
reduced_attention_dev = reduce_attention_dist(attention_scores_indexed_dev, params.intermediary_task["attention"])
windowed_dist_dev = window_attention_dist(reduced_attention_dev, bias_indices_dev, window_size=7, num_concat=len(params.intermediary_task["attention"]["layers"]))
attention_dist_dev = windowed_dist_dev

In [16]:
dev_dataset.add_data(attention_dist_dev, "attention_dist")
dev_dataset.add_data(dev_dataset.get_val('bias_label'),'weak_bias_label')
dev_dataset.shuffle_data()
assert(attention_dist_dev.shape[1] == params.final_task['input_dim'])

### This is where the classification experiment starts

We create a classification experiment that contains useful methods for classifying bias based on the attention distributions. 

In [17]:
params = Params.read_params("linear-params-window.json")

In [18]:
train_dataloader = train_dataset.return_dataloader(batch_size=params.final_task['training_params']['batch_size'])
dev_dataloader = dev_dataset.return_dataloader(batch_size=32)

In [19]:
classification_experiment = ClassificationExperiment.init_cls_experiment(params.final_task)

In [20]:
losses, evals = classification_experiment.train_model(train_dataloader, dev_dataloader, input_key="attention_dist", label_key="weak_bias_label")

HBox(children=(FloatProgress(value=0.0, description='epochs', max=20.0, style=ProgressStyle(description_width=…




In [21]:
from src.utils.classification_utils import average_data

In [22]:
avg_evaluations = [average_data(epoch_evaluations) for epoch_evaluations in evals]

In [23]:
avg_evaluations

[{'num_examples': 81,
  'accuracy': 0.6296296296296295,
  'auc': 0.5770370566380197},
 {'num_examples': 81,
  'accuracy': 0.654320987654321,
  'auc': 0.6584481705912734},
 {'num_examples': 81,
  'accuracy': 0.6419753086419753,
  'auc': 0.6992791738577773},
 {'num_examples': 81,
  'accuracy': 0.654320987654321,
  'auc': 0.7085265745224466},
 {'num_examples': 81, 'accuracy': 0.654320987654321, 'auc': 0.71327471659429},
 {'num_examples': 81,
  'accuracy': 0.6790123456790124,
  'auc': 0.7137744926530617},
 {'num_examples': 81,
  'accuracy': 0.654320987654321,
  'auc': 0.7248702169926801},
 {'num_examples': 81,
  'accuracy': 0.6666666666666666,
  'auc': 0.7374650244963282},
 {'num_examples': 81,
  'accuracy': 0.6666666666666666,
  'auc': 0.7390142861763089},
 {'num_examples': 81,
  'accuracy': 0.6666666666666666,
  'auc': 0.756256878719899},
 {'num_examples': 81,
  'accuracy': 0.6790123456790124,
  'auc': 0.7655524487997828},
 {'num_examples': 81,
  'accuracy': 0.6790123456790124,
  'auc': 

In [24]:
classification_experiment.save_model_weights("linear-windowed-attention.weights")