# Debugging Biosbias
- **Task**: For Biosbias, the task is predicting the occupation of a given bio paragraph, i.e., whether the person is 'a surgeon' (class 0) or 'a nurse' (class 1).
- **Problem**: Due to the gender imbalance in each occupation, a classifier usually exploits gender information when making predictions. As a result, bios of female surgeons and male nurses are often misclassified. We quantify the bias of the model using two metrics: FPED and FNED (For details, please see [Dixon et al., 2018](https://dl.acm.org/doi/pdf/10.1145/3278721.3278729)). 
- **Solution**: To reduce the model's bias, we use our framework to identify the features which detect gender information rather than occupation and disable such features.

In [1]:
# Notebook setup
import pickle
import os
import datetime
import random
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [14, 7]
os.environ['PYTHONHASHSEED'] = '0'

# Set random seed to create reproducable results
the_seed = 1234
np.random.seed(the_seed)
random.seed(the_seed)
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
from keras import backend as K
tf.set_random_seed(the_seed)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [5]:
import find

## Settings

- Dataset

In [6]:
DATA_PATH = 'preprocessed_data/'
MAIN_DATASET = 'Biosbias2'
SECOND_DATASET = None
THIRD_DATASET = None
GENDER_BIAS = True

EMBEDDING_DIM = 300
EMBEDDING_PATH = f"../../CNNAnalysis/data/glove/glove.6B.{EMBEDDING_DIM}d.txt" # Path to your glove embeddings

- Model

In [7]:
MODEL_PATH = 'trained_models/'
MODEL_ARCH = 'CNN'
MAXLEN = 150
FILTERS = [(10, 2), (10, 3), (10, 4)] # Ten filters of each window size [2,3,4]
BATCH_SIZE = 128

## Model creation and training

In [8]:
# 0. Load GloVe embeddings
embedding_matrix, vocab_size, index2word, word2index = find.get_embedding_matrix(EMBEDDING_PATH, EMBEDDING_DIM, pad_initialisation = "zeros")

262it [00:00, 2601.03it/s]

Loading Glove Model


400000it [00:55, 7144.91it/s]


Done. 400000  words loaded!


In [9]:
# 1. Load datasets and prepare inputs
# 1.1 Main dataset
data_1 = pickle.load(open(DATA_PATH + f'all_data_{MAIN_DATASET}.pickle', 'rb'))
class_names = data_1['class_names']
X_train_1, X_validate_1, X_test_1 = find.get_data_matrix(data_1['text_train'], word2index, MAXLEN), \
                                    find.get_data_matrix(data_1['text_validate'], word2index, MAXLEN), \
                                    find.get_data_matrix(data_1['text_test'], word2index, MAXLEN)
y_test_1 = data_1['y_test']
gender_test_1 = data_1['gender_test'] if GENDER_BIAS else None

# 1.2 Second dataset
if SECOND_DATASET is not None:
    data_2 = pickle.load(open(DATA_PATH + f'all_data_{SECOND_DATASET}.pickle', 'rb'))
    X_test_2, y_test_2 = find.get_data_matrix(data_2['text_test'], word2index, MAXLEN), data_2['y_test']
    gender_test_2 = data_2['gender_test'] if GENDER_BIAS else None
else:
    X_test_2, y_test_2, gender_test_2 = None, None, None

# 1.3 Third dataset
if THIRD_DATASET is not None:
    data_3 = pickle.load(open(DATA_PATH + f'all_data_{THIRD_DATASET}.pickle', 'rb'))
    X_test_3, y_test_3 = find.get_data_matrix(data_3['text_test'], word2index, MAXLEN), data_3['y_test']
    gender_test_3 = data_3['gender_test'] if GENDER_BIAS else None
else:
    X_test_3, y_test_3, gender_test_2  = None, None, None

100%|██████████| 3832/3832 [00:07<00:00, 492.91it/s]
100%|██████████| 1277/1277 [00:02<00:00, 467.36it/s]
100%|██████████| 1278/1278 [00:02<00:00, 480.26it/s]


In [10]:
# 2. Create the result directory
if not os.path.exists(MODEL_PATH):
    os.makedirs(MODEL_PATH)
result_folder = MAIN_DATASET + '_' + MODEL_ARCH + '_' + datetime.datetime.now().strftime("%Y%m%d%H%M%S") + '/'
result_path = MODEL_PATH + result_folder
os.mkdir(result_path)

In [11]:
# 3. Create a model
if MODEL_ARCH == 'CNN':
    model = find.get_CNN_model(vocab_size, EMBEDDING_DIM, embedding_matrix, MAXLEN, class_names, FILTERS)
else:
    assert False, f"Unsupported model architecture: {MODEL_ARCH}"

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, None)         0                                            
__________________________________________________________________________________________________
embedding_1 (Embedding)         (None, 150, 300)     120000600   input_1[0][0]                    
__________________________________________________________________________________________________
conv1d_1 (Conv1D)               (None, 149, 10)      6010        embedding_1[0][0]                
__________________________________________________________________________________________________
conv1d_2 (Conv1D)               (None, 148, 10)      9010        embedding_1[0][0]                
__________________________________________________________________________________________________
conv1d_3 (

In [12]:
# 4. Train the model
history = find.model_train(model, result_path + f'trained_{MODEL_ARCH}.h5', X_train_1, data_1['y_train'], X_validate_1, data_1['y_validate'], BATCH_SIZE, epochs = 300)

Train on 3832 samples, validate on 1277 samples
Epoch 1/300
 - 13s - loss: 0.3877 - acc: 0.8236 - val_loss: 0.1898 - val_acc: 0.9366

Epoch 00001: val_loss improved from inf to 0.18977, saving model to trained_models/Biosbias2_CNN_20200921022749/trained_CNN.h5
Epoch 2/300
 - 12s - loss: 0.1461 - acc: 0.9517 - val_loss: 0.1362 - val_acc: 0.9499

Epoch 00002: val_loss improved from 0.18977 to 0.13615, saving model to trained_models/Biosbias2_CNN_20200921022749/trained_CNN.h5
Epoch 3/300
 - 13s - loss: 0.1070 - acc: 0.9653 - val_loss: 0.1168 - val_acc: 0.9569

Epoch 00003: val_loss improved from 0.13615 to 0.11684, saving model to trained_models/Biosbias2_CNN_20200921022749/trained_CNN.h5
Epoch 4/300
 - 13s - loss: 0.0868 - acc: 0.9736 - val_loss: 0.1075 - val_acc: 0.9569

Epoch 00004: val_loss improved from 0.11684 to 0.10746, saving model to trained_models/Biosbias2_CNN_20200921022749/trained_CNN.h5
Epoch 5/300
 - 13s - loss: 0.0698 - acc: 0.9804 - val_loss: 0.0996 - val_acc: 0.9593

Ep

In [13]:
# 5. Evaluate the model
if not GENDER_BIAS:
    find.evaluate_all(model, class_names, BATCH_SIZE, X_test_1, y_test_1, X_test_2, y_test_2, X_test_3, y_test_3, result_path = result_path, model_name = 'original')
else:
    find.evaluate_all_gender(model, class_names, BATCH_SIZE, X_test_1, y_test_1, gender_test_1, X_test_2, y_test_2, gender_test_2, result_path = result_path, model_name = 'original')

Evaluate with the original test set:
{'per_class': {0: {'all_positive': 714,
                   'all_true': 731,
                   'class_f1': 0.9550173010380623,
                   'class_name': 'surgeon',
                   'class_precision': 0.9663865546218487,
                   'class_recall': 0.9439124487004104,
                   'true_positive': 690},
               1: {'all_positive': 564,
                   'all_true': 547,
                   'class_f1': 0.9414941494149416,
                   'class_name': 'nurse',
                   'class_precision': 0.9273049645390071,
                   'class_recall': 0.9561243144424132,
                   'true_positive': 523}},
 'total': {'accuracy': 0.9491392801251957,
           'macro_f1': 0.9484294173731734,
           'macro_precision': 0.9468457595804279,
           'macro_recall': 0.9500183815714118,
           'micro_f1': 0.9491392801251957,
           'micro_precision': 0.9491392801251957,
           'micro_recall': 0.9491392

## Model understanding and debugging

In [14]:
# 6. Generate wordclouds
settings = {
    'model_arch': MODEL_ARCH,
    'filters': FILTERS,
    'maxlen': MAXLEN,
    'result_path': result_path,
    'index2word': index2word,
    'embedding_dim': EMBEDDING_DIM,
    'batch_size': BATCH_SIZE
}
all_wordclouds = find.generate_wordclouds(model, X_train_1, settings, max_examples = 2000)

 19%|█▉        | 3/16 [00:00<00:00, 19.66it/s]

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
embedded_text_input (InputLayer (None, 150, 300)     0                                            
__________________________________________________________________________________________________
conv1d_4 (Conv1D)               (None, 149, 10)      6010        embedded_text_input[0][0]        
__________________________________________________________________________________________________
conv1d_5 (Conv1D)               (None, 148, 10)      9010        embedded_text_input[0][0]        
__________________________________________________________________________________________________
conv1d_6 (Conv1D)               (None, 147, 10)      12010       embedded_text_input[0][0]        
__________________________________________________________________________________________________
global_max

100%|██████████| 16/16 [00:02<00:00,  7.36it/s]
100%|██████████| 30/30 [02:43<00:00,  5.45s/it]
100%|██████████| 30/30 [00:26<00:00,  1.12it/s]


- Get input from a human

In [15]:
is_feature_enabled = [True for i in range(find.num_all_filters(FILTERS))]

In [16]:
# UI components from ipywidgets
import ipywidgets as wgt

def update_screen(feature_idx):
    show_action_panel(feature_idx)
    wordcloud = all_wordclouds[feature_idx]
    f, ax = plt.subplots()
    plt.rcParams['figure.figsize'] = [14, 7]
    ax.imshow(wordcloud, interpolation='bilinear')
    ax.axis("off")
    
    W = model.layers[-1].get_weights()[0] # For the final layer
    weight_plot = find.visualize_weights(W, feature_idx, class_names, show = False)
    plt.show()

def update_action(action):
    global feature_radio_button, is_feature_enabled
    feature_idx = feature_radio_button.value
    if action == 'enabled':
        print('enable')
        is_feature_enabled[feature_idx] = True
    elif action == 'disabled':
        print('disable')
        is_feature_enabled[feature_idx] = False
    else:
        assert False
    
def show_action_panel(feature_idx):
    global action_radio_button
    action_radio_button.description = f'Current status of feature {feature_idx}:'
    action_radio_button.value = 'enabled' if is_feature_enabled[feature_idx] else 'disabled'
    
feature_radio_button = wgt.RadioButtons(options=list(range(30)), value=0, description='Feature:', disabled=False)
action_radio_button = wgt.RadioButtons(options=['enabled', 'disabled'],
    value = 'enabled' if is_feature_enabled[feature_radio_button.value] else 'disabled',
    description = f'Current status of feature {feature_radio_button.value}:',
    style = {'description_width': 'initial'},
    disabled = False
)

wgt.interactive_output(update_action, {'action':action_radio_button})
out = wgt.interactive_output(update_screen, {'feature_idx':feature_radio_button})

In [17]:
# 7. Get input from a human (disabling some features)
display(wgt.HBox([feature_radio_button, wgt.VBox([out, action_radio_button])]))#

HBox(children=(RadioButtons(description='Feature:', options=(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,…

In [18]:
print(f"Total: {len(is_feature_enabled)} features \nEnabled: {sum(is_feature_enabled)} features \nDisabled: {len(is_feature_enabled)-sum(is_feature_enabled)} features")
print(f"Disabled features: {[i for i,s in enumerate(is_feature_enabled) if not s]}")

Total: 30 features 
Enabled: 23 features 
Disabled: 7 features
Disabled features: [0, 5, 8, 10, 14, 16, 26]


## Creating and fine-tuning an improved classifier

In [19]:
# 8. Create an improved model
# 8.1 Copy the existing CNN features
model_improved = find.get_CNN_model(vocab_size, EMBEDDING_DIM, embedding_matrix, MAXLEN, class_names, 
                                    FILTERS, trainable_filters = False)
model_improved.set_weights(model.get_weights()) 

# 8.2 Apply human decisions to disable irrelevant features
for idx, enable in enumerate(is_feature_enabled):
    if not enable:
        model_improved.layers[-1].disable_mask(idx)

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            (None, None)         0                                            
__________________________________________________________________________________________________
embedding_2 (Embedding)         (None, 150, 300)     120000600   input_2[0][0]                    
__________________________________________________________________________________________________
conv1d_7 (Conv1D)               (None, 149, 10)      6010        embedding_2[0][0]                
__________________________________________________________________________________________________
conv1d_8 (Conv1D)               (None, 148, 10)      9010        embedding_2[0][0]                
__________________________________________________________________________________________________
conv1d_9 (

In [20]:
# 9. Fine-tuning the improved model
history = find.model_train(model_improved, result_path + f'trained_{MODEL_ARCH}_improved.h5', X_train_1, data_1['y_train'], X_validate_1, data_1['y_validate'], BATCH_SIZE, epochs = 300)

Train on 3832 samples, validate on 1277 samples
Epoch 1/300
 - 8s - loss: 0.0460 - acc: 0.9927 - val_loss: 0.1176 - val_acc: 0.9491

Epoch 00001: val_loss improved from inf to 0.11765, saving model to trained_models/Biosbias2_CNN_20200921022749/trained_CNN_improved.h5
Epoch 2/300
 - 7s - loss: 0.0412 - acc: 0.9945 - val_loss: 0.1181 - val_acc: 0.9491

Epoch 00002: val_loss did not improve from 0.11765
Epoch 3/300
 - 8s - loss: 0.0388 - acc: 0.9940 - val_loss: 0.1187 - val_acc: 0.9491

Epoch 00003: val_loss did not improve from 0.11765
Epoch 4/300
 - 7s - loss: 0.0365 - acc: 0.9945 - val_loss: 0.1171 - val_acc: 0.9491

Epoch 00004: val_loss improved from 0.11765 to 0.11714, saving model to trained_models/Biosbias2_CNN_20200921022749/trained_CNN_improved.h5
Epoch 5/300
 - 7s - loss: 0.0348 - acc: 0.9945 - val_loss: 0.1191 - val_acc: 0.9499

Epoch 00005: val_loss did not improve from 0.11714
Epoch 6/300
 - 8s - loss: 0.0332 - acc: 0.9948 - val_loss: 0.1186 - val_acc: 0.9499

Epoch 00006: 

In [21]:
# 10. Evaluate the improved model
if not GENDER_BIAS:
    find.evaluate_all(model_improved, class_names, BATCH_SIZE, X_test_1, y_test_1, X_test_2, y_test_2, X_test_3, y_test_3, result_path = result_path, model_name = 'debugged')
else:
    find.evaluate_all_gender(model_improved, class_names, BATCH_SIZE, X_test_1, y_test_1, gender_test_1, X_test_2, y_test_2, gender_test_2, result_path = result_path, model_name = 'debugged')

Evaluate with the original test set:
{'per_class': {0: {'all_positive': 709,
                   'all_true': 731,
                   'class_f1': 0.9374999999999999,
                   'class_name': 'surgeon',
                   'class_precision': 0.9520451339915373,
                   'class_recall': 0.9233926128590971,
                   'true_positive': 675},
               1: {'all_positive': 569,
                   'all_true': 547,
                   'class_f1': 0.9193548387096774,
                   'class_name': 'nurse',
                   'class_precision': 0.9015817223198594,
                   'class_recall': 0.9378427787934186,
                   'true_positive': 513}},
 'total': {'accuracy': 0.9295774647887324,
           'macro_f1': 0.9287116661661686,
           'macro_precision': 0.9268134281556983,
           'macro_recall': 0.9306176958262579,
           'micro_f1': 0.9295774647887324,
           'micro_precision': 0.9295774647887324,
           'micro_recall': 0.9295774