# Working on Group-CFEs

### Using Datasets from; Retiring Adult: New Datasets for Fair Machine Learning (https://papers.nips.cc/paper/2021/file/32e54441e6382a7fbacbbbaf3c450059-Paper.pdf)


## Data Prep

In [1]:
import numpy as np 
import pandas as pd
import random
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn_extra.cluster import KMedoids
from sklearn.neighbors import NearestNeighbors
from sklearn.ensemble import GradientBoostingClassifier
from scipy.spatial import distance
from collections import Counter

In [2]:
np.random.seed(0)

X_train = np.load('X_train_CA.npy')
y_train = np.load('y_train_CA.npy')
X_test = np.load('X_test_CA.npy')
y_test = np.load('y_test_CA.npy')

In [3]:
model = make_pipeline(StandardScaler(), GradientBoostingClassifier(random_state=0))
model.fit(X_train, y_train)
yhat = model.predict(X_test)

### Understanding Features:

#### Employment Type (COW)

- 1 .Employee of a private for-profit company or business, or of an individual, for wages, salary, or commissions
- 2 .Employee of a private not-for-profit, tax-exempt, or charitable organization
- 3 .Local government employee (city, county, etc.)
- 4 .State government employee
- 5 .Federal government employee
- 6 .Self-employed in own not incorporated business, professional
.practice, or farm
- 7 .Self-employed in own incorporated business, professional
.practice or farm
- 8 .Working without pay in family business or farm
- 9 .Unemployed  and last worked 5 years ago or earlier or never worked


#### Education

- bb .N/A (less than 3 years old)
- 01 .No schooling completed
- 02 .Nursery school, preschool
- 03 .Kindergarten
- 04 .Grade 1
- 05 .Grade 2
- 06 .Grade 3
- 07 .Grade 4
- 08 .Grade 5
- 09 .Grade 6
- 10 .Grade 7
- 11 .Grade 8
- 12 .Grade 9
- 13 .Grade 10
- 14 .Grade 11
- 15 .12th grade - no diploma
- 16 .Regular high school diploma
- 17 .GED or alternative credential
- 18 .Some college, but less than 1 year
- 19 .1 or more years of college credit, no degree
- 20 .Associate's degree
- 21 .Bachelor's degree
- 22 .Master's degree
- 23 .Professional degree beyond a bachelor's degree
- 24 .Doctorate degree
 

#### Marital status
- 1 .Married
- 2 .Widowed
- 3 .Divorced
- 4 .Separated
- 5 .Never married or under 15 years old

#### Career 

See Page 84/85: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2014-2018.pdf

#### POB  

See Page 96: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2014-2018.pdf

#### Age
Self Explanatory

#### Weekly Hours
Self Explanatory

#### Gender

- 1 Male
- 2 Female

####  Recoded detailed race code (RAC1P)
- 1 .White alone
- 2 .Black or African American alone
- 3 .American Indian alone
- 4 .Alaska Native alone
- 5 .American Indian and Alaska Native tribes specified; or American Indian or Alaska Native, not specified and no other races
- 6 .Asian alone
- 7 .Native Hawaiian and Other Pacific Islander alone
- 8 .Some Other Race alone
- 9 .Two or More Races

## Counterfactuals

### A simple baseline; NUNs - Nearest Unlike Neighbors

In [4]:
pd.DataFrame(X_train, columns = ['Employment Type', 'Qualification', 'Marital status', 'Career', 'POB','AGE', 'Weekly Hours', 'Gender', 'Race'])

Unnamed: 0,Employment Type,Qualification,Marital status,Career,POB,AGE,Weekly Hours,Gender,Race
0,2.0,22.0,1.0,1821.0,6.0,46.0,45.0,2.0,9.0
1,1.0,21.0,3.0,4850.0,12.0,45.0,50.0,2.0,1.0
2,1.0,21.0,5.0,1021.0,215.0,40.0,40.0,2.0,6.0
3,1.0,24.0,1.0,300.0,210.0,59.0,40.0,1.0,6.0
4,1.0,19.0,5.0,3401.0,6.0,23.0,40.0,2.0,1.0
...,...,...,...,...,...,...,...,...,...
156527,1.0,16.0,1.0,3645.0,6.0,29.0,40.0,2.0,1.0
156528,1.0,21.0,1.0,2640.0,6.0,42.0,40.0,2.0,1.0
156529,1.0,21.0,5.0,630.0,24.0,60.0,60.0,1.0,6.0
156530,3.0,22.0,1.0,230.0,6.0,47.0,60.0,1.0,1.0


#### NUN instances where people make 50k + in the training data

In [5]:
negative_outcome = [X_test[instance] for instance in np.where(yhat == False)][0] # the people in the test set who are predicted to make less than 50k
positive_outcome = [X_test[instance] for instance in np.where(yhat == True)][0] # the people in the test set who are predicted to make more than 50k

positive_train_set = [X_train[instance] for instance in np.where(y_train == True)][0] # the people who make 50k in the train set
negative_train_set = positive = [X_train[instance] for instance in np.where(y_train == False)][0] # the people who dont make 50k in the train set

In [6]:
index = 0
neighbors = NearestNeighbors(n_neighbors=30, metric='hamming').fit(positive_train_set) #nb could do with a better distance function
distances, indices = neighbors.kneighbors(X_test[index].reshape(1,-1))

list(X_test[index]), list(positive_train_set[indices[0][0]]) # a NUN

([1.0, 18.0, 1.0, 5840.0, 21.0, 62.0, 40.0, 1.0, 1.0],
 [1.0, 18.0, 1.0, 7340.0, 360.0, 62.0, 40.0, 1.0, 1.0])

#### Finding NNs

NB might use a custom distance function

In [7]:
neighbors_negative = NearestNeighbors(n_neighbors=30, metric='hamming').fit(negative_train_set) # other instances that dont get 50k   

In [8]:
def NUN_finder(query, outcome):
    
    if outcome == 'negative':
        
        distances, indices = neighbors.kneighbors(query.reshape(1,-1))
        NUN = positive_train_set[indices[0][0]]
        
    elif outcome == 'positive':
            distances, indices = neighbors_negative.kneighbors(query.reshape(1,-1))
            NUN = negative_train_set[indices[0][0]]
            
    return list(NUN)

In [9]:
def explanation_generator(query, outcome): # a query predicted to be under 50k 
    
    if outcome == 'negative':
        
        query = query
        distances_neg, indices_neg = neighbors_negative.kneighbors(query.reshape(1,-1))
        NNs = (negative_train_set[indices_neg[0][0:5]])

        distances, indices = neighbors.kneighbors(query.reshape(1,-1))
        NUN = positive_train_set[indices[0][0]]

        NUNs = []
        for instance in NNs:
            NUNs.append(NUN_finder(instance, outcome='negative'))

        return query, NUN, NNs, NUNs, indices_neg #return the query, NUN, the NN's in the same class and also the corresponding NUNs
    
    elif outcome == 'positive':
        
        query = query
        distances_pos, indices_pos = neighbors.kneighbors(query.reshape(1,-1))
        NNs = (positive_train_set[indices_pos[0][0:5]])

        distances_neg, indices_neg = neighbors_negative.kneighbors(query.reshape(1,-1))
        NUN = negative_train_set[indices_neg[0][0]]

        NUNs = []
        for instance in NNs:
            NUNs.append(NUN_finder(instance, outcome='positive'))

        return query, NUN, NNs, NUNs, indices_pos #return the query, NUN, the NN's in the same class and also the corresponding NUNs
    


In [10]:
def boarderline_cases(threshold):
    
    max_proba = []
    for instance in range(X_test.shape[0]):
        max_proba.append(model.predict_proba(X_test[instance].reshape(1,-1)).max())
    
    boarderline_cases = np.where(np.array(max_proba) <= threshold)[0]
    
    intersection = list(np.intersect1d(boarderline_cases, np.where(yhat != y_test)[0]))
    
    return np.array(sorted(list(set(boarderline_cases) - set(intersection))))

## DiCE Counterfactuals

In [11]:
# DiCE imports
import dice_ml
from dice_ml.utils import helpers  # helper functions

In [12]:
# Getting dataset ready using pandas

x_train = pd.DataFrame(X_train, columns = ['employment type', 'qualification', 'marital status', 'career', 'pob','age', 'weekly hours', 'gender', 'race'])
x_train['income'] = y_train

x_test = pd.DataFrame(X_test, columns = ['employment type', 'qualification', 'marital status', 'career', 'pob','age', 'weekly hours', 'gender', 'race'])
x_test['income'] = y_test
x_test = x_test.drop('income', axis=1)

x_train = x_train.drop('income', axis=1)
#x_test = test_dataset.drop('income', axis=1)

In [13]:
x_train

Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race
0,2.0,22.0,1.0,1821.0,6.0,46.0,45.0,2.0,9.0
1,1.0,21.0,3.0,4850.0,12.0,45.0,50.0,2.0,1.0
2,1.0,21.0,5.0,1021.0,215.0,40.0,40.0,2.0,6.0
3,1.0,24.0,1.0,300.0,210.0,59.0,40.0,1.0,6.0
4,1.0,19.0,5.0,3401.0,6.0,23.0,40.0,2.0,1.0
...,...,...,...,...,...,...,...,...,...
156527,1.0,16.0,1.0,3645.0,6.0,29.0,40.0,2.0,1.0
156528,1.0,21.0,1.0,2640.0,6.0,42.0,40.0,2.0,1.0
156529,1.0,21.0,5.0,630.0,24.0,60.0,60.0,1.0,6.0
156530,3.0,22.0,1.0,230.0,6.0,47.0,60.0,1.0,1.0


Given the train dataset, we construct a data object for DiCE. Since continuous and discrete features have different ways of perturbation, we need to specify the names of the continuous features. DiCE also requires the name of the output variable that the ML model will predict.

In [14]:
# Step 1: dice_ml.Data
d = dice_ml.Data(dataframe=x_train, continuous_features=['age', 'weekly hours'], outcome_name='income')
m = dice_ml.Model(model=model, backend="sklearn")
exp = dice_ml.Dice(d, m, method='random')

### Materials --- Close to Decision Boundary

In [15]:
boarderline_cases(threshold=0.6)[0:300], [y_test[instance] for instance in boarderline_cases(threshold=0.6)[0:300]]

(array([   6,   22,   35,   40,   42,   45,   66,   67,  113,  163,  174,
         185,  191,  194,  197,  214,  220,  228,  232,  276,  301,  322,
         361,  363,  382,  388,  426,  454,  468,  479,  507,  524,  526,
         549,  579,  593,  598,  599,  622,  633,  643,  664,  692,  698,
         707,  717,  721,  741,  756,  759,  774,  797,  801,  818,  829,
         833,  834,  842,  843,  845,  880,  898,  905,  943,  958,  992,
        1001, 1004, 1008, 1009, 1015, 1024, 1026, 1044, 1065, 1066, 1072,
        1097, 1112, 1128, 1129, 1141, 1156, 1190, 1196, 1238, 1245, 1255,
        1279, 1282, 1318, 1319, 1327, 1350, 1352, 1374, 1381, 1411, 1450,
        1456, 1461, 1470, 1472, 1488, 1494, 1496, 1519, 1543, 1545, 1546,
        1561, 1567, 1652, 1695, 1698, 1717, 1742, 1762, 1764, 1783, 1791,
        1814, 1822, 1833, 1840, 1841, 1842, 1843, 1844, 1852, 1882, 1893,
        1900, 1918, 1921, 1945, 1954, 1963, 1977, 1994, 2017, 2025, 2028,
        2039, 2041, 2043, 2045, 2050, 

In [16]:
model.predict(X_test[6].reshape(1,-1))

array([False])

In [17]:
#defining arguments
#features_to_vary = ['employment type', 'qualification', 'marital status', 'pob',
#                                                   'age', 'weekly hours', 'gender', 'race']

features_to_vary = ['employment type', 'qualification', 'marital status', 'pob','age', 'weekly hours', 'gender', 'race']
random_seed = 0

In [18]:
model.predict(X_test[7].reshape(1,-1))[0] == False

True

In [37]:
def cfe_generator(instance):
    
    if model.predict(X_test[instance].reshape(1,-1))[0] == False:

        NNs = explanation_generator((np.array(x_test[instance:instance+1])).reshape(1,-1), outcome = 'negative')[4][0]

        indices_cf_example = (np.where(y_train == False)[0][NNs[0]], np.where(y_train == False)[0][NNs[1]], np.where(y_train == False)[0][NNs[2]], np.where(y_train == False)[0][NNs[3]], np.where(y_train == False)[0][NNs[4]])

        e1 = exp.generate_counterfactuals(x_test[instance:instance+1], total_CFs=1, desired_class="opposite",
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e2 = exp.generate_counterfactuals(x_train[indices_cf_example[0]:indices_cf_example[0]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e3 = exp.generate_counterfactuals(x_train[indices_cf_example[1]:indices_cf_example[1]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e4 = exp.generate_counterfactuals(x_train[indices_cf_example[2]:indices_cf_example[2]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e5 = exp.generate_counterfactuals(x_train[indices_cf_example[3]:indices_cf_example[3]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)
        
    elif model.predict(X_test[instance].reshape(1,-1))[0] == True:
        
        NNs = explanation_generator((np.array(x_test[instance:instance+1])).reshape(1,-1), outcome = 'positive')[4][0]

        indices_cf_example = (np.where(y_train == True)[0][NNs[0]], np.where(y_train == True)[0][NNs[1]], np.where(y_train == True)[0][NNs[2]], np.where(y_train == True)[0][NNs[3]], np.where(y_train == True)[0][NNs[4]])

        e1 = exp.generate_counterfactuals(x_test[instance:instance+1], total_CFs=1, desired_class="opposite",
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e2 = exp.generate_counterfactuals(x_train[indices_cf_example[0]:indices_cf_example[0]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e3 = exp.generate_counterfactuals(x_train[indices_cf_example[1]:indices_cf_example[1]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e4 = exp.generate_counterfactuals(x_train[indices_cf_example[2]:indices_cf_example[2]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e5 = exp.generate_counterfactuals(x_train[indices_cf_example[3]:indices_cf_example[3]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)
        
        e6 = exp.generate_counterfactuals(x_train[indices_cf_example[4]:indices_cf_example[4]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)
        
        
    return e1.visualize_as_dataframe(show_only_changes=True), e2.visualize_as_dataframe(show_only_changes=True),e3.visualize_as_dataframe(show_only_changes=True),e4.visualize_as_dataframe(show_only_changes=True), e5.visualize_as_dataframe(show_only_changes=True)
    
    #e1.visualize_as_dataframe(show_only_changes=True)

## Below 50k ---> Above 50k

In [73]:
cfe_generator(401)

Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,13.0,1.0,4220.0,31.0,46.0,40.0,1.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,22.0,-,-,313.0,-,-,-,-,1.0


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,13.0,1.0,8810.0,303.0,46.0,40.0,1.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,23.0,-,-,-,-,-,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,1.0,4220.0,48.0,46.0,40.0,1.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,65.4,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,13.0,1.0,6442.0,303.0,46.0,40.0,1.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,19.0,-,-,-,-,95.1,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,13.0,1.0,4220.0,303.0,64.0,40.0,1.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,23.0,-,-,-,-,48.8,-,-,1


(None, None, None, None, None)

### Material A1

In [20]:
cfe_generator(6)

Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,19.0,1.0,5740.0,6.0,51.0,40.0,2.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,21.0,-,-,-,-,-,-,9.0,1.0


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,19.0,1.0,5740.0,6.0,59.0,40.0,2.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,23.0,-,-,-,-,-,1.0,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,19.0,1.0,5740.0,6.0,27.0,40.0,2.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,21.0,-,-,-,-,89.7,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,19.0,1.0,5740.0,6.0,36.0,40.0,2.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,86.6,-,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,19.0,1.0,5240.0,6.0,31.0,40.0,2.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,24.0,-,-,-,-,88.8,-,-,1


(None, None, None, None, None)

In [130]:
model.predict(np.array([3.0, 19 ,1.0 ,5740.0 ,6.0 , 51.0, 40, 2.0 ,8.0]).reshape(1,-1))

array([False])

### Material A2

In [21]:
cfe_generator(2993)

Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,19.0,1.0,6515.0,303.0,55.0,40.0,1.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,47.0,-,-,-,-,1.0


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,19.0,1.0,5522.0,303.0,46.0,40.0,1.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,47.4,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,9.0,1.0,6200.0,303.0,55.0,40.0,1.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,17.0,-,81.4,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,19.0,1.0,4220.0,303.0,54.0,40.0,1.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,22.0,-,-,313.0,-,-,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,9.0,1.0,6050.0,303.0,55.0,40.0,1.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,20.0,-,-,1.0,-,-,-,-,1


(None, None, None, None, None)

### Material A3

In [22]:
cfe_generator(4033)

Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,3.0,5740.0,6.0,58.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,4.0,-,-,-,85.2,-,-,1.0


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,3.0,5740.0,6.0,62.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,76.7,-,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,3.0,5400.0,6.0,58.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,65.5,-,7.0,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,3.0,5740.0,6.0,57.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,62.0,-,3.0,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,3.0,5740.0,6.0,47.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,88.6,-,1.0,-,1


(None, None, None, None, None)

### Material A4

In [23]:
cfe_generator(1652)

Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,3.0,3421.0,6.0,65.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,53.7,-,-,1.0


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,3.0,3421.0,6.0,57.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,21.0,-,-,-,52.2,-,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,3.0,3421.0,6.0,32.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,82.3,84.0,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,3.0,3421.0,6.0,36.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,93.1,1.0,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,3.0,3421.0,6.0,49.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,56.6,-,-,1


(None, None, None, None, None)

### Material A5

In [24]:
cfe_generator(2369)

Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,1.0,5630.0,6.0,50.0,40.0,1.0,2.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,24.0,-,-,-,-,88.8,-,-,1.0


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,1.0,6260.0,6.0,50.0,40.0,1.0,8.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,21.0,-,-,-,-,-,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,1.0,6260.0,6.0,43.0,40.0,1.0,2.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,96.7,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,1.0,4251.0,6.0,58.0,40.0,1.0,2.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,24.0,-,-,-,67.4,-,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,16.0,1.0,6260.0,6.0,36.0,40.0,1.0,2.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,67.8,-,-,1


(None, None, None, None, None)

## Above 50k ---> Below 50k

### Material B1

In [25]:
cfe_generator(185)

Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,2.0,22.0,1.0,2012.0,370.0,28.0,40.0,2.0,8.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,5.0,4.0,-,-,-,-,-,-,0.0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,2.0,22.0,1.0,230.0,6.0,36.0,40.0,2.0,8.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,15.0,-,-,-,35.99999999999995,-,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,2.0,22.0,1.0,230.0,303.0,58.0,40.0,2.0,8.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,35.6,24.6,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,2.0,22.0,1.0,2014.0,313.0,59.0,40.0,2.0,8.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,-,-,-,-,-,28.7,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,1.0,735.0,370.0,28.0,40.0,1.0,8.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,8.0,-,-,-,-,-,-,-,0


(None, None, None, None, None)

### Material B2

In [26]:
cfe_generator(717)

Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,20.0,3.0,5000.0,8.0,38.0,40.0,2.0,9.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,23.8,-,-,0.0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,20.0,3.0,3220.0,6.0,40.0,40.0,2.0,9.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,2.0,-,-,-,16.9,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,20.0,3.0,5000.0,6.0,59.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,10.0,-,-,-,-,-,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,22.0,1.0,5000.0,226.0,38.0,40.0,2.0,9.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,29.4,16.7,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,21.0,3.0,1007.0,36.0,38.0,40.0,2.0,9.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,2.6,-,-,0


(None, None, None, None, None)

### Material B3

In [27]:
cfe_generator(3977)

Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,14.0,3.0,2205.0,6.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,365.0,-,49.3,-,-,0.0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,20.0,3.0,3960.0,6.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,169.0,-,-,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,22.0,3.0,10.0,6.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,3.0,-,-,-,-,27.2,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,20.0,3.0,1108.0,6.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,18.9,-,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,24.0,3.0,2100.0,6.0,63.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,508.0,23.3,-,-,-,0


(None, None, None, None, None)

### Material B4

In [28]:
cfe_generator(3401)

Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,1.0,9030.0,6.0,55.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,36.2,-,-,9.0,0.0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,1.0,7700.0,6.0,55.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,18.3,-,2.0,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,1.0,3620.0,6.0,55.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,30.9,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,1.0,9130.0,6.0,55.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,32.5,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,18.0,1.0,6355.0,6.0,55.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,18.3,-,2.0,-,0


(None, None, None, None, None)

### Material B5

In [29]:
cfe_generator(1814)

Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,8.0,21.0,1.0,5120.0,42.0,48.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,10.0,-,-,-,-,-,-,-,0.0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,21.0,1.0,102.0,42.0,48.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,5.0,4.0,-,-,-,-,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,2.0,21.0,1.0,845.0,42.0,49.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,3.0,-,-,-,27.0,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,6.0,21.0,1.0,710.0,6.0,48.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,9.0,-,-,-,-,-,-,2.0,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,21.0,1.0,1006.0,34.0,48.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,4.8,-,-,0


(None, None, None, None, None)

In [78]:
cfe_generator(702)

Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,5.0,1021.0,110.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,27.1,13.9,-,-,0.0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,5.0,1021.0,110.0,30.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,28.7,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,5.0,1021.0,110.0,50.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,27.1,13.9,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,1.0,1021.0,253.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,13.0,-,-,-,-,8.8,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,1.0,1021.0,48.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,22.3,7.8,-,-,0


(None, None, None, None, None)

In [88]:
cfe_generator(293)

Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,21.0,1.0,120.0,215.0,56.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,14.0,-,-,-,33.5,-,-,-,0.0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,21.0,1.0,120.0,6.0,56.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,3.0,-,-,-,27.0,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,21.0,1.0,120.0,6.0,56.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,3.0,-,-,-,27.0,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,21.0,1.0,120.0,34.0,32.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,17.0,-,-,-,-,33.4,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,21.0,1.0,120.0,22.0,38.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,22.3,-,-,-,0


(None, None, None, None, None)

In [100]:
cfe_generator(702)

Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,5.0,1021.0,110.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,27.1,13.9,-,-,0.0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,5.0,1021.0,110.0,30.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,28.7,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,5.0,1021.0,110.0,50.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,27.1,13.9,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,1.0,1021.0,253.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,13.0,-,-,-,-,8.8,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,22.0,1.0,1021.0,48.0,53.0,40.0,1.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,22.3,7.8,-,-,0


(None, None, None, None, None)

In [133]:
cfe_generator(88)

Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,19.0,1.0,4055.0,6.0,23.0,28.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,74.0,73.6,-,-,1.0


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,19.0,1.0,4055.0,6.0,23.0,20.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,74.0,73.6,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,19.0,1.0,2300.0,6.0,23.0,40.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,69.3,-,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,19.0,1.0,4110.0,6.0,23.0,20.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,74.0,73.6,-,-,1


Query instance (original outcome : 0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,1.0,19.0,1.0,9645.0,6.0,33.0,28.0,2.0,1.0,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,84.0,86.3,-,-,1


(None, None, None, None, None)

In [157]:
def cfe_generator_bias(instance):
    
    features_to_vary = ['weekly hours','gender', 'race']
    random_seed = 0
    
    if model.predict(X_test[instance].reshape(1,-1))[0] == False:

        NNs = explanation_generator((np.array(x_test[instance:instance+1])).reshape(1,-1), outcome = 'negative')[4][0]

        indices_cf_example = (np.where(y_train == False)[0][NNs[0]], np.where(y_train == False)[0][NNs[1]], np.where(y_train == False)[0][NNs[2]], np.where(y_train == False)[0][NNs[3]], np.where(y_train == False)[0][NNs[4]])

        e1 = exp.generate_counterfactuals(x_test[instance:instance+1], total_CFs=1, desired_class="opposite",
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e2 = exp.generate_counterfactuals(x_train[indices_cf_example[0]:indices_cf_example[0]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e3 = exp.generate_counterfactuals(x_train[indices_cf_example[1]:indices_cf_example[1]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e4 = exp.generate_counterfactuals(x_train[indices_cf_example[2]:indices_cf_example[2]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e5 = exp.generate_counterfactuals(x_train[indices_cf_example[3]:indices_cf_example[3]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)
        
    elif model.predict(X_test[instance].reshape(1,-1))[0] == True:
        
        NNs = explanation_generator((np.array(x_test[instance:instance+1])).reshape(1,-1), outcome = 'positive')[4][0]

        indices_cf_example = (np.where(y_train == True)[0][NNs[0]], np.where(y_train == True)[0][NNs[1]], np.where(y_train == True)[0][NNs[2]], np.where(y_train == True)[0][NNs[3]], np.where(y_train == True)[0][NNs[4]])

        e1 = exp.generate_counterfactuals(x_test[instance:instance+1], total_CFs=1, desired_class="opposite",
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e2 = exp.generate_counterfactuals(x_train[indices_cf_example[0]:indices_cf_example[0]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e3 = exp.generate_counterfactuals(x_train[indices_cf_example[1]:indices_cf_example[1]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e4 = exp.generate_counterfactuals(x_train[indices_cf_example[2]:indices_cf_example[2]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)

        e5 = exp.generate_counterfactuals(x_train[indices_cf_example[3]:indices_cf_example[3]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)
        
        e6 = exp.generate_counterfactuals(x_train[indices_cf_example[4]:indices_cf_example[4]+1], total_CFs=1, desired_class="opposite", 
                                     features_to_vary=features_to_vary, random_seed=random_seed)
        
        
    return e1.visualize_as_dataframe(show_only_changes=True), e2.visualize_as_dataframe(show_only_changes=True),e3.visualize_as_dataframe(show_only_changes=True),e4.visualize_as_dataframe(show_only_changes=True), e5.visualize_as_dataframe(show_only_changes=True)
    
    #e1.visualize_as_dataframe(show_only_changes=True)

In [160]:
cfe_generator_bias(21)

Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,22.0,1.0,710.0,6.0,29.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,24.7,-,8.0,0.0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,22.0,1.0,120.0,6.0,51.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,2.2,-,9.0,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,22.0,1.0,3150.0,6.0,25.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,3.9,-,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,5.0,22.0,1.0,1220.0,6.0,42.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,11.8,1.0,-,0


Query instance (original outcome : 1)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,3.0,22.0,1.0,2330.0,6.0,29.0,40.0,2.0,1.0,1



Diverse Counterfactual set (new outcome: 0.0)


Unnamed: 0,employment type,qualification,marital status,career,pob,age,weekly hours,gender,race,income
0,-,-,-,-,-,-,15.5,1.0,-,0


(None, None, None, None, None)