# Binary classification coutnerfactual generation 

The notebook shows the progress of generating counterfactuals for binary classification. It uses adult dataset, enclosed in the repository, but any data would work. 

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

from mnist import MNIST  # (this uses mnist-py package)

from textualizer import Textualizer
from nn_model import NNModel
from counterfactual_generator import CounterfactualGenerator
from data import MixedEncoder

#### Prepare the data

In [2]:
data = pd.read_csv('adult_frame.csv')
model_path = "model.pt"

# create target binary i.e. {0,1} variable to predict
target = np.asarray(data['income'] == '>=50k', dtype=np.float32).reshape(-1, 1)

The counterfactual generator class expects an encoder 
and encoder expects a pandas dataframe, so we need to do this process.

It does not change the data inside, only understands if they are categorical, numerical or mixed.

In [3]:
input_data = data[data.columns[0:8]]
encoder = MixedEncoder(input_data)
encoded = encoder.get_encoded_data()

# partition into train and test, y ~ target, X ~ input data
X_train, X_test, y_train, y_test = train_test_split(encoded, target, test_size=0.2, random_state=42)

#### Create Neural network
Create and train the Neural Net.

In [4]:
model = NNModel(encoded.shape[1], hidden_sizes=[15, 10], output_size=1)
to_train = False
if to_train:
    model.train(X_train, y_train, batch_size=128, epochs=50)
    model.save(model_path)
else:
    model.load(model_path)

In [5]:
print("Train data:")
model.test(X_train, y_train)
print()
print("Test data:")
model.test(X_test, y_test)

Train data:
Testing:
Accuracy: 83.23%
Average loss: 0.35073863427054885

Test data:
Testing:
Accuracy: 83.10%
Average loss: 0.3544103077786475


#### Select a datapoint
Prepare a datapoint for generating the "counterfactual"

In [6]:
selected_i = 0
# take one from the dataset, not yet encoded
in_data = input_data.iloc[selected_i]

prediction = int(model.predict(encoder.encode_datapoint(in_data)) >= 0)
print("Prediction:", prediction)
print("True target:", int(target[selected_i][0]))

Prediction: 0
True target: 0


#### Generate the counterfactuals 
Within some distance relative to the objective value of the optimum

In [7]:
relative_distance_q = 1 # relative distance to optimum within which to search for 

In [8]:
cf_generator = CounterfactualGenerator(encoder)
counterfactuals = cf_generator.generate_close_counterfactuals(in_data, 
                                                              model,
                                                              relative_distance_q,
                                                              verbose=False)

Restricted license - for non-production use only - expires 2024-10-28


Set up mapping from values to meaning of the categorical parameters

In [9]:
string_vals = {'workclass': {0: 'Government', -3: 'Other/Unknown', -2: 'Private', -1: 'Self-Employed'},
               'education': {0: 'Assoc', -7: 'Bachelors', -6: 'Doctorate', -5: 'HS-grad', -4: 'Masters', -3: 'Prof-school', -2: 'School', -1: 'Some-college'},
               'marital_status': {0: 'Divorced', -4: 'Married', -3: 'Separated', -2: 'Single', -1: 'Widowed'},
               'occupation': {0: 'Blue-Collar', -5: 'Other/Unknown', -4: 'Professional', -3: 'Sales', -2: 'Service', -1: 'White-Collar'},
               'race': {0: 'Non-White', -1: 'White'},
               'gender': {0: 'Female', -1: 'Male'}}
explainer = Textualizer(string_vals, encoder)


labels = ["BAD", "GOOD"]
for expl in explainer.formulate_list(counterfactuals, labels):
    print(expl)

You got score BAD.
One way you could have got score GOOD instead is if:
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 45.24 rather than 39.0 and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 47.44 rather than 39.0 and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 52.18 rather than 39.0 and 
  marital_status had taken value Married (-4) rather than Single (-2)


Use margin to find stronger solutions, not just above the line.

e. g. before the sought return value of the NN could have been $0.0001$, now it must be $\ge 1$

In [10]:
cf_margin = 1 # margin represents the distance between best and second best class 
counterfactuals = cf_generator.generate_close_counterfactuals(in_data, 
                                                              model,
                                                              relative_distance_q, 
                                                              verbose=False, 
                                                              cf_margin=cf_margin)

for expl in explainer.formulate_list(counterfactuals, labels):
    print(expl)


You got score BAD.
One way you could have got score GOOD instead is if:
  age had taken value 53.24 rather than 39.0 and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 39.28 rather than 39.0 and 
  workclass had taken value Private (-2) rather than Government (0) and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 39.52 rather than 39.0 and 
  workclass had taken value Private (-2) rather than Government (0) and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 54.26 rather than 39.0 and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 57.39 rather than 39.0 and 
  marital_status had tak

You can set the limit on the maximum number of generated couterfactuals.

In [11]:
counterfactuals = cf_generator.generate_close_counterfactuals(in_data, 
                                                              model,
                                                              relative_distance_q, 
                                                              verbose=False, 
                                                              n_limit=5,
                                                              cf_margin=cf_margin)

for expl in explainer.formulate_list(counterfactuals, labels):
    print(expl)

You got score BAD.
One way you could have got score GOOD instead is if:
  age had taken value 53.24 rather than 39.0 and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 39.28 rather than 39.0 and 
  workclass had taken value Private (-2) rather than Government (0) and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 39.52 rather than 39.0 and 
  workclass had taken value Private (-2) rather than Government (0) and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 54.26 rather than 39.0 and 
  marital_status had taken value Married (-4) rather than Single (-2)
Another way you could have got score GOOD instead is if:
  age had taken value 57.39 rather than 39.0 and 
  marital_status had tak