# Counterfactuals generation for classification

We show how to generate counterfactuals using the code in this repository

In [1]:
import sys
from pathlib import Path
import joblib
import json

PROJECT_ROOT = Path().resolve().parent
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

from src import load_dataset, Counterfactuals

We start by loading sample data from scikit-learn. For classification, we use the adult dataset, trying to predict whether an individual's income will be above or below $50k/yr.

The code that loads this data set can be found in `src/data_loader/data.py` and it preprocesses the data, mostly encoding categorical values.

OrdinalEncoding is preferred in order not to increase the amount of columns too much. Only the column `sex` has been one-hot encoded and both resulting columns, `sex_Male` and `sex_Female` preserved in the data. This choice has been made to showcase how counterfactuals can be generated while ensuring one-hot encoding is preserved. We will see in more detail how to enforce co-mutation. That is, say we start with `sex_Male=1` and `sex_Female=0` and the counterfactual mutates `sex_Male` to 0, we need to be sure `sex_Female` becomes 1.

In [9]:
_, X, y = load_dataset("adult", preprocess=True, include_description=True)

Index(['age', 'workclass', 'fnlwgt', 'education', 'education-num',
       'marital-status', 'occupation', 'relationship', 'race', 'sex',
       'capital-gain', 'capital-loss', 'hours-per-week', 'native-country'],
      dtype='object')
Stored categorical mappings in: artifacts/categorical_encodings.json
{'age': 'Age of the individual',
 'capital-gain': 'Capital gain in USD',
 'capital-loss': 'Capital loss in USD',
 'education': 'Highest level of education achieved',
 'education-num': 'Education level as an integer',
 'fnlwgt': 'Final weight — estimation of census population',
 'hours-per-week': 'Average working hours per week',
 'marital-status': 'Marital status',
 'native-country': 'Country of origin',
 'occupation': 'Occupation',
 'race': 'Race',
 'relationship': 'Relationship status',
 'sex': 'Gender',
 'target': "Income level, '>50K' or '<=50K'",
 'workclass': 'Represents the employment status'}


The preprocessed data:

In [10]:
X

Unnamed: 0,age,education_num,capital_gain,capital_loss,hours_per_week,workclass,marital_status,relationship,race,native_country,sex_Female,sex_Male
0,25,7,0.0,0.0,40.0,3,4,3,2,38,0.0,1.0
1,38,9,0.0,0.0,50.0,3,2,0,4,38,0.0,1.0
2,28,12,0.0,0.0,40.0,1,2,0,4,38,0.0,1.0
3,44,10,7688.0,0.0,40.0,3,2,0,2,38,0.0,1.0
4,18,10,0.0,0.0,30.0,3,4,3,4,38,1.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...
48837,27,12,0.0,0.0,38.0,3,2,5,4,38,1.0,0.0
48838,40,9,0.0,0.0,40.0,3,2,0,4,38,0.0,1.0
48839,58,9,0.0,0.0,40.0,3,6,4,4,38,1.0,0.0
48840,22,9,0.0,0.0,20.0,3,4,3,4,38,0.0,1.0


raw data is currently stored in `_` as we don't need it for the demo.

The model is a `XGBClassifier`. No hyperparameter tuning has been done as an optimal model is not necessary for this demo.

In [11]:
ARTIFACT_DIR = Path("../artifacts")
ARTIFACT_DIR.mkdir(exist_ok=True)

model = joblib.load(ARTIFACT_DIR / "classification_model.pkl")

Just for completeness, here are the training metrics for the model:

In [17]:
model_metrics = Path(ARTIFACT_DIR / "classification_report.json")

with open(model_metrics) as f:
    model_metrics = json.load(f)
    
model_metrics

{'0': {'precision': 0.8959938366718028,
  'recall': 0.9396714247239429,
  'f1-score': 0.9173130011831208,
  'support': 3713.0},
 '1': {'precision': 0.7739656912209889,
  'recall': 0.6544368600682594,
  'f1-score': 0.7092001849283402,
  'support': 1172.0},
 'accuracy': 0.8712384851586489,
 'macro avg': {'precision': 0.8349797639463958,
  'recall': 0.7970541423961012,
  'f1-score': 0.8132565930557305,
  'support': 4885.0},
 'weighted avg': {'precision': 0.8667170738328358,
  'recall': 0.8712384851586489,
  'f1-score': 0.8673829662495275,
  'support': 4885.0}}

We see the model is definitely biased towards the class with larger support (0-class). This can be improved with some hyperparameter tuning, up-/down-sampling and/or regularization. Again, not the scope of this demo.

What might be useful for interpreting results though is having a look at categorical encodings:

In [18]:
ENCODING_PATH = Path(ARTIFACT_DIR / "categorical_encodings.json")

with open(ENCODING_PATH) as f:
    CAT_ENCODINGS = json.load(f)

for key in CAT_ENCODINGS.keys():
    print(CAT_ENCODINGS[key])

{'Federal-gov': 0, 'Local-gov': 1, 'Never-worked': 2, 'Private': 3, 'Self-emp-inc': 4, 'Self-emp-not-inc': 5, 'State-gov': 6, 'Without-pay': 7}
{'Divorced': 0, 'Married-AF-spouse': 1, 'Married-civ-spouse': 2, 'Married-spouse-absent': 3, 'Never-married': 4, 'Separated': 5, 'Widowed': 6}
{'Husband': 0, 'Not-in-family': 1, 'Other-relative': 2, 'Own-child': 3, 'Unmarried': 4, 'Wife': 5}
{'Amer-Indian-Eskimo': 0, 'Asian-Pac-Islander': 1, 'Black': 2, 'Other': 3, 'White': 4}
{'Cambodia': 0, 'Canada': 1, 'China': 2, 'Columbia': 3, 'Cuba': 4, 'Dominican-Republic': 5, 'Ecuador': 6, 'El-Salvador': 7, 'England': 8, 'France': 9, 'Germany': 10, 'Greece': 11, 'Guatemala': 12, 'Haiti': 13, 'Holand-Netherlands': 14, 'Honduras': 15, 'Hong': 16, 'Hungary': 17, 'India': 18, 'Iran': 19, 'Ireland': 20, 'Italy': 21, 'Jamaica': 22, 'Japan': 23, 'Laos': 24, 'Mexico': 25, 'Nicaragua': 26, 'Outlying-US(Guam-USVI-etc)': 27, 'Peru': 28, 'Philippines': 29, 'Poland': 30, 'Portugal': 31, 'Puerto-Rico': 32, 'Scotland'

## Generating counterfactuals

Typically, one would have a model response and may ask the question: "what would need to be different in order to get a specific result?"

In this case, trying to predict whether someone is above or below a certain income threshold, we may ask "what would this person need to do, according to the model, in order to earn more/less?"

The answer is what counterfactuals provide.

We start with a data instance we want to reverse:

In [19]:
instance = X.iloc[0:1].copy()
instance_outcome = y.iloc[0:1]
instance.loc[:, 'outcome'] = instance_outcome.values

In [20]:
instance

Unnamed: 0,age,education_num,capital_gain,capital_loss,hours_per_week,workclass,marital_status,relationship,race,native_country,sex_Female,sex_Male,outcome
0,25,7,0.0,0.0,40.0,3,4,3,2,38,0.0,1.0,0


We see this person would not earn more than $50k/yr (outcome=0)

So, what would it need to be different for him to earn more?

We initialize the `Counterfactuals` class:

In [21]:
cf = Counterfactuals(X=X, y=y, model=model)

Now, there are currently two ways provided  in this repository in order to explore counterfactuals.

First up, it might be that there are similar instances of data that already have a different outcome. This would be what is called a prototype: an existing data instance with the desired model outcome.

We can then look into our data for close (in feature space) instances and see what we can find:

In [26]:
prototypes = cf.get_counterfactuals(instance, n_counterfactuals=5, method="prototypes", desired_class=1)

In [28]:
instance

Unnamed: 0,age,education_num,capital_gain,capital_loss,hours_per_week,workclass,marital_status,relationship,race,native_country,sex_Female,sex_Male,outcome
0,25,7,0.0,0.0,40.0,3,4,3,2,38,0.0,1.0,0


In [27]:
prototypes

Unnamed: 0,age,education_num,capital_gain,capital_loss,hours_per_week,workclass,marital_status,relationship,race,native_country,sex_Female,sex_Male,outcome
5953,20,8,0.0,0.0,35.0,3,4,3,2,38,0.0,1.0,1
24914,23,7,14344.0,0.0,40.0,3,4,3,1,39,0.0,1.0,1
7720,23,10,0.0,0.0,40.0,3,4,3,1,38,0.0,1.0,1
31169,36,7,13550.0,0.0,40.0,3,4,1,2,38,0.0,1.0,1
28800,33,10,8614.0,0.0,40.0,3,4,1,2,38,0.0,1.0,1


We see that there are some instances with different ages, capital gains or working hours per week. This would already provide some insights.
However, we see that some counterfactuals have a different race, native country and relationship status compared to our instance.

One may not want to change some features (relationship status) or may simply not be able to (race or native country).
For these cases, we can pass the parameter `fix_vars` that locks certain feature values: 

In [29]:
cf.get_counterfactuals(instance, n_counterfactuals=5, method="prototypes", desired_class=1, fix_vars=['relationship', 'race'])




Unnamed: 0,age,education_num,capital_gain,capital_loss,hours_per_week,workclass,marital_status,relationship,race,native_country,sex_Female,sex_Male,outcome
5953,20,8,0.0,0.0,35.0,3,4,3,2,38,0.0,1.0,1
1900,30,9,99999.0,0.0,40.0,3,4,3,2,38,1.0,0.0,1
47392,22,10,99999.0,0.0,55.0,5,4,3,2,38,0.0,1.0,1
42771,39,10,0.0,0.0,50.0,3,2,3,2,38,1.0,0.0,1


We see that relationship status and race are now fixed. However, a different marital status or sex might be suggested. Now this may be exposing some bias in the model, data, or both, but we'll accept it for the moment and focus on te warning that was thrown: we see that5 counterfactuals were requested, but only 4 could be found in the data.

Now this would be a limit of this approach: it only returns solutions that are already known. But surely, there must be other combinations. There are. And don't call me Shirley.

We can use a genetic algorithm to try and generate data that may not be present already, but that is realistic enough and has the desired outcome. Links to the genetic algorithm used can be found in the README for this repository, but briefly, in order to ensure the generated data is realistic, we use 4 different fitness functions:

- *outcome fitness*: the predicted outcome of the generated data must be the one we want
- *sparsity fitness*: the least amount of features we change, the better
- *point likelihood fitness*: the generated point must be close to the data distribution of the actual data
- *distance fitness*: the generated instance must be close to the base instance in feature space

In [31]:
genetic = cf.get_counterfactuals(instance, 5, "genetic", desired_class=1, fix_vars=['relationship', 'race'], one_hot_encoded=['sex_'])
genetic

TypeError: Cannot interpret 'CategoricalDtype(categories=[0, 1, 2, 3, 4, 5], ordered=False, categories_dtype=int64)' as a data type

In [8]:
prototypes = cf.get_counterfactuals(instance, 5, "prototypes", desired_class=1)

In [9]:
genetic

Unnamed: 0,age,education_num,capital_gain,capital_loss,hours_per_week,workclass,marital_status,relationship,race,native_country,sex_Female,sex_Male,outcome
0,25,10,11160.768924,0.0,40.0,0,4,3,4,38,0.0,1.0,0.999711
1,34,9,6135.722058,0.0,21.923877,3,2,0,2,38,0.0,1.0,0.945842
2,36,13,12123.919138,46.587135,23.520548,3,2,0,4,38,0.0,1.0,0.999804
3,25,9,5592.674458,0.0,16.539723,3,2,0,2,38,0.0,1.0,0.878579
4,32,13,6210.4793,96.576854,13.866313,3,2,3,4,38,0.0,1.0,0.782531


In [10]:
prototypes

Unnamed: 0,age,education_num,capital_gain,capital_loss,hours_per_week,workclass,marital_status,relationship,race,native_country,sex_Female,sex_Male,outcome
5953,20,8,0.0,0.0,35.0,3,4,3,2,38,0.0,1.0,1
24914,23,7,14344.0,0.0,40.0,3,4,3,1,39,0.0,1.0,1
7720,23,10,0.0,0.0,40.0,3,4,3,1,38,0.0,1.0,1
31169,36,7,13550.0,0.0,40.0,3,4,1,2,38,0.0,1.0,1
28800,33,10,8614.0,0.0,40.0,3,4,1,2,38,0.0,1.0,1
