# Counterfactual explanations
Counterfactual explanations (CEs) are an important tool from the field of explainable artificial intelligence (XAI).
This notebook teaches what CEs are, why they are important, and provides how they can be discovered.  
反事实解释器 CEs 来自 解释人工智能 XAI 

## To begin with: What is *XAI*?
XAI is a subfield of AI concerned with developing methods to help us use AI systems in a fair, safe, and responsible manner.
To do that, XAI aims at *explaining* why an AI system (typically, actually a machine learning model) behaves the way it does.
There are two main categories of XAI methods:

1 - Methods to understand why very large and complex models, like deep neural nets and large ensembles of decision trees, come to certain decisions/predictions.
These models are typically called *black-box* models.

2 - Methods to generate models that are so simple that they can be interpreted directly. Models of this type are, e.g., decision trees, rule sets, and equations found by symbolic regression.
These models are typically called *glass-box* models.

## A brief intro to CEs
CEs belong to the first category mentioned above: methods to explain black-box models.
Let us consider the case in which we have a model that is a classifier, i.e., our model is a function $$f : \Omega^d → \mathbb{C},$$
where $\Omega^d$ is our space of $d$ features (some of which are numerical and thus in $\mathbb{R}$, some of which are categorical) while $\mathbb{C}$ is the space of classes (for example for a classifier of credit risk, $\mathbb{C} = \{ \textit{High risk}, \textit{Low risk} \}$).

Say $\mathbf{x} \in \Omega^d$ is a possible input for our classifier $f$.
$\mathbf{x}$ represents a user. For example, $\mathbf{x}$ can be the:
$$\mathbf{x} = ( \textit{ age : 22, gender : Female, savings : 5.000\$, job : student, } \dots ). $$
For a given $\mathbf{x}$, $f$ will predict a certain class $c$ (e.g., "$\textit{High risk}$").
Now, a CE aims to answer the question:
"What **small change** is needed to $\mathbf{x}$ such that the new input $\mathbf{x}^\prime$ will cause $f$ to produce the desired class $c^\star$? (e.g., $f(\mathbf{x}^\prime) = \textit{Low risk}$).

A CE is a possible answer to the question above.
For example, an answer could be that the user needs to increase their savings ($\textit{5.000\$} → {8.000\$}$) and change occupation ($ \textit{student} \rightarrow \textit{part-time employed}$).
However, a CE may also reveal that $f$ changes its prediction based on ethnicity or gender (all other features remaining the same), meaning that $f$ learned harmful biases (e.g., from historical data) that perpetuate a discrimination against minorities (unfairness).

Here's a simplified depiction in a 2D feature space:
![](https://drive.google.com/uc?export=view&id=1eQTEExQhIgi-2sEoCcyMELfKXACTrxAW)


### Seeking *small* changes to $x$

We seek *small* changes to $x$ to observe how $f$ behaves in the neighborhood of an input to gain information on what the decision boundary looks like in that area.
Moreover, a very interesting property of CEs is that they prescribe a possible intervention that the user may actually want to pursue!
Thus, we wish that the cost of intervention is small for the user.
This means that $\mathbf{x}^\prime$ needs to be as close as possible to $\mathbf{x}$, under some meaningful distance function $\delta$ that captures the cost of intervention.

## Additional reading material
An excellent and beginner-friendly starting point is the book by Christoph Molnar: "Inteprable ML Book".
Here's a direct link to his chapter on CEs (co-written by Susanne Dandl): https://christophm.github.io/interpretable-ml-book/counterfactual.html

## Note: CEs vs adversarial examples
CEs are similar to adversarial examples (AEs). In both cases, one searches for changes to the input $x$ that trigger a change to the prediction made by $f$. However, CEs are intended to explain $f$ and not to fool it!

## Let's get started
In this notebook we simulate a financial credit risk situation, in which a black-box model (we will be using a random forest) has been trained to tell which users are at high or low risk of default (i.e., become unable of paying back the credit given by the bank).
We will then use a CE discovery algorithm to see how an user can change their (unfavorable) situation (i.e., f(x)=high risk).

### Set up libraries & random seed

In [1]:
import numpy as np              # 处理数组和矩阵
import pandas as pd             # Pandas 是 Python 的核心数据分析支持库，提供了快速、灵活、明确的数据结构，旨在简单、直观地处理关系型、标记型数据。Pandas 的主要数据结构是 Series（一维数据）与 DataFrame（二维数据）.
from sklearn.ensemble import RandomForestClassifier         # 导入随机森林分类器
from sklearn.model_selection import train_test_split        # 导入数据切分函数
# accuracy_score 用于计算分类模型的准确率，即分类正确的样本数除以总样本数
# balanced_accuracy_score 平均精度——每个类别下的样本精度的算术平均
from sklearn.metrics import accuracy_score, balanced_accuracy_score # sklearn.metrics是scikit-learn开发的用于评估分类、回归和聚类算法性能的模块之一。

SEED = 42
np.random.seed(SEED) # for reproducibility 生成指定随机数，为了可以再次复现

### Load data
We load the data set "South German Credit", which concerns learning a model of whether providing a financial credit to a user may be safe or risky.
See https://archive.ics.uci.edu/ml/datasets/South+German+Credit+%28UPDATE%29 for more info.

We get this data from the repo of CoGS, a baseline algorithm for the discovery of CEs (more details later).

In [2]:
# clone repo, access, and install repo
# ! git clone https://github.com/marcovirgolin/cogs   # 下载COGS(Counterfactual Genetic Search)
 # 进入clone的文件目录   UsageError: Line magic function `%` not found：去掉 % cd 中间的空格。
%cd cogs
# 执行setup.py文件           
! pip install .     

d:\workspace\githubLib\ML-Notebooks\notebooks\cogs


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing d:\workspace\githublib\ml-notebooks\notebooks\cogs
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: cogs
  Building wheel for cogs (setup.py): started
  Building wheel for cogs (setup.py): finished with status 'done'
  Created wheel for cogs: filename=cogs-1.0.0-py3-none-any.whl size=23637 sha256=f42c567dddf7e3929cac2c91d96808c037801cda875f733284634e5f087d2bec
  Stored in directory: C:\Users\25627\AppData\Local\Temp\pip-ephem-wheel-cache-r2th5tt4\wheels\a6\aa\39\3647cb5659512021c0e4d308f7a8383af08969951c5b379cfb
Successfully built cogs
Installing collected packages: cogs
  Attempting uninstall: cogs
    Found existing installation: cogs 1.0.0
    Uninstalling cogs-1.0.0:
      Successfully uninstalled cogs-1.0.0
Successfully installed cogs-1.0.0


Load the data and preprocess it a bit

In [3]:
# Load data set & do some pre-processing
df = pd.read_csv("south_german_credit.csv")     # 读取csv格式数据集文件，该数据集是研究德国南部的信贷问题的
df.drop("account_check_status",axis=1,inplace=True) # 删除account_check_status列；axis：用于确定要删除的是行还是列，0表示行，1表示列。inplace：指定是否在原始数据帧中进行删除操作。默认值为False
# 类别型特征是指哪些在有限可能选项中取值的特征。这些有限可能选项通常是离散的，而且大多数情况下是字符串形式的，除了少数模型能够直接处理类别型特征，对于大多数模型来说都需要对类别型特征进行必要的处理，转换成数值型特征才能正常工作。
# 处理类别型特征的方式：https://blog.csdn.net/qq_39780701/article/details/136989309
# 1.序号编码（Ordinal Encoding）-- 序号编码通常处理类别特征之间存在明确的顺序关系（即等级或次序意义）时。
# 2.标签编码（Label Encoding）--  标签编码方式是说将每一个类别赋予一个唯一的整数标签，但并不表达类别之间的顺序关系。
# 3.独热编码（One-hot Encoding）
# 4.二进制编码（Binary Encoding）
categorical_feature_names = ['purpose', 'personal_status_sex',
    'other_debtors', 'other_installment_plans', 'telephone', 'foreign_worker']      # 类别型特征列表
# Note: some other features are indices (categories in which the order matters), treated as numerical here for simplicity
label_name = 'credit_risk'              # 标签名字是 credit_risk
desired_class = 1 # this means "low risk"   期望的分类是 低风险

for feat in categorical_feature_names: # convert categorical features into integer codes    将类别型特征转换为整数，将不同的类别的赋予一个唯一的整数标签
    df[feat] = pd.Categorical(df[feat]) # Categorical()可以用来计算一个列表类型数据中的类别数，即不重复项。它返回的是一个CategoricalDtype
    df[feat] = df[feat].cat.codes   # cat需先把特征转换为category 类型，然后调用codes转换为分类变量。
feature_names = list(df.columns)    # 将列标签转换为list并存储在feature_names变量中
feature_names.remove(label_name)    # 在feature_names list中移除lable_name

print("Num. features: {}, feature names: {}".format(len(feature_names), feature_names)) # 打印feature_names的长度和内容

# Prepare data to be in numpy format, as typically used to train a scikit-learn model
X = df[feature_names].to_numpy()    # 将feature_names标签下的DataFrame转化为numpy，赋值给X变量
y = df[label_name].to_numpy().astype(int)   # 将 credit_risk 标签下的DataFrame转化为numpy 并指定为int类型赋值给y变量。numpy.ndarray.astype 函数用于将数组的数据类型转换为指定的类型
# Assume we have a specific train & test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=SEED) # 使用之前生成的SEED划分训练集和测试集；
# test_size	分割比例，默认为0.25，即测试集占完整数据集的比例；
# random_state：是随机数的种子。随机数种子：其实就是该组随机数的编号，在需要重复试验的时候，保证得到一组一样的随机数。

Num. features: 19, feature names: ['duration_in_month', 'credit_history', 'purpose', 'credit_amount', 'savings', 'present_emp_since', 'installment_as_income_perc', 'personal_status_sex', 'other_debtors', 'present_res_since', 'property', 'age', 'other_installment_plans', 'housing', 'credits_this_bank', 'job', 'people_under_maintenance', 'telephone', 'foreign_worker']


### Train the model
Here we train the model, but in a practical situation we may assume that the model has already been trained (and is, e.g., property of the bank that assesses to whether to award the credit or not).

We use random forest because it is quick and easy. However, you can use any model you like, such as a deep neural net.
As classicly done in XAI litereature, we call this model a *black-box model*.

In [4]:
# Train black-box model (bbm)   训练一个黑盒模型
bbm = RandomForestClassifier(random_state=SEED, class_weight="balanced", min_samples_leaf=25)   # 实例化随机森林分类器
bbm.fit(X_train, y_train)   # 将模型与训练数据拟合；从训练数据集(X,y)上建立一个决策树森林。 
# note: we do not one-hot encode multi-category features here for simplicity

Let's check that the model has a decent accuracy
(Note: not really needed for the purpose of CEs)

In [5]:
# 输出训练的模型在测试集上的准确率（即分类正确的样本数除以总样本数）和平均准确率（每个类别下的样本精度的算术平均）
print("acc:{:.3f}, bal.-acc:{:.3f}".format(accuracy_score(y_test, bbm.predict(X_test)), balanced_accuracy_score(y_test, bbm.predict(X_test))))

acc:0.760, bal.-acc:0.691


### Pick the user
Next, we simulate to have a user for whom the decision of the black-box model is the undesired one.
For example, let's pick the last point in the test set for which the prediction is unfavourable.

In [6]:
# Let's consider, e.g., the last test sample for which an undesired decision is given
p = bbm.predict(X_test) # 预测X_test的类别
# print(p)
# print(np.argwhere(p != desired_class).shape)
# print(np.argwhere(p != desired_class))
# print(np.argwhere(p != desired_class).squeeze().shape)
# np.argwhere(p != desired_class) 返回 X_test预测的类别不为desired_class 的数组元组的索引
# np.squeeze（）函数可以删除数组形状中的单维度条目，即把shape中为1的维度去掉，但是对非单维的维度不起作用。
idx = np.argwhere(p != desired_class).squeeze()[-1]   # 获取最后一个预测类别不为desited_class 的数据的序号

x = X_test[idx] # this is our unhappy user! 将选取的数据赋值给变量x

# show features of this user  打印变量x的特征数据
print("Description of x:")
for i, feat_name in enumerate(feature_names):     # 遍历feature_names列表
  print(" ", feat_name+" "*(30-len(feat_name)), x[i])

Description of x:
  duration_in_month              48
  credit_history                 0
  purpose                        8
  credit_amount                  3844
  savings                        2
  present_emp_since              4
  installment_as_income_perc     4
  personal_status_sex            2
  other_debtors                  0
  present_res_since              4
  property                       4
  age                            34
  other_installment_plans        2
  housing                        3
  credits_this_bank              1
  job                            2
  people_under_maintenance       1
  telephone                      0
  foreign_worker                 1


### CE discovery algorithm
We use the library CoGS to find a CE.
CoGS (Counterfactual Genetic Search) is a relatively quick to run and easy to use library that makes no assumptions on the black-box model $f$ (e.g., it does not require linearity nor gradients to work).
Moreover, CoGS can handle both numerical and categorical features.


### Setting up the search space
To set up the space in which CoGS searches, we must provide:
1) Intervals within which the search takes place (for categorical features, which categories are possible)          搜索的间隔
2) The indices of categorical features (for CoGS to know which are categorical and which are numerical)             区分那些特征是类别型特征（categorical features）哪些是数字型特征
3) Optional plausibility constraints to ensure that the discovered CE can be realized (e.g., the age of a person cannot decrease)   可选的合理性约束以确保discovered CE能实现

All of these three must be provided as lists that have the same order, in particular the order used to list the feature in `X_train` and `X_test`.

In [7]:
# Set up search bounds  设置搜索的边界
feature_intervals = list()  # 声明一个用来保存特征搜索间距的空列表
for i, feat in enumerate(feature_names):    # 遍历feature_names列表
  if feat in categorical_feature_names:     # 如果该feat为类别型特征（categorical feature)
    interval_i = np.unique(X_train[:,i])    # 去除该列其中重复的元素 ，并按元素 由小到大 返回一个新的无元素重复的数组（递增的整数序列？）
  else:                                     # 如果该feat不在categorical_feature_names列表内
    interval_i = (np.min(X_train[:,i]), np.max(X_train[:,i])) # 创建一个包含该列最小值最大值的元组
  feature_intervals.append(interval_i)  # 将创建的元组添加到feature_intervals中

# Set up which feature indices are categorical  # 设置那些特征索引是类别型特征（categorical feature)
indices_categorical_features = [i for i, feat in enumerate(feature_names) if feat in categorical_feature_names] # 将categorical feature 所在的列索引存储在indices_categorical_features中

# Let's also set up a plausibility constraint for the feature "age" (can only increase) 
# and one for foreign worker (cannot change, must stay equal to what it is) 
pcs = ['>=' if feat=='age' else ('=' if feat=='foreign_worker' else None) for feat in feature_names]  # 对age特征做设置一个合理性约束--只能增加; 对foreign worker特征做设置一个合理性约束--不能改变
# for feat in feature_names
#   if feat=='age'
#     '>=' 
#   else 
#     if feat=='foreign_worker'
#       '='
#     else 
#       None

## Hyper parameters
We can now setup the hyper-parameters of CoGS, and then run the search!
We put some comments to explain what they mean in the code below.

As distance $\delta$, here we use Gower's distance, which takes into account both numerical differences and categorical mismatches (see https://christophm.github.io/interpretable-ml-book/counterfactual.html#method-by-dandl-et-al.).
In a genetic algorithm, the quality of solutions is measured in terms of *fitness*, where normally higher is better.
Thus the fitness used here is set to be the opposite of Gower's distance.

In [8]:
from cogs.evolution import Evolution
from cogs.fitness import gower_fitness_function         # a classic fitness function for counterfactual explanations

cogs = Evolution(
        ### hyper-parameters of the problem (required!) ###
        x=x,  # the starting point aka unhappy user                                                                     起始点，即获取的最后一个预测类别不为desited_class 的数据的序号
        fitness_function=gower_fitness_function,  # a classic fitness function for counterfactual explanations          反事实解释的一个经典适应度函数
        fitness_function_kwargs={'blackbox':bbm,'desired_class': desired_class},  # these must be passed for the fitness function to work        gower_fitness_function()函数的参数：'blackbox':bbm--训练的模型,'desired_class': desired_class 期望的类别
        feature_intervals=feature_intervals,  # intervals within which the search operates                              搜索操作的间距
        indices_categorical_features=indices_categorical_features,  # the indices of the features that are categorical  这些特征的索引是类别型特征（categorical feature)
        plausibility_constraints=pcs, # can be "None" if no constraints need to be set                                  约束条件，如果没有设置约束可以为None
        ### hyper-parameters of the evolution (all optional) ###                                                        Evolution 设置参数
        evolution_type='classic',       # the type of evolution, classic works quite  well                              evolution的类型：classic
        population_size=1000,           # how many candidate counterfactual examples to evolve simultaneously           同时演化多少个候选counterfactual examples
        n_generations=25,               # number of iterations for the evolution                                        evolution的迭代次数
        selection_name='tournament_4',  # selection pressure                                                            设置遗传算法中的选择压力；Truncation selection(截断选择)在截断选择中，根据适应度值对种群中的个体按照从优到劣的顺序进行排序，只有前n个最好的个体被选择进入下一代。
        init_temperature=0.8, # how "far" from x we initialize                                                          距我们初始x多“远”
        num_features_mutation_strength=0.25, # strength of random mutations for numerical features                      数值特征(numerical features)的随机变异强度
        num_features_mutation_strength_decay=0.5, # decay for the hyper-param. above                                    数值特征(numerical features)的超参数的衰退率
        num_features_mutation_strength_decay_generations=[10,15,20], # when to apply the decay                          什么节点应用上述的衰退率
        ### other settings ###
        verbose=True  # logs progress at every generation
)

Ready to run!

In [9]:
cogs.run()  # 运行cogs(Counterfactual Genetic Search)

generation: 1 best fitness: -0.23960241235590057 avg. fitness: -0.5655859030567522
generation: 2 best fitness: -0.23960241235590057 avg. fitness: -0.45144422711053095
generation: 3 best fitness: -0.23960241235590057 avg. fitness: -0.3932625759375893
generation: 4 best fitness: -0.18999766508484883 avg. fitness: -0.3424200187317699
generation: 5 best fitness: -0.18999766508484883 avg. fitness: -0.29326014109826054
generation: 6 best fitness: -0.10306974074439322 avg. fitness: -0.24382975966224887
generation: 7 best fitness: -0.06996051106514789 avg. fitness: -0.1966318789656929
generation: 8 best fitness: -0.059808815825190084 avg. fitness: -0.15648801037922425
generation: 9 best fitness: -0.03346219281443785 avg. fitness: -0.12237121200700284
generation: 10 best fitness: -0.029642499013702472 avg. fitness: -0.09555415482021212
generation: 11 best fitness: -0.029642499013702472 avg. fitness: -0.0736235690606997
generation: 12 best fitness: -0.029642499013702472 avg. fitness: -0.05854284

## Counterfactual explanation
Now that CoGS has terminated, we can look at its result.
The field `cogs.elite` contains the best-found counterfactual example, i.e., a point `x'` for which `bbm(x')=desired_class`.
The respective counterfactual explanation is simply `x'-x` (there exist more involved definitions of counterfactual explanations, here we use this simple one).
Let's take a look at what the user needs to do to obtain the desired class, i.e., be granted the loan.

In [13]:
# categorical是pandas中对应分类变量的一种数据类型，与R中的因子型变量比较相似，例如性别、血型等等用于表征类别的变量都可以用其来表示.
# 文档 https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html
from pandas.core.arrays import categorical  # 导入categorical
# Get the best-found counterfactual example (called elite)  获取最佳反事实实例
cf_example = cogs.elite               # 获取 the best-found counterfactual example,
cf_explanation = cogs.elite - x       # The respective counterfactual explanation is simply `x'-x`
# print(x)
# print(cf_example)

# Show counterfactual explanation
if bbm.predict([cf_example])[0] == desired_class:   # 如果预测的结果为desired_class
  print("Success! Here's the explanation:")
  for i, feat in enumerate(feature_names):      # 遍历特征
    if cf_explanation[i] != 0:                  # 如果某一特征序号下存储的值不为0
      print(" Feature '{}' should change from '{}' to '{}'".format(feat, np.round(x[i],3), np.round(cf_example[i],3)))  # 打印该特征的名称、原来的值和目标值
else:
  print("Failed to find a counterfactual explanation for the desired class :(")

[  48    0    8 3844    2    4    4    2    0    4    4   34    2    3
    1    2    1    0    1]
[4.80000000e+01 0.00000000e+00 8.00000000e+00 3.84400000e+03
 2.50040711e+00 4.00000000e+00 4.00000000e+00 2.00000000e+00
 0.00000000e+00 4.00000000e+00 4.00000000e+00 3.40000000e+01
 2.00000000e+00 3.00000000e+00 1.00000000e+00 2.00000000e+00
 1.00000000e+00 0.00000000e+00 1.00000000e+00]
Success! Here's the explanation:
 Feature 'savings' should change from '2' to '2.5'


# Exercise idea
Here's an idea for an exercise.
One of the features is called `foreign_worker`. This may be considered a sensitive feature: should $f$ be allowed to discriminate based only on that?

Try to use CoGS to search whether a CE can be found for (one or more) `x` who is a foreign worker and for whom `bbm` says `high risk`, that recommends not to be a foreign worker.
To do that, you can set the plausibility constraints to "`=`" for all features except for `foreign_worker`.