## Setting Up and Importing Libraries
In this step, we'll import the necessary libraries and modules. We'll also ensure the correct path is set to access our custom modules.

In [1]:
# Import necessary libraries
import pandas as pd
import os
import sys
import json

os.chdir(os.path.join(os.getcwd(), '..'))
print(os.getcwd())

from src.cfsearch import CFsearch
from src.dataset import Dataset
from src.explainable_model import ExplainableModel
from src.ceinstance.instance_sampler import CEInstanceSampler
from src.config import Config
from src.transformer import Transformer
from src.ceinstance.instance_factory import InstanceFactory
from src import load_datasets

/home/rita/TRUST_AI/trustframework/trustCE


## Loading Configuration
Here, we'll load our configuration files which dictate various parameters for our counterfactual search. It includes dataset details, feature management, and other related configurations.

In [2]:
# Load configuration
config_file_path = "config/conf.yaml"
config = Config(config_file_path)

with open("config/constraints_conf.json", 'r') as file:
    constraints = json.load(file)

print("Configuration Loaded:")
print(config)

Configuration Loaded:
<src.config.Config object at 0x7fcc9426cd60>


## Preparing Dataset and Model
In this section, we initialize our dataset, model, and the required transformers. We'll also define a sample instance for which we wish to find the counterfactuals.

In [3]:
# Set the target instance path
target_instance_json = "input_instance/instance.json"

# Load the dataset and set up the necessary objects
load_datasets.download("homeloan")

In [4]:
data = Dataset(config.get_config_value("dataset"), "Loan_Status")
normalization_transformer = Transformer(data, config)
instance_factory = InstanceFactory(data)
sampler = CEInstanceSampler(config, normalization_transformer, instance_factory)

model = ExplainableModel(config.get_config_value("model"))

Features verified
Continious features: ['ApplicantIncome', 'CoapplicantIncome', 'LoanAmount', 'Loan_Amount_Term']
Categorical features: ['Gender', 'Married', 'Dependents', 'Education', 'Self_Employed', 'Property_Area', 'Credit_History']
Dataset preprocessed
Feature: Loan_Amount_Term
Range: [-5.044846090672854, 2.106513522957348]
Feature: Self_Employed
Range: [0, 1]
Feature: CoapplicantIncome
Range: [-0.548056854219573, 13.372167288446008]
Feature: LoanAmount
Range: [-1.5999485916282457, 6.4030605082645256]
Feature: Credit_History
Range: [0, 1]
Feature: Property_Area
Range: [0, 2]
Feature: Married
Range: [0, 1]
Feature: Education
Range: [0, 1]
Feature: Dependents
Range: [0, 3]
Feature: ApplicantIncome
Range: [-0.8484208485011342, 12.13039262846177]
Feature: Gender
Range: [0, 1]
Constraint Type: immutable
Sanity check for model
Model input shape is  11


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


## Finding Counterfactuals
With everything set up, we'll now search for counterfactuals for our sample instance using the CFsearch object.

In [5]:
# Create a CFsearch object
config_for_cfsearch = config.get_config_value("cfsearch")
search = CFsearch(normalization_transformer, model, sampler, config,
                  algorithm=config_for_cfsearch["optimizer"], 
                  distance_continuous=config_for_cfsearch["continuous_distance"], 
                  distance_categorical=config_for_cfsearch["categorical_distance"], 
                  loss_type=config_for_cfsearch["loss_type"], 
                  coherence=config_for_cfsearch["coherence"],
                  objective_function_weights=config_for_cfsearch["objective_function_weights"])

# Load target instance and find counterfactuals
with open(target_instance_json, 'r') as file:
    target_instance_json_content = file.read()

target_instance = instance_factory.create_instance_from_json(target_instance_json_content)

In [6]:
counterfactuals = search.find_counterfactuals(target_instance, 1, "opposite", 50)

Label encoder:  Male  for feature  Gender
Label encoder:  Yes  for feature  Married
Label encoder:  2  for feature  Dependents
Label encoder:  Graduate  for feature  Education
Label encoder:  No  for feature  Self_Employed
Label encoder:  Urban  for feature  Property_Area
Label encoder:  1.0  for feature  Credit_History


## Evaluation and Visualization
Once the counterfactuals are generated, it's crucial to evaluate and visualize them. This helps in understanding how the counterfactuals differ from the original instance and assessing their quality.

In [7]:
# Evaluate and visualize the counterfactuals
search.evaluate_counterfactuals(target_instance, counterfactuals)

# Display the counterfactuals and original instance in the notebook
display_df = search.visualize_as_dataframe(target_instance, counterfactuals)
display(display_df)

Feature ApplicantIncome changed its value from -0.13149591358318327 to 7.048360574578692
Feature CoapplicantIncome changed its value from -0.548056854219573 to 4.19365367062453
Feature LoanAmount changed its value from -0.15222625083722335 to 1.6498903038095103
Feature Loan_Amount_Term changed its value from 0.27283157074447584 to 1.0887094906885637
Feature Gender changed its value from 1 to 0
Feature Married changed its value from 1 to 1
Feature Dependents changed its value from 2 to 1
Feature Education changed its value from 0 to 0
Feature Self_Employed changed its value from 0 to 0
Feature Property_Area changed its value from 2 to 1
Feature Credit_History changed its value from 1 to 1
CF instance:  {'Gender': 0, 'Married': 1, 'Dependents': 1, 'Education': 0, 'Self_Employed': 0, 'Property_Area': 1, 'Credit_History': 1, 'ApplicantIncome': 7.048360574578692, 'CoapplicantIncome': 4.19365367062453, 'LoanAmount': 1.6498903038095103, 'Loan_Amount_Term': 1.0887094906885637}
Distance continu

Unnamed: 0,Gender,Married,Dependents,Education,Self_Employed,Property_Area,Credit_History,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term
0,1,1,2,0,0,2,1,-0.131496,-0.548057,-0.152226,0.272832



Counterfactual set (new outcome: 0)


Unnamed: 0,Gender,Married,Dependents,Education,Self_Employed,Property_Area,Credit_History,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term
0,0,-,1,-,-,1,-,7.048360574578692,4.19365367062453,1.6498903038095103,1.0887094906885637


None

## Storing the Results
For reproducibility and further analysis, we'll store the counterfactuals and their evaluations in designated folders.

In [8]:
# Store results
search.store_counterfactuals(config.get_config_value("output_folder"), "first_test")
search.store_evaluations(config.get_config_value("output_folder"), "first_eval")