## Setting Up and Importing Libraries
In this step, we'll import the necessary libraries and modules. We'll also ensure the correct path is set to access our custom modules.

In [2]:
# Import necessary libraries
import os
import json


In [3]:
# Make sure that the current working directory is the parent directory of the project
os.chdir('/home/rodion/TrustAI/TalkToModel-TrustAI')
print(os.getcwd())

/home/rodion/TrustAI/TalkToModel-TrustAI


In [4]:
! conda env list

# conda environments:
#
base                     /home/rodion/miniconda3
gpgomenv                 /home/rodion/miniconda3/envs/gpgomenv
trustce               *  /home/rodion/miniconda3/envs/trustce
ttm                      /home/rodion/miniconda3/envs/ttm



In [5]:
from trustce.cfsearch import CFsearch
from trustce.dataset import Dataset
from trustce.cemodels.sklearn_model import SklearnModel
from trustce.ceinstance.instance_sampler import CEInstanceSampler
from trustce.config import Config
from trustce.transformer import Transformer
from trustce.ceinstance.instance_factory import InstanceFactory

## Loading Configuration
Here, we'll load our configuration files which dictate various parameters for our counterfactual search. It includes dataset details, feature management, and other related configurations.

In [6]:
# Load configuration
config_file_path = "cfe/conf_bc.yaml"
config = Config(config_file_path)

with open("cfe/constraints_conf_bc.json", 'r') as file:
    constraints = json.load(file)

print("Configuration Loaded:")
print(config)

Configuration Loaded:
<trustce.config.Config object at 0x7fe455586570>


## Preparing Dataset and Model
In this section, we initialize our dataset, model, and the required transformers. We'll also define a sample instance for which we wish to find the counterfactuals.

In [7]:
data = Dataset(config.get_config_value("dataset"), "y")
normalization_transformer = Transformer(data, config)
instance_factory = InstanceFactory(data)
sampler = CEInstanceSampler(config, normalization_transformer, instance_factory)
model = SklearnModel(config.get_config_value("model"))

Features verified
Continious features: ['radius', 'texture', 'perimeter', 'area', 'smoothness', 'compactness', 'concavity', 'concave_points', 'symmetry', 'fractal_dimension']
Categorical features: []
Dataset preprocessed
Feature: fractal_dimension
Range: [-1.8471166348711128, 5.081944782090663]
Feature: concavity
Range: [-1.129050066092012, 4.306684366839897]
Feature: symmetry
Range: [-2.6892934559130643, 4.460658772550865]
Feature: compactness
Range: [-1.663070884363049, 4.098612101932611]
Feature: radius
Range: [-2.020664657895016, 3.9271765688611073]
Feature: perimeter
Range: [-1.9786942699538923, 3.9429969951971064]
Feature: concave_points
Range: [-1.2659948385422117, 3.9170965902340282]
Feature: area
Range: [-1.428632714617807, 5.112255531344153]
Feature: texture
Range: [-2.3099404763139924, 3.535857910129814]
Feature: smoothness
Range: [-3.15671998869921, 3.488590578235572]


https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


In [8]:
# Load target instance 
import pandas as pd

dataset = pd.read_csv(config.get_config_value("dataset")["path"], index_col=0)
target_instance_data = dataset.iloc[0]

target_instance = instance_factory.create_instance(target_instance_data)

In [9]:
target_instance.get_values_dict()

{'radius': 11.85,
 'texture': 17.46,
 'perimeter': 75.54,
 'area': 432.7,
 'smoothness': 0.08372,
 'compactness': 0.05642,
 'concavity': 0.02688,
 'concave_points': 0.0228,
 'symmetry': 0.1875,
 'fractal_dimension': 0.05715}

## Finding Counterfactuals
With everything set up, we'll now search for counterfactuals for our sample instance using the CFsearch object.

In [10]:
# Create a CFsearch object
config_for_cfsearch = config.get_config_value("cfsearch")
search = CFsearch(normalization_transformer, model, sampler, config,
                  optimizer_name=config_for_cfsearch["optimizer"],
                  distance_continuous=config_for_cfsearch["continuous_distance"],
                  distance_categorical=config_for_cfsearch["categorical_distance"],
                  loss_type=config_for_cfsearch["loss_type"],
                  coherence=config_for_cfsearch["coherence"],
                  objective_function_weights=config_for_cfsearch["objective_function_weights"])

In [11]:
counterfactuals = search.find_counterfactuals(target_instance, number_cf=1, desired_class="opposite", maxiterations=50)

Not all conterfactuals are valid, but the closest instances reported


## Evaluation and Visualization
Once the counterfactuals are generated, it's crucial to evaluate and visualize them. This helps in understanding how the counterfactuals differ from the original instance and assessing their quality.

In [12]:
# Evaluate and visualize the counterfactuals
search.evaluate_counterfactuals(target_instance, counterfactuals)

# Display the counterfactuals and original instance in the notebook
display_df = search.visualize_as_dataframe(target_instance, counterfactuals)
display(display_df)

Feature radius changed its value from -0.6500347684977155 to -0.9854652716858642
probability_sign: [0. 0.], type: <class 'numpy.ndarray'>
required_label: 0, type: <class 'numpy.int64'>
Modified required_label: 0, type: <class 'numpy.int64'>
Feature texture changed its value from -0.4300675512127683 to -1.7731666069237133
probability_sign: [0. 0.], type: <class 'numpy.ndarray'>
required_label: 0, type: <class 'numpy.int64'>
Modified required_label: 0, type: <class 'numpy.int64'>
Feature perimeter changed its value from -0.6794495897759899 to -1.8102524812112857
probability_sign: [0. 0.], type: <class 'numpy.ndarray'>
required_label: 0, type: <class 'numpy.int64'>
Modified required_label: 0, type: <class 'numpy.int64'>
Feature area changed its value from -0.6262467630877121 to -0.025767735167848933
probability_sign: [0. 0.], type: <class 'numpy.ndarray'>
required_label: 0, type: <class 'numpy.int64'>
Modified required_label: 0, type: <class 'numpy.int64'>
Feature smoothness changed its v

Unnamed: 0,radius,texture,perimeter,area,smoothness,compactness,concavity,concave_points,symmetry,fractal_dimension
0,11.85,17.46,75.54,432.7,0.08372,0.05642,0.02688,0.0228,0.1875,0.05715



Counterfactual set (new outcome: [1])


Unnamed: 0,radius,texture,perimeter,area,smoothness,compactness,concavity,concave_points,symmetry,fractal_dimension
0,10.658422950165445,11.922914198735356,47.90625837239946,649.1276861319288,0.0897785163716816,0.0211103939090948,0.2326656181898375,0.0440523375853846,0.1891169108108564,0.0598085958906692


None

## Storing the Results
For reproducibility and further analysis, we'll store the counterfactuals and their evaluations in designated folders.

In [16]:
# Store results
search.store_counterfactuals(config.get_config_value("output_folder"), "bc_first_test")
search.store_evaluations(config.get_config_value("output_folder"), "bc_first_test")

Store counterfactuals to  cfe/output/bc_first_test_0.json
Store counterfactuals evaluation to  cfe/output/bc_first_test_eval_0.json


In [13]:
counterfactuals[0].get_values_dict()

{'radius': 10.658422950165445,
 'texture': 11.922914198735356,
 'perimeter': 47.90625837239946,
 'area': 649.1276861319288,
 'smoothness': 0.08977851637168165,
 'compactness': 0.02111039390909482,
 'concavity': 0.23266561818983753,
 'concave_points': 0.04405233758538468,
 'symmetry': 0.1891169108108564,
 'fractal_dimension': 0.05980859589066922}

In [14]:
model.predict(counterfactuals[0].to_numpy_array().reshape(1, -1))

array([1])

In [15]:
model.predict(target_instance.to_numpy_array().reshape(1, -1))

array([1])