In the following notebook we will show how you can use the CARLA library.

## Data

Before we can do anything else we need some data.
You could import one of the datasets in our [catalog](https://carla-counterfactual-and-recourse-library.readthedocs.io/en/latest/data.html#module-data.catalog.catalog),
however maybe you want to use your own data instead.

In [1]:
import warnings
warnings.filterwarnings('ignore')

from carla.data.catalog import CsvCatalog

Using TensorFlow backend.


[INFO] Using Python-MIP package version 1.12.0 [model.py <module>]


In [2]:
continuous = ["age", "fnlwgt", "education-num", "capital-gain", "hours-per-week", "capital-loss"]
categorical = ["marital-status", "native-country", "occupation", "race", "relationship", "sex", "workclass"]
immutable = ["age", "sex"]

dataset = CsvCatalog(file_path="adult.csv",
                     continuous=continuous,
                     categorical=categorical,
                     immutables=immutable,
                     target='income')

print(dataset.df)

            age    fnlwgt  education-num  capital-gain  capital-loss  ...  \
0      0.301370  0.044131       0.800000      0.021740           0.0  ...   
1      0.452055  0.048052       0.800000      0.000000           0.0  ...   
2      0.287671  0.137581       0.533333      0.000000           0.0  ...   
3      0.493151  0.150486       0.400000      0.000000           0.0  ...   
4      0.150685  0.220635       0.800000      0.000000           0.0  ...   
...         ...       ...            ...           ...           ...  ...   
48827  0.301370  0.137428       0.800000      0.000000           0.0  ...   
48828  0.643836  0.209130       0.533333      0.000000           0.0  ...   
48829  0.287671  0.245379       0.800000      0.000000           0.0  ...   
48830  0.369863  0.048444       0.800000      0.054551           0.0  ...   
48831  0.246575  0.114919       0.800000      0.000000           0.0  ...   

       occupation_Other  race_White  relationship_Non-Husband  sex_Male  \


## Model

Now that we have the data loaded we also need a classification model.
You could define your own [model](https://carla-counterfactual-and-recourse-library.readthedocs.io/en/latest/examples.html#black-box-model),
however here we will show how you can train one of our [catalog](https://carla-counterfactual-and-recourse-library.readthedocs.io/en/latest/mlmodel.html#module-models.catalog.catalog) models.
Note that depending on your data you might need to tweak the training hyperparameters.

In [3]:
from carla.recourse_methods.catalog.focus.tree_model import ForestModel, XGBoostModel

 [deprecation_wrapper.py __getattr__]


In [4]:
ml_model = XGBoostModel(dataset)


[0]	validation_0-logloss:0.58379	validation_1-logloss:0.58367
[1]	validation_0-logloss:0.52351	validation_1-logloss:0.52342
[2]	validation_0-logloss:0.48537	validation_1-logloss:0.48545
[3]	validation_0-logloss:0.46049	validation_1-logloss:0.45962
[4]	validation_0-logloss:0.44151	validation_1-logloss:0.44143


## Recourse

Now that we have both the data, and a model we can start using CARLA to generate counterfactuals.
You can pick a [recourse method](https://carla-counterfactual-and-recourse-library.readthedocs.io/en/latest/recourse.html) from the catalog, or implement one yourself.
In the following example we are getting negative labeled samples for which we want counterfactuals.

In [5]:
from carla.models.negative_instances import predict_negative_instances
import carla.recourse_methods.catalog as recourse_catalog

In [6]:
# get factuals
factuals = predict_negative_instances(ml_model, dataset.df)
test_factual = factuals.iloc[:5]

hyperparams = {
    "optimizer": "adam",
    "lr": 0.001,
    "n_class": 2,
    "n_iter": 1000,
    "sigma": 1.0,
    "temperature": 1.0,
    "distance_weight": 0.01,
    "distance_func": "l1",
}

focus = recourse_catalog.FOCUS(ml_model, dataset, hyperparams)
df_cfs = focus.get_counterfactuals(test_factual)

AttributeError: module 'carla.recourse_methods.catalog' has no attribute 'FeatureTweak'