# Implementation examples for different usages of CARLA

## CARLA as recourse library

In the following cell we show how to use CARLA with our catalog black-box-models and data.

In [1]:
import os
os.chdir("../")

from carla import DataCatalog, MLModelCatalog, log
from carla.recourse_methods import GrowingSpheres

# load catalog dataset
data_name = "adult"
dataset = DataCatalog(data_name)

# load artificial neural network from catalog
model = MLModelCatalog(dataset, "ann")

# get some factuals from the data to generate counterfactual examples
factuals = dataset.raw.iloc[:10]

# load recourse model with model specific hyperparameter
gs = GrowingSpheres(model)

# generate counterfactual examples
counterfactuals = gs.get_counterfactuals(factuals)

log.info(counterfactuals.head(5))

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


Using Python-MIP package version 1.12.0


Using TensorFlow backend.


From d:\eigene dateien\uni\master\4_semester_ss21\masterarbeit\gitroot\carla\venv\lib\site-packages\tensorflow\python\ops\init_ops.py:97: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
From d:\eigene dateien\uni\master\4_semester_ss21\masterarbeit\gitroot\carla\venv\lib\site-packages\tensorflow\python\ops\init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
        age    fnlwgt  education-num  capital-gain  capital-loss  ...  \
0  0.301370  0.075433       0.891804      0.036070      0.065560  ...   
1  0.452055  0.071319       0.781081      0.041491      0.022579  ...   
2  0.

If a user is interested in using its own black-box-model or dataset, we provide an easy-to-use interface in CARLA to
wrap every possible model or dataset. Below we want to give a pseudo-code implementation of such an use-case.


In [None]:
from carla import Data, MLModel
from carla.recourse_methods import GrowingSpheres

# first implement the dataset wrapper
class MyOwnData(Data):
    def __init__(self):
        # the dataset could be loaded in the constructor
        self._dataset = load_dataset_from_disk()

    def categoricals(self):
        # this property contains a list of all categorical features
        return [...]

    def continous(self):
        # this property contains a list of all continuous features
        return [...]

    def immutables(self):
        # this property contains a list of features which should not be changed by the recourse method
        return [...]

    def target(self):
        # this property contains the feature name of the target column
        return "label"

    def raw(self):
        # this property contains the not encoded and not normalized, raw dataset
        return self._dataset

# second, implement the black-box-model wrapper
class MyOwnModel(MLModel):
    def __init__(self, data):
        super().__init__(data)
        # the constructor can be used to load or build an arbitrary black-box-model
        self._mymodel = load_model()

        # this property contains a fitted scaler to normalize input data
        # MinMaxScaler from sklearn is predefined, but can be redefined by every other sklearn scaler
        self.scaler = MySklearnScaler().fit()

        # the same is possible for data encoding
        # OneHotEncoder from sklearn with dropped first column for binary data is predefined, but can be
        # changed into any other sklearn encoder.
        self.encoder = MySklearnEncoder.fit()

    def feature_input_order(self):
        # this property contains a list of the correct input order of features for the ml model
        return [...]

    def backend(self):
        # this property contains a string with the used backend of the model
        return "pytorch"

    def raw_model(self):
        # this property contains the fitted/ loaded black-box-model
        return self._mymodel

    def predict(self, x: Union[np.ndarray, pd.DataFrame]):
        # the predict function outputs the continous prediction of the model, similar to sklearn.
        return self._mymodel.predict(x)

    def predict_proba(self, x: Union[np.ndarray, pd.DataFrame]):
        # the predict_proba method outputs the prediction as class probabilities, similar to sklearn
        return self._mymodel.predict_proba(x)


# after implementing the user-specific model and dataset, the call of the recourse method,
# and the generation of counterfactuals stays the same.
dataset = MyOwnData()
model = MyOwnModel(dataset)

# get some factuals from the data to generate counterfactual examples
factuals = dataset.raw.iloc[:10]

# load recourse model with model specific hyperparameter
gs = GrowingSpheres(model)

# generate counterfactual examples
counterfactuals = gs.get_counterfactuals(factuals)

## CARLA for research groups

New recourse methods can be implemented via a simple interface to benchmark new methods with already existing ones.
The following example shows a pseudo-code example of how to integrate new recourse methods into CARLA.

In [None]:
from carla import RecourseMethod

# similar to data- and model wrapper, call the recourse method wrapper
class MyRecourseMethod(RecourseMethod):
    def __init__(self, mlmodel):
        super().__init__(mlmodel)
        # the constructor can be used to load the recourse method,
        # or construct everything necessary

    def get_counterfactuals(self, factuals: pd.DataFrame):
        # this property is responsible to generate and output
        # encoded and scaled counterfactual examples
        # as pandas DataFrames
        return counterfactual_examples


## Benchmarking recourse methods

The following will show a simple way to use the Benchmarking-class for every wrapped recourse method.

In [2]:
from carla import Benchmark

# first initilize the benchmarking class by passing
# black-box-model, recourse method, and factuals into it
benchmark = Benchmark(model, gs, factuals)

# now you can decide if you want to run all measurements
# or just specific ones.

# lets first compute the distance measure
distances = benchmark.compute_distances()

# now run all implemented measurements and create a
# DataFrame which consists of all results
results = benchmark.run_benchmark()

print(results.head(5))

   Distance_1  Distance_2  Distance_3  Distance_4  Constraint_Violation  \
0         6.0    1.241542    1.015364         1.0                     1   
1         6.0    1.086494    1.001986         1.0                     0   
2         7.0    2.152139    2.006682         1.0                     0   
3         6.0    1.166120    1.007252         1.0                     0   
4         6.0    1.050702    1.000691         1.0                     0   

   Redundancy  y-Nearest-Neighbours  Success_Rate  Average_Time  
0           4              0.285714           0.7      0.010788  
1           4                   NaN           NaN           NaN  
2           4                   NaN           NaN           NaN  
3           3                   NaN           NaN           NaN  
4           5                   NaN           NaN           NaN  
