
<center><img src="https://raw.githubusercontent.com/dssg/aequitas/master/src/aequitas_webapp/static/images/aequitas_flow_header.svg"></center>

Aequitas Flow is an open-source project for **research data scientists** and **practitioners** to experiment with Fair ML and aid in finding the best models or methods for a given dataset with fairness concerns.

In this notebook, we will **perform an experiment** in **Aequitas Flow**.

The steps for this end are:
1. Define and configure the included methods in the experiment.
2. Configure the dataset or datasets in the experiment.
3. Configure the experiment parameters.
4. Run the experiment.
5. Read and visualize the final results.

The objective of this project is to **standardize** and **democratize** the usage of FairML methods. This is done by abstracting several boilerplate code, such as dataset loading and spliting, method instantiation, fitting and predicting, and hyperparameter optimization into a common and easy-to-use interface. This culminates in the configuration and running of an Aequitas Flow `Experiment`.



We will start off by installing the adequate version of the `aequitas` package in the Colab runtime environment. This is common to all notebooks in the package ran in Colab.

In [None]:
# Install Aequitas and Fairflow (TODO: change to release version afterwards)
!pip install git+https://github.com/dssg/aequitas.git@release-fixes &> /dev/null
# This only needs to run once, or after you lose your runtime environment in Colab

In [None]:
# Cleaning the default logger of Google Colab (logs appear repeated otherwise)
from aequitas.flow.utils.logging import clean_handlers

clean_handlers()

---

## FairML Methods

In Fairflow, methods are the algorithms implemented for the solution of FairML problems. These are separated into **three different** categories, depending on the main tasks they perform:

- **Pre-processing** methods alter the input data.
- **In-processing** methods score the input data.  (Making a new category for a base estimator / model?)
- **Post-processing** methods transform the input scores.

All methods, independently of the category they are included in, have a fitting step.

Let's see how to **create a configuration** a generic method. First, we have to specify the **classpath** for the method. We can check the implemented methods in Fairflow [in the official documentation](). Then, we have to specify specific **arguments** which will be used in the experiment for sampling. The prototype of a configuration should follow this format:

##### **`method_config.yaml`**
```yaml
<method_name>:
    classpath: <python.path.to.Class>
    args:
        <arg_1>:  # Categorical argument
            - <cat_value_1>
            - <cat_value_2>
            - ...
            - <cat_value_n>
        <arg_2>:
            type: int  # Numerical argument (int)
            range: [1, 10]  # Range of possible values
            log: False  # Sample from uniform distribution
        <arg_3>:
            type: float  # Numerical argument (float)
            range: [0.01, 1]
            log: True  # sample from log-uniform distribution
        ...
        <arg_n>:
            - True  # We want a fixed value of True in this arg.
```

We provide examples of configurations in a python dictionaries bellow.

Note that the package accepts configurations in both `.yaml` and python `dict` format.

We will be creating configurations in the dictionary format for:

- **LightGBM**
- **FairGBM** (in-)
- **Sampling** (pre-) **+ Random Forest**
- **Logistic Regression + Thresholding** (post-)

In this example, we will have a mix of pre-, in-, and post-processing FairML techniques. We will also train LightGBM models, which maximize only performance.

In [None]:
lightgbm_config = {
    'lightgbm': {
        'classpath': 'aequitas.flow.methods.base_estimator.lightgbm.LightGBM',
        'args': {
            'boosting_type': ['dart', 'gbdt'],
            'enable_bundle': [False,],
            'n_estimators': {'type': 'int', 'range': [10, 100]},
            'min_child_samples': {'type': 'int', 'range': [1, 500], 'log': True},
            'learning_rate': {'type': 'float', 'range': [0.001, 0.1]},
            'random_state': [42,],
        }
    }
}

fairgbm_config = {
    'fairgbm': {
        'classpath': 'aequitas.flow.methods.inprocessing.fairgbm.FairGBM',
        'args': {
            'global_constraint_type': ['FPR,FNR'],
            'global_target_fpr': [0.05],
            'global_target_fnr': {'type': 'float', 'range': [0.4, 0.6]},
            'constraint_type': ['fpr'],
            'constraint_fpr_threshold': [0],
            'proxy_margin': [1],
            'multiplier_learning_rate': {'type': 'float', 'range': [0.00001, 1.0], 'log': True},
            'constraint_stepwise_proxy': 'cross_entropy',
            'boosting_type': ['dart', 'gbdt'],
            'enable_bundle': [False,],
            'n_estimators': {'type': 'int', 'range': [10, 100]},
            'min_child_samples': {'type': 'int', 'range': [1, 500], 'log': True},
            'learning_rate': {'type': 'float', 'range': [0.001, 0.1]},
            'random_state': [42,]
        }
    }
}

sampling_config = {
    'sampling': {
        'classpath': 'aequitas.flow.methods.preprocessing.PrevalenceSampling',
        'args': {}
    }
}

random_forest_config = {
    'random_forest': {
        'classpath': 'aequitas.flow.methods.base_estimator.random_forest.RandomForest',
        'args': {
            'n_estimators': {'type': 'int', 'range': [10, 100]},
            'min_samples_leaf': {'type': 'int', 'range': [1, 500], 'log': True},
            'random_state': [42,]
        }
    }
}

group_threshold_config = {
    'group_threshold': {
        'classpath': 'aequitas.flow.methods.postprocessing.GroupThreshold',
        'args': {
            'threshold_type': 'fpr',
            'threshold_value': 0.05,
        }
    }
}

logistic_regression_config = {
    'logistic_regression': {
        'classpath': 'aequitas.flow.methods.base_estimator.logistic_regression.LogisticRegression',
        'args': {
            'penalty': ['l2', None],
            'tol': {'type': 'float', 'range': [1e-5, 1], 'log': True},
            'C': {'type': 'float', 'range': [1e-5, 1e2], 'log': True},
            'random_state': [42, ]
        }
    }
}

One important aspect of Fairflow is the combination of different types of methods to create a single method where the **transformation** of input data is done by the **pre-processing** method, the scoring of the data is done by the **in-processing** method, and the transformation of scores is done by the **post-processing** method. To do so, we must create an additional configuration which calls the different methods' configurations.
##### **`composed_method_config.yaml`**
```yaml
<composed_method_name>:
    type: "pre, in, post-processing"

defaults:
    <composed_method_name>/preprocessing: <preprocessing_method>
    <composed_method_name>/inprocessing: <inprocessing_method>
    <composed_method_name>/postprocessing: <postprocessing_method>
```

For this configuration, the composed method will fetch the hyperparameter search space for all of the methods, relative to itself. So, for example, if the path for the config is `~/composed_method_config.yaml`, the package expects to exist the files:
-  `~/<composed_method_name>/preprocessing/<preprocessing_method>.yaml`
-  `~/<composed_method_name>/inprocessing/<inprocessing_method>.yaml`
-  `~/<composed_method_name>/postprocessing/<postprocessing_method>.yaml`


Note that for these methods, only an in-processing method is required. If the pre-processing or post-processing methods are omited, the transformation of the input data and scores is skipped, respectively.

Let's see how to do this in a python dictionary, with the previously created methods.

In [None]:
composed_config_lgbm = {'lightgbm':
  {
    'inprocessing': lightgbm_config,
    'type': 'base estimator',
  },
}

composed_config_fairgbm = {'fairgbm':
  {
    'inprocessing': fairgbm_config,
    'type': 'in-processing',
  },
}

composed_config_random_forest = {'Random Forest + Undersampling':
  {
    'preprocessing': sampling_config,
    'inprocessing': random_forest_config,
    'type': 'pre-processing',
  },
}

composed_config_logistic_regression = {'Logistic Regression':
  {
    'inprocessing': logistic_regression_config,
    'postprocessing': group_threshold_config,
    'type': 'post-processing',
  },
}

---

## Datasets

We will now **configure** a dataset to be used.

These follow a pattern similar to the methods, with **classpath**, **arguments**, and an additional keyword related to the **thresholding rule**.


##### **`dataset.yaml`**
```yaml
    <dataset_name>:
    classpath: <python.path.to.Class>
    threshold:
        threshold_type: <type_of_threshold>
        threshold_value: <value>
    args:
        <arg_1>: <val_1>
        <arg_2>: <val_2>
        ...
        <arg_n>: <val_n>
```

In this notebook, we will use a sample of the Bank Account Fraud dataset. In this dataset, the objective is to determine if a given individual made a fraudulent attempt at opening a bank account (positive instance).

Here, we must define the dataset variant in the keyword arguments for the class, passed in the **args** key. This determines the dataset that will be selected.

Note that it this this configuration you should change to try a different dataset, including user-defined ones.

In [None]:
baf_sample_config = {
    'baf_sample': {
        'classpath': 'aequitas.flow.datasets.BankAccountFraud',
        'threshold': {
            'threshold_type': 'fpr',
            'threshold_value': 0.05,
        },
        'args': {'variant': 'Sample'},
    }
}

---
## Experiment

Finally, we can create a configuration for the experiment. This constitutes a simple configuration, with the number of trials per algorithm to try, the datasets, and the algorithms themselves. We can create paralel jobs to train



##### **`experiment.yaml`**
```yaml
optimization:
  n_trials: <trials_per_method>
  n_jobs: <paralel_jobs>
  sampler: <optuna_sampler>
  sampler_args:
    <arg_1>: <val_1>
    <arg_2>: <val_2>
    ...
    <arg_n>: <val_n>
    
datasets:
  - <dataset_1>
  - <dataset_2>
  ...
  - <dataset_n>

methods:
  - <method_1>
  - <method_2>
  ...
  - <method_n>
```

The configuration for experiments shares the same property of the `composed_methods` in which it will read configurations linked to some of the fields, in this case `datasets` and `methods`. To this end, the configurations of these components **must be** in a directory at the same level of the experiment configuration. For example, if the experiment file is in `~/experiment.yaml`, then the datasets file must be in `~/datasets/<dataset_1>.yaml`, and so on. The same applies to methods, as they should be in `~/methods/<method_1>.yaml`, and so on.

The config can also be defined as a dictionary, as shown bellow.

In [None]:
config = {
  'optimization': {
      'n_trials': 50,  # Number of runs per algorithm.
      'n_jobs': 1,
      'sampler': 'RandomSampler',  # The sampler for hyperparameters.
      'sampler_args': {'seed': 42},
  },
  'datasets': [baf_sample_config],
  'methods': [composed_config_lgbm, composed_config_fairgbm, composed_config_logistic_regression, composed_config_random_forest],
}

You can see the example from all the configurations above as `yaml` if you download the examples cell bellow. This will be available in the directory `examples/configs/colab_configs`.

In [None]:
# This cell will download a model from the repository. You do not need to run it if you have your won model.
from aequitas.flow.utils.colab import get_examples

get_examples("configs")

---
#### Instantiating the Experiment

Now, with the configurations correctly set, all that is left to do is instantiating an `Experiment` object with the configurations, and running it.

In [None]:
from aequitas.flow.experiment import Experiment
from pathlib import Path
from omegaconf import DictConfig

experiment = Experiment(config=DictConfig(config), name="baf_exp")

experiment.run()

Note that it is possible to run a pre-defined grid and number of trials with the `DefaultExperiment` class.

We can now check some of the results from the experiment we ran.

In [None]:
from aequitas.flow.utils.results import read_results

from pathlib import Path

results = read_results(Path("artifacts/baf_exp"))

In [None]:
# Some of the models aren't able to reach the target 5% FPR
# We will filter these results out, with a slack of +/-2% FPR
dataset_results = results["baf_sample"]

for method, method_results in dataset_results.items():
    filtered_method_results = []
    for iteration in method_results:
        if 0.03 < iteration.validation_results["fpr"] < 0.07:  # Here we implement the slack
            filtered_method_results.append(iteration)
    dataset_results[method] = filtered_method_results

results["baf_sample"] = dataset_results

In [None]:
from aequitas.flow.plots.pareto import Plot

plot = Plot(
    results,
    "baf_sample",
    "Predictive Equality",
    "TPR",
    split="validation"
)

plot.visualize()

Or perform a bias audit to a specific model in the results.

In [None]:
from aequitas.flow.datasets import BankAccountFraud

dataset = BankAccountFraud("Sample")
dataset._download = False
dataset.load_data()
dataset.create_splits()

plot.bias_audit(119, dataset.test, "customer_age_bin", metrics=["fpr"], results_path="artifacts/baf_exp")