# Mitigating Disparities

This demo shows how to run the `mitigate_disparity` scripot on a development dataset. 
Below, we demonstrate how to run `mitgate_disparity.py` from the command line using a model trained to predict risk of admission to the emergency department using the freely available [MIMIC-IV repository](https://www.nature.com/articles/s41597-022-01899-x). 

## Inputs

In addition to providing a dataset, the user should identify protected features by providing a list of column names corresponding to demographics and/or other variables over which fairness should be sought.

## Continuous Updating

This script may also be used to update a model with new data by passing a `starting_point` parameter. 
This allows models to be continously updated over time as new biases arise and dataset shift occurs, without having to start from scratch. 
Under the hood, this is done by setting `checkpoint=True` in the `FomoClassifier` object. 
See the [Fomo docs](https://cavalab.org/fomo/) for more information on options. 

You may also [browse the API](https://cavalab.org/interfair/api.html) for `mitigate_disparity.py`. 


Below, we run `mitigate_disparity.py` using a development dataset and specifying that we want to ensure fairness with respect to the features named ethnicity, gender, and insurance. 

In [None]:
%run ../mitigate_disparity.py \
    --dataset ../data/mimic/development_dataset.train.csv \
    --protected_features ethnicity,gender,insurance 

Calling `mitigate_disparity.py` will produce an `estimator.pkl` file that can be loaded for further analysis. 
We demonstrate this below.

## Visualize fairness/error trade-offs

Once training is done, we can view a set of candidate models. 
The red dot indicates the model that was selected. 
In addition to the default "PseudoWeights" approach, FOMO provides other multi-criteria decsion making (MCDM) algorithms via pymoo.

In [None]:
import pickle
with open('../estimator.pkl','rb') as f:
    est = pickle.load(f)
est.plot().show()

## check test set performance

This cell 

In [None]:
# add path to sys to import functions
import os 
import sys
dir_path = os.getcwd()
sys.path.insert(0,os.path.abspath(os.path.join(dir_path, '..')))

from utils import make_measure_dataset
import pandas as pd

import pickle
with open('../estimator.pkl','rb') as f:
    est = pickle.load(f)
    
df_test = pd.read_csv('../data/mimic/development_dataset.test.csv')
X_test = df_test.drop(columns='binary outcome')
y_test = df_test['binary outcome']
make_measure_dataset(est, 'fomo', X_test, y_test)

## measure change in disparity measures

Now that we have an updated model, we can check how our disparity measures have changed. 
Below we run `measure_disparity.py` with our new results and compare the results to the old ones. 

In [None]:
from measure_disparity import measure_disparity
measure_disparity('../fomo_model_mimic4_admission.csv', save_file='df_fairness.post.csv')

## Improvements over Baseline Model

If we compare with results from our baseline model in [demo_measure_disparity.ipynb](https://cavalab.org/interfair/demo_measure_disparity.html), we see that we have made a marked improvement to the maximum subgroup deviations on the test set:


In [None]:
from tabulate import tabulate
print(
    tabulate(
        [
        ["Max Subgroup Deviation in Metric (%)","Original","New"],
        ["Brier Score (MSE)",19.9, 19.3],
        ["Subgroup FNR", 20.4, 10.9],
        ["Subgroup FPR",86.0, 62.3],
        ["Positivity Rate",44.9, 28.8],
        ],
        headers="firstrow",
        tablefmt='rounded_outline'
)
     )

In [None]:
In summary, our new model has a more equal false negative rate among groups than before, which was our goal. 
In addition, we see reductions in the false positive rate deviations and differences in positivity rates. 

In terms of overall performance, we see a slight decrease, as we would also expect:

- AUROC: 0.881 -> 0.859
- AUPRC: 0.77 -> 0.74


By using the model visualization tools above, decision makers can decide whether this model, or another within the set, is a better fit to the use case, as needed. 