# Using Aequitas for Bias and Fairness in Data Science Systems
## Exploring Aequitas's visualization tools both to audit a single model for fairness as well as lookinng at trade-offs between fairness and accuracy metrics

## What does this notebook do?

This notebook describes how to use the results of Aequitas to visualize fairness metrics for your models. In order to use, you must have first run Triage with a bias_audit_config and populated the Aequitas database tables. This particular visualization example uses the 2014 Donors Choose data.

## Install dependencies, import packages and data

In [1]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:75% !important; }</style>"))
import yaml
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import triage.component.postmodeling.fairness.aequitas_utils as au
import aequitas.plot as ap
import sqlalchemy

## Load Database and Select Data From Aequitas
Your database configuration file should be in the following form:
<h3 align="center">
host: __  <br>
user: __  <br>
db: __  <br>
pass: __  <br>
port: __  <br>
</h3>

In [2]:
# Configure Database
dbconfig = {}
with open('database.yaml') as file:
    dbconfig = yaml.load(file, Loader=yaml.FullLoader)
dbconfig = {
    'host': dbconfig['host'],
    'user': dbconfig['user'],
    'database': dbconfig['db'],
    'password': dbconfig['pass'],
    'port': dbconfig['port']
    }
conn = sqlalchemy.create_engine('postgres://', connect_args=dbconfig)

# Load Data
bdf = au.get_aequitas_results(conn, parameter = "100_abs", model_id = 405)
bdf.head(5)

Unnamed: 0,model_id,subset_hash,tie_breaker,evaluation_start_time,evaluation_end_time,matrix_uuid,parameter,attribute_name,attribute_value,total_entities,...,Impact_Parity,FDR_Parity,FPR_Parity,FOR_Parity,FNR_Parity,TypeI_Parity,TypeII_Parity,Equalized_Odds,Unsupervised_Fairness,Supervised_Fairness
0,405,,worst,2012-05-01,2012-05-31,107dc250b46e9ef9b5bf4ac4b5d323fd,100_abs,school_metro,urban,5461,...,,,,,,,,,,
1,405,,worst,2012-05-01,2012-05-31,107dc250b46e9ef9b5bf4ac4b5d323fd,100_abs,school_metro,suburban,5461,...,,,,,,,,,,
2,405,,worst,2012-05-01,2012-05-31,107dc250b46e9ef9b5bf4ac4b5d323fd,100_abs,school_metro,rural,5461,...,,,,,,,,,,
3,405,,worst,2012-05-01,2012-05-31,107dc250b46e9ef9b5bf4ac4b5d323fd,100_abs,poverty_level,moderate poverty,5461,...,,,,,,,,,,
4,405,,worst,2012-05-01,2012-05-31,107dc250b46e9ef9b5bf4ac4b5d323fd,100_abs,poverty_level,low poverty,5461,...,,,,,,,,,,


## Select fairness metric(s) that we care about

In [3]:
metrics = ['tpr']

## Define  Disparity Tolerance 

In [4]:
disparity_tolerance = 1.30

## Look at Audit Results

Now we are going to focus our analysis on the fairness metric(s) of interest in this case study: TPR across different groups. The aequitas plot module exposes the disparities_metrics() plot, which displays both the disparities and the group-wise metric results side by side.

### Check for Fairness in Metro_Type (where the school is based)

In [5]:
%matplotlib inline
ap.disparity(bdf, metrics, 'school_metro', fairness_threshold = disparity_tolerance)

In [6]:
%matplotlib inline
ap.absolute(bdf, metrics, 'school_metro', fairness_threshold = disparity_tolerance)