# Using Aequitas for Bias and Fairness in Data Science Systems
## Exploring Aequitas's visualization tools both to audit a single model for fairness as well as lookinng at trade-offs between fairness and accuracy metrics

## What does this notebook do?

This notebook describes how to use the results of Aequitas to visualize fairness metrics for your models. In order to use, you must have first run Triage with a bias_audit_config and populated the Aequitas database tables. This particular visualization example uses the 2014 Donors Choose data.

## Install dependencies, import packages and data

In [None]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:75% !important; }</style>"))
#import yaml
#import os
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
#from aequitas.group import Group
#from aequitas.bias import Bias
#from aequitas.fairness import Fairness
import triage.component.postmodeling.fairness.aequitas_utils
import aequitas.plot as ap
import sqlalchemy
#DATAPATH = 'https://github.com/dssg/fairness_tutorial/raw/master/data/'
#DPI = 200

## Load Database and Select Data From Aequitas
Your database configuration file should be in the following form:
<h3 align="center">
host: __ 
user: __
db: __
pass: __
port: __
</h3>

In [None]:
dbconfig = {}
with open('database.yaml') as file:
    dbconfig = yaml.load(file, Loader=yaml.FullLoader)

conn = sqlalchemy.create_engine('postgres://', connect_args=dbconfig)
bdf = aequitas_utils.get_aequitas_results(conn)

## Define Attributes to Audit and Reference Group for each Attribute

In [None]:
attributes_and_reference_groups={'poverty_level':'lower', 'metro_type':'suburban_rural', 'teacher_sex':'male'}
attributes_to_audit = list(attributes_and_reference_groups.keys())

## Select fairness metric(s) that we care about

In [None]:
metrics = ['tpr']

## Define  Disparity Tolerance 

In [None]:
disparity_tolerance = 1.30

## Look at Audit Results

Now we are going to focus our analysis on the fairness metric(s) of interest in this case study: TPR across different groups. The aequitas plot module exposes the disparities_metrics() plot, which displays both the disparities and the group-wise metric results side by side.

### Check for Fairness in Poverty Level 

In [None]:
ap.disparity(bdf, metrics, 'poverty_level', fairness_threshold = disparity_tolerance)

In [None]:
ap.absolute(bdf, metrics, 'poverty_level', fairness_threshold = disparity_tolerance)

### Check for Fairness in Metro_Type (where the school is based)

In [None]:
ap.disparity(bdf, metrics, 'metro_type', fairness_threshold = disparity_tolerance)

In [None]:
ap.absolute(bdf, metrics, 'metro_type', fairness_threshold = disparity_tolerance)

### Check for Fairness in the Sex of the Teacher submitting the project 

In [None]:
ap.disparity(bdf, metrics, 'teacher_sex', fairness_threshold = disparity_tolerance)

In [None]:
ap.absolute(bdf, metrics, 'teacher_sex', fairness_threshold = disparity_tolerance)

### Deeper Dive into the audit results

#### Look at the underlying data: Disparities for all metrics 

In [None]:
bdf[['attribute_name', 'attribute_value'] + b.list_disparities(bdf)]

#### Look at the underlying data: All Metrics

In [None]:
absolute_metrics = g.list_absolute_metrics(xtab)
xtab[['attribute_name', 'attribute_value'] + absolute_metrics]

#### Look at the underlying data: All raw counts

In [None]:
xtab[[col for col in xtab.columns if col not in absolute_metrics]]