# 1. Instalar DEON

https://alecokas.github.io/ethics/2020/03/11/tools-for-ethical-machine-learning-development.html

In [38]:
!pip install deon



In [39]:
!python -m deon --help

Usage: python -m deon [OPTIONS]

  Easily create an ethics checklist for your data science project.

  The checklist will be printed to standard output by default. Use the
  --output option to write to a file instead.

Options:
  -l, --checklist PATH  Override default checklist file with a path to a
                        custom checklist.yml file.
  -f, --format TEXT     Output format. Default is "markdown". Can be one of
                        [ascii, html, jupyter, markdown, rmarkdown, rst].
                        Ignored and file extension used if --output is passed.
  -o, --output PATH     Output file path. Extension can be one of [.txt,
                        .html, .ipynb, .md, .rmd, .rst]. The checklist is
                        appended if the file exists.
  -w, --overwrite       Overwrite output file if it exists. Default is False,
                        which will append to existing file.
  -m, --multicell       For use with Jupyter format only. Write checklist with
  

# 2. Crear lista de verificación ética

In [40]:
!python -m deon --output Lista.ipynb --checklist my_checklist.yml --overwrite

Checklist successfully written to file Lista.ipynb.


# 1. Instalar TransparentAI

In [41]:
!pip install transparentai --user



In [42]:
import numpy as np
import pandas as pd

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

from transparentai.datasets import load_adult
from transparentai import fairness

In [43]:
data = load_adult()
X, Y = data.drop(columns='income'), data['income']

In [44]:
X = X.select_dtypes('number')
Y = Y.replace({'>50K':1, '<=50K':0})

In [45]:
X_train, X_valid, Y_train, Y_valid = train_test_split(X, Y, test_size=0.33, random_state=42)

In [46]:
clf = RandomForestClassifier()
clf.fit(X_train,Y_train)

In [69]:
y_true       = Y_train.astype(float)
y_true_valid = Y_valid.astype(float)
y_pred       = clf.predict_proba(X_train)
y_pred_valid = clf.predict_proba(X_valid)

In [101]:
privileged_group = {
    'gender':['Male'],                
    'marital-status': lambda x: 'Married' in x,
    'race':['White']
}

df_valid = data.loc[X_valid.index,:]
df_train = data.loc[X_train.index,:]

In [103]:
res_train = fairness.model_bias(y_true, y_pred, df_train, privileged_group)
res_train['gender']

{'statistical_parity_difference': -0.06841011619362844,
 'disparate_impact': 0.5872402985025538,
 'equal_opportunity_difference': -0.04130756838043226,
 'average_odds_difference': -0.02065378419021613}

In [104]:
res_train = fairness.model_bias(y_true, y_pred, df_train, privileged_group, returns_text=True)
print(res_train['gender'])

The privileged group is predicted with the positive output 6.84% more often than the unprivileged group. This is considered to be fair.
The privileged group is predicted with the positive output 1.70 times more often than the unprivileged group. This is considered to be not fair.
For a person in the privileged group, the model predict a correct positive output 4.13% more often than a person in the unprivileged group. This is considered to be fair.
For a person in the privileged group, the model predict a correct positive output or a correct negative output 2.07% more often than a person in the unprivileged group. This is considered to be fair.
The model has 3 fair metrics over 4 (75%).


In [84]:
res_valid = fairness.model_bias(y_true_valid, y_pred_valid, df_valid, privileged_group, returns_text=True)

In [85]:
print(res_valid['gender'])

{'statistical_parity_difference': -0.13273015839435442, 'disparate_impact': 0.3365011427333495, 'equal_opportunity_difference': -0.04619575675547505, 'average_odds_difference': -0.023097878377737524}
