## Overview 

In this demo, we will show:

1. How to represent actions that are available to a person using an `ActionSet` 
2. How to provide a consumer who is denied a loan by a machine learning model a list of actionable changes to flip their prediction with a `Flipset`
3. How to verify that a model will provide recourse to all of its decision subjects using the `RecourseAuditor`




Our library provides tools for recourse reporting and verifcation.

We'll start by building a machine learning model for loan approval that we'll use for the demo. 
We'll use a processed version of the  `german` credit dataset from the [UCI Machine Learning repository](https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)). 
We'll predict the risk of repayment with a simple logistic regression model. 

In [16]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
import recourse as rs
from IPython.core.display import display, HTML
pd.options.display.float_format = '{:,.3f}'.format

# import data
url = 'https://raw.githubusercontent.com/ustunb/actionable-recourse/master/examples/paper/data/credit_processed.csv'
df = pd.read_csv(url)
y, X = df.iloc[:, 0], df.iloc[:, 1:]

In [17]:
# train a classifier
clf = LogisticRegression(max_iter = 1000, C=1./1000, solver='liblinear', penalty='l1')
clf.fit(X, y)
yhat = clf.predict(X)

In [18]:
pd.Series(clf.coef_[0], index=X.columns).to_frame('Coefficients')

Unnamed: 0,Coefficients
Married,0.0
Single,0.0
Age_lt_25,0.0
Age_in_25_to_40,0.0
Age_in_40_to_59,0.0
Age_geq_60,0.0
EducationLevel,0.332
MaxBillAmountOverLast6Months,0.0
MaxPaymentAmountOverLast6Months,0.0
MonthsWithZeroBalanceOverLast6Months,0.0


# Action Set

In [19]:
# customize the set of actions
A = rs.ActionSet(X)  ## matrix of features. ActionSet will set bounds and step sizes by default

# specify immutable variables
A['Married'].actionable = False

# education level
A['EducationLevel'].step_direction = 1  ## force conditional immutability.
A['EducationLevel'].step_size = 1  ## set step-size to a custom value.
A['EducationLevel'].step_type = "absolute"  ## force conditional immutability.
A['EducationLevel'].bounds = (0, 3)

A['TotalMonthsOverdue'].step_size = 1  ## set step-size to a custom value.
A['TotalMonthsOverdue'].step_type = "absolute"  ## discretize on absolute values of feature rather than percentile values
A['TotalMonthsOverdue'].bounds = (0, 12)  ## set bounds to a custom value.
A['MonthsWithLowSpendingOverLast6Months'].bounds = (0, 4)

# can only specify properties for multiple variables using a list
A[['Age_lt_25', 'Age_in_25_to_40', 'Age_in_40_to_59', 'Age_geq_60']].actionable = False
A[['TotalMonthsOverdue', 'TotalOverdueCounts', 'HistoryOfOverduePayments']].actionable = False
# todo: add one-hot constraint

In [27]:
A.df[['name','variable_type', 'actionable', 'lb', 'ub']].assign(lb=lambda df: df['lb'].astype(int)).assign(ub=lambda df: df['ub'].astype(int)).style.hide_index()

name,variable_type,actionable,lb,ub
Married,,False,0,1
Single,,True,0,1
Age_lt_25,,False,0,1
Age_in_25_to_40,,False,0,1
Age_in_40_to_59,,False,0,1
Age_geq_60,,False,0,1
EducationLevel,,True,0,3
MaxBillAmountOverLast6Months,,True,0,11321
MaxPaymentAmountOverLast6Months,,True,0,5480
MonthsWithZeroBalanceOverLast6Months,,True,0,4


In [28]:
# Person #13 is denied a loan (bad luck)
x = X.values[[13]]
yhat = clf.predict(x)[0]
yhat

0.0

In [29]:
# Let's produce a list of actions that can change this person's predictions
fs = rs.Flipset(x, action_set = A, clf = clf)
fs.populate(enumeration_type = 'distinct_subsets', total_items = 5)
html_str = fs.to_html()
display(HTML(html_str));

obtained 5 items in 0.3 seconds


Features to Change,Current Value,to,Required Value
MaxBillAmountOverLast6Months,2060,→,2166
MaxBillAmountOverLast6Months,2060,→,2166
MaxPaymentAmountOverLast6Months,100,→,110
MaxBillAmountOverLast6Months,2060,→,2166
MostRecentBillAmount,2010,→,1926
MaxBillAmountOverLast6Months,2060,→,2166
MostRecentPaymentAmount,100,→,105
MonthsWithLowSpendingOverLast6Months,0,→,1


In [31]:
def get_score(x, clf=clf):
    return ((x).dot(clf.coef_[0]) + clf.intercept_[0])[0]

In [32]:
get_score(x)

-0.00910430380250779

In [33]:
get_score(x + fs.actions[0])

0.010600351114092882

In [34]:
# Person for which flipset is empty
# To-Do find a person who has no recourse
# no recourse = #2020

# These are cases where people have no recourse
# They have no action to obtain a desired outcome
# We could provide them with principals reasons for the denial, but it would be misleading.

In [50]:
p_threshold = .95
score_threshold = np.log(p_threshold / (1. - p_threshold))

x = X.values[[649]]
fs = rs.Flipset(x, action_set = A, coefficients=clf.coef_[0], intercept=clf.intercept_[0] - score_threshold)
fs.populate(enumeration_type = 'distinct_subsets', total_items = 5)
html_str = fs.to_html()
display(HTML(html_str))

recovered all minimum-cost items
obtained 0 items in 0.0 seconds


# Recourse Verification

## Basic Audits

In [59]:
# Basic Recourse Verification with 1 Model
# Use the auditor on 100 points (live)
# It's super easy

# How many people are dnied
# How many have recourse?
# How difficult is that recourse?

from recourse import RecourseAuditor
pred_neg = clf.predict(X) == 0

ra = RecourseAuditor(
    action_set=A,
    coefficients=clf.coef_[0],
    intercept=clf.intercept_[0] - score_threshold,
    solver='python-mip'
)

audit_output = ra.audit(X.sample(100))

HBox(children=(IntProgress(value=0, max=98), HTML(value='')))

In [60]:
audit_output['feasible'].value_counts()

True     97
False     1
Name: feasible, dtype: int64

## Internal Audits for Model Development

In [None]:
# Show normal graphs 2 x 1

In [None]:
# Show normal graphs + recourse graphs (2 x 2)