# Onboard a Credit Approval Model to Evaluate Fairness

In this notebook, we present the steps for onboarding a model to evaluate model fairness.  

Fiddler is the pioneer in the **AI Observability** space, offering a unified platform that enables Data Science, MLOps, Risk, Compliance, Analytics, and LOB teams to **monitor, explain, analyze, and improve ML deployments at enterprise scale**. 
Obtain contextual insights at any stage of the ML lifecycle, improve predictions, increase transparency and fairness, and optimize business revenue.

---

You can experience Fiddler's Fairness Offering ***in minutes*** by following these four quick steps:

1. Connect to Fiddler
2. Define Your Model Specifications
3. Add Your Model
4. Add Fairness Metrics
5. Publish Production Events
6. Get Fairness insights


# 0. Imports

In [None]:
!pip install -q fiddler-client

import fiddler as fdl
import pandas as pd
import yaml
import datetime
import time
from IPython.display import clear_output

print(f"Running Fiddler client version {fdl.__version__}")

# 1. Connect to Fiddler

Before you can add information about your model with Fiddler, you'll need to connect using our Python client.

---

**We need a few pieces of information to get started.**
1. The URL you're using to connect to Fiddler
3. Your authorization token

The latter two of these can be found by pointing your browser to your Fiddler URL and navigating to the **Settings** page.

In [None]:
URL = ''  # Make sure to include the full URL (including https://).
TOKEN = ''

Now just run the following code block to connect the client to your Fiddler environment.

In [None]:
fdl.init(
    url=URL,
    token=TOKEN)

Once you connect, you can create a new project by specifying a project name using Fiddler's [Project API](https://docs.fiddler.ai/client-guide/create-a-project-and-model#create-a-project).

In [None]:
PROJECT_NAME = 'credit_approval'

project = fdl.Project(
    name=PROJECT_NAME)

project.create()

# 2. Define Your Model 

In this example, we'll be considering the case where we're a bank and we have **a model that predicts credit approval worthiness**.
  
In order to get insights into the model's performance, **Fiddler needs a small  sample of data that can serve as a baseline** for making comparisons with data in production.


---


*For more information on how to design a sample dataset, [click here](https://docs.fiddler.ai/client-guide/creating-a-baseline-dataset).*

In [None]:
PATH_TO_CSV = '/Users/anushrav/Projects/fiddler-examples/quickstart/data/v3/intersectionally_unfair_events.csv'

events_df = pd.read_csv(PATH_TO_CSV)
sample_df = events_df.sample(50)
sample_df

### 2.a Define Model Specification
In order to add your model to Fiddler, simply create a ModelSpec object with information about what each column of your data sample should used for.

Fiddler supports four column types:
1. **Inputs**
2. **Outputs** (Model predictions)
3. **Targets** (Ground truth values)
4. **Metadata**

In [None]:
model_spec = fdl.ModelSpec(
    inputs=['flag_own_car', 'flag_own_realty', 'name_income_type',
       'name_education_type', 'name_family_status', 'name_housing_type',
       'days_birth', 'days_employed', 'cnt_fam_members', 'income', 'paid_off',
       '__of_pastdues', 'no_loan'],
    outputs=['approve_probability_of_credit_request'],
    targets=['target'],
    metadata=['gender','race'])

### 2.b Define Model Task

Fiddler supports a variety of model tasks. In this case, we're adding a binary classification model.

For this, we'll create a ModelTask object and an additional ModelTaskParams object to specify the ordering of our positive and negative labels.

For a detailed breakdown of all supported model tasks, click here.

In [None]:
model_task = fdl.ModelTask.BINARY_CLASSIFICATION

task_params = fdl.ModelTaskParams(target_class_order=[0, 1])

## 3. Add your model

Create a Model object and publish it to Fiddler, passing in:
1. Your data sample
2. Your ModelSpec object
3. Your ModelTask and ModelTaskParams objects

In [None]:
MODEL_NAME = 'intersectionally_unfair'

model = fdl.Model.from_data(
    name=MODEL_NAME,
    project_id=fdl.Project.from_name(PROJECT_NAME).id,
    source=sample_df,
    spec=model_spec,
    task=model_task,
    task_params=task_params,
    event_ts_col='ts')

model.create()

## 4. Track Fairness Metrics 

Fiddler's [Custom Metric API](https://docs.fiddler.ai/python-client-3-x/api-methods-30#custommetric) allows for the creation of custom metrics that can be leveraged for creating **Fairness Metrics**.  These fairness metrics will be tracked over time and can be charted and alerted upon just like any other out-of-the-box metric offered by the Fiddler platform. You can configure these through the Fiddler UI too.

### 4.a Create Fairness Metrics

Here are some examples of industry-standard metrics that you can create with Fiddler:
- Group Benefit
- Demographic Parity

In [None]:
group_benefit = fdl.CustomMetric(
    name='Group Benefit',
    model_id=model.id,
    description='Measures Ocurrance Rate vs Prediction Rate',
    definition="sum(if(tp(), 1, 0)+if(fp(), 1, 0)) / sum(if(tp(), 1, 0)+if(fn(), 1, 0))" 
)

group_benefit.create()

In [None]:
demographic_parity = fdl.CustomMetric(
    name='Demographic Parity',
    model_id=model.id,
    description='Ratio of Passing Rate and Total Applicants',
    definition="sum(if(\"approve_probability_of_credit_request\">0.8, 1, 0))/count(\"approve_probability_of_credit_request\")" 
)

demographic_parity.create()

### 4.b Define Segments for Intersectional Fairness 

You can also segment users to visualize these fairness metrics for different intersections of user identities 

In [None]:
caucasian_male = fdl.Segment(
    name='Caucasian Male',
    model_id=model.id,
    description='Applicants Identifying as Pacific Islander and Male',
    definition="\"race\"=='Caucasian' and \"gender\"=='M'"
)

caucasian_male.create()

In [None]:
caucasian_female = fdl.Segment(
    name='Caucasian Female',
    model_id=model.id,
    description='Applicants Identifying as Pacific Islander and Male',
    definition="\"race\"=='Caucasian' and \"gender\"=='F'"
)

caucasian_female.create()

## 5. Publish Production Events

Finally, let's send in some production data that we can track these fairness metrics for.

---

Each record sent to Fiddler is called **an event**.
  
We will use the same dataset we sampled at the start of this notebook

In [None]:
# Shift the timestamps of the production events to be as recent as today
events_df['ts'] = events_df['ts'] + (int(time.time()) - events_df['ts'].max())
events_df

In [None]:
model.publish(events_df)

# 6. Get Fairness insights

**You're all done!**
  
Now head to your Fiddler environment and you can access your fairness metrics and apply segments to them to best understand your model outcomes by plotting these metrics and segments using our charts.

<table>
    <tr>
        <td>
            <img src="https://raw.githubusercontent.com/fiddler-labs/fiddler-examples/main/quickstart/images/fariness_disparate_impact.png"/>
        </td>
    </tr>
</table>



---


**Questions?**  
  
Check out [our docs](https://docs.fiddler.ai/) for a more detailed explanation of what Fiddler has to offer.

If you're still looking for answers, fill out a ticket on [our support page](https://fiddlerlabs.zendesk.com/) and we'll get back to you shortly.