This notbook illustrates the steps in a **Fairness Model Review** process applied to a **credit risk model**. The review ensures compliance with fairness standards and prevents discriminatory practices in decision-making:

1. Identify and remove **prohibited variables** (such as **race**, **gender**, **age**) from the model to prevent biased decision-making.

2. Analyze if any variables act as **proxy variables** that might indirectly reflect prohibited characteristics (e.g., ZIP code could correlate with race).

3. Document & Justify Variables used in the model to ensure they are non-discriminatory and comply with fairness standards.

4. Evaluate the model for potential biases in predictions. If disparities are identified, apply mitigation strategies or alternative approaches to balance fairness across different groups.

This review process promotes **fairness** in decision-making.



In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

In [None]:

# Create a dataset with some potentially sensitive variables (e.g., race, gender)
data = {
    'income': [45000, 54000, 60000, 67000, 75000],
    'credit_score': [620, 650, 700, 750, 800],
    'age': [25, 40, 35, 50, 60],
    'loan_amount': [20000, 25000, 30000, 35000, 40000],
    'gender': ['male', 'female', 'female', 'male', 'female'],  # Prohibited attribute
    'race': ['Caucasian', 'African American', 'Caucasian', 'Hispanic', 'Caucasian'],  # Prohibited attribute
    'approved': [1, 0, 1, 1, 0]  # Target variable
}

df = pd.DataFrame(data)

# Step 1: Identify & Remove prohibited variables (race and gender)
df_clean = df.drop(columns=['gender', 'race'])

print("Cleaned dataset (after removing prohibited variables):")
print(df_clean)


Cleaned dataset (after removing prohibited variables):
   income  credit_score  age  loan_amount  approved
0   45000           620   25        20000         1
1   54000           650   40        25000         0
2   60000           700   35        30000         1
3   67000           750   50        35000         1
4   75000           800   60        40000         0


Step 2: Analyze if any variables might act as proxy variables for sensitive attributes like race. For example, income may correlate with race or zip code, potentially introducing indirect bias.

In [None]:
# check for potential proxy variables (eg income and age correlation)
correlations = df[['income', 'credit_score', 'age', 'loan_amount']].corr()

print("\nCorrelation matrix to identify proxy variables:")
print(correlations)



Correlation matrix to identify proxy variables:
                income  credit_score       age  loan_amount
income        1.000000      0.990916  0.949069     0.998222
credit_score  0.990916      1.000000  0.927740     0.996241
age           0.949069      0.927740  1.000000     0.936329
loan_amount   0.998222      0.996241  0.936329     1.000000


We can see, they are correlate highly, so we’d need to investigate further to see if income could act as a proxy for age.



Step 3: Document and justify each variable used in the model to ensure non-discrimination

In [None]:

variable_documentation = {
    'income': 'Represents the borrower’s ability to repay the loan. It is not a proxy for race or gender.',
    'credit_score': 'A commonly used metric in assessing credit risk, not discriminatory.',
    'age': 'Age is a neutral factor that is not prohibited under Fair Lending laws.',
    'loan_amount': 'The loan amount requested is a non-discriminatory factor for loan approval.',
}

print("\nVariable documentation and justification:")
for var, justification in variable_documentation.items():
    print(f"{var}: {justification}")



Variable documentation and justification:
income: Represents the borrower’s ability to repay the loan. It is not a proxy for race or gender.
credit_score: A commonly used metric in assessing credit risk, not discriminatory.
age: Age is a neutral factor that is not prohibited under Fair Lending laws.
loan_amount: The loan amount requested is a non-discriminatory factor for loan approval.


Step 4: Check for Disparate Impact (DI) in the predictions

If a certain group (e.g., Caucasian vs African American) experiences different outcomes, we apply fairness mitigation.

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Prepare features and target
X = df_clean.drop(columns=['approved'])
y = df_clean['approved']

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a simple logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Evaluate the model
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)


# Step 4: Disparate Impact analysis (for simplicity, let's assume race was a feature)
# Normally, we'd check if any group is unfairly disadvantaged
# For now, we'll print a simple statement (in practice, use fairness metrics like statistical parity)

if accuracy < 0.75:  # Placeholder condition to apply fairness mitigation
    print("Disparate Impact detected, applying fairness mitigation...")
    # Placeholder for fairness remediation (e.g., apply re-weighting or re-sampling)


Disparate Impact detected, applying fairness mitigation...


Note: In this simple example, we evaluated the model's fairness using a simple accuracy threshold (which in practice would be replaced by fairness metrics like DI).

-If significant disparities are found in disparate impact analysis, then:

Step 5: Implement mitigation techniques (such as re-weighting, re-sampling, or modifying the model) to improve fairness (alternative analysis)

Key metrics:

**Disparate Impact (DI)**:

$\text{Disparate Impact} = \frac{\text{Approval Rate for Minority Group}}{\text{Approval Rate for Majority Group}} $

If DI < 0.8, it may indicate disparate impact, suggesting that the model might be biased against certain groups.

 **Fairness metrics** that help assess potential biases or disparities in a model's decision-making process, particularly in terms of different groups or protected classes (e.g., race, gender):

**AIR** is used to evaluate the fairness of a decision-making process, particularly in terms of DI between 2 groups:

$
\text{AIR} = \frac{\text{Selection Rate for Protected Group}}{\text{Selection Rate for Reference Group}}$

where *Selection Rate* = proportion of applicants in a group who receive the favorable decision (e.g., loan approval, promotion)

- **AIR = 1**: Perfect equality between the protected and reference groups.
- **AIR < 0.8** is often considered an indication of **adverse impact**, where the protected group is unfairly disadvantaged in terms of the favorable decision.
- **AIR > 1**: Indicates that the protected group is favored over the reference group

Example:If the selection rate for a protected group (e.g., women) is 40% and for the reference group (e.g., men) is 60%, then:
$\text{AIR} = \frac{0.40}{0.60} = 0.67$.
An AIR below 0.8 would indicate adverse impact against women in this case.


**Standardized Mean Difference (SMD)** measures the **difference in means** between 2 groups in a standardized way to assess the degree of disparity or imbalance between 2 groups on a particular feature or outcome:

($\mu_1 - \mu_2)/\sigma$

- **SMD = 0**: No difference between the two groups.
- **SMD between 0 and 0.2**: Small difference between groups.
- **SMD between 0.2 and 0.5**: Moderate difference between groups.
- **SMD > 0.5**: Large difference between groups, suggesting significant imbalance or disparity in the variable being assessed.


Summary:
- **AIR** helps evaluate whether one group is disproportionately affected by adverse decisions.
- **SMD** provides insights into whether there is a significant imbalance or disparity in the distribution of a key feature between different groups.

Together, these 2 fairness metrics help identify where fairness interventions might be necessary


**Conclusion:**

For more complex or production-level datasets, it's highly recommended to use fairness auditing toolkits like **AI Fairness 360 (AIF360)** - an open-source library developed by IBM to help developers and data scientists evaluate and improve the fairness of their AI systems.


**AIF360** and **Fairlearn** libraries offer a wide range of metrics (e.g., **Disparate Impact**, **Equalized Odds**, **Demographic Parity**) and mitigation techniques (like **reweighing**, **disparate impact remover**, and **adversarial debiasing**) that help assess and reduce unfair bias in machine learning models. They are especially useful when working with **non-linear models** like **Gradient Boosted Machines (GBM)**, where bias can be subtle and embedded in feature interactions.

For example: *!pip install aif360[Reductions]* - to install AI Fairness 360 with the additional fairness algorithms related to reductions, that can be used for detecting & mitigating bias in ML models.
