# Counterfactual Fairness Analysis

This notebook evaluates counterfactual fairness by swapping sensitive attributes (gender or ethnicity) while keeping AQ-10 answers fixed. The change in predicted probabilities (Δ probability) is recorded using the all supervised models. This analysis helps in understanding the fairness of the model with respect to sensitive attributes.

# Counterfactual Fairness Analysis Function

The `counterfactual_fairness_analysis` function evaluates the fairness of a model by swapping sensitive attributes (e.g., gender or ethnicity) and measuring the change in predicted probabilities. It returns a DataFrame containing the original and swapped probabilities along with their differences.

In [48]:
def counterfactual_fairness_analysis(data, model, sensitive_column, fixed_columns):
    results = []
    for index, row in data.iterrows():
        X_original = pd.DataFrame([row.drop(fixed_columns)])
        X_original = X_original.reindex(columns=model.feature_names_in_, fill_value=0)
        original_prob = model.predict_proba(X_original)[:, 1][0]


        # Swap sensitive attribute
        if sensitive_column == 'gender':
            row[sensitive_column] = 1 - row[sensitive_column]  # Swap gender
        elif sensitive_column == 'ethnicity':
            for col in row.index:
                if col.startswith('ethnicity_') and row[col] == 1:
                    row[col] = 0
                elif col.startswith('ethnicity_') and row[col] == 0:
                    row[col] = 1
                    break


        X_swapped = pd.DataFrame([row.drop(fixed_columns)])
        X_swapped = X_swapped.reindex(columns=model.feature_names_in_, fill_value=0)
        swapped_prob = model.predict_proba(X_swapped)[:, 1][0]

        delta_prob = swapped_prob - original_prob

        results.append({
            'Index': index,
            'Original_Probability': original_prob,
            'Swapped_Probability': swapped_prob,
            'Delta_Probability': delta_prob
        })

    return pd.DataFrame(results)

# Test Model Function

The `testModel` function performs counterfactual fairness analysis for both gender and ethnicity. It calculates the mean change in probabilities (Δ probability) for each sensitive attribute and prints the results.

In [49]:
def testModel(model, test_df):
    # Perform counterfactual fairness analysis for gender
    sensitive_column = 'gender'
    fixed_columns = [col for col in test_df.columns if col != 'gender' and not col.startswith('ethnicity')]

    results_gender = counterfactual_fairness_analysis(cleanTest, rf_model, sensitive_column, fixed_columns)
    print(results_gender) 

    # Perform counterfactual fairness analysis for ethnicity
    sensitive_column = 'ethnicity'

    results_ethnicity = counterfactual_fairness_analysis(cleanTest, rf_model, sensitive_column, fixed_columns)
    print(results_ethnicity)

    # Calculate mean delta probabilities
    mean_delta_gender = results_gender['Delta_Probability'].mean()
    mean_delta_ethnicity = results_ethnicity['Delta_Probability'].mean()

    print(f"Mean Delta Probability for Gender: {mean_delta_gender}")
    print(f"Mean Delta Probability for Ethnicity: {mean_delta_ethnicity}")

# Random Forest Model Testing

This section loads the Random Forest model from the `SupervisedRandomForest` notebook and performs counterfactual fairness analysis using the `testModel` function.

In [None]:
# Load the Random Forest model 
%run SupervisedRandomForest.ipynb
testModel(rf_model, cleanTest)

# CatBoost Model Testing

This section loads the CatBoost model from the `SupervisedCatBoost` notebook and performs counterfactual fairness analysis using the `testModel` function.

In [None]:
# Load the Cat Boost model 
%run SupervisedCatBoost.ipynb
testModel(catboost_model, cleanTest)

# XGBoost Model Testing

This section loads the XGBoost model from the `SupervisedXGBoost` notebook and performs counterfactual fairness analysis using the `testModel` function.

In [None]:
# Load the XG Boost model 
%run SupervisedXGBoost.ipynb
testModel(xgb_model, cleanTest)

# Logistic Regression Model Testing

This section loads the Logistic Regression model from the `SupervisedLogisticRegression` notebook and performs counterfactual fairness analysis using the `testModel` function.

In [None]:
# Load the Logistic Regression model 
%run SupervisedlogisticRegression.ipynb
testModel(model, cleanTest)