# grail.fairness.data.metrics

> This module contains functions for computing fairness metrics for datasets.

In [None]:
# | default_exp fairness.data.metrics

In [None]:
# | hide
from nbdev.showdoc import *  # noqa

In [None]:
# | export
import pandas as pd
from scipy.stats import entropy
import numpy as np
from grail.fairness.data.utils import remark_spiel_generator

#### Sample Data Loading

In [None]:
from grail.fairness.data.utils import create_biased_dataset

df = create_biased_dataset(100)
df.head()

Unnamed: 0,id,age,loan_application_date,loan_type,feature_x,target,gender,location
0,0,48,2013-05-18,ggives,97,1,female,loc3
1,1,53,2015-01-01,gloan,29,1,male,loc2
2,2,28,2016-02-26,gcredit,76,1,male,loc1
3,3,58,2014-03-05,gcredit,43,1,male,loc3
4,4,27,2015-01-25,borrowload,57,1,male,loc2


## Class Imbalance

Class imbalance (CI) bias occurs when a class value *d* has fewer training samples when compared with another class value *a* in the dataset. We call the class with fewer values the **disadvantaged class or the risk group** while the class with more training samples as the **advantaged class**. This is because models preferentially fit the larger classes at the expense of the smaller classes and so can result in a higher training error for class *d*. Models are also at higher risk of overfitting the smaller data sets, which can cause a larger test error for class *d*. Consider the example where a machine learning model is trained primarily on data from middle-aged individuals (class *a*), it might be less accurate when making predictions involving younger and older people (class *d*).

The formula for the (normalized) facet imbalance measure:

$$CI = (n_a - n_d)/(n_a + n_d)$$

Where $n_a$ is the number of members of the advantaged class and $n_d$ the number for the disadvantaged class. Its values range over the interval [-1, 1].

**Note:** If a feature is multi-class, the disadvantaged class is computed against the number of all the other classes: $n_d = n_1 + n_2 + n_n$

- Positive values indicate the advantaged class has more training samples in the dataset.
- Values near zero indicate the classes are balanced in the number of training samples in the dataset.
- Negative values indicate the disadvantaged class has more training samples in the dataset.


Document Source: [AWS Clarify][AWS Clarify]<br>
Code Inspiration: [Medium][Medium]

[AWS Clarify]: https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-bias-metric-class-imbalance.html
[Medium]: https://medium.com/@corymaklin/pretraining-data-bias-18e1d1dfc350

In [None]:
# | export
def compute_class_imbalance(num_class_adv: int, num_class_disadv: int) -> float:
    """
    Compute class imbalance.

    Parameters
    ----------
    num_class_adv
        number of records for the advantaged class
    num_class_disadv
        number of records for the disadvantaged class

    Returns
    -------
    class_imbalance_value
        The class imbalance metric
    """
    metric_value = (num_class_adv - num_class_disadv) / (
        num_class_adv + num_class_disadv
    )
    return metric_value


def class_imbalance(
    df: pd.DataFrame,
    protected_feature: str,
    underpriviledged_group: str = None,
    threshold: float = 0.5,
) -> pd.DataFrame:
    """
    Compute class imbalance given a pandas series containing a categorical feature column.

    Parameters
    ----------
    df_col
        the feature column with the advantaged and disadvantaged class
    underpriviledged_group
        optional. the disadvantaged class

    Returns
    -------
    pd.DataFrame
        The class imbalance metric
    """
    METRIC_NAME = "class_imbalance"
    df_col = df[protected_feature]
    if underpriviledged_group:
        underpriviledged_group = [underpriviledged_group]
    else:
        underpriviledged_group = df_col.unique()

    num_class = df_col.value_counts().to_dict()
    num_results = len(underpriviledged_group)
    metric_vals = []
    exceeds_thredhold = []
    remarks = []
    for val in underpriviledged_group:
        num_class_disadv = num_class[val]
        num_class_adv = len(df_col) - num_class_disadv
        metric_val = compute_class_imbalance(num_class_adv, num_class_disadv)
        exceeds_flag = (
            True if metric_val > threshold or metric_val < -threshold else False
        )
        remarks.append(
            remark_spiel_generator(
                feature_name=protected_feature,
                group=val,
                metric_name=METRIC_NAME,
                exceeds_flag=exceeds_flag,
                threshold=threshold,
            )
        )
        metric_vals.append(metric_val)
        exceeds_thredhold.append(exceeds_flag)

    result = pd.DataFrame(
        {
            "Protected Feature": df_col.name,
            "Metric": METRIC_NAME,
            "Metric Value": metric_vals,
            "Underpriviledged Group": underpriviledged_group,
            "Threshold": threshold,
            "Exceeds Threshold": exceeds_thredhold,
            "Remarks": remarks,
        },
        index=list(range(0, num_results)),
    )

    return result

### Computing Class Imbalance for 2 Classes

In the example below, we compute the class imbalance on the feature **gender** for the risk group or the imbalanced group being **female**. The resulting value is close to 1, meaning that the risk group is highly imabalanced compared to the advantaged group, in this case, **males**.

In [None]:
class_imbalance(df, protected_feature="gender", underpriviledged_group="female")

Unnamed: 0,Protected Feature,Metric,Metric Value,Underpriviledged Group,Threshold,Exceeds Threshold,Remarks
0,gender,class_imbalance,0.56,female,0.5,True,Group female of Feature gender exceeded the cl...


### Computing Class Imbalance for Multiple Classes

In the example below, we compute the class imbalance on the feature **location** without specifying a risk group. This means that the risk group is computed against all the other classes in the feature.

**loc3** has the highest value among all the classes meaning it is highly imabalanced compared to other classes followed by **loc2**.

**loc1** is balanced since 50% of the records are loc1 with the remaining 50% can either be loc2 or loc3.

In [None]:
class_imbalance(df, protected_feature="location")

Unnamed: 0,Protected Feature,Metric,Metric Value,Underpriviledged Group,Threshold,Exceeds Threshold,Remarks
0,location,class_imbalance,0.56,loc3,0.5,True,Group loc3 of Feature location exceeded the cl...
1,location,class_imbalance,0.14,loc1,0.5,False,Acceptable
2,location,class_imbalance,0.3,loc2,0.5,False,Acceptable


## Conditional Demographic Disparity in Labels

Conditional Demographic Disparity in Labels (CDDL) builds on Demographic Disparity to avoid Simpson's paradox by accounting for subgroup differences. The CDDL metric examines disparities within subgroups to provide a more nuanced understanding of potential bias.

The formula for CDDL is:

$$CDD = (1/n)*∑ᵢnᵢ*DDᵢ$$

where:
- n is the total number of observations
- nᵢ is the number of observations in subgroup i
- DDᵢ is the demographic disparity for subgroup i

A classic example of why this conditional analysis is important comes from the Berkeley admissions case. While initial analysis showed men were accepted at higher rates overall (suggesting gender bias), examining admission rates by department revealed women actually had higher acceptance rates in many departments. The Simpson's paradox arose because women tended to apply to more competitive departments with lower overall acceptance rates.

- Positive CDDL score indicates bias against the protected group
- Zero indicates no disparity
- Negative score indicates bias in favor of the protected group

Document Source: [Conditional Demographic Disparity in Predicted Labels (CDDPL)](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-cddpl.html)

In [None]:
# | export
def compute_conditional_demographic_disparity_in_labels(
    df: pd.DataFrame,
    protected_attribute: str,
    target: str,
    group_variable: str,
    protected_value: str,
    positive_label: int,
) -> pd.DataFrame:
    """
    Compute the Conditional Demographic Disparity in Labels (CDDL) metric.

    Parameters
    ----------
    df
        Input DataFrame containing features and target
    protected_attribute
        Column name of protected attribute (e.g., 'sex', 'race')
    target
        Column name of target variable
    group_variable
        Column name for subgroup analysis (e.g., 'age_group', 'department')
    protected_value
        Value in protected_attribute that indicates protected group
    positive_label
        Value in target that indicates positive outcome

    Returns
    -------
    pd.DataFrame
        DataFrame containing the metric results with columns:
        - feature: The protected attribute being analyzed
        - metric: Name of the metric (CDDL)
        - subgroup: The subgroup being analyzed
        - subgroup_size: Number of observations in subgroup
        - protected_rate: Rate for protected group
        - unprotected_rate: Rate for unprotected group
        - disparity: Disparity value for subgroup
        - weight: Subgroup weight in final calculation
    """
    # Input validation
    required_cols = {protected_attribute, target, group_variable}
    if not required_cols.issubset(df.columns):
        missing = required_cols - set(df.columns)
        raise ValueError(f"Missing required columns: {missing}")

    results = []
    total_observations = len(df)
    weighted_disparity_sum = 0

    # Compute disparity for each subgroup
    for subgroup in df[group_variable].unique():
        # Subgroup mask
        subgroup_mask = df[group_variable] == subgroup
        subgroup_size = subgroup_mask.sum()

        if subgroup_size == 0:
            continue

        # Protected group positive rate
        protected_mask = df[protected_attribute] == protected_value
        positive_mask = df[target] == positive_label

        protected_positive = (protected_mask & positive_mask & subgroup_mask).sum()
        protected_total = (protected_mask & subgroup_mask).sum()
        protected_rate = (
            protected_positive / protected_total if protected_total > 0 else 0.0
        )

        # Unprotected group positive rate
        unprotected_positive = ((~protected_mask) & positive_mask & subgroup_mask).sum()
        unprotected_total = ((~protected_mask) & subgroup_mask).sum()
        unprotected_rate = (
            unprotected_positive / unprotected_total if unprotected_total > 0 else 0.0
        )

        # Compute demographic disparity for subgroup
        disparity = unprotected_rate - protected_rate
        weight = subgroup_size / total_observations
        weighted_disparity_sum += disparity * subgroup_size

        # Store results
        results.append(
            {
                "feature": protected_attribute,
                "subgroup": subgroup,
                "subgroup_size": int(subgroup_size),
                "protected_rate": f"{protected_rate:.2%}",
                "unprotected_rate": f"{unprotected_rate:.2%}",
                "disparity": disparity,
                "weight": weight,
            }
        )

    # Create results DataFrame
    results_df = pd.DataFrame(results)

    # Add overall score row
    overall_score = weighted_disparity_sum / total_observations
    overall_row = pd.DataFrame(
        [
            {
                "feature": protected_attribute,
                "subgroup": "Overall",
                "subgroup_size": total_observations,
                "protected_rate": "-",
                "unprotected_rate": "-",
                "disparity": overall_score,
                "weight": 1.0,
            }
        ]
    )

    return pd.concat([results_df, overall_row], ignore_index=True)

### Compute Gender Bias in Loan Approvals by Age Group Example

This example examines potential gender bias in loan approvals across different age categories using CDDL analysis. By grouping ages into categories, we can see if certain age ranges show different patterns of gender disparity in loan decisions.

In [None]:
df = create_biased_dataset(1000)

# Create age groups for analysis
df["age_group"] = pd.cut(
    df["age"], bins=[0, 30, 40, 50, 100], labels=["18-30", "31-40", "41-50", "50+"]
)

compute_conditional_demographic_disparity_in_labels(
    df=df,
    protected_attribute="gender",
    target="target",
    group_variable="age_group",
    protected_value="female",
    positive_label=1,
)

Unnamed: 0,feature,subgroup,subgroup_size,protected_rate,unprotected_rate,disparity,weight
0,gender,18-30,338,28.21%,97.31%,0.691026,0.338
1,gender,50+,200,23.40%,97.39%,0.739814,0.2
2,gender,41-50,230,17.39%,97.28%,0.798913,0.23
3,gender,31-40,232,14.29%,99.47%,0.85188,0.232
4,gender,Overall,1000,-,-,0.762915,1.0


The output shows several key aspects of CDDL analysis:

1. **Subgroup Distribution**:
   - The data is split into four age groups: 18-30, 31-40, 41-50, and 50+
   - Each group's size is indicated in the 'subgroup_size' column, showing the distribution of applicants across age ranges

2. **Rate Comparison**:
   - 'protected_rate' shows approval rates for females in each age group
   - 'unprotected_rate' shows approval rates for males in each age group
   - These rates help identify if disparities are consistent across age groups or if certain age groups show larger gaps

3. **Disparity Measures**:
   - Positive disparity values indicate higher approval rates for males
   - The magnitude of disparity can vary across age groups
   - The 'weight' column shows how much each age group contributes to the overall CDDL score

4. **Overall Score**:
   - The final row ("Overall") provides the weighted CDDL score across all age groups
   - This score helps determine if there's systemic bias in the loan approval process

### Displaying Simpson's Paradox in Location-based Analysis

This example demonstrates how CDDL helps avoid Simpson's paradox by examining loan approvals across locations. While overall rates might suggest strong gender bias, examining rates by location reveals that the disparity varies significantly by area.

In [None]:
compute_conditional_demographic_disparity_in_labels(
    df=df,
    protected_attribute="gender",
    target="target",
    group_variable="location",
    protected_value="female",
    positive_label=1,
)

Unnamed: 0,feature,subgroup,subgroup_size,protected_rate,unprotected_rate,disparity,weight
0,gender,loc2,319,17.95%,95.85%,0.779019,0.319
1,gender,loc1,460,39.66%,99.50%,0.598473,0.46
2,gender,loc3,221,12.99%,96.53%,0.835408,0.221
3,gender,Overall,1000,-,-,0.70843,1.0


The output shows the CDDL analysis across different locations, revealing:

1. **Location Distribution**:
  - Each location's proportion of the total data is shown in the 'weight' column

2. **Rate Comparison**:
  - 'protected_rate' shows female approval rates by location
  - 'unprotected_rate' shows male approval rates by location
  - The variation in rates across locations demonstrates the Simpson's paradox effect

3. **Disparity Measures**:
  - Each location has its own disparity score
  - Differences in disparities across locations suggest that bias isn't uniform

4. **Overall Score**:
  - The weighted CDDL score provides a balanced measure accounting for location differences
  - This helps avoid misleading conclusions from aggregated data alone

This location-based analysis helps identify whether apparent gender bias might be influenced by location-specific factors, demonstrating the importance of conditional analysis in fairness metrics.

## Statistical Parity Difference

The Statistical Parity Difference (SPD) metric calculates the difference in the ratio of favorable outcomes between two groups, **monitored or unprivileged groups** and **reference or privileged groups**. A favorable outcome is the proportion of individuals in a group that receive a positive outcome (e.g., being approved for a loan). It is often used to assess the fairness of a decision-making process or outcome where there are two groups of interest, such as men and women or people of different racial groups.

The formula for the difference according to Ruiviera:

$$SPD = p(ŷ = 1|D_u) - p(ŷ = 1|D_p)$$

Where $ŷ = 1$ is the favourable outcome and $D_u, D_p$ are respectively the unprivileged and privileged group data.

Another way to look at the formula according to Data Platform:

$$SPD = (np_u / np_p) - (ni_u / ni_p)$$

Where $np_u, np_p$ are respectively the number of positive outcomes for the underprivileged and privileged groups. While $ni_u, ni_p$ are the number of instances or total number of members in each of the groups.

**Note:** One key difference between SPD and other statistical parity method is that it is specifically designed to compare the fairness of only two groups.

- An SPD of 0 indicates perfect fairness, meaning both groups have the same favorable outcome.
- A positive SPD indicates that the unprivileged group has a higher favorable outcome than the privileged group.
- A negative SPD indicates that the unprivileged group has a lower favorable outcome than the privileged group.

Document Source: [Data Platform][Data Platform]<br>
Code Inspiration: [Ruivieira][Ruivieira]

[Data Platform]: https://dataplatform.cloud.ibm.com/docs/content/wsj/model/wos-stat-parity-diff.html?context=cpdaas
[Ruivieira]: https://ruivieira.dev/fairness-in-machine-learning.html

In [None]:
# | export
def compute_statistical_parity_diff(
    num_pos_under: int, num_pos_priv: int, num_inst_under: int, num_inst_priv: int
) -> float:
    """
    Compute statistical parity difference.

    Parameters
    ----------
    num_pos_under
        number of positive outcomes for the underprivileged class
    num_pos_priv
        number of positive outcomes for the privileged class
    num_inst_under
        total number of instances for the underprivileged class
    num_inst_priv
        total number of instances for the privileged class

    Returns
    -------
    statistical_imbalance_difference
        The statistical parity difference metric
    """
    metric_value = (num_pos_under / num_pos_priv) - (num_inst_under / num_inst_priv)
    return "%.2f" % metric_value


def statistical_parity_difference(
    df_col: pd.Series,
    pos_col: pd.Series,
    pos_outcome_val,
    privileged_group: str,
    underprivileged_group: str = None,
) -> pd.DataFrame:
    """
    Compute statistical parity difference given a pandas series containing a categorical feature column.

    Parameters
    ----------
    df_col
        the feature column with the privileged and underpivileged class
    pos_col
        the column to determine whether a given row has a positive outcome or not
    pos_outcome_val
        the value to determine if an outcome is positive or not
    underprivileged_group
        the underpivileged class
    privileged_group
        the privileged class

    Returns
    -------
    pd.DataFrame
        The statistical parity difference metric
    """
    if underprivileged_group:
        num_results = len([underprivileged_group])
        num_class = df_col.value_counts().to_dict()

        num_inst_priv = num_class[privileged_group]
        num_pos_priv = df_col[
            (pos_col == pos_outcome_val) & (df_col == privileged_group)
        ].count()
        num_inst_under = num_class[underprivileged_group]
        num_pos_under = df_col[
            (pos_col == pos_outcome_val) & (df_col == underprivileged_group)
        ].count()

        underprivileged_groups = [underprivileged_group]
        metric_vals = compute_statistical_parity_diff(
            num_pos_under=num_pos_under,
            num_pos_priv=num_pos_priv,
            num_inst_under=num_inst_under,
            num_inst_priv=num_inst_priv,
        )

    elif underprivileged_group is None:
        num_class = df_col.value_counts().to_dict()

        num_inst_priv = num_class[privileged_group]
        num_pos_priv = df_col[
            (pos_col == pos_outcome_val) & (df_col == privileged_group)
        ].count()

        del num_class[privileged_group]
        num_results = len(num_class)

        underprivileged_groups = []
        metric_vals = []
        for key in num_class:
            underprivileged_groups.append(key)

            num_inst_under = num_class[key]
            num_pos_under = df_col[
                (pos_col == pos_outcome_val) & (df_col == key)
            ].count()

            metric_val = compute_statistical_parity_diff(
                num_pos_under=num_pos_under,
                num_pos_priv=num_pos_priv,
                num_inst_under=num_inst_under,
                num_inst_priv=num_inst_priv,
            )
            metric_vals.append(metric_val)

    result = pd.DataFrame(
        {
            "feature": df_col.name,
            "metric": "statistical_parity_difference",
            "risk_group": underprivileged_groups,
            "metric_value": metric_vals,
        },
        index=list(range(0, num_results)),
    )

    return result

### Example 1

In the example below, we compute the statistical parity difference on the feature **gender** for the underprivileged group being **female**. The **target** column will be used to determine how many from the underprivileged and privileged (in this case **male**) have a postivie outcome of 1. 

In [None]:
df_col = df.gender
pos_col = df.target
statistical_parity_difference(
    df_col=df_col,
    pos_col=pos_col,
    pos_outcome_val=1,
    privileged_group="male",
    underprivileged_group="female",
)

Unnamed: 0,feature,metric,risk_group,metric_value
0,gender,statistical_parity_difference,female,-0.2


### Example 2

In the example below, we compute the statistical parity difference on the feature **gender** with the underprivileged group **unspecified**. The **target** column will be used to determine how many from the underprivileged and privileged (in this case **male**) have a postivie outcome of 1. 

In [None]:
df_col = df.gender
pos_col = df.target
statistical_parity_difference(
    df_col=df_col, pos_col=pos_col, pos_outcome_val=1, privileged_group="male"
)

Unnamed: 0,feature,metric,risk_group,metric_value
0,gender,statistical_parity_difference,female,-0.2


## Disparate Impact

**Disparate Impact (DI)** is a metric used to assess if outcomes differs across different groups or classes even if there are no intentional bias. It compares the proportion of individuals receiving favorable outcomes from ML predictions between two groups - a *privileged* (or *majority*) group and an *unprivileged* or (*minority*) group. Disparate Impact is calculated as the ratio of of favorable outcomes for the unprivileged group over the favorable outcomes for the privileged group.

The formula for Disparate Impact (DI) is shown below:

$$
DI = \frac{\text{P}(\text{favorable} = 1 \mid \text{C} = \text{unprivileged})}{\text{P}(\text{favorable} = 1 \mid \text{C} = \text{privileged})}
$$

where,

$\text{P}(\text{favorable} = 1 \mid \text{C} = \text{unprivileged})$ represents the proportion of favorable outcomes for the unprivileged group

$\text{P}(\text{favorable} = 1 \mid \text{C} = \text{privileged})$ represents the proportion of favorable outcomes for the privileged group


<br />

The formula could also be written below which illustrates how the proportion or rates are determined:

$$
DI = \frac{\#\text{ favorable outcomes for unprivileged group} \,\div\,\#\text{ unprivileged group}}{\#\text{ favorable outcomes for privileged group}\,\div\,\#\text{ privileged group}}
$$

<br />

To compute the Disparate Impact metric in GRAIL, use the `compute_disparate_impact` function. The function requires a pandas DataFrame `df` that has the column features containing the privileged and unprivileged groups (`protected_feature`), and the prediction columns that contains the predictions of a model (`target_col`) whose favorable prediction is equal to `positive_label` (defaults to 1). The function calculates the DI metric of a group `underprivileged_group` to all other groups. If `underprivileged_group` is not provided, then it will iterate and calculate the DI metrics for each unique groups with respect to the other groups. The function has a lower threshold (`threshold_lower`) and an upper threshold (`threshold_upper`) arguments whose default values are 0.8 and 1.2, respectively. The DI metric is deemed acceptable if it is within the threshold range. See the examples below for more information.

In [None]:
# | export


def compute_disparate_impact(
    df: pd.DataFrame,
    protected_feature: str,
    target_col: str,
    underprivileged_group: str = None,
    positive_label: int = 1,
    threshold_upper: float = 1.2,
    threshold_lower: float = 0.8,
) -> pd.DataFrame:
    """
    Computes the Disparate Impact (DI) metric for a given dataset and underprivileged group

    Parameters
    ----------
    df
        The pandas DataFrame containing the privileged and unprivileged groups and
        the output predictions
    protected_feature
        The column name of the protected feature where the underprivileged group is found
    target_col
        The column name of the predictions
    positive_label
        The positive label value in target_col. Defaults to 1
    threshold_upper
        Upper threshold for Disparate Impact. Defaults to 1.2
    threshold_lower
        Lower threshold for Disparat Impact. Defaults to 0.8
    underprivileged_group
        (Optional) The underprivileged group to calculate the DI metric for. If not provided,
        the function calculates the DI metric for all groups


    Returns
    -------
    pd.DataFrame
        The Disparate Impact metric
    """
    METRIC_NAME = "disparate_impact"

    # Check for required columns
    if protected_feature not in df.columns:
        raise ValueError(f"Column '{protected_feature}' not in 'df'")
    if target_col not in df.columns:
        raise ValueError(f"Column '{target_col}' not in 'df'")

    # Check for valid values
    if positive_label not in df[target_col].unique():
        raise ValueError(
            f"Value '{positive_label}' not in column '{target_col}' of 'df'"
        )
    if (
        underprivileged_group
        and underprivileged_group not in df[protected_feature].unique()
    ):
        raise ValueError(
            f"Value '{underprivileged_group}' not in column '{target_col}' of 'df'"
        )

    # Calculate the mean (rate), sum (total_positive), and count (total) for each group
    group_stats = df.groupby(protected_feature)[target_col].agg(
        rate="mean", total_positive="sum", total="count"
    )

    # Calculate the Disparate Impact metrics for each group
    disparate_impact_metrics = []

    # If a specific underprivileged group is provided, calculate DI only for that group
    if underprivileged_group:
        current_rate = group_stats.loc[underprivileged_group, "rate"]

        # Calculate the Disparate Impact metrics with respect to other groups
        for group in group_stats.index:
            if group != underprivileged_group:
                group_rate = group_stats.loc[group, "rate"]
                di_metric = current_rate / group_rate if group_rate != 0 else np.nan
                exceeds_threshold = (
                    di_metric < threshold_lower or di_metric > threshold_upper
                )

                disparate_impact_metrics.append(
                    {
                        "Protected Faeture": protected_feature,
                        "Metric": METRIC_NAME,
                        "Metric Value": di_metric,
                        "Underprivileged Group": underprivileged_group,
                        "Privileged Group": group,
                        "Threshold": [threshold_lower, threshold_upper],
                        "Exceeds Threshold": exceeds_threshold,
                        "Remarks": remark_spiel_generator(
                            feature_name=protected_feature,
                            group=underprivileged_group,
                            metric_name=METRIC_NAME,
                            exceeds_flag=exceeds_threshold,
                            lower_bound=threshold_lower,
                            upper_bound=threshold_upper,
                        ),
                    }
                )

        # Calculate the Disparate Impact metrics for combination of all other groups
        if len(df[protected_feature].unique()) > 2:
            other_total_positives = (
                df[target_col].sum()
                - group_stats.loc[underprivileged_group, "total_positive"]
            )
            other_total = len(df) - group_stats.loc[underprivileged_group, "total"]
            other_rate = (
                other_total_positives / other_total if other_total != 0 else np.nan
            )

            di_metric = current_rate / other_rate if other_rate != 0 else np.nan
            exceeds_threshold = (
                di_metric < threshold_lower or di_metric > threshold_upper
            )

            disparate_impact_metrics.append(
                {
                    "Protected Faeture": protected_feature,
                    "Metric": METRIC_NAME,
                    "Metric Value": di_metric,
                    "Underprivileged Group": underprivileged_group,
                    "Privileged Group": "All",
                    "Threshold": [threshold_lower, threshold_upper],
                    "Exceeds Threshold": exceeds_threshold,
                    "Remarks": remark_spiel_generator(
                        feature_name=protected_feature,
                        group=underprivileged_group,
                        metric_name=METRIC_NAME,
                        exceeds_flag=exceeds_threshold,
                        lower_bound=threshold_lower,
                        upper_bound=threshold_upper,
                    ),
                }
            )
    else:
        for current_group in group_stats.index:
            current_rate = group_stats.loc[current_group, "rate"]

            # Calculate the Disparate Impact metrics with respect to other groups
            for group in group_stats.index:
                if group != current_group:
                    group_rate = group_stats.loc[group, "rate"]

                    di_metric = current_rate / group_rate if group_rate != 0 else np.nan
                    exceeds_threshold = (
                        di_metric < threshold_lower or di_metric > threshold_upper
                    )

                    disparate_impact_metrics.append(
                        {
                            "Protected Faeture": protected_feature,
                            "Metric": METRIC_NAME,
                            "Metric Value": di_metric,
                            "Underprivileged Group": current_group,
                            "Privileged Group": group,
                            "Threshold": [threshold_lower, threshold_upper],
                            "Exceeds Threshold": exceeds_threshold,
                            "Remarks": remark_spiel_generator(
                                feature_name=protected_feature,
                                group=current_group,
                                metric_name=METRIC_NAME,
                                exceeds_flag=exceeds_threshold,
                                lower_bound=threshold_lower,
                                upper_bound=threshold_upper,
                            ),
                        }
                    )

            # Calculate the Disparate Impact metrics for combination of all other groups if more than 2 groups
            if len(df[protected_feature].unique()) > 2:
                other_total_positives = (
                    df[target_col].sum()
                    - group_stats.loc[current_group, "total_positive"]
                )
                other_total = len(df) - group_stats.loc[current_group, "total"]
                other_rate = (
                    other_total_positives / other_total if other_total != 0 else np.nan
                )

                di_metric = current_rate / other_rate if other_rate != 0 else np.nan
                exceeds_threshold = (
                    di_metric < threshold_lower or di_metric > threshold_upper
                )

                disparate_impact_metrics.append(
                    {
                        "Protected Faeture": protected_feature,
                        "Metric": METRIC_NAME,
                        "Metric Value": di_metric,
                        "Underprivileged Group": current_group,
                        "Privileged Group": "All",
                        "Threshold": [threshold_lower, threshold_upper],
                        "Exceeds Threshold": exceeds_threshold,
                        "Remarks": remark_spiel_generator(
                            feature_name=protected_feature,
                            group=current_group,
                            metric_name=METRIC_NAME,
                            exceeds_flag=exceeds_threshold,
                            lower_bound=threshold_lower,
                            upper_bound=threshold_upper,
                        ),
                    }
                )

    return pd.DataFrame(disparate_impact_metrics)

### Computing Disparate Impact for 2 Classes
The example below illustrates how to compute the Disparate Impact metric on the protected feature `gender` having two classes with the underprivileged group, **female**, provided to argument `underprivileged_group`. The ML model's prediction results in `target` column was also provided to the `target_col` argument (default value of `positive_label` is 1). The function returns a pandas DataFrame containing the DI metric calculated for the provided underprivileged group.

In [None]:
df = create_biased_dataset(100)

compute_disparate_impact(
    df=df,
    protected_feature="gender",
    target_col="target",
    underprivileged_group="female",
)

Unnamed: 0,Protected Faeture,Metric,Metric Value,Underprivileged Group,Privileged Group,Threshold,Exceeds Threshold,Remarks
0,gender,disparate_impact,0.205479,female,male,"[0.8, 1.2]",True,Group female of Feature gender exceeded the di...


If the `underprivileged_group` is not provided, it calculates the Disparate Impact metric for each group, considering each group as an unprivileged group

In [None]:
compute_disparate_impact(df=df, protected_feature="gender", target_col="target")

Unnamed: 0,Protected Faeture,Metric,Metric Value,Underprivileged Group,Privileged Group,Threshold,Exceeds Threshold,Remarks
0,gender,disparate_impact,0.205479,female,male,"[0.8, 1.2]",True,Group female of Feature gender exceeded the di...
1,gender,disparate_impact,4.866667,male,female,"[0.8, 1.2]",True,Group male of Feature gender exceeded the disp...


### Computing Disparate Impact for Multiple Classes
The following example calculates the Disparate Impact metric on a multi-class protected feature, `location`, with a given `underprivileged_group`. For more than 2 groups, it also calculates the DI metric of the unprivileged group against all the other groups combined and considered as a single privileged group

In [None]:
compute_disparate_impact(
    df=df,
    protected_feature="location",
    target_col="target",
    underprivileged_group="loc1",
)

Unnamed: 0,Protected Faeture,Metric,Metric Value,Underprivileged Group,Privileged Group,Threshold,Exceeds Threshold,Remarks
0,location,disparate_impact,1.126749,loc1,loc2,"[0.8, 1.2]",False,Acceptable
1,location,disparate_impact,1.057143,loc1,loc3,"[0.8, 1.2]",False,Acceptable
2,location,disparate_impact,1.102981,loc1,All,"[0.8, 1.2]",False,Acceptable


If the `underprivileged_group` is not provided, then it calculates the DI metric for each unique group with respect to the other groups, as well as with all the groups combined

In [None]:
compute_disparate_impact(df=df, protected_feature="location", target_col="target")

Unnamed: 0,Protected Faeture,Metric,Metric Value,Underprivileged Group,Privileged Group,Threshold,Exceeds Threshold,Remarks
0,location,disparate_impact,1.126749,loc1,loc2,"[0.8, 1.2]",False,Acceptable
1,location,disparate_impact,1.057143,loc1,loc3,"[0.8, 1.2]",False,Acceptable
2,location,disparate_impact,1.102981,loc1,All,"[0.8, 1.2]",False,Acceptable
3,location,disparate_impact,0.887509,loc2,loc1,"[0.8, 1.2]",False,Acceptable
4,location,disparate_impact,0.938224,loc2,loc3,"[0.8, 1.2]",False,Acceptable
5,location,disparate_impact,0.901431,loc2,All,"[0.8, 1.2]",False,Acceptable
6,location,disparate_impact,0.945946,loc3,loc1,"[0.8, 1.2]",False,Acceptable
7,location,disparate_impact,1.065844,loc3,loc2,"[0.8, 1.2]",False,Acceptable
8,location,disparate_impact,0.996528,loc3,All,"[0.8, 1.2]",False,Acceptable


## KL Divergence

KL Divergence measures how one probability distribution 
𝑄 differs from an expected distribution 𝑃, representing the "information lost" when 𝑄 approximates 𝑃. A value of zero means 𝑄 perfectly matches 𝑃, with higher values indicating more divergence. It’s asymmetric, so the KL Divergence of 𝑃 || 𝑄 is not the same as the KL Divergence of 𝑄 || 𝑃. In the context of fairness, KL Divergence can be used to determine which demographics have different target distributions relative to other demographics, which may need to be investigated for bias in training data.

$$
D_{KL}(P || Q) = \sum_{i} P(i) \log \frac{P(i)}{Q(i)}
$$

or, in the continuous case:

$$
D_{KL}(P || Q) = \int_{-\infty}^{\infty} p(x) \log \frac{p(x)}{q(x)} \, dx
$$

where:
- P and Q are the two probability distributions.
- p(x) and q(x) are the probability density functions of P and Q, respectively.

In the context of GRAIL, the KL Divergence will be generated by comparing the probability distribution of the target variable for the entire dataset against the probability distribution of the target variable for specific categories of a feature. The higher the value of the KL Divergence, the larger the difference is between the target distribution of a specific demographic versus the target distribution of the rest of the dataset. This implies that the demographic group could have a relationship with the target that is different from other groups and might need to be investigated further (ex. people from loc1 could be more or less likely to get their loans rejected in training data).

The function's output is a dataframe containing the KL Divergence value per category and a column to specify if the KL Divergence has exceeded the set threshold. For GRAIL, the default threshold is 0.20 but can be adjusted by the user.

In [None]:
# | export


def compute_kl_divergence(
    df: pd.DataFrame,
    target_col: str,
    protected_feature: str,
    underprivileged_group: str = None,
    threshold: float = 0.20,
):
    """
    Calculate Kullback-Leibler divergence between the overall target distribution
    and a specified risk group's target distribution.

    Parameters
    ----------
    df : pandas.DataFrame
        The input dataframe containing the target and risk feature columns.
    target_col : str
        The target variable column name.
    protected_feature : str
        The risk feature column name.
    underprivileged_group : str, optional
        The specific group to analyze within the risk feature. If None, KL Divergence will be calculated for all groups.
    threshold : float, optional
        The threshold for determining if the KL divergence exceeds a specified limit. Default is 0.20.

    Returns
    -------
    pd.DataFrame
        A dataframe containing the KL divergence results and threshold status.
    """
    METRIC_NAME = "KL Divergence"
    if underprivileged_group:
        target_distribution = df[target_col].value_counts(normalize=True)
        group_distribution = df[df[protected_feature] == underprivileged_group][
            target_col
        ].value_counts(normalize=True)
        combined_index = target_distribution.index.union(group_distribution.index)
        p = target_distribution.reindex(combined_index, fill_value=0)
        q = group_distribution.reindex(combined_index, fill_value=0)
        kl_divergence = entropy(p, q)
        exceeds_flag = kl_divergence > threshold
        return pd.DataFrame(
            {
                "target": target_col,
                "protected_feature": protected_feature,
                "underprivileged_group": underprivileged_group,
                "kl_divergence": kl_divergence,
                "threshold": threshold,
                "exceeds": exceeds_flag,
                "remarks": remark_spiel_generator(
                    feature_name=protected_feature,
                    group=underprivileged_group,
                    metric_name=METRIC_NAME,
                    exceeds_flag=exceeds_flag,
                    threshold=threshold,
                ),  # f"Group {underprivileged_group} of Feature {protected_feature} Exceeds KL Divergence Threshold of {threshold}"
                # if kl_divergence > threshold
                # else "Does Not Exceed Threshold",
            },
            index=list(range(0, 1)),
        )
    else:
        groups = df[protected_feature].unique()
        kl_divs = []
        exceeds = []
        remarks = []
        for risk_group in groups:
            target_distribution = df[target_col].value_counts(normalize=True)
            group_distribution = df[df[protected_feature] == risk_group][
                target_col
            ].value_counts(normalize=True)
            combined_index = target_distribution.index.union(group_distribution.index)
            p = target_distribution.reindex(combined_index, fill_value=0)
            q = group_distribution.reindex(combined_index, fill_value=0)
            kl_divergence = entropy(p, q)
            kl_divs.append(kl_divergence)
            exceeds_flag = kl_divergence > threshold
            exceeds.append(exceeds_flag)
            remarks.append(
                remark_spiel_generator(
                    feature_name=protected_feature,
                    group=risk_group,
                    metric_name=METRIC_NAME,
                    exceeds_flag=exceeds_flag,
                    threshold=threshold,
                )
            )
            # remarks.append(
            #     f"Group {risk_group} of Feature {protected_feature} Exceeds KL Divergence Threshold of {threshold}"
            #     if kl_divergence > threshold
            #     else "Does Not Exceed Threshold"
            # )
        return pd.DataFrame(
            {
                "target": target_col,
                "risk_feature": protected_feature,
                "risk_group": groups,
                "kl_divergence": kl_divs,
                "threshold": threshold,
                "exceeds": exceeds,
                "remarks": remarks,
            },
            index=list(range(0, len(groups))),
        )

### Computing KL Divergence for 1 Category
In this example, the target distribution of the full dataset is compared against the target distribution of females. The calculated KL divergence is 1.3, which is higher than the default threshold of 0.2. The data for females may then be investigated for potential bias.

In [None]:
compute_kl_divergence(df, "target", "gender", "female")

Unnamed: 0,target,protected_feature,underprivileged_group,kl_divergence,threshold,exceeds,remarks
0,target,gender,female,0.789931,0.2,True,Group female of Feature gender exceeded the KL...


### Computing KL Divergence for Multiple Categories
In the example below, we calculate KL Divergences for locations. All 3 KL Divergence values are under the default threshold, which implies that the target distributions for rows in loc1, loc2, or loc3 are consistent with the target distribution of the entire population.

In [None]:
compute_kl_divergence(df, "target", "location")

Unnamed: 0,target,risk_feature,risk_group,kl_divergence,threshold,exceeds,remarks
0,target,location,loc3,0.006715,0.2,False,Acceptable
1,target,location,loc1,0.017287,0.2,False,Acceptable
2,target,location,loc2,0.007069,0.2,False,Acceptable


In [None]:
# | hide
import nbdev

nbdev.nbdev_export()