# Fairness Evaluation with Uniform Sampling
This Jupyter Notebook demonstrates the application of various sampling strategies to balance a dataset for fairness evaluation, particularly focusing on uniform sampling. The goal is to evaluate how different sampling techniques can impact fairness metrics, specifically for gender-based fairness in income prediction.

### Steps in this notebook:
1. Load the dataset and preprocess it (encoding categorical variables, filtering by hours worked).
2. Apply uniform sampling to balance the dataset.
3. Evaluate fairness metrics based on gender.

In [8]:
!pip install imbalanced-learn kagglehub gower 'aif360[all]'



In [9]:
# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from imblearn.over_sampling import RandomOverSampler
from imblearn.under_sampling import RandomUnderSampler
from group_fairness import evaluate_group_fairness
import kagglehub

## Data Preprocessing Functions

These functions handle preprocessing tasks such as:
- Encoding categorical columns (e.g., `sex` and `income`).
- Filtering the dataset based on hours worked (e.g., only including rows where `hours.per.week == 40`).
- Implementing sampling strategies for fairness evaluations.

In [10]:
# Function to encode categorical columns into numerical values
def encode_all_categorical_columns(data):
    """
    Automatically encodes all categorical columns into numerical values.
    
    Parameters:
    data (pd.DataFrame): The dataset containing categorical columns.

    Returns:
    pd.DataFrame: The dataset with all categorical columns encoded as numerical values.
    """
    # Encode 'income' and 'sex' columns
    data['income'] = data['income'].map({">50K": 1, "<=50K": 0})
    data['sex'] = data['sex'].map({"Male": 1, "Female": 0})

    # Keep only 'sex' and 'income' columns
    data = data[['sex', 'income']]

    # Drop rows with NaN if necessary
    if data.isna().sum().sum() > 0:
        print("Warning: NaN values detected after encoding! Check your categories.")
        data = data.dropna()  # Drop rows with NaN if necessary

    return data

### Modify the dataset based on the hours worked.

Checking if hours worked is a factor as to income disparity, so I filtered to a constant value.

In [11]:
# Function to filter dataset based on 'hours.per.week'
def modify_hours(data, hours = 40):
    """
    Filters the dataset to only include rows where 'hours.per.week' is equal to the specified value.
    
    Parameters:
    data (pd.DataFrame): The dataset containing the 'hours.per.week' column.

    Returns:
    pd.DataFrame: The dataset with 'hours.per.week' set to 40 for all rows.
    """
    return data[data['hours.per.week'] == hours]

### Sampling Methods

The following methods are used to balance the dataset. The uniform sampling method is particularly relevant, as it is designed to ensure that the dataset is evenly distributed across different protected groups (e.g., males and females) and is listed as the main preprocessing strategy in the paper.

In [12]:
def preprocess_sampling(data, target, protected_attrs, sampling_strategy='oversample'):
    """
    Applies sampling techniques to balance the dataset for fairness. 
    I have put examples of oversampling, undersampling, stratified sampling, and uniform sampling (Uniform sampling was explicitly mentioned in the paper).
    This function can be used to preprocess the dataset before applying any fairness metrics.

    Parameters:
    data (pd.DataFrame): DataFrame containing the features and target
    target (str): The target column name
    protected_attrs (list): List of protected attribute column names (e.g., ['sex'])
    sampling_strategy (str): 'oversample', 'undersample', 'stratified', or 'uniform'

    Returns:
    pd.DataFrame: Processed dataset
    """
    X = data.drop(columns=[target])
    y = data[target]

    if sampling_strategy == 'oversample':
        sampler = RandomOverSampler()
    elif sampling_strategy == 'undersample':
        sampler = RandomUnderSampler()
    elif sampling_strategy == 'stratified':
        return data.groupby(protected_attrs, group_keys=False).apply(lambda x: x.sample(len(x), replace=True)).reset_index(drop=True)
    elif sampling_strategy == 'uniform':
        # Apply uniform sampling (randomly select equal number of samples from each group)
        min_size = data.groupby(protected_attrs).size().min()  # Find the minimum size of the groups
        return data.groupby(protected_attrs, group_keys=False).apply(lambda x: x.sample(n=min_size, random_state=42)).reset_index(drop=True)
    else:
        raise ValueError("Invalid sampling strategy. Choose 'oversample', 'undersample', 'stratified', or 'uniform'.")

    X_resampled, y_resampled = sampler.fit_resample(X, y)
    return pd.concat([pd.DataFrame(X_resampled, columns=X.columns), pd.DataFrame(y_resampled, columns=[target])], axis=1)

def uniform_sampling(data, target, protected_attr):
    """
    Applies uniform sampling based on partitions defined by the protected attribute and target class with methods listed in the paper.

    Parameters:
    data (pd.DataFrame): The dataset containing the features and target
    target (str): The target column name
    protected_attr (str): The protected attribute column name (e.g., 'sex')

    Returns:
    pd.DataFrame: New dataset after uniform sampling
    """
    # Step 1: Partition the dataset into the four groups (DP, DN, FP, FN)
    DP = data[(data[protected_attr] == 1) & (data[target] == 1)]  # Protected group with favorable outcome
    DN = data[(data[protected_attr] == 1) & (data[target] == 0)]  # Protected group with unfavorable outcome
    FP = data[(data[protected_attr] == 0) & (data[target] == 1)]  # Unprotected group with favorable outcome
    FN = data[(data[protected_attr] == 0) & (data[target] == 0)]  # Unprotected group with unfavorable outcome

    # Step 2: Calculate the weights for each partition based on size
    total_samples = len(data)
    weight_DP = len(DP) / total_samples
    weight_DN = len(DN) / total_samples
    weight_FP = len(FP) / total_samples
    weight_FN = len(FN) / total_samples

    # Step 3: Sample uniformly with replacement based on weights
    samples_DP = DP.sample(n=int(weight_DP * total_samples), replace=True, random_state=42)
    samples_DN = DN.sample(n=int(weight_DN * total_samples), replace=True, random_state=42)
    samples_FP = FP.sample(n=int(weight_FP * total_samples), replace=True, random_state=42)
    samples_FN = FN.sample(n=int(weight_FN * total_samples), replace=True, random_state=42)

    # Step 4: Combine the samples from all partitions
    uniform_sampled_data = pd.concat([samples_DP, samples_DN, samples_FP, samples_FN])

    return uniform_sampled_data


## Loading and Preprocessing the Dataset

loading the dataset, filter it by hours worked (optional), encode the categorical columns, and apply uniform sampling.

In [13]:
path = kagglehub.dataset_download("uciml/adult-census-income")
df = pd.read_csv(path + "/adult.csv") # Load the dataset

# df = modify_hours(df)  # Modify the 'hours.per.week' column
df = encode_all_categorical_columns(df)  # Automatically encode all categorical columns
df_oversampled = preprocess_sampling(df, target="income", protected_attrs=["sex"], sampling_strategy='oversample')
# Apply uniform sampling to balance the dataset based on paper
df_uniform_sampled = uniform_sampling(df, target="income", protected_attr="sex")

# Evaluate fairness metrics based on sex
fairness_metrics_over = evaluate_group_fairness(df_oversampled, target="income", protected_attr="sex")
fairness_metrics_uni = evaluate_group_fairness(df_uniform_sampled, target="income", protected_attr="sex")

print(f"Oversampling: {fairness_metrics_over}")
print(f"Uniform Sampling: {fairness_metrics_uni}")

Oversampling: {'Statistical Parity Difference': -0.3043882229682283, 'Disparate Impact': 0.47674277008926685, 'Demographic Parity': -0.3043882229682283}
Uniform Sampling: {'Statistical Parity Difference': -0.19627598779361352, 'Disparate Impact': 0.3580225496813511, 'Demographic Parity': -0.19627598779361352}


### Results and Discussion

After applying uniform sampling and evaluating the fairness metrics, we compared the results with those obtained from oversampling. The key fairness metrics we are focusing on are:

1. **Statistical Parity Difference**: Measures the difference in favorable outcomes across protected groups. A negative value indicates that the protected group (in this case, females) is less likely to receive a favorable outcome compared to the unprotected group (males).
2. **Disparate Impact**: Measures the ratio of favorable outcomes between the protected and unprotected groups. A value less than 1 indicates that the protected group is disadvantaged.
3. **Demographic Parity**: Similar to statistical parity, this metric ensures that each group has the same proportion of favorable outcomes.

Here are the results for each sampling strategy:

#### Oversampling:
- **Statistical Parity Difference**: -0.3092
- **Disparate Impact**: 0.4692
- **Demographic Parity**: -0.3092

#### Uniform Sampling:
- **Statistical Parity Difference**: -0.1963
- **Disparate Impact**: 0.3580
- **Demographic Parity**: -0.1963

#### Interpretation of Results:

1. **Statistical Parity Difference**:
   - **Oversampling**: The oversampling method has a larger negative value for statistical parity (-0.3092) compared to uniform sampling (-0.1963). This suggests that oversampling increases the disparity between the protected group (females) and the unprotected group (males), as the oversampling method over-represents the unprotected group, leading to an exacerbated disparity in favorable outcomes.
   - **Uniform Sampling**: This method reduces the disparity in statistical parity, resulting in a less negative value (-0.1963). It ensures that both protected and unprotected groups are represented more evenly in the dataset.

2. **Disparate Impact**:
   - **Oversampling**: The disparate impact value for oversampling (0.4692) is quite low, indicating that the protected group (females) is disadvantaged, but to a lesser extent than with uniform sampling.
   - **Uniform Sampling**: The disparate impact for uniform sampling is 0.3580, which is slightly worse than oversampling but still indicates a noticeable disadvantage for females compared to males. This metric shows that while uniform sampling improves fairness, it still leaves some disparity between the two groups.

3. **Demographic Parity**:
   - **Oversampling**: Similar to statistical parity, the oversampling strategy results in a more significant negative value (-0.3092) for demographic parity, showing a stronger disparity in favorable outcomes.
   - **Uniform Sampling**: The demographic parity for uniform sampling is also less negative (-0.1963), which shows that uniform sampling reduces the disparity in the proportion of favorable outcomes between the protected and unprotected groups compared to oversampling.

### Conclusion:
- **Oversampling** exacerbates the disparity between the protected and unprotected groups, leading to a stronger imbalance in favorable outcomes. While it increases the representation of the under-represented group, it does so at the cost of creating a more significant fairness issue.
- **Uniform Sampling**, on the other hand, reduces the disparity in fairness metrics, particularly in terms of statistical parity and demographic parity. Although it still shows some disadvantage for females, it represents a more balanced approach compared to oversampling, particularly when considering fairness in machine learning models.

Thus, uniform sampling appears to be a better method for achieving fairness, especially in terms of reducing the disparities between protected and unprotected groups.