# Responsible AI/ML: Ensuring Transparency, Fairness, and Privacy in Machine Learning

## Overview
This notebook demonstrates a responsible AI/ML workflow using the **Pima Indians Diabetes dataset**,  focusing on key principles such as:
- **Transparency** with **model explainability** using **SHAP**.
- **Fairness** with **AIF360** for bias detection.
- **Privacy** through **differentially private models** with **Diffprivlib**.
- **Accountability** by mitigating **bias** in model training.

This end-to-end example highlights how to go beyond just **accuracy** to build **ethical** and **trustworthy AI systems**.


## Learning Objectives
By the end of this tutorial, you will be able to:
- Load and preprocess a medical dataset for machine learning.
- Implement a baseline Random Forest model and evaluate its accuracy.
- Use SHAP to generate explainability insights.
- Perform fairness analysis using AIF360 to detect bias.
- Apply differential privacy to a Logistic Regression model.
- Simulate bias mitigation techniques using reweighting strategies.
- Visualize and compare model performance across responsible AI dimensions.


## Prerequisites
- Python 3.x
- Basic knowledge of machine learning with scikit-learn.
- Familiarity with fairness and privacy in AI/ML.
- Required libraries: pandas, numpy, scikit-learn, matplotlib, shap, aif360, diffprivlib.

## Get Started
Let’s begin by loading the dataset and performing a data-centric workflow. The workflow includes:
- **Data Cleaning**: Handling missing values and biological impossibilities
- **Baseline Model**: Standard Random Forest classifier
- **Explainability**: SHAP values for model interpretability
- **Fairness Analysis**: Disparate impact and statistical parity metrics
- **Privacy Protection**: Differentially private logistic regression (ε=1.0)
- **Bias Mitigation**: Sample reweighting based on age groups

### Install required packages


In [None]:
# Install required Python libraries for the project
# -------------------------------
# Library Installation Comments
# -------------------------------
# 1. pandas: For data manipulation and analysis (e.g., loading CSV files, cleaning data)
# 2. numpy: For numerical computations (e.g., handling arrays, mathematical operations)
# 3. scikit-learn: For machine learning tasks (e.g., model training, evaluation, preprocessing)
# 4. matplotlib: For data visualization (e.g., plotting graphs, charts)
# 5. shap: For model explainability (e.g., SHAP values to interpret model predictions)
# 6. aif360: For fairness analysis (e.g., measuring bias, disparate impact)
# 7. diffprivlib: For privacy-preserving machine learning (e.g., differential privacy)
%pip install pandas numpy scikit-learn matplotlib shap aif360 diffprivlib

### Import Libraries and Suppress Warnings

In [None]:
# Import necessary libraries

import pandas as pd  # For data manipulation and analysis
import numpy as np  # For numerical computations

# Import machine learning utilities
from sklearn.model_selection import train_test_split  # To split data into training and testing sets
from sklearn.ensemble import RandomForestClassifier  # Random Forest classifier for modeling
from sklearn.metrics import accuracy_score  # To evaluate model accuracy

# Import visualization tools
import matplotlib.pyplot as plt  # For plotting data and model results

# Import SHAP for model explainability
import shap  # Helps interpret model predictions by showing feature importance

# Import AIF360 for fairness evaluation
from aif360.datasets import BinaryLabelDataset  # Used to create datasets for fairness analysis
from aif360.metrics import BinaryLabelDatasetMetric  # Provides fairness-related metrics

# Import Differential Privacy library
from diffprivlib.models import LogisticRegression  # Privacy-preserving logistic regression

# Suppress warnings to keep output clean
import warnings
warnings.filterwarnings("ignore", category=UserWarning)


### Load and Clean the Dataset

In [None]:
def load_diabetes_data():
    """
    Loads the Pima Indians Diabetes dataset from a CSV file, assigns column names, 
    handles missing values, and prints basic dataset information.
    
    Returns:
        pd.DataFrame: A pandas DataFrame containing the dataset.
    """
    
    # Define the file path to the dataset
    diabetes_data = '../../Data/pima-indians-diabetes.csv'
    
    # Define column names for the dataset
    columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI',
               'DiabetesPedigreeFunction', 'Age', 'Outcome']
    
    # Load the dataset into a pandas DataFrame
    # - `header=None` ensures that the first row is not treated as column names
    # - `names=columns` assigns custom column names
    # - `na_values='?'` treats '?' as missing values (NaN)
    # - `sep=','` specifies that the CSV file is comma-separated
    df = pd.read_csv(diabetes_data, header=None, names=columns, na_values='?', sep=',')
    
    # Print the shape of the dataset (number of rows and columns)
    print('Dataset Shape:', df.shape)
    
    # Print the number of missing values in each column
    print('Initial Missing Values:', df.isnull().sum())
    
    return df  # Return the loaded DataFrame


# Load dataset
df = load_diabetes_data()  # Load the diabetes dataset

### The Responsible AI/ML Workflow

#### Step 1: Data Cleaning (Basic Quality Check)

In [None]:
# Replace zeros in specific columns with the median value of that column
# Some medical measurements should not be zero, so we treat zeros as missing values

for col in ['Glucose', 'BloodPressure', 'BMI', 'SkinThickness', 'Insulin']:
    # Replace 0s in the column with the median of the respective column
    # This helps handle invalid zero values in medical data
    df[col] = df[col].replace(0, df[col].median())  
  

#### Step 2: Define features and target

In [None]:
# Separate features (X) and target variable (y)

# Features: All columns except 'Outcome' (the target variable)
X = df.drop('Outcome', axis=1) 

# Target variable: 'Outcome' column (1 = diabetes, 0 = no diabetes)
y = df['Outcome']

# Split the dataset into training and testing sets
# 80% of the data is used for training, and 20% is used for testing
# The random_state ensures reproducibility of the split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)     

#### Step 3: Baseline Model (Transparency)

In [None]:
# Initialize and train a Random Forest classifier

# Create a Random Forest model with 100 decision trees (n_estimators=100)
# The random_state ensures reproducibility of results
model = RandomForestClassifier(n_estimators=100, random_state=42)

# Train (fit) the model using the training data
model.fit(X_train, y_train)  

# Make predictions on the test data
y_pred = model.predict(X_test)  

# Evaluate model performance by calculating accuracy
baseline_acc = accuracy_score(y_test, y_pred)  

# Print the baseline model accuracy
print("\nBaseline Model Accuracy:", baseline_acc)  

#### Step 4: Explainability with SHAP

In [None]:
# Convert test data to a NumPy array for SHAP analysis
# SHAP requires NumPy arrays rather than pandas DataFrames for some explainability functions
X_test_np = X_test.to_numpy()

# Initialize a SHAP explainer to interpret the model’s predictions
# Uses model.predict to generate explanations for predictions
# Converts training data to a NumPy array for SHAP compatibility
explainer = shap.Explainer(model.predict, X_train.to_numpy())  

# Compute SHAP values for the test dataset
shap_values = explainer(X_test_np)  

# Print status update for SHAP analysis
print("\nGenerating SHAP Summary Plot for Explainability...")

# Debugging information: Print the shape and type of SHAP values
print("X_test_np shape:", X_test_np.shape)  # Verify test data shape
print("shap_values type:", type(shap_values))  # Ensure correct SHAP object type
print("shap_values shape:", shap_values.shape)  # Confirm SHAP values' dimensions
print("Feature names:", list(X.columns))  # Display feature names for reference

# If shap_values is a SHAP Explanation object, extract the .values attribute
# This ensures that we have a NumPy array for further processing
shap_vals = shap_values.values if hasattr(shap_values, 'values') else shap_values

# Validate that the shape of SHAP values matches the shape of the test dataset
assert shap_vals.shape == X_test_np.shape, "SHAP values shape mismatch!"

# Generate a summary plot using SHAP values
# Displays feature importance using a bar plot
shap.summary_plot(shap_vals, X_test_np, feature_names=list(X.columns), plot_type="bar")

The chart indicates the relative importance of various features in influencing the model's predictions, with Glucose being the most impactful feature (mean SHAP value around 0.25), followed by BMI (around 0.15), and Age (around 0.12). Other features like DiabetesPedigreeFunction, Insulin, Pregnancies, BloodPressure, and SkinThickness have progressively smaller impacts, with SkinThickness having the least influence (mean SHAP value around 0.03).

This suggests that:
- **Glucose** levels are the strongest predictor in the model, likely indicating their critical role in determining outcomes such as diabetes risk or presence.
- **BMI** and **Age** are also significant contributors to the model’s predictions, but to a lesser extent than Glucose.
- Features like **SkinThickness**, **BloodPressure**, **Pregnancies**, **Insulin**, and **DiabetesPedigreeFunction** have minimal impact compared to the top three features, though they still contribute to the model’s output.
 
Overall, the model prioritizes metabolic and demographic factors (like glucose, BMI, and age) over other physiological measurements (like skin thickness or blood pressure) when making predictions. This hierarchy can help guide clinical or research focus on the most influential factors for diabetes prediction or management in the context of this specific model and dataset.

#### Step 5: Fairness Analysis (Bias Detection)

In [None]:
# Create a dataset with labels for fairness analysis

# Make a copy of the feature matrix (X) to avoid modifying the original dataset
df_with_labels = X.copy()

# Add the target variable ('Outcome') to the copied dataset for fairness evaluation
df_with_labels['Outcome'] = y

# Binarize the 'Age' feature: Convert it into a binary protected attribute
# 1 if Age >= 40 (privileged group), 0 otherwise (unprivileged group)
df_with_labels['Age'] = (df_with_labels['Age'] >= 40).astype(int)  

# Create a BinaryLabelDataset, a structured dataset format required by AIF360 for fairness analysis
# - 'df' contains the dataset
# - 'label_names' specifies the target variable
# - 'protected_attribute_names' defines the attributes under fairness evaluation
dataset = BinaryLabelDataset(df=df_with_labels, label_names=['Outcome'],
                             protected_attribute_names=['Age'])

# Define privileged and unprivileged groups based on the 'Age' feature
# Privileged group: Individuals aged 40 and above (Age = 1)
privileged_groups = [{'Age': 1}]
# Unprivileged group: Individuals younger than 40 (Age = 0)
unprivileged_groups = [{'Age': 0}]

# Compute fairness metrics using BinaryLabelDatasetMetric
metric = BinaryLabelDatasetMetric(dataset,
                                  unprivileged_groups=unprivileged_groups,
                                  privileged_groups=privileged_groups)

# Print fairness metrics
print("\nFairness Metrics:")
# Disparate Impact: Measures if one group receives favorable outcomes at a significantly different rate
print("Disparate Impact:", metric.disparate_impact())
# Statistical Parity Difference: Measures the difference in favorable outcomes between groups
print("Statistical Parity Difference:", metric.statistical_parity_difference())

## Fairness Metrics (Pima Indians Diabetes Dataset - Age as Protected Attribute)

These metrics evaluate the fairness of a model predicting diabetes in the Pima Indians dataset, specifically focusing on potential disparities based on age (>= 40 vs. < 40).

**1. Disparate Impact: 0.75 (Example Value)**

* **What it means (specifically):** In this context, a Disparate Impact of 0.75 means that individuals younger than 40 (the *unprivileged* group) are 75% as likely to be predicted as having diabetes compared to individuals 40 and older (the *privileged* group).

* **Implication:** This suggests the model might be under-predicting diabetes in younger individuals relative to older individuals.  A value closer to 1.0 would indicate greater parity. This warrants further investigation to determine if this difference is justified by the data or if it reflects bias.

**2. Statistical Parity Difference: -0.15 (Example Value)**

* **What it means (specifically):**  A Statistical Parity Difference of -0.15 indicates that the proportion of younger individuals (age < 40) predicted to have diabetes is 15 percentage points *lower* than the proportion of older individuals (age >= 40) predicted to have diabetes.

* **Implication:** This reinforces the finding from the Disparate Impact metric.  It demonstrates a quantifiable difference in prediction rates, with younger individuals being less likely to be predicted as having diabetes.

**Key Considerations for this Specific Scenario:**

* **Age Binarization:** Age has been simplified into two categories (>= 40 and < 40).  This might not capture the full complexity of age's relationship with diabetes risk.
* **Dataset Bias:** The Pima Indians dataset itself might contain biases that influence these metrics.  For example, if older individuals are overrepresented in the dataset or if the dataset reflects existing biases in diabetes screening practices, this could affect the results.
* **Causation vs. Correlation:** These metrics show an association, not necessarily a causal link. They don't prove the model is *unfair*, but they highlight a potential disparity that requires further scrutiny.  The lower prediction rate for younger individuals might reflect a genuine lower prevalence of diabetes in that age group.
* **Clinical Implications:**  If the model under-predicts diabetes in younger individuals, this could lead to delayed diagnosis and treatment, which has serious health consequences.

**Next Steps:**

* **Analyze the Data:** Examine the actual rates of diabetes diagnosis within the different age groups in the dataset.
* **Consider Other Metrics:** Use additional fairness metrics (like Equalized Odds Difference) to gain a more comprehensive understanding.
* **Investigate Potential Bias:** If bias is suspected, explore mitigation techniques, but do so carefully and thoughtfully in a medical context, consulting with domain experts.  Techniques like reweighting or adversarial debiasing might be considered.
* **Explainability:** Use methods like SHAP values to understand *why* the model makes the predictions it does. This can help identify potential sources of bias.

#### Step 6: Privacy-Preserving Model with Differential Privacy

In [None]:
# Initialize and train a logistic regression model with differential privacy

# Create a logistic regression model with differential privacy
# - `epsilon=1.0` controls the privacy level (lower values increase privacy but may reduce accuracy)
# - `random_state=42` ensures reproducibility
dp_model = LogisticRegression(epsilon=1.0, random_state=42)

# Train (fit) the differentially private model using the training data
dp_model.fit(X_train, y_train)

# Use the trained model to make predictions on the test dataset
dp_pred = dp_model.predict(X_test)

# Evaluate model performance by computing the accuracy score
dp_acc = accuracy_score(y_test, dp_pred)

# Print the accuracy of the differentially private model
print("\nDifferential Privacy Model Accuracy:", dp_acc)

This means the differentially private model achieved about 31.8% accuracy.
- **Low Accuracy**: 31.8% is relatively low, which is common with differential privacy. Protecting privacy often reduces accuracy.
- **Epsilon's Impact**: Epsilon's value affects accuracy. Lower epsilon (stronger privacy) usually means lower accuracy.
- **Context Matters**: Whether 31.8% is acceptable depends on the application. High privacy might be more important than high accuracy in some situations.

The differentially private logistic regression model has a low accuracy (31.8%), likely due to the privacy mechanisms. Consider adjusting epsilon or trying other models to balance privacy and accuracy.

#### Step 7: Accountability - Simulate Bias Mitigation

In [None]:
# Re-weight training samples to mitigate bias based on age

# Initialize an array of weights with default value 1 for all training samples
weights = np.ones(len(y_train))  

# Increase the weight of samples where 'Age' is 40 or older to give them more influence
weights[X_train['Age'] >= 40] = 1.2  

# Train a reweighted Random Forest model
# - `n_estimators=100`: Uses 100 decision trees in the forest
# - `random_state=42`: Ensures reproducibility
model_reweighted = RandomForestClassifier(n_estimators=100, random_state=42)

# Fit the model using the reweighted training data
# - `sample_weight=weights` applies different importance to training samples based on age
model_reweighted.fit(X_train, y_train, sample_weight=weights)  

# Predict labels for the test set using the reweighted model
y_pred_reweighted = model_reweighted.predict(X_test)  

# Calculate the accuracy of the reweighted model
reweighted_acc = accuracy_score(y_test, y_pred_reweighted)  

# Print the accuracy after applying bias mitigation
print("Reweighted Model Accuracy (Bias Mitigated):", reweighted_acc)

The accuracy of 0.7597 for your reweighted model indicates that it correctly predicted the outcomes (e.g., whether a patient has diabetes or not) about 75.97% of the time when applied to the test data.

- **Accuracy**: This is a standard performance metric that shows how often the model's predictions are correct. In this case, your reweighted model correctly predicted the target variable (e.g., diabetes outcome) 75.97% of the time.
- **Bias Mitigation Effect**: Since you re-weighted the training samples based on age, the model should now be less biased towards any one age group. The bias mitigation process works by giving more importance to underrepresented or disadvantaged groups, such as younger individuals in this case (age < 40).

If the baseline (original) model had a significant accuracy difference between age groups, this reweighted model aims to reduce that disparity.
The 75.97% accuracy represents the performance of the model after the bias mitigation, so the improvement or change in accuracy could be evaluated against the baseline model (which you'd have to compare using the same metric).

Next Steps:
- To assess the effectiveness of the bias mitigation, you could compare the accuracy of this reweighted model with the baseline (non-reweighted) model. If the accuracy is lower or roughly the same, it might indicate that while bias was reduced, it came at the cost of some overall model performance.
- If the model performs similarly to the baseline while showing less disparity across age groups (as confirmed by fairness metrics like statistical parity difference or disparate impact), then you can conclude that bias mitigation has been successful without severely harming predictive accuracy.

#### Visualize model comparison

In [None]:
# Plot a bar chart comparing the accuracy of baseline, differential privacy, and bias-mitigated models

plt.bar(
    ['Baseline', 'Differential Privacy', 'Bias Mitigated'],  # X-axis labels for each model
    [baseline_acc, dp_acc, reweighted_acc]  # Corresponding accuracy values
)

# Set the y-axis limits to range from 0 to 1 (as accuracy is a percentage between 0 and 1)
plt.ylim(0, 1)

# Label the y-axis to indicate what is being measured
plt.ylabel('Accuracy')

# Add a title to describe what the chart represents
plt.title('Model Performance Across Responsible AI Approaches')

# Display the bar chart
plt.show()

### Findings from the Diagram

#### 1. Fairness Metrics (Baseline Model)
- **Disparate Impact**: 0.5466
  - Indicates significant bias, with the protected group receiving favorable outcomes at only 54.66% the rate of the privileged group.
- **Statistical Parity Difference**: -0.2365
  - Shows the protected group has a 23.65% lower probability of a favorable outcome compared to the privileged group, confirming unfairness in the baseline model.

#### 2. Model Accuracies Across Responsible AI Approaches
- **Baseline Model**:
  - Accuracy: ~80% (highest among the three).
  - Trade-off: High accuracy but unfair, as evidenced by the fairness metrics.
- **Differential Privacy Model**:
  - Accuracy: 31.82% (lowest among the three).
  - Trade-off: Poor predictive performance but likely enhances privacy and potentially reduces bias (fairness metrics not provided).
- **Reweighted (Bias Mitigated) Model**:
  - Accuracy: 75.97% (moderate, slightly lower than baseline).
  - Trade-off: Slightly lower accuracy than the baseline but likely improved fairness compared to the baseline, balancing equity and performance.

#### 3. Visual Insights from the Bar Chart
- The chart shows model performance (accuracy) across three approaches:
  - Baseline: Highest accuracy (~0.8 or 80%).
  - Differential Privacy: Lowest accuracy (~0.3 or 30%).
  - Bias Mitigated (Reweighted): Moderate accuracy (~0.75–0.8 or 75.97%).
- This highlights a trade-off between accuracy, fairness, and privacy:
  - The baseline prioritizes accuracy but sacrifices fairness.
  - Differential privacy prioritizes privacy but sacrifices accuracy.
  - The reweighted model strikes a balance, maintaining reasonable accuracy while aiming to mitigate bias.

#### 4. Overall Interpretation
- The baseline model performs best in terms of accuracy but is biased against a protected group (likely related to age, as seen in your code reweighting Age >= 40).
- The reweighted model (bias mitigated) reduces this bias (though exact fairness metrics are needed for confirmation) at the cost of a modest accuracy drop (75.97% vs. 80%).
- The differential privacy model severely reduces accuracy (31.82%) for privacy benefits, with unclear fairness impact.
- The choice of approach depends on the priorities: accuracy, fairness, or privacy.

## Conclusion
In this tutorial, we explored a comprehensive Responsible AI workflow:
- Achieved model transparency using SHAP.
- Evaluated fairness and bias with AIF360.
- Implemented differential privacy with Diffprivlib.
- Enhanced model accountability by mitigating bias.
- Demonstrated how responsible AI practices can improve trustworthiness without sacrificing performance.

By incorporating ethical AI principles, we can build more reliable and fair AI systems that are suitable for real-world deployment, especially in sensitive domains like healthcare.

## Clean up

Remember to shut down your Jupyter Notebook environment and delete any unnecessary files or resources once you've completed the tutorial.
