To demonstrate how to use Fairlearn for bias metrics, you can follow this example. Fairlearn provides tools for measuring and mitigating bias in machine learning models. Here’s a step-by-step guide to setting up a simple demonstration using Fairlearn’s bias metrics on a sample dataset.

### Setup and Dependencies

First, ensure you have the necessary packages installed. You can install Fairlearn and other required libraries using pip:

```bash
pip install fairlearn scikit-learn pandas matplotlib
```

### Demo Code: Fairlearn Bias Metrics

This example uses the Adult Income dataset to evaluate fairness metrics for a simple classifier. The dataset is often used for fairness benchmarking and includes features like age, gender, and occupation.

```python
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from fairlearn.metrics import MetricFrame, selection_rate, false_positive_rate, false_negative_rate
from fairlearn.datasets import fetch_adult

# Load dataset
data = fetch_adult(as_frame=True)
X = data.data
y = data.target

# Convert target to binary (0 = <=50K, 1 = >50K)
y = (y == '>50K').astype(int)

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train a classifier
classifier = RandomForestClassifier(random_state=42)
classifier.fit(X_train, y_train)

# Make predictions
y_pred = classifier.predict(X_test)

# Define protected attribute (e.g., gender)
protected_attribute = X_test['gender']

# Create a MetricFrame for fairness metrics
metric_frame = MetricFrame(
    metrics={
        'selection_rate': selection_rate,
        'false_positive_rate': false_positive_rate,
        'false_negative_rate': false_negative_rate
    },
    y_true=y_test,
    y_pred=y_pred,
    sensitive_features=protected_attribute
)

# Print fairness metrics
print("Fairness Metrics:")
print(metric_frame.by_group)

# Overall accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Overall Accuracy: {accuracy:.2f}")

```

### Explanation

1. **Loading the Dataset:**
   - We use `fetch_adult` from Fairlearn to get the Adult Income dataset.

2. **Preprocessing:**
   - Convert the target to a binary format for simplicity.
   - Split the data into training and testing sets.

3. **Model Training:**
   - Train a `RandomForestClassifier` on the training set.

4. **Make Predictions:**
   - Use the trained model to predict outcomes on the test set.

5. **Fairness Metrics:**
   - Define a protected attribute (e.g., gender).
   - Use `MetricFrame` from Fairlearn to compute fairness metrics like selection rate, false positive rate, and false negative rate by group.
   - Print the results to assess fairness across different groups.

### Running the Code

Save the code in a Python script or Jupyter notebook, and run it to see the output. The `MetricFrame` will show you how the model’s predictions are distributed among different groups defined by the protected attribute.

### Conclusion

This demo provides a simple way to evaluate fairness metrics using Fairlearn. You can extend this example to use other datasets, different models, and additional fairness metrics as needed. For more advanced usage and customization, refer to the [Fairlearn documentation](https://fairlearn.org/) for comprehensive guidelines and examples.

In [2]:
pip install fairlearn scikit-learn pandas matplotlib


Note: you may need to restart the kernel to use updated packages.


In [3]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from fairlearn.metrics import MetricFrame, selection_rate, false_positive_rate, false_negative_rate
from fairlearn.datasets import fetch_adult

# Load dataset
data = fetch_adult(as_frame=True)
X = data.data
y = data.target

# Convert target to binary (0 = <=50K, 1 = >50K)
y = (y == '>50K').astype(int)

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train a classifier
classifier = RandomForestClassifier(random_state=42)
classifier.fit(X_train, y_train)

# Make predictions
y_pred = classifier.predict(X_test)

# Define protected attribute (e.g., gender)
protected_attribute = X_test['gender']

# Create a MetricFrame for fairness metrics
metric_frame = MetricFrame(
    metrics={
        'selection_rate': selection_rate,
        'false_positive_rate': false_positive_rate,
        'false_negative_rate': false_negative_rate
    },
    y_true=y_test,
    y_pred=y_pred,
    sensitive_features=protected_attribute
)

# Print fairness metrics
print("Fairness Metrics:")
print(metric_frame.by_group)

# Overall accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Overall Accuracy: {accuracy:.2f}")



ValueError: could not convert string to float: 'Private'