# Bias Metric Example

This notebook demonstrates how to use the **Bias** metric from Fair Forge to detect and analyze bias in AI assistant responses across protected attributes like gender, race, religion, nationality, and sexual orientation.

## Installation

First, install Fair Forge and the required dependencies.

In [None]:
import sys
!uv pip install --python {sys.executable} --force-reinstall "$(ls ../../dist/*.whl)[bias]" -q

## Setup

Import the required modules and configure your credentials.

In [None]:
import os
import sys

# Add examples directory to path for helpers import
sys.path.insert(0, os.path.dirname(os.getcwd()))

from fair_forge.metrics.bias import Bias
from fair_forge.guardians import IBMGranite
from fair_forge.guardians.llms.providers import OpenAIGuardianProvider
from fair_forge.schemas.bias import GuardianLLMConfig
from helpers.retriever import LocalRetriever

In [None]:
# Configure your Guardian model credentials
# The Guardian uses an LLM endpoint (e.g., IBM Granite or LLamaGuard) for bias detection
GUARDIAN_URL = os.environ.get("GUARDIAN_URL", "https://your-guardian-endpoint")
GUARDIAN_MODEL_NAME = os.environ.get("GUARDIAN_MODEL_NAME", "ibm-granite/granite-guardian-3.1-8b")
GUARDIAN_API_KEY = os.environ.get("GUARDIAN_API_KEY", "your-api-key-here")

## Configure the Guardian

The Bias metric uses a Guardian model to detect potential bias in responses. Fair Forge supports:
- **IBMGranite**: Uses IBM's Granite Guardian model
- **LLamaGuard**: Uses Meta's LLamaGuard model

In [None]:
guardian_config = GuardianLLMConfig(
    model=GUARDIAN_MODEL_NAME,
    api_key=GUARDIAN_API_KEY,
    url=GUARDIAN_URL,
    temperature=0.5,
    provider=OpenAIGuardianProvider,
    logprobs=True,
)

## Run the Bias Metric

The Bias metric analyzes each Q&A interaction for potential bias across protected attributes:
- Gender
- Race
- Religion
- Nationality
- Sexual orientation

In [None]:
metrics = Bias.run(
    LocalRetriever,
    guardian=IBMGranite,
    config=guardian_config,
    confidence_level=0.80,
    verbose=True,
)

## Analyze Results

Each BiasMetric contains:
- `confidence_intervals`: Statistical confidence intervals for each protected attribute
- `guardian_interactions`: Detailed bias assessments per Q&A interaction
- `cluster_profiling`: Clustering analysis of biased content
- `assistant_space`: Embedding space analysis

In [None]:
print(f"Total metrics generated: {len(metrics)}\n")

for metric in metrics:
    print(f"Session: {metric.session_id}")
    print(f"Assistant: {metric.assistant_id}")
    print("\nConfidence Intervals by Protected Attribute:")
    for ci in metric.confidence_intervals:
        print(f"  - {ci.protected_attribute}: [{ci.lower_bound:.3f}, {ci.upper_bound:.3f}]")
        print(f"    Probability: {ci.probability:.3f}, Samples: {ci.samples}")
    print("-" * 50)

## Inspect Guardian Interactions

View detailed bias assessments for each Q&A interaction.

In [None]:
for metric in metrics:
    print(f"\nBias Interactions for {metric.assistant_id}:")
    for attribute, interactions in metric.guardian_interactions.items():
        biased_count = sum(1 for i in interactions if i.is_biased)
        print(f"\n  {attribute}: {biased_count}/{len(interactions)} flagged as biased")
        for interaction in interactions:
            if interaction.is_biased:
                print(f"    - QA {interaction.qa_id}: certainty={interaction.certainty:.3f}")

## Export Results

Export the results to JSON for further analysis.

In [None]:
import json

# Custom serializer for nested structures
def serialize_metric(metric):
    data = {
        "session_id": metric.session_id,
        "assistant_id": metric.assistant_id,
        "confidence_intervals": [
            {
                "protected_attribute": ci.protected_attribute,
                "lower_bound": ci.lower_bound,
                "upper_bound": ci.upper_bound,
                "probability": ci.probability,
                "samples": ci.samples,
                "k_success": ci.k_success,
                "confidence_level": ci.confidence_level,
            }
            for ci in metric.confidence_intervals
        ],
        "guardian_interactions": {
            attr: [{"qa_id": i.qa_id, "is_biased": i.is_biased, "certainty": i.certainty} for i in interactions]
            for attr, interactions in metric.guardian_interactions.items()
        },
    }
    return data

results = [serialize_metric(m) for m in metrics]

with open("bias_results.json", "w") as f:
    json.dump(results, f, indent=2)

print("Results exported to bias_results.json")