<a href="https://colab.research.google.com/github/peremartra/optipfair/blob/main/examples/bias_compatibility_check.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#OptiPFair Notebook Series – Bias Visualization Compatibility Checker

![optiPfair Logo](https://github.com/peremartra/optipfair/blob/main/images/optiPfair.png?raw=true)


Verify if your model is compatible with [OptiPFair](https://github.com/peremartra/optipfair) bias visualization capabilities for analyzing fairness and bias in transformer models.

This notebook quickly verifies if your transformer model is compatible with OptipFair's **bias visualization** capabilities.

**In 30 seconds, you'll know:**
- Can I analyze bias in this model with OptipFair?
- What visualization types are supported?
- Are all required dependencies available?
- Any specific recommendations for my model?

**Supported features:** Activation capture, mean difference plots, heatmaps, PCA analysis, and bias metrics.

##Recommended Environment

- **Platform**: [Google Colab](https://colab.research.google.com)  
- **Hardware**: GPU runtime (recommended: T4 or better for 1B–3B models)  
- **Dependencies**: Installed automatically in the first cell

##by Pere Martra.

- [LinkedIn](https://www.linkedin.com/in/pere-martra)  
- [GitHub](https://github.com/peremartra)  
- [X / Twitter](https://x.com/peremartra)

## Setup

In [None]:
# Install required packages with bias visualization support
!pip install "optipfair[viz]" -q

In [None]:
import torch
from transformers import AutoModel, AutoConfig, AutoTokenizer
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.decomposition import PCA
import warnings
warnings.filterwarnings('ignore')

print("✅ Setup complete!")
print(f"🔥 PyTorch version: {torch.__version__}")
print(f"🤗 Device: {'GPU' if torch.cuda.is_available() else 'CPU'}")
print("📦 OptipFair bias visualization compatibility checker ready")

## Model Input
**Enter your model name below:**  
You can use any Hugging Face model ID (e.g., `meta-llama/Llama-3.2-1B`, `google/gemma-2-2b`)

In [None]:
# 👇 EDIT THIS: Enter your model name
MODEL_NAME = "meta-llama/Llama-3.2-1B"  # Change this to test your model
print(f"🔍 Checking bias visualization compatibility for: {MODEL_NAME}")

## Compatibility Analysis

In [None]:
def check_dependencies():
    """
    Check if all required dependencies are available for bias visualization
    """
    dependencies = {
        "matplotlib": False,
        "seaborn": False,
        "sklearn": False,
        "numpy": False,
        "optipfair_bias": False,
        "all_available": False
    }
    
    try:
        import matplotlib.pyplot as plt
        dependencies["matplotlib"] = True
        
        import seaborn as sns
        dependencies["seaborn"] = True
        
        import sklearn
        from sklearn.decomposition import PCA
        dependencies["sklearn"] = True
        
        import numpy as np
        dependencies["numpy"] = True
        
        # Test OptiPFair bias module
        from optipfair.bias import visualize_bias
        dependencies["optipfair_bias"] = True
        
        dependencies["all_available"] = all([
            dependencies["matplotlib"],
            dependencies["seaborn"], 
            dependencies["sklearn"],
            dependencies["numpy"],
            dependencies["optipfair_bias"]
        ])
        
        return dependencies
        
    except ImportError as e:
        dependencies["error"] = str(e)
        return dependencies

def check_model_architecture(model_name):
    """
    Check if model has the required architecture for bias visualization
    """
    try:
        print("🔄 Loading model configuration...")
        config = AutoConfig.from_pretrained(model_name)
        
        # Initialize results
        results = {
            "model_name": model_name,
            "config_loaded": True,
            "architecture_compatible": False,
            "issues": [],
            "recommendations": [],
            "details": {}
        }
        
        # Extract basic info
        results["details"]["model_type"] = getattr(config, 'model_type', 'Unknown')
        results["details"]["num_layers"] = getattr(config, 'num_hidden_layers', 'N/A')
        results["details"]["hidden_size"] = getattr(config, 'hidden_size', 'N/A')
        results["details"]["intermediate_size"] = getattr(config, 'intermediate_size', 'N/A')
        
        print(f"📊 Model type: {results['details']['model_type']}")
        print(f"📊 Layers: {results['details']['num_layers']}")
        print(f"📊 Hidden size: {results['details']['hidden_size']}")
        
        return results
        
    except Exception as e:
        print(f"❌ Error loading model: {str(e)}")
        return {
            "model_name": model_name,
            "config_loaded": False,
            "error": str(e)
        }

def test_hook_registration(model_name):
    """
    Test if we can register hooks and capture activations for bias visualization.
    Based on OptiPFair's bias module requirements.
    """
    try:
        print("\n🔍 Testing hook registration and activation capture...")
        
        # Load a small portion of the model
        model = AutoModel.from_pretrained(
            model_name,
            torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
            device_map="auto" if torch.cuda.is_available() else "cpu",
            trust_remote_code=True
        )
        
        # Try to find the model layers - OptiPFair expects model.model.layers structure
        layers = None
        layer_access_method = None
        
        if hasattr(model, 'model') and hasattr(model.model, 'layers'):
            layers = model.model.layers
            layer_access_method = "model.model.layers"
        elif hasattr(model, 'layers'):
            layers = model.layers
            layer_access_method = "model.layers"
        elif hasattr(model, 'transformer') and hasattr(model.transformer, 'h'):
            layers = model.transformer.h
            layer_access_method = "model.transformer.h"
        
        if layers is None or len(layers) == 0:
            return {
                "hook_registration": False,
                "layers_found": False,
                "error": "Could not find transformer layers"
            }
        
        # Test hook registration on first layer
        first_layer = layers[0]
        hook_results = {
            "layers_found": True,
            "layer_access_method": layer_access_method,
            "total_layers": len(layers),
            "components_found": {},
            "supported_visualizations": [],
            "layer_types_available": []
        }
        
        # Test for attention component (generates attention_output_layer_N)
        if hasattr(first_layer, 'self_attn'):
            hook_results["components_found"]["attention"] = True
            hook_results["components_found"]["attention_type"] = "self_attn"
            hook_results["layer_types_available"].append("attention_output")
            hook_results["supported_visualizations"].extend([
                "attention_output: mean_diff visualizations",
                "attention_output: heatmap visualizations", 
                "attention_output: PCA analysis"
            ])
        elif hasattr(first_layer, 'attn'):
            hook_results["components_found"]["attention"] = True
            hook_results["components_found"]["attention_type"] = "attn"
            hook_results["layer_types_available"].append("attention_output")
            hook_results["supported_visualizations"].extend([
                "attention_output: mean_diff visualizations",
                "attention_output: heatmap visualizations", 
                "attention_output: PCA analysis"
            ])
        else:
            hook_results["components_found"]["attention"] = False
            hook_results["components_found"]["attention_type"] = None
        
        # Test for MLP component (generates mlp_output_layer_N)
        if hasattr(first_layer, 'mlp'):
            hook_results["components_found"]["mlp"] = True
            hook_results["layer_types_available"].append("mlp_output")
            mlp = first_layer.mlp
            
            # Check for GLU components (important for detailed MLP analysis)
            if hasattr(mlp, 'gate_proj'):
                hook_results["components_found"]["gate_proj"] = True
                hook_results["layer_types_available"].append("gate_proj")
                hook_results["supported_visualizations"].extend([
                    "gate_proj: mean_diff visualizations",
                    "gate_proj: heatmap visualizations"
                ])
            
            if hasattr(mlp, 'up_proj'):
                hook_results["components_found"]["up_proj"] = True
                hook_results["layer_types_available"].append("up_proj")
                hook_results["supported_visualizations"].extend([
                    "up_proj: mean_diff visualizations", 
                    "up_proj: heatmap visualizations"
                ])
            
            if hasattr(mlp, 'down_proj'):
                hook_results["components_found"]["down_proj"] = True
                hook_results["layer_types_available"].append("down_proj")
            
            # Add MLP visualizations (PCA only for mlp_output, not individual projections)
            hook_results["supported_visualizations"].extend([
                "mlp_output: mean_diff visualizations",
                "mlp_output: heatmap visualizations",
                "mlp_output: PCA analysis"
            ])
        else:
            hook_results["components_found"]["mlp"] = False
        
        # Test for input normalization (generates input_norm_layer_N)
        if hasattr(first_layer, 'input_layernorm'):
            hook_results["components_found"]["input_norm"] = True
            hook_results["layer_types_available"].append("input_norm")
        
        # Test actual hook registration
        test_activations = {}
        
        def test_hook(name):
            def hook(module, input, output):
                if isinstance(output, tuple):
                    test_activations[name] = output[0].detach().cpu()
                else:
                    test_activations[name] = output.detach().cpu()
            return hook
        
        # Register test hooks
        handles = []
        if hook_results["components_found"]["attention"]:
            if hasattr(first_layer, 'self_attn'):
                handles.append(first_layer.self_attn.register_forward_hook(test_hook("test_attention")))
            elif hasattr(first_layer, 'attn'):
                handles.append(first_layer.attn.register_forward_hook(test_hook("test_attention")))
        
        if hook_results["components_found"]["mlp"]:
            handles.append(first_layer.mlp.register_forward_hook(test_hook("test_mlp")))
        
        hook_results["hook_registration"] = len(handles) > 0
        
        # Clean up
        for handle in handles:
            handle.remove()
        
        del model
        if torch.cuda.is_available():
            torch.cuda.empty_cache()
        
        return hook_results
        
    except Exception as e:
        print(f"⚠️  Hook registration test failed: {str(e)}")
        return {
            "hook_registration": False,
            "error": str(e)
        }

def test_tokenizer_compatibility(model_name):
    """
    Test if tokenizer works properly for bias analysis
    """
    try:
        print("\n🔍 Testing tokenizer compatibility...")
        
        tokenizer = AutoTokenizer.from_pretrained(model_name)
        
        # Test tokenization with sample bias analysis prompts from OptiPFair defaults
        test_prompts = [
            "The white man walked through the neighborhood. The police officer thought he",
            "The Black man walked through the neighborhood. The police officer thought he",
            "The male engineer proposed a solution. The team thought it was",
            "The female engineer proposed a solution. The team thought it was"
        ]
        
        tokenizer_results = {
            "tokenizer_loaded": True,
            "pad_token_available": tokenizer.pad_token is not None,
            "can_tokenize": True,
            "tokenization_examples": []
        }
        
        # Test tokenization
        for prompt in test_prompts:
            try:
                tokens = tokenizer.tokenize(prompt)
                if len(tokens) == 0:
                    tokenizer_results["can_tokenize"] = False
                    break
                tokenizer_results["tokenization_examples"].append({
                    "prompt": prompt,
                    "tokens": len(tokens),
                    "success": True
                })
            except Exception as e:
                tokenizer_results["can_tokenize"] = False
                tokenizer_results["tokenization_examples"].append({
                    "prompt": prompt,
                    "error": str(e),
                    "success": False
                })
                break
        
        return tokenizer_results
        
    except Exception as e:
        return {
            "tokenizer_loaded": False,
            "error": str(e)
        }

# Run all compatibility checks
print("Starting compatibility analysis...")
print("=" * 50)

# Check dependencies
print("📦 Checking bias visualization dependencies...")
dep_results = check_dependencies()

# Check model architecture
arch_results = check_model_architecture(MODEL_NAME)

# Test hook registration
hook_results = test_hook_registration(MODEL_NAME)

# Test tokenizer
tokenizer_results = test_tokenizer_compatibility(MODEL_NAME)

print("\n✅ Compatibility analysis complete!")

In [None]:
def generate_compatibility_assessment(dep_results, arch_results, hook_results, tokenizer_results):
    """
    Generate final compatibility assessment
    """
    assessment = {
        "model_name": arch_results.get("model_name", "Unknown"),
        "compatible": False,
        "compatibility_score": 0,
        "supported_features": [],
        "issues": [],
        "recommendations": [],
        "details": {}
    }
    
    # Check dependencies
    if dep_results.get("all_available", False):
        assessment["compatibility_score"] += 25
        assessment["supported_features"].append("✅ All visualization dependencies available")
    else:
        assessment["issues"].append("❌ Missing visualization dependencies")
    
    # Check model architecture
    if arch_results.get("config_loaded", False):
        assessment["compatibility_score"] += 25
        assessment["supported_features"].append("✅ Model configuration loaded successfully")
        
        # Check for supported model types
        model_type = arch_results.get("details", {}).get("model_type", "").lower()
        supported_types = ["llama", "mistral", "gemma", "qwen", "phi"]
        
        if any(supported_type in model_type for supported_type in supported_types):
            assessment["supported_features"].append(f"✅ Supported architecture: {model_type}")
        else:
            assessment["issues"].append(f"⚠️  Architecture '{model_type}' may have limited support")
    else:
        assessment["issues"].append("❌ Could not load model configuration")
    
    # Check hook registration
    if hook_results.get("hook_registration", False):
        assessment["compatibility_score"] += 30
        assessment["supported_features"].append("✅ Hook registration successful")
        
        # Check specific components
        components = hook_results.get("components_found", {})
        if components.get("attention", False):
            assessment["supported_features"].append("✅ Attention components available")
        if components.get("mlp", False):
            assessment["supported_features"].append("✅ MLP components available")
        if components.get("gate_proj", False) and components.get("up_proj", False):
            assessment["supported_features"].append("✅ GLU components available")
    else:
        assessment["issues"].append("❌ Hook registration failed")
    
    # Check tokenizer
    if tokenizer_results.get("tokenizer_loaded", False):
        assessment["compatibility_score"] += 20
        assessment["supported_features"].append("✅ Tokenizer loaded successfully")
        
        if tokenizer_results.get("can_tokenize", False):
            assessment["supported_features"].append("✅ Tokenization working")
    else:
        assessment["issues"].append("❌ Tokenizer loading failed")
    
    # Overall compatibility
    assessment["compatible"] = assessment["compatibility_score"] >= 70
    
    # Generate recommendations
    if assessment["compatible"]:
        assessment["recommendations"] = [
            "🎉 Your model is compatible with OptiPFair bias visualization!",
            "📦 Install: pip install optipfair",
            "📓 Try the basic_bias_visualization.ipynb example",
            "🔗 Documentation: https://github.com/peremartra/optipfair"
        ]
    else:
        assessment["recommendations"] = [
            "⚠️  Your model may have limited compatibility",
            "📧 Report issues: https://github.com/peremartra/optipfair/issues",
            "📚 Check supported models in documentation"
        ]
    
    # Store detailed results
    assessment["details"] = {
        "dependencies": dep_results,
        "architecture": arch_results,
        "hooks": hook_results,
        "tokenizer": tokenizer_results
    }
    
    return assessment

# Generate final assessment
final_assessment = generate_compatibility_assessment(dep_results, arch_results, hook_results, tokenizer_results)

## Final Results

In [None]:
def display_compatibility_results(assessment):
    """
    Display the final compatibility results in a clean format
    """
    print("=" * 60)
    print("🎯 OPTIPFAIR BIAS VISUALIZATION COMPATIBILITY REPORT")
    print("=" * 60)
    
    # Header
    status_emoji = "✅" if assessment["compatible"] else "❌"
    status_text = "COMPATIBLE" if assessment["compatible"] else "LIMITED COMPATIBILITY"
    
    print(f"\n{status_emoji} STATUS: {status_text}")
    print(f"📊 COMPATIBILITY SCORE: {assessment['compatibility_score']}/100")
    print(f"📦 MODEL: {assessment['model_name']}")
    
    # Supported features
    if assessment["supported_features"]:
        print(f"\n✅ SUPPORTED FEATURES:")
        for feature in assessment["supported_features"]:
            print(f"   {feature}")
    
    # Issues
    if assessment["issues"]:
        print(f"\n⚠️  ISSUES FOUND:")
        for issue in assessment["issues"]:
            print(f"   {issue}")
    
    # Available visualizations - Enhanced with OptiPFair manual information
    print(f"\n🎨 AVAILABLE VISUALIZATIONS:")
    if assessment["compatible"]:
        hook_details = assessment["details"].get("hooks", {})
        components = hook_details.get("components_found", {})
        layer_types = hook_details.get("layer_types_available", [])
        
        print(f"   📊 VISUALIZATION TYPES:")
        
        # Mean difference plots
        if any(lt in layer_types for lt in ["attention_output", "mlp_output", "gate_proj", "up_proj"]):
            print(f"   🔍 MEAN DIFFERENCE PLOTS:")
            for layer_type in ["attention_output", "mlp_output", "gate_proj", "up_proj"]:
                if layer_type in layer_types:
                    print(f"      • {layer_type}_layer_N visualizations")
        
        # Heatmap visualizations  
        if any(lt in layer_types for lt in ["attention_output", "mlp_output", "gate_proj", "up_proj"]):
            print(f"   🔥 HEATMAP VISUALIZATIONS:")
            for layer_type in ["attention_output", "mlp_output", "gate_proj", "up_proj"]:
                if layer_type in layer_types:
                    print(f"      • {layer_type}_layer_N heatmaps")
        
        # PCA analysis (only for attention_output and mlp_output)
        if any(lt in layer_types for lt in ["attention_output", "mlp_output"]):
            print(f"   📈 PCA ANALYSIS:")
            for layer_type in ["attention_output", "mlp_output"]:
                if layer_type in layer_types:
                    print(f"      • {layer_type}_layer_N PCA plots")
        
        print(f"   📊 BIAS METRICS:")
        print(f"      • Quantitative bias measurements")
        print(f"      • Cross-layer activation comparisons")
        print(f"      • Statistical significance tests")
        print(f"      • Layer-wise progression analysis")
    else:
        print("   ⚠️  Limited or no visualization support")
    
    # Layer-specific capabilities based on OptiPFair documentation
    hook_details = assessment["details"].get("hooks", {})
    if hook_details.get("supported_visualizations"):
        print(f"\n🔍 DETECTED LAYER CAPABILITIES:")
        for viz in hook_details["supported_visualizations"]:
            print(f"   ✅ {viz}")
    
    # Layer types available
    if hook_details.get("layer_types_available"):
        print(f"\n🏗️  AVAILABLE LAYER TYPES:")
        for layer_type in hook_details["layer_types_available"]:
            print(f"   • {layer_type}_layer_N (N = 0 to {hook_details.get('total_layers', 'unknown')-1})")
    
    # Recommendations
    if assessment["recommendations"]:
        print(f"\n💡 RECOMMENDATIONS:")
        for rec in assessment["recommendations"]:
            print(f"   {rec}")
    
    # Technical details
    arch_details = assessment["details"].get("architecture", {}).get("details", {})
    if arch_details:
        print(f"\n🔧 TECHNICAL DETAILS:")
        print(f"   • Model Type: {arch_details.get('model_type', 'unknown')}")
        print(f"   • Layers: {arch_details.get('num_layers', 'unknown')}")
        print(f"   • Hidden Size: {arch_details.get('hidden_size', 'unknown')}")
    
    hook_details = assessment["details"].get("hooks", {})
    if hook_details.get("layers_found"):
        print(f"   • Layer Access: {hook_details.get('layer_access_method', 'unknown')}")
        print(f"   • Total Layers: {hook_details.get('total_layers', 'unknown')}")
        
        # Component availability
        components = hook_details.get("components_found", {})
        print(f"   • Attention Component: {'✅' if components.get('attention') else '❌'}")
        print(f"   • MLP Component: {'✅' if components.get('mlp') else '❌'}")
        if components.get('mlp'):
            print(f"   • GLU Architecture: {'✅' if components.get('gate_proj') and components.get('up_proj') else '❌'}")
            print(f"   • Gate Projection: {'✅' if components.get('gate_proj') else '❌'}")
            print(f"   • Up Projection: {'✅' if components.get('up_proj') else '❌'}")
            print(f"   • Down Projection: {'✅' if components.get('down_proj') else '❌'}")
    
    print("=" * 60)

# Display final results
display_compatibility_results(final_assessment)

## 🔗 What's Next?

### ✅ If your model is compatible:
- **Install OptiPFair with bias visualization support:**
  ```bash
  pip install "optipfair[viz]"
  ```
- **Try basic bias visualization:**
  ```python
  from optipfair.bias import visualize_bias
  # Use default prompt pairs or create custom ones
  visualize_bias(model, tokenizer, output_dir="./bias_analysis")
  ```
- **Explore specific visualizations:**
  - `visualize_mean_differences()` for layer-wise analysis
  - `visualize_heatmap()` for detailed activation patterns
  - `visualize_pca()` for dimensional reduction analysis
- **Custom analysis:** Create prompt pairs for your specific bias concerns

### ❌ If your model has limited compatibility:
- **Check layer structure:** Your model may need `model.model.layers` structure
- **Request support:** Open an issue on [GitHub](https://github.com/peremartra/optipfair/issues)
- **Try alternatives:** Use a supported model (LLaMA, Mistral, Gemma, Qwen, Phi) for testing
- **Contribute:** Help us add support for your model architecture!

---

## 📚 Learn More About OptiPFair Bias Visualization

### 🎯 **Key Features:**
- **Layer Types Analyzed:** `attention_output`, `mlp_output`, `gate_proj`, `up_proj`, `down_proj`
- **Visualization Types:** Mean differences, heatmaps, PCA analysis
- **Metrics:** Quantitative bias measurements with statistical analysis
- **Layer Selection:** `"first_middle_last"`, `"all"`, or specific indices

### 📖 **Resources:**
- **📖 Documentation:** [OptipFair GitHub](https://github.com/peremartra/optipfair)  
- **📝 LLM Reference Manual:** `optipfair_llm_reference_manual.txt` for detailed API info
- **🎓 Examples:** Check out the `examples/` directory for tutorials
- **🔬 Research:** "From Biased to Balanced: Visualizing and Fixing Bias in Transformer Models"

### 🛠️ **Example Usage:**
```python
# Basic usage
from optipfair.bias import visualize_bias
_, metrics = visualize_bias(
    model, tokenizer,
    visualization_types=["mean_diff", "heatmap", "pca"],
    layers="first_middle_last"
)

# Advanced - target specific layers
from optipfair.bias import visualize_pca
visualize_pca(
    model, tokenizer,
    prompt_pair=("The white doctor...", "The Black doctor..."),
    layer_key="attention_output_layer_8"  # Specific layer
)
```

---

If you found this notebook useful, the best way to support the OptiPFair project is by **starring it on GitHub**. Your support helps boost the project's visibility and reach more developers and researchers.

### ➡️ [**Star OptiPFair on GitHub**](https://github.com/peremartra/optipfair)

---
You can also follow my work and new projects on:

* **[LinkedIn](https://www.linkedin.com/in/pere-martra/)**
* **[X / Twitter](https://twitter.com/PereMartra)**