# LBMD SOTA Framework - Quick Start Tutorial

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lbmd-research/lbmd-sota/blob/main/docs/tutorials/01_quickstart.ipynb)

Welcome to the LBMD SOTA Enhancement Framework! This tutorial will get you started with the basic concepts and functionality of Latent Boundary Manifold Decomposition for instance segmentation interpretability.

## What You'll Learn

- Basic LBMD concepts and terminology
- How to load models and datasets
- Running your first boundary analysis
- Visualizing and interpreting results
- Understanding boundary manifolds

## Prerequisites

- Basic Python knowledge
- Familiarity with PyTorch (helpful but not required)
- Understanding of instance segmentation (helpful but not required)

**Estimated Time**: 15-20 minutes

## 1. Installation and Setup

First, let's install the LBMD SOTA framework and its dependencies:

In [None]:
# Install the framework (uncomment if running in Colab or fresh environment)
# !pip install lbmd-sota

# Import required libraries
import torch
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path

# Import LBMD components
from lbmd_sota.core import LBMDConfig, load_global_config
from lbmd_sota.empirical_validation import MultiDatasetEvaluator
from lbmd_sota.visualization import InteractiveManifoldExplorer

print("✅ LBMD SOTA Framework imported successfully!")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

## 2. Understanding LBMD Concepts

Before diving into code, let's understand the key concepts:

### Latent Boundary Manifold Decomposition (LBMD)

LBMD is a mechanistic interpretability method that:
- **Identifies boundary-responsive neurons** in instance segmentation models
- **Constructs manifolds** in the latent space representing object boundaries
- **Analyzes transitions** between different object regions
- **Provides insights** into how models process object boundaries

### Key Terms

- **Boundary Responsiveness**: How much a neuron activates near object boundaries
- **Manifold**: A geometric structure in high-dimensional space representing boundary patterns
- **Transition Strength**: The clarity of boundaries between different object regions
- **Cluster**: Groups of similar boundary patterns in the manifold

## 3. Configuration Setup

Let's create a basic configuration for our analysis:

In [None]:
# Create a basic configuration
config = {
    'datasets': {
        'data_dir': './sample_data',
        'cache_dir': './cache',
        'batch_size': 4
    },
    'models': {
        'checkpoint_dir': './models',
        'architecture': 'maskrcnn_r50_fpn'
    },
    'lbmd_parameters': {
        'k_neurons': 20,  # Top-k boundary-responsive neurons
        'epsilon': 0.1,   # Boundary detection threshold
        'tau': 0.5,       # Transition strength threshold
        'manifold_method': 'umap'  # Manifold learning method
    },
    'visualization': {
        'output_dir': './results',
        'figure_format': 'png',
        'interactive': True
    }
}

# Create LBMD configuration object
lbmd_config = LBMDConfig(config)
print("✅ Configuration created successfully!")
print(f"Using {lbmd_config.lbmd_parameters.k_neurons} top boundary-responsive neurons")
print(f"Manifold method: {lbmd_config.lbmd_parameters.manifold_method}")

## 4. Loading Sample Data

For this tutorial, we'll use a small sample dataset. In practice, you would use larger datasets like COCO or Cityscapes:

In [None]:
# Download and prepare sample data
from lbmd_sota.empirical_validation.dataset_loaders import COCODatasetLoader

# Initialize dataset loader
dataset_loader = COCODatasetLoader(lbmd_config.datasets)

# Load a small sample for demonstration
try:
    # This will download a small COCO sample if not available
    sample_data = dataset_loader.load_sample_data(num_images=5)
    print(f"✅ Loaded {len(sample_data)} sample images")
    
    # Display first image
    sample_image, sample_mask = sample_data[0]
    
    fig, axes = plt.subplots(1, 2, figsize=(12, 5))
    axes[0].imshow(sample_image)
    axes[0].set_title('Sample Image')
    axes[0].axis('off')
    
    axes[1].imshow(sample_mask, cmap='tab20')
    axes[1].set_title('Instance Masks')
    axes[1].axis('off')
    
    plt.tight_layout()
    plt.show()
    
except Exception as e:
    print(f"⚠️ Could not load sample data: {e}")
    print("Using synthetic data for demonstration...")
    
    # Create synthetic data for demonstration
    sample_image = np.random.randint(0, 255, (224, 224, 3), dtype=np.uint8)
    sample_mask = np.random.randint(0, 5, (224, 224), dtype=np.uint8)
    sample_data = [(sample_image, sample_mask)]

## 5. Loading a Pre-trained Model

Let's load a pre-trained instance segmentation model:

In [None]:
from lbmd_sota.empirical_validation.architecture_manager import ArchitectureManager

# Initialize architecture manager
arch_manager = ArchitectureManager(lbmd_config.models)
arch_manager.initialize()

try:
    # Load pre-trained model
    model = arch_manager.load_model('maskrcnn_r50_fpn')
    model.eval()  # Set to evaluation mode
    
    print("✅ Model loaded successfully!")
    print(f"Model type: {type(model).__name__}")
    print(f"Device: {next(model.parameters()).device}")
    
    # Display model architecture summary
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    print(f"Total parameters: {total_params:,}")
    print(f"Trainable parameters: {trainable_params:,}")
    
except Exception as e:
    print(f"⚠️ Could not load pre-trained model: {e}")
    print("Creating a simple demonstration model...")
    
    # Create a simple model for demonstration
    import torch.nn as nn
    
    class SimpleSegmentationModel(nn.Module):
        def __init__(self):
            super().__init__()
            self.backbone = nn.Sequential(
                nn.Conv2d(3, 64, 3, padding=1),
                nn.ReLU(),
                nn.Conv2d(64, 128, 3, padding=1),
                nn.ReLU(),
                nn.Conv2d(128, 256, 3, padding=1),
                nn.ReLU()
            )
            self.head = nn.Conv2d(256, 80, 1)  # 80 COCO classes
        
        def forward(self, x):
            features = self.backbone(x)
            return self.head(features)
    
    model = SimpleSegmentationModel()
    print("✅ Simple demonstration model created!")

## 6. Running LBMD Analysis

Now let's run our first LBMD analysis:

In [None]:
from lbmd_sota.empirical_validation.multi_dataset_evaluator import MultiDatasetEvaluator

# Initialize the evaluator
evaluator = MultiDatasetEvaluator(lbmd_config)
evaluator.initialize()

# Prepare input tensor
input_tensor = torch.tensor(sample_image).permute(2, 0, 1).float() / 255.0
input_tensor = input_tensor.unsqueeze(0)  # Add batch dimension

print(f"Input tensor shape: {input_tensor.shape}")
print("Running LBMD analysis...")

try:
    # Run LBMD analysis
    with torch.no_grad():
        lbmd_results = evaluator.analyze_boundary_manifolds(
            model=model,
            input_tensor=input_tensor,
            target_layer='backbone.2'  # Analyze the third layer
        )
    
    print("✅ LBMD analysis completed!")
    print(f"Analyzed layer: {lbmd_results.layer_name}")
    print(f"Number of boundary neurons found: {len(lbmd_results.boundary_scores)}")
    print(f"Manifold coordinates shape: {lbmd_results.manifold_coords.shape}")
    print(f"Number of clusters: {len(np.unique(lbmd_results.clusters))}")
    
except Exception as e:
    print(f"⚠️ LBMD analysis failed: {e}")
    print("Creating synthetic results for demonstration...")
    
    # Create synthetic LBMD results
    from lbmd_sota.core.data_models import LBMDResults
    
    n_points = 1000
    lbmd_results = LBMDResults(
        layer_name='demo_layer',
        boundary_scores=np.random.rand(20),
        boundary_mask=np.random.rand(224, 224) > 0.7,
        manifold_coords=np.random.randn(n_points, 2),
        pixel_coords=np.random.randint(0, 224, (n_points, 2)),
        is_boundary=np.random.rand(n_points) > 0.5,
        clusters=np.random.randint(0, 5, n_points),
        transition_strengths={(0, 1): 0.8, (1, 2): 0.6, (2, 3): 0.9},
        cluster_hulls={},
        statistical_metrics=None,
        topological_properties=None
    )
    print("✅ Synthetic LBMD results created for demonstration!")

## 7. Visualizing Results

Let's visualize the LBMD analysis results:

In [None]:
# Create visualizations
fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# 1. Boundary responsiveness scores
axes[0, 0].bar(range(len(lbmd_results.boundary_scores)), 
               sorted(lbmd_results.boundary_scores, reverse=True))
axes[0, 0].set_title('Top Boundary-Responsive Neurons')
axes[0, 0].set_xlabel('Neuron Rank')
axes[0, 0].set_ylabel('Boundary Responsiveness Score')
axes[0, 0].grid(True, alpha=0.3)

# 2. Boundary mask overlay
axes[0, 1].imshow(sample_image)
boundary_overlay = np.ma.masked_where(~lbmd_results.boundary_mask, 
                                     lbmd_results.boundary_mask)
axes[0, 1].imshow(boundary_overlay, alpha=0.6, cmap='Reds')
axes[0, 1].set_title('Detected Boundaries (Red Overlay)')
axes[0, 1].axis('off')

# 3. Manifold visualization (2D projection)
scatter = axes[1, 0].scatter(lbmd_results.manifold_coords[:, 0], 
                            lbmd_results.manifold_coords[:, 1],
                            c=lbmd_results.clusters, 
                            cmap='tab10', 
                            alpha=0.6,
                            s=20)
axes[1, 0].set_title('Boundary Manifold (2D Projection)')
axes[1, 0].set_xlabel('Manifold Dimension 1')
axes[1, 0].set_ylabel('Manifold Dimension 2')
plt.colorbar(scatter, ax=axes[1, 0], label='Cluster ID')

# 4. Cluster distribution
cluster_counts = np.bincount(lbmd_results.clusters)
axes[1, 1].pie(cluster_counts, labels=[f'Cluster {i}' for i in range(len(cluster_counts))],
               autopct='%1.1f%%', startangle=90)
axes[1, 1].set_title('Cluster Distribution')

plt.tight_layout()
plt.show()

print("📊 Visualization complete!")

## 8. Interpreting the Results

Let's analyze what these results tell us:

In [None]:
# Analyze the results
print("🔍 LBMD Analysis Summary")
print("=" * 50)

# Boundary responsiveness analysis
top_scores = sorted(lbmd_results.boundary_scores, reverse=True)[:5]
print(f"\n📈 Top 5 Boundary Responsiveness Scores:")
for i, score in enumerate(top_scores, 1):
    print(f"  {i}. {score:.3f}")

# Boundary coverage
boundary_coverage = np.mean(lbmd_results.boundary_mask)
print(f"\n🎯 Boundary Coverage: {boundary_coverage:.1%} of image pixels")

# Cluster analysis
n_clusters = len(np.unique(lbmd_results.clusters))
print(f"\n🔗 Manifold Structure:")
print(f"  - Number of clusters: {n_clusters}")
print(f"  - Points per cluster (avg): {len(lbmd_results.clusters) / n_clusters:.1f}")

# Transition strength analysis
if lbmd_results.transition_strengths:
    avg_transition = np.mean(list(lbmd_results.transition_strengths.values()))
    print(f"\n⚡ Transition Analysis:")
    print(f"  - Average transition strength: {avg_transition:.3f}")
    print(f"  - Number of transitions: {len(lbmd_results.transition_strengths)}")
    
    # Show strongest and weakest transitions
    transitions = list(lbmd_results.transition_strengths.items())
    strongest = max(transitions, key=lambda x: x[1])
    weakest = min(transitions, key=lambda x: x[1])
    print(f"  - Strongest transition: {strongest[0]} → {strongest[1]:.3f}")
    print(f"  - Weakest transition: {weakest[0]} → {weakest[1]:.3f}")

print("\n" + "=" * 50)
print("✅ Analysis complete! You've successfully run your first LBMD analysis.")

## 9. Interactive Exploration

Let's create an interactive visualization for deeper exploration:

In [None]:
# Create interactive manifold explorer
from lbmd_sota.visualization import InteractiveManifoldExplorer

try:
    # Initialize interactive explorer
    explorer = InteractiveManifoldExplorer(lbmd_config.visualization)
    explorer.initialize()
    
    # Create interactive visualization
    interactive_viz = explorer.create_interactive_manifold(
        manifold_data={
            'coords': lbmd_results.manifold_coords,
            'clusters': lbmd_results.clusters,
            'boundary_flags': lbmd_results.is_boundary,
            'pixel_coords': lbmd_results.pixel_coords
        }
    )
    
    # Display interactive plot
    interactive_viz.show()
    
    print("🎮 Interactive visualization created!")
    print("   - Hover over points to see details")
    print("   - Click and drag to rotate (if 3D)")
    print("   - Use zoom controls to explore")
    
except Exception as e:
    print(f"⚠️ Interactive visualization not available: {e}")
    print("This might be due to missing dependencies or environment limitations.")
    print("The static visualizations above show the same information.")

## 10. Key Takeaways

Congratulations! You've completed your first LBMD analysis. Here's what you learned:

### 🎯 Core Concepts
- **Boundary Responsiveness**: Neurons that activate strongly near object boundaries
- **Manifold Structure**: How boundary patterns organize in high-dimensional space
- **Cluster Analysis**: Groups of similar boundary processing patterns
- **Transition Strength**: How clearly the model distinguishes between regions

### 🔧 Technical Skills
- Setting up LBMD configuration
- Loading models and datasets
- Running boundary analysis
- Visualizing and interpreting results

### 🚀 What's Next?
- **Tutorial 02**: Learn advanced configuration options
- **Tutorial 03**: Work with real datasets (COCO, Cityscapes)
- **Tutorial 04**: Deep dive into boundary analysis methods
- **Tutorial 05**: Interactive manifold exploration

### 💡 Research Applications
LBMD can help you:
- **Understand** how models process object boundaries
- **Identify** weak boundary representations
- **Improve** model architectures based on insights
- **Debug** segmentation failures
- **Compare** different interpretability methods

## 11. Exercise: Try It Yourself!

Now it's your turn to experiment:

In [None]:
# Exercise: Modify parameters and observe changes
print("🎯 Exercise: Experiment with different parameters")
print("Try modifying these parameters and re-running the analysis:")
print()

# Create a new configuration with different parameters
experimental_config = config.copy()
experimental_config['lbmd_parameters'] = {
    'k_neurons': 50,      # Try: 10, 30, 50, 100
    'epsilon': 0.05,      # Try: 0.01, 0.05, 0.1, 0.2
    'tau': 0.7,           # Try: 0.3, 0.5, 0.7, 0.9
    'manifold_method': 'tsne'  # Try: 'umap', 'tsne', 'pca'
}

print(f"Original k_neurons: {config['lbmd_parameters']['k_neurons']}")
print(f"New k_neurons: {experimental_config['lbmd_parameters']['k_neurons']}")
print()
print(f"Original epsilon: {config['lbmd_parameters']['epsilon']}")
print(f"New epsilon: {experimental_config['lbmd_parameters']['epsilon']}")
print()
print(f"Original manifold method: {config['lbmd_parameters']['manifold_method']}")
print(f"New manifold method: {experimental_config['lbmd_parameters']['manifold_method']}")
print()
print("💡 Questions to explore:")
print("  - How does increasing k_neurons affect the analysis?")
print("  - What happens with different epsilon values?")
print("  - How do different manifold methods compare?")
print("  - Which parameters give the clearest boundary detection?")

# Uncomment the lines below to run the experimental analysis
# experimental_lbmd_config = LBMDConfig(experimental_config)
# experimental_evaluator = MultiDatasetEvaluator(experimental_lbmd_config)
# experimental_evaluator.initialize()
# experimental_results = experimental_evaluator.analyze_boundary_manifolds(
#     model=model, input_tensor=input_tensor, target_layer='backbone.2'
# )

## 12. Resources and Next Steps

### 📚 Documentation
- [API Reference](../api/) - Detailed API documentation
- [Configuration Guide](../configuration.md) - Complete configuration options
- [Installation Guide](../installation.md) - Setup instructions

### 🎓 More Tutorials
- [Tutorial 02: Configuration](02_configuration.ipynb) - Advanced configuration
- [Tutorial 03: Data Loading](03_data_loading.ipynb) - Working with datasets
- [Tutorial 04: Boundary Analysis](04_boundary_analysis.ipynb) - Deep dive into analysis

### 💬 Community
- [GitHub Issues](https://github.com/lbmd-research/lbmd-sota/issues) - Bug reports and feature requests
- [GitHub Discussions](https://github.com/lbmd-research/lbmd-sota/discussions) - Questions and discussions
- [Research Papers](https://lbmd-research.github.io/papers/) - Academic publications

### 🔬 Research Applications
- Medical imaging analysis
- Autonomous driving systems
- Industrial quality control
- Scientific image analysis

---

**Congratulations on completing the LBMD Quick Start Tutorial!** 🎉

You now have the foundation to explore the powerful capabilities of the LBMD SOTA Enhancement Framework. Continue with the next tutorials to deepen your understanding and unlock advanced features.