# Induction Head Detector Tutorial

Learn to find and analyze induction heads in transformers!

## What are Induction Heads?

Induction heads are attention heads that implement in-context learning via:
1. **Pattern Matching**: Find previous occurrence of current token
2. **Copying**: Predict what came after that previous occurrence

Example: `The cat sat on the mat. The cat` -> predicts `sat`

## Why Do They Matter?

Induction heads are believed to be the primary mechanism for:
- In-context learning
- Few-shot prompting
- Pattern completion

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from induction_head_detector import InductionHeadDetector, DetectorConfig

print("Libraries loaded!")

## 1. Create Test Data

In [None]:
# Model parameters
n_layers = 12
n_heads = 12
seq_len = 64

# Generate random attention patterns
attention = np.random.rand(n_layers, n_heads, seq_len, seq_len)
attention = attention / attention.sum(axis=-1, keepdims=True)

# Add synthetic induction head at layer 5, head 7
for i in range(1, seq_len):
    attention[5, 7, i, i-1] = 0.9  # Strong previous-token attention
attention[5, 7] /= attention[5, 7].sum(axis=-1, keepdims=True)

print(f"Created attention patterns: {attention.shape}")

## 2. Detect Induction Heads

In [None]:
# Configure detector
config = DetectorConfig(
    induction_threshold=0.3,
    analyze_composition=True,
)

detector = InductionHeadDetector(config)

# Detect induction heads
heads = detector.detect(attention)

print(f"Found {len(heads)} induction heads:")
for head in heads[:5]:
    print(f"  Layer {head.layer}, Head {head.head}: score={head.score:.3f}")

## 3. Visualize Attention Pattern

In [None]:
# Visualize the induction head we created
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Induction head
im1 = axes[0].imshow(attention[5, 7, :32, :32], cmap='viridis')
axes[0].set_title('Layer 5, Head 7 (Induction Head)')
axes[0].set_xlabel('Key Position')
axes[0].set_ylabel('Query Position')
plt.colorbar(im1, ax=axes[0])

# Random head for comparison
im2 = axes[1].imshow(attention[0, 0, :32, :32], cmap='viridis')
axes[1].set_title('Layer 0, Head 0 (Random)')
axes[1].set_xlabel('Key Position')
axes[1].set_ylabel('Query Position')
plt.colorbar(im2, ax=axes[1])

plt.tight_layout()
plt.show()

## 4. Analyze Head Properties

In [None]:
# Analyze the detected induction head
if heads:
    top_head = heads[0]
    analysis = detector.analyze_head(
        attention[top_head.layer, top_head.head],
        layer=top_head.layer,
        head=top_head.head,
    )
    
    print("Induction Head Analysis:")
    print(f"  Pattern type: {analysis.pattern_type}")
    print(f"  Copying strength: {analysis.copying_strength:.3f}")
    print(f"  Previous token attention: {analysis.prev_token_attention:.3f}")

## 5. Score All Heads

In [None]:
# Create heatmap of induction scores
scores = np.zeros((n_layers, n_heads))
for head in heads:
    scores[head.layer, head.head] = head.score

plt.figure(figsize=(10, 8))
plt.imshow(scores, cmap='Reds', aspect='auto')
plt.colorbar(label='Induction Score')
plt.xlabel('Head')
plt.ylabel('Layer')
plt.title('Induction Scores by Layer and Head')
plt.show()

## Conclusion

This tutorial showed:
1. How induction heads work (pattern matching + copying)
2. Detecting induction heads via attention pattern analysis
3. Visualizing and comparing attention patterns
4. Scoring all heads for induction behavior

For real models, use the full API to analyze actual attention weights!