# Day 23: Interpretability Dashboard

## 🔬 Objective
Visualize **Attention Weights** to understand how a model routes information. Attention tells us "When generating token X, which previous tokens was I looking at?"

## 🧠 Induction Heads
One key circuit in LLMs is the "Induction Head", which copies information. E.g., if the text says "Harry Potter", encountering "Harry" later makes the model attend back to "Potter".

In [None]:
import sys
import os
import numpy as np
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "../../")))

from src.observability.interpretability import AttentionVisualizer

### Step 1: Simulate Attention
Create a sequence [A, B, A, B]. When generating the 2nd 'B' (index 3), the model should look at the 1st 'B' (index 1).

In [None]:
tokens = ["Harry", "Potter", "ate", "Harry", "Potter"]
viz = AttentionVisualizer(tokens)

# Create specific attention matrix for our 'Induction Head'
# 5x5 matrix
matrix = np.zeros((5, 5))

# Standard causal mask (can't see future) - not strictly enforced here for simple viz

# Token 4 ('Potter') attends strongly to Token 1 ('Potter')
matrix[4, 1] = 0.95
matrix[4, 4] = 0.05 # Self attention

viz.add_attention(layer=1, head=1, matrix=matrix)

### Step 2: Analyze Focus
Check where the last token is looking.

In [None]:
focus = viz.get_strongest_focus(token_index=4, layer=1, head=1)
print(f"Token '{focus['source_token']}' (idx 4) is focusing on '{focus['focused_token']}' (idx {focus['focused_index']}) with weight {focus['weight']}")