# Experiment 035B: Context-Controlled Excitation (Extended)

**AKIRA Project - Oscar Goldman - Shogu Research Group @ Datamutant.ai**

---

## The Core Insight: AQ Are About DISCRIMINATION

From `ACTION_QUANTA.md`:

```
An Action Quantum (AQ) is:
  THE MINIMUM PATTERN THAT ENABLES CORRECT DECISION
  
Key aspects:
  - MINIMUM: Cannot be reduced further
  - PATTERN: Has internal structure
  - ENABLES: Makes action possible
  - CORRECT: Leads to appropriate response
  - DECISION: Discriminates between alternatives
```

The key word is **DISCRIMINATION**. An AQ must discriminate between action alternatives.

---

## Why The Previous 035B Was Weak

We measured "The answer is" - a structural phrase that appears in many contexts.
This phrase doesn't DISCRIMINATE actions. It's scaffolding, not content.

From `LANGUAGE_ACTION_CONTEXT.md`:

```
AQ don't exist in the signal alone.
AQ don't exist in the context alone.
AQ CRYSTALLIZE from signal + context interaction.
```

The phrase "The answer is" is AQ-neutral. It doesn't carry discrimination.

---

## What We Need to Test

We need to test at the **DECISION POINT** - where the model must commit to a specific output.

Three approaches:

1. **Same word, different required action** - "bank" in finance vs river context
2. **Last token before prediction** - where AQ must crystallize
3. **Disambiguation points** - where context resolves ambiguity

---

## 1. Setup

In [None]:
# Install dependencies (uncomment for Colab)
!pip install transformers torch numpy scikit-learn matplotlib seaborn -q

In [None]:
import torch
import torch.nn as nn
from transformers import AutoModelForCausalLM, AutoTokenizer
import numpy as np
from sklearn.decomposition import PCA
from sklearn.metrics import silhouette_score, pairwise_distances
from sklearn.preprocessing import StandardScaler
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import seaborn as sns
from typing import Dict, List, Tuple, Optional
from dataclasses import dataclass, field
import warnings

warnings.filterwarnings('ignore')

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Device: {DEVICE}")
print(f"PyTorch version: {torch.__version__}")
if DEVICE == "cuda":
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

## 2. Configuration

In [None]:
@dataclass
class ExperimentConfig:
    """Configuration for extended context-controlled excitation experiment."""
    model_name: str = "gpt2-medium"  # Larger model for A100
    layers_to_probe: List[int] = field(default_factory=list)
    random_seed: int = 42
    
    def __post_init__(self) -> None:
        if not self.layers_to_probe:
            if "gpt2-medium" in self.model_name.lower():
                # GPT-2 Medium has 24 layers
                self.layers_to_probe = [0, 4, 8, 12, 16, 20, 23]
            elif "gpt2-large" in self.model_name.lower():
                # GPT-2 Large has 36 layers
                self.layers_to_probe = [0, 6, 12, 18, 24, 30, 35]
            elif "gpt2-xl" in self.model_name.lower():
                # GPT-2 XL has 48 layers
                self.layers_to_probe = [0, 8, 16, 24, 32, 40, 47]
            elif "gpt2" in self.model_name.lower():
                self.layers_to_probe = [0, 3, 6, 9, 11]
            else:
                self.layers_to_probe = [0, 4, 8, 12, 16, 20, 23]
        
        np.random.seed(self.random_seed)
        torch.manual_seed(self.random_seed)


config = ExperimentConfig()
print(f"Model: {config.model_name}")
print(f"Layers to probe: {config.layers_to_probe}")

## 3. Activation Capture

In [None]:
class ActivationCapture:
    """Captures activations from specified layers using forward hooks."""
    
    def __init__(self, model: nn.Module, layer_indices: List[int]) -> None:
        assert len(layer_indices) > 0, "Must specify at least one layer to probe"
        
        self.activations: Dict[int, torch.Tensor] = {}
        self.hooks: List[torch.utils.hooks.RemovableHandle] = []
        self.layer_indices = layer_indices
        
        if hasattr(model, 'transformer'):
            layers = model.transformer.h
        elif hasattr(model, 'gpt_neox'):
            layers = model.gpt_neox.layers
        else:
            raise ValueError(f"Unknown model architecture: {type(model)}")
        
        assert len(layers) > max(layer_indices), \
            f"Model has {len(layers)} layers but requested layer {max(layer_indices)}"
        
        for idx in layer_indices:
            layer = layers[idx]
            hook = layer.register_forward_hook(self._make_hook(idx))
            self.hooks.append(hook)
        
        print(f"Registered hooks on layers: {layer_indices}")
    
    def _make_hook(self, layer_idx: int):
        def hook(module, input, output):
            if isinstance(output, tuple):
                self.activations[layer_idx] = output[0].detach()
            else:
                self.activations[layer_idx] = output.detach()
        return hook
    
    def clear(self) -> None:
        self.activations = {}
    
    def remove_hooks(self) -> None:
        for hook in self.hooks:
            hook.remove()
        self.hooks = []
    
    def get_last_token_activation(self, layer_idx: int) -> np.ndarray:
        """Get activation at the LAST token (decision point)."""
        assert layer_idx in self.activations, f"Layer {layer_idx} not captured"
        act = self.activations[layer_idx]
        return act[0, -1, :].cpu().numpy()
    
    def get_token_activation(self, layer_idx: int, position: int) -> np.ndarray:
        """Get activation at a specific token position."""
        assert layer_idx in self.activations, f"Layer {layer_idx} not captured"
        act = self.activations[layer_idx]
        return act[0, position, :].cpu().numpy()
    
    def get_all_activations(self, layer_idx: int) -> np.ndarray:
        """Get all token activations."""
        assert layer_idx in self.activations, f"Layer {layer_idx} not captured"
        return self.activations[layer_idx][0].cpu().numpy()

## 4. Analysis Functions

In [None]:
def compute_metrics(activations: np.ndarray, labels: List[str]) -> Dict[str, float]:
    """Compute clustering metrics."""
    within_distances = []
    between_distances = []
    
    distances = pairwise_distances(activations, metric='euclidean')
    
    for i in range(len(labels)):
        for j in range(i + 1, len(labels)):
            if labels[i] == labels[j]:
                within_distances.append(distances[i, j])
            else:
                between_distances.append(distances[i, j])
    
    within_mean = np.mean(within_distances) if within_distances else 0
    between_mean = np.mean(between_distances) if between_distances else 0
    ratio = between_mean / within_mean if within_mean > 0 else float('inf')
    
    # Silhouette
    unique_labels = list(set(labels))
    label_to_int = {l: i for i, l in enumerate(unique_labels)}
    int_labels = [label_to_int[l] for l in labels]
    silhouette = silhouette_score(activations, int_labels) if len(set(int_labels)) >= 2 else 0.0
    
    return {
        'within_distance': within_mean,
        'between_distance': between_mean,
        'distance_ratio': ratio,
        'silhouette': silhouette
    }


def run_pca(activations: np.ndarray, n_components: int = 2) -> Tuple[np.ndarray, float]:
    """Apply PCA and return reduced activations and explained variance."""
    scaler = StandardScaler()
    scaled = scaler.fit_transform(activations)
    pca = PCA(n_components=n_components, random_state=42)
    reduced = pca.fit_transform(scaled)
    explained = sum(pca.explained_variance_ratio_) * 100
    return reduced, explained

## 5. Visualization

In [None]:
def plot_scatter(activations_2d: np.ndarray, labels: List[str], 
                 colors: Dict[str, str], title: str, texts: List[str] = None) -> None:
    """Create scatter plot with optional annotations."""
    plt.figure(figsize=(12, 10))
    
    for category in colors:
        mask = [l == category for l in labels]
        if any(mask):
            indices = [i for i, m in enumerate(mask) if m]
            plt.scatter(
                activations_2d[indices, 0],
                activations_2d[indices, 1],
                c=colors[category],
                label=category.replace('_', ' '),
                alpha=0.7,
                s=150,
                edgecolors='white',
                linewidth=1
            )
            
            # Add text annotations if provided
            if texts:
                for idx in indices:
                    # Truncate text for display
                    short_text = texts[idx][:30] + "..." if len(texts[idx]) > 30 else texts[idx]
                    plt.annotate(short_text, 
                                (activations_2d[idx, 0], activations_2d[idx, 1]),
                                fontsize=7, alpha=0.6)
    
    plt.xlabel('PC1', fontsize=12)
    plt.ylabel('PC2', fontsize=12)
    plt.title(title, fontsize=14)
    plt.legend(loc='best', fontsize=10)
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()


def plot_similarity_heatmap(activations: np.ndarray, labels: List[str], title: str) -> None:
    """Create similarity heatmap."""
    norms = np.linalg.norm(activations, axis=1, keepdims=True)
    normalized = activations / (norms + 1e-8)
    similarity = normalized @ normalized.T
    
    sorted_idx = sorted(range(len(labels)), key=lambda i: labels[i])
    similarity_sorted = similarity[sorted_idx][:, sorted_idx]
    labels_sorted = [labels[i] for i in sorted_idx]
    
    plt.figure(figsize=(14, 12))
    sns.heatmap(similarity_sorted, cmap='RdYlBu_r', vmin=-1, vmax=1, square=True,
                cbar_kws={'label': 'Cosine Similarity'})
    
    # Category boundaries
    unique = []
    boundaries = [0]
    for i, l in enumerate(labels_sorted):
        if l not in unique:
            unique.append(l)
            if i > 0:
                boundaries.append(i)
    boundaries.append(len(labels_sorted))
    
    for b in boundaries[1:-1]:
        plt.axhline(y=b, color='black', linewidth=2)
        plt.axvline(x=b, color='black', linewidth=2)
    
    plt.title(title, fontsize=14)
    plt.tight_layout()
    plt.show()


def plot_layer_progression(metrics_by_layer: Dict[int, Dict[str, float]], title: str) -> None:
    """Plot metrics across layers."""
    layers = sorted(metrics_by_layer.keys())
    silhouettes = [metrics_by_layer[l]['silhouette'] for l in layers]
    ratios = [metrics_by_layer[l]['distance_ratio'] for l in layers]
    
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    
    axes[0].plot(layers, silhouettes, 'o-', color='#3498db', linewidth=2, markersize=10)
    axes[0].set_xlabel('Layer Index', fontsize=12)
    axes[0].set_ylabel('Silhouette Score', fontsize=12)
    axes[0].set_title(f'{title} - Clustering Quality', fontsize=14)
    axes[0].grid(True, alpha=0.3)
    axes[0].set_ylim(-0.2, 1.0)
    
    axes[1].plot(layers, ratios, 'o-', color='#e74c3c', linewidth=2, markersize=10)
    axes[1].set_xlabel('Layer Index', fontsize=12)
    axes[1].set_ylabel('Between/Within Distance Ratio', fontsize=12)
    axes[1].set_title(f'{title} - Category Separation', fontsize=14)
    axes[1].grid(True, alpha=0.3)
    axes[1].axhline(y=1.0, color='gray', linestyle='--', alpha=0.5)
    
    plt.tight_layout()
    plt.show()

## 6. Load Model

In [None]:
print(f"Loading {config.model_name}...")
tokenizer = AutoTokenizer.from_pretrained(config.model_name)
model = AutoModelForCausalLM.from_pretrained(config.model_name)
model = model.to(DEVICE)
model.eval()

print(f"Model loaded on {DEVICE}")
print(f"Parameters: {sum(p.numel() for p in model.parameters()):,}")
print(f"Layers: {len(model.transformer.h)}")

---

# EXPERIMENT A: Polysemous Words at Decision Point

## The Test

The word "bank" can mean:
- Financial institution
- River edge
- To tilt (aircraft)
- To rely on

Same word. Different DISCRIMINATIONS required. Different ACTIONS enabled.

If AQ theory is correct:
- The SAME token "bank" should produce DIFFERENT activations
- Activations should cluster by MEANING (action required), not by token identity

We measure at the LAST TOKEN (decision point) where the model must commit.

In [None]:
# Polysemous word probes - same word, different meanings
POLYSEMY_PROBES = {
    'bank_financial': [
        "I need to deposit money at the bank",
        "The bank approved my loan application",
        "She works as a teller at the bank",
        "The bank charges high interest rates",
        "I opened a savings account at the bank",
        "The bank is closed on Sundays",
        "He withdrew cash from the bank",
        "The bank sent my monthly statement",
    ],
    'bank_river': [
        "We sat on the bank watching the river",
        "The fisherman stood on the bank",
        "Flowers grew along the river bank",
        "The children played on the bank",
        "We walked along the bank of the stream",
        "Trees lined the bank on both sides",
        "The boat was tied to the bank",
        "Ducks swam near the bank",
    ],
    'spring_season': [
        "The flowers bloom in spring",
        "Spring is my favorite season",
        "The birds return in spring",
        "We plant gardens in spring",
        "Spring brings warmer weather",
        "The days get longer in spring",
        "Spring rain helps the crops grow",
        "Children play outside in spring",
    ],
    'spring_coil': [
        "The spring in the mattress broke",
        "A spring provides tension in the mechanism",
        "The spring bounced back into shape",
        "He compressed the spring tightly",
        "The spring stores mechanical energy",
        "A broken spring caused the malfunction",
        "The spring mechanism needs repair",
        "Metal springs are used in cars",
    ],
    'bat_animal': [
        "A bat flew out of the cave",
        "Bats are nocturnal mammals",
        "The bat uses echolocation",
        "We saw a bat hanging upside down",
        "The bat caught insects in flight",
        "Bats sleep during the day",
        "A bat colony lives in the attic",
        "The bat spread its wings",
    ],
    'bat_sports': [
        "He swung the bat and hit a home run",
        "The baseball bat was made of wood",
        "She gripped the bat tightly",
        "The bat cracked on impact",
        "He practiced with the bat for hours",
        "The bat connected with the ball",
        "A new bat improved his hitting",
        "The bat felt heavy in his hands",
    ],
}

POLYSEMY_COLORS = {
    'bank_financial': '#3498db',  # blue
    'bank_river': '#2ecc71',      # green
    'spring_season': '#f39c12',   # orange
    'spring_coil': '#9b59b6',     # purple
    'bat_animal': '#e74c3c',      # red
    'bat_sports': '#1abc9c',      # teal
}

print(f"Polysemy probes: {len(POLYSEMY_PROBES)} categories")
print(f"Total probes: {sum(len(v) for v in POLYSEMY_PROBES.values())}")

In [None]:
# Run polysemy experiment
capture = ActivationCapture(model, config.layers_to_probe)

polysemy_activations = {layer: [] for layer in config.layers_to_probe}
polysemy_labels = []
polysemy_texts = []

print("Running polysemy probes (measuring at LAST token - decision point)...")

for category, prompts in POLYSEMY_PROBES.items():
    for prompt in prompts:
        inputs = tokenizer(prompt, return_tensors="pt").to(DEVICE)
        
        capture.clear()
        with torch.no_grad():
            outputs = model(**inputs)
        
        # Get LAST token activation (decision point)
        for layer_idx in config.layers_to_probe:
            act = capture.get_last_token_activation(layer_idx)
            polysemy_activations[layer_idx].append(act)
        
        polysemy_labels.append(category)
        polysemy_texts.append(prompt)

for layer_idx in config.layers_to_probe:
    polysemy_activations[layer_idx] = np.array(polysemy_activations[layer_idx])

print(f"Collected {len(polysemy_labels)} activations")

In [None]:
# Analyze polysemy results
print("=" * 70)
print("EXPERIMENT A: POLYSEMOUS WORDS AT DECISION POINT")
print("=" * 70)
print("\nQuestion: Does the SAME word produce DIFFERENT activations")
print("          based on the DISCRIMINATION required?\n")

polysemy_metrics = {}
polysemy_pca = {}

for layer_idx in config.layers_to_probe:
    print(f"--- Layer {layer_idx} ---")
    
    acts = polysemy_activations[layer_idx]
    metrics = compute_metrics(acts, polysemy_labels)
    polysemy_metrics[layer_idx] = metrics
    
    print(f"Silhouette: {metrics['silhouette']:.4f}")
    print(f"Distance ratio: {metrics['distance_ratio']:.4f}")
    
    pca_2d, explained = run_pca(acts)
    polysemy_pca[layer_idx] = pca_2d
    print(f"PCA variance explained: {explained:.1f}%\n")

In [None]:
# Visualize polysemy results
final_layer = config.layers_to_probe[-1]

plot_scatter(polysemy_pca[final_layer], polysemy_labels, POLYSEMY_COLORS,
             f"Polysemous Words - Layer {final_layer} (Decision Point)")

In [None]:
# Show early vs late layer comparison
early_layer = config.layers_to_probe[1]  # Second layer
plot_scatter(polysemy_pca[early_layer], polysemy_labels, POLYSEMY_COLORS,
             f"Polysemous Words - Layer {early_layer} (Early)")

In [None]:
plot_similarity_heatmap(polysemy_activations[final_layer], polysemy_labels,
                        f"Polysemy Similarity - Layer {final_layer}")

In [None]:
plot_layer_progression(polysemy_metrics, "Polysemy")

---

# EXPERIMENT B: Required Action Discrimination

## The Test

Different contexts require different ACTIONS:
- Arithmetic: compute a number
- Yes/No: produce true/false
- Completion: continue text naturally
- Definition: explain meaning

Same final structure "...is" but different AQ must crystallize because different DISCRIMINATIONS are required.

In [None]:
# Action discrimination probes
ACTION_PROBES = {
    'compute_number': [
        "2 + 3 is",
        "10 - 4 is",
        "5 times 2 is",
        "8 divided by 2 is",
        "The sum of 7 and 3 is",
        "Half of 20 is",
        "3 squared is",
        "The product of 4 and 5 is",
    ],
    'answer_yesno': [
        "The sky is blue. This is",
        "Fire is cold. This is",
        "Water is wet. This is",
        "Ice is hot. This is",
        "Grass is green. This is",
        "Snow is black. This is",
        "The sun is bright. This is",
        "Night is dark. This is",
    ],
    'complete_sentence': [
        "The weather today is",
        "My favorite color is",
        "The best food is",
        "Life is",
        "The world is",
        "Music is",
        "Love is",
        "Time is",
    ],
    'provide_fact': [
        "The capital of France is",
        "The largest planet is",
        "The chemical formula for water is",
        "The speed of light is",
        "The first president of the USA is",
        "The tallest mountain is",
        "The deepest ocean is",
        "The longest river is",
    ],
}

ACTION_COLORS = {
    'compute_number': '#3498db',   # blue
    'answer_yesno': '#e74c3c',     # red
    'complete_sentence': '#2ecc71', # green
    'provide_fact': '#9b59b6',     # purple
}

print(f"Action probes: {len(ACTION_PROBES)} categories")
print(f"Total probes: {sum(len(v) for v in ACTION_PROBES.values())}")

In [None]:
# Run action discrimination experiment
action_activations = {layer: [] for layer in config.layers_to_probe}
action_labels = []
action_texts = []

print("Running action discrimination probes...")

for category, prompts in ACTION_PROBES.items():
    for prompt in prompts:
        inputs = tokenizer(prompt, return_tensors="pt").to(DEVICE)
        
        capture.clear()
        with torch.no_grad():
            outputs = model(**inputs)
        
        for layer_idx in config.layers_to_probe:
            act = capture.get_last_token_activation(layer_idx)
            action_activations[layer_idx].append(act)
        
        action_labels.append(category)
        action_texts.append(prompt)

for layer_idx in config.layers_to_probe:
    action_activations[layer_idx] = np.array(action_activations[layer_idx])

print(f"Collected {len(action_labels)} activations")

In [None]:
# Analyze action results
print("=" * 70)
print("EXPERIMENT B: ACTION DISCRIMINATION")
print("=" * 70)
print("\nQuestion: Do different REQUIRED ACTIONS produce different AQ patterns?\n")

action_metrics = {}
action_pca = {}

for layer_idx in config.layers_to_probe:
    print(f"--- Layer {layer_idx} ---")
    
    acts = action_activations[layer_idx]
    metrics = compute_metrics(acts, action_labels)
    action_metrics[layer_idx] = metrics
    
    print(f"Silhouette: {metrics['silhouette']:.4f}")
    print(f"Distance ratio: {metrics['distance_ratio']:.4f}")
    
    pca_2d, explained = run_pca(acts)
    action_pca[layer_idx] = pca_2d
    print(f"PCA variance explained: {explained:.1f}%\n")

In [None]:
# Visualize action results
plot_scatter(action_pca[final_layer], action_labels, ACTION_COLORS,
             f"Action Discrimination - Layer {final_layer}")

In [None]:
plot_similarity_heatmap(action_activations[final_layer], action_labels,
                        f"Action Discrimination Similarity - Layer {final_layer}")

In [None]:
plot_layer_progression(action_metrics, "Action Discrimination")

---

# EXPERIMENT C: Disambiguation at Critical Token

## The Test

Some sentences have ambiguity that gets resolved at a specific token.
The AQ should crystallize differently based on how ambiguity resolves.

Example:
- "I saw her duck" - could be the animal or the action
- "I saw her duck under the fence" - action resolved
- "I saw her duck swimming in the pond" - animal resolved

In [None]:
# Disambiguation probes
DISAMBIGUATION_PROBES = {
    'flying_planes_people': [
        "Flying planes can be dangerous for pilots",
        "Flying planes require skilled operators",
        "Flying planes involves risk for the crew",
        "Flying planes takes training and practice",
    ],
    'flying_planes_aircraft': [
        "Flying planes can be seen from the ground",
        "Flying planes leave contrails in the sky",
        "Flying planes are visible overhead",
        "Flying planes cross the ocean daily",
    ],
    'time_flies_speed': [
        "Time flies when you are having fun",
        "Time flies during enjoyable activities",
        "Time flies while playing games",
        "Time flies on vacation",
    ],
    'time_flies_insects': [
        "Time flies like fruit and sweet things",
        "Time flies are attracted to rotting food",
        "Time flies buzz around the garbage",
        "Time flies are a type of small insect",
    ],
    'visiting_relatives_action': [
        "Visiting relatives can be tiring for hosts",
        "Visiting relatives requires hospitality",
        "Visiting relatives means preparing guest rooms",
        "Visiting relatives is a social obligation",
    ],
    'visiting_relatives_people': [
        "Visiting relatives bring gifts and stories",
        "Visiting relatives arrived from out of town",
        "Visiting relatives stayed for the weekend",
        "Visiting relatives gathered for the holiday",
    ],
}

DISAMBIGUATION_COLORS = {
    'flying_planes_people': '#3498db',
    'flying_planes_aircraft': '#2ecc71',
    'time_flies_speed': '#e74c3c',
    'time_flies_insects': '#f39c12',
    'visiting_relatives_action': '#9b59b6',
    'visiting_relatives_people': '#1abc9c',
}

print(f"Disambiguation probes: {len(DISAMBIGUATION_PROBES)} categories")
print(f"Total probes: {sum(len(v) for v in DISAMBIGUATION_PROBES.values())}")

In [None]:
# Run disambiguation experiment
disambig_activations = {layer: [] for layer in config.layers_to_probe}
disambig_labels = []
disambig_texts = []

print("Running disambiguation probes...")

for category, prompts in DISAMBIGUATION_PROBES.items():
    for prompt in prompts:
        inputs = tokenizer(prompt, return_tensors="pt").to(DEVICE)
        
        capture.clear()
        with torch.no_grad():
            outputs = model(**inputs)
        
        for layer_idx in config.layers_to_probe:
            act = capture.get_last_token_activation(layer_idx)
            disambig_activations[layer_idx].append(act)
        
        disambig_labels.append(category)
        disambig_texts.append(prompt)

for layer_idx in config.layers_to_probe:
    disambig_activations[layer_idx] = np.array(disambig_activations[layer_idx])

print(f"Collected {len(disambig_labels)} activations")

In [None]:
# Analyze disambiguation results
print("=" * 70)
print("EXPERIMENT C: DISAMBIGUATION")
print("=" * 70)
print("\nQuestion: Do ambiguous phrases produce different patterns")
print("          based on how they are RESOLVED?\n")

disambig_metrics = {}
disambig_pca = {}

for layer_idx in config.layers_to_probe:
    print(f"--- Layer {layer_idx} ---")
    
    acts = disambig_activations[layer_idx]
    metrics = compute_metrics(acts, disambig_labels)
    disambig_metrics[layer_idx] = metrics
    
    print(f"Silhouette: {metrics['silhouette']:.4f}")
    print(f"Distance ratio: {metrics['distance_ratio']:.4f}")
    
    pca_2d, explained = run_pca(acts)
    disambig_pca[layer_idx] = pca_2d
    print(f"PCA variance explained: {explained:.1f}%\n")

In [None]:
# Visualize disambiguation results
plot_scatter(disambig_pca[final_layer], disambig_labels, DISAMBIGUATION_COLORS,
             f"Disambiguation - Layer {final_layer}")

In [None]:
plot_layer_progression(disambig_metrics, "Disambiguation")

---

# SUMMARY

In [None]:
print("=" * 70)
print("EXPERIMENT 035B EXTENDED - SUMMARY")
print("=" * 70)

final_layer = config.layers_to_probe[-1]

experiments = [
    ("A: Polysemous Words", polysemy_metrics[final_layer]),
    ("B: Action Discrimination", action_metrics[final_layer]),
    ("C: Disambiguation", disambig_metrics[final_layer]),
]

print(f"\nFinal Layer: {final_layer}")
print("\n" + "-" * 50)
print(f"{'Experiment':<25} | {'Silhouette':>12} | {'Distance Ratio':>15}")
print("-" * 50)

total_evidence = 0
for name, metrics in experiments:
    sil = metrics['silhouette']
    ratio = metrics['distance_ratio']
    print(f"{name:<25} | {sil:>12.3f} | {ratio:>15.3f}")
    
    if sil > 0.1:
        total_evidence += 1
    if ratio > 1.3:
        total_evidence += 1

print("-" * 50)
print(f"\nTotal Evidence Score: {total_evidence}/6")

print("\n" + "=" * 70)
print("INTERPRETATION")
print("=" * 70)

print("""
These experiments test AQ theory more rigorously:

A. POLYSEMOUS WORDS: Same token, different meaning
   - "bank" (financial) vs "bank" (river)
   - If AQ theory is correct: cluster by MEANING, not token

B. ACTION DISCRIMINATION: Different required outputs
   - Compute number vs Answer yes/no vs Complete sentence
   - If AQ theory is correct: cluster by ACTION TYPE

C. DISAMBIGUATION: Same ambiguous phrase, different resolution
   - "Flying planes" as activity vs aircraft
   - If AQ theory is correct: cluster by RESOLVED MEANING

KEY INSIGHT:
   We measure at the LAST TOKEN (decision point) because that's
   where the AQ must crystallize to enable correct action.
   
   The previous 035B failed because "The answer is" is structural
   scaffolding - it doesn't discriminate actions.
""")

In [None]:
# Cleanup
capture.remove_hooks()

print("\n" + "=" * 70)
print("EXPERIMENT 035B EXTENDED COMPLETE")
print("=" * 70)

---

## Relation to AQ Theory

From `ACTION_QUANTA.md`:

```
AQ (pattern) -> enables DISCRIMINATION -> enables ACTION

STRUCTURAL: AQ = Minimum PATTERN (what it IS)
FUNCTIONAL: Discrimination = ATOMIC ABSTRACTION (what it DOES)
```

These experiments test whether:
1. The same surface form ("bank") produces different AQ based on discrimination required
2. Different action types (compute vs classify) produce different AQ patterns
3. Ambiguity resolution crystallizes different AQ

From `LANGUAGE_ACTION_CONTEXT.md`:

```
Linguistic signal + Experiential context -> AQ crystallizes -> Action

Language provides the TRIGGER.
Context provides the DISCRIMINATION SPACE.
AQ is what enables the ACTION CHOICE.
```

The extended experiments test this by varying:
- Same trigger ("bank"), different discrimination space
- Different triggers, same structure ("...is"), different required action
- Ambiguous trigger, different resolution

---

**AKIRA Project - Experiment 035B Extended**  
Oscar Goldman - Shogu Research Group @ Datamutant.ai