# Experiment 035G: Belief Crystallization Strength

**AKIRA Project - Oscar Goldman - Shogu Research Group @ Datamutant.ai**

---

## Core Insight

AQ are **crystallized beliefs from data patterns**. They emerge when sufficient
consistent evidence accumulates:

```
3 cups sugar + 1/2 gallon milk + 2 eggs = CAKE
late night + noise closer + 8 people approaching = THREAT
cold + looks like rain = GO_HOME
```

The key is: **if X is X and X is doing X then = X**

When patterns are:
- **Complete and consistent** -> Strong crystallization, clear AQ
- **Incomplete** -> Weak crystallization, uncertain AQ
- **Contradictory** -> No crystallization, interference/hallucination

---

## The Question

**How do I X if X isn't X?**

When the pattern is ambiguous or contradictory, the belief cannot crystallize.
The model should show:
- Phase spreading (not tight clustering)
- Lower confidence
- Higher activation variance
- Potential hallucination

---

## Experimental Design

Test belief crystallization with varying **pattern completeness**:

| Level | Pattern State | Example | Expected |
|:------|:--------------|:--------|:---------|
| 5 | Complete + Consistent | All components present, all point same way | Strong crystallization |
| 4 | Mostly complete | Missing 1 component | Good crystallization |
| 3 | Partial | Missing 2+ components | Weak crystallization |
| 2 | Ambiguous | Components present but conflicting | Interference |
| 1 | Contradictory | Direct contradictions | No crystallization |
| 0 | Impossible | Logical impossibility | Failure/hallucination |

---

## 1. Setup

In [None]:
!pip install transformers torch numpy matplotlib seaborn scipy -q

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from typing import Dict, List, Tuple, Optional
from dataclasses import dataclass, field
from scipy import stats
from scipy.stats import circvar, circmean
from tqdm import tqdm
import warnings
import gc

warnings.filterwarnings('ignore')

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Device: {DEVICE}")
print(f"PyTorch version: {torch.__version__}")
if DEVICE == "cuda":
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

## 2. Belief Pattern Definitions

Define patterns that should crystallize into clear beliefs vs patterns that
should create interference.

In [None]:
# Belief domains - each has components that together crystallize a belief

BELIEF_PATTERNS = {
    "THREAT_FLEE": {
        "description": "Threat situation requiring escape",
        "expected_belief": "FLEE",
        "components": {
            "danger_source": ["predator", "attacker", "fire", "flood", "armed person", "collapsing structure"],
            "proximity": ["approaching", "nearby", "close", "getting closer", "meters away"],
            "urgency": ["now", "immediately", "quickly", "right now", "urgent"],
            "direction": ["coming toward you", "heading your way", "approaching from behind"],
            "capability": ["you can run", "exit available", "path clear", "you are mobile"]
        }
    },
    
    "RECIPE_CAKE": {
        "description": "Recipe pattern for making cake",
        "expected_belief": "MAKE_CAKE",
        "components": {
            "flour": ["2 cups flour", "flour sifted", "all-purpose flour"],
            "sugar": ["1 cup sugar", "granulated sugar", "sugar measured"],
            "eggs": ["2 eggs", "eggs beaten", "large eggs"],
            "butter": ["half cup butter", "butter softened", "melted butter"],
            "method": ["mix together", "combine ingredients", "bake at 350"]
        }
    },
    
    "WEATHER_SHELTER": {
        "description": "Weather pattern requiring shelter",
        "expected_belief": "GO_INSIDE",
        "components": {
            "temperature": ["cold", "freezing", "temperature dropping"],
            "precipitation": ["rain coming", "storm approaching", "dark clouds"],
            "wind": ["strong wind", "gusty", "wind picking up"],
            "time": ["getting dark", "evening", "late"],
            "location": ["outside", "in the open", "exposed"]
        }
    },
    
    "TRUST_PERSON": {
        "description": "Pattern indicating trustworthy person",
        "expected_belief": "TRUST",
        "components": {
            "history": ["known for years", "long friendship", "proven reliable"],
            "behavior": ["always honest", "keeps promises", "never lied"],
            "reputation": ["respected by others", "good reputation", "trusted by community"],
            "alignment": ["shares your values", "similar goals", "mutual benefit"],
            "vulnerability": ["has trusted you", "shared secrets", "been vulnerable"]
        }
    },
    
    "WEALTH_PATH": {
        "description": "Pattern for achieving wealth - inherently ambiguous",
        "expected_belief": "UNCERTAIN",
        "components": {
            "education": ["studied hard", "got degree", "learned skills"],
            "work": ["work hard", "put in hours", "dedicated effort"],
            "saving": ["save money", "invest wisely", "budget carefully"],
            "opportunity": ["right place", "good timing", "lucky break"],
            "connections": ["know right people", "network", "mentors"]
        }
    }
}

print(f"Defined {len(BELIEF_PATTERNS)} belief patterns")
for name, pattern in BELIEF_PATTERNS.items():
    print(f"  {name}: {len(pattern['components'])} components -> {pattern['expected_belief']}")

## 3. Prompt Generation by Crystallization Level

In [None]:
def generate_crystallization_prompts(pattern_name: str, n_per_level: int = 50) -> Dict[int, List[str]]:
    """Generate prompts at different crystallization levels for a belief pattern.
    
    Level 5: All components present, consistent
    Level 4: 4 of 5 components
    Level 3: 3 of 5 components  
    Level 2: Components present but with ambiguity markers
    Level 1: Components with contradictions
    Level 0: Logical impossibility
    
    Args:
        pattern_name: Name of the belief pattern
        n_per_level: Number of prompts per crystallization level
        
    Returns:
        Dict mapping level to list of prompts
    """
    assert pattern_name in BELIEF_PATTERNS, f"Unknown pattern: {pattern_name}"
    
    pattern = BELIEF_PATTERNS[pattern_name]
    components = pattern["components"]
    component_names = list(components.keys())
    
    prompts = {level: [] for level in range(6)}
    np.random.seed(42)
    
    for i in range(n_per_level):
        # Level 5: All components
        parts = []
        for comp_name in component_names:
            options = components[comp_name]
            parts.append(options[i % len(options)])
        prompt_5 = "Situation: " + ", ".join(parts) + ". What should you do?"
        prompts[5].append(prompt_5)
        
        # Level 4: 4 of 5 components
        subset = np.random.choice(component_names, 4, replace=False)
        parts = [components[c][i % len(components[c])] for c in subset]
        prompt_4 = "Situation: " + ", ".join(parts) + ". What should you do?"
        prompts[4].append(prompt_4)
        
        # Level 3: 3 of 5 components
        subset = np.random.choice(component_names, 3, replace=False)
        parts = [components[c][i % len(components[c])] for c in subset]
        prompt_3 = "Situation: " + ", ".join(parts) + ". What should you do?"
        prompts[3].append(prompt_3)
        
        # Level 2: All components but with ambiguity markers
        ambiguity_markers = ["maybe", "possibly", "not sure if", "might be", "could be"]
        parts = []
        for j, comp_name in enumerate(component_names):
            options = components[comp_name]
            marker = ambiguity_markers[j % len(ambiguity_markers)]
            parts.append(f"{marker} {options[i % len(options)]}")
        prompt_2 = "Situation: " + ", ".join(parts) + ". What should you do?"
        prompts[2].append(prompt_2)
        
        # Level 1: Components with contradictions
        parts = []
        contradictions = {
            "THREAT_FLEE": ["but also safe", "but friendly", "but not dangerous", "but harmless"],
            "RECIPE_CAKE": ["but no oven", "but making soup", "but without heat", "but raw"],
            "WEATHER_SHELTER": ["but sunny", "but warm", "but clear skies", "but pleasant"],
            "TRUST_PERSON": ["but lied before", "but unreliable", "but suspicious", "but secretive"],
            "WEALTH_PATH": ["but economy crashed", "but no jobs", "but skills outdated", "but unlucky"]
        }
        for j, comp_name in enumerate(component_names[:3]):
            options = components[comp_name]
            parts.append(options[i % len(options)])
        contras = contradictions.get(pattern_name, ["but opposite", "but not really"])
        parts.append(contras[i % len(contras)])
        prompt_1 = "Situation: " + ", ".join(parts) + ". What should you do?"
        prompts[1].append(prompt_1)
        
        # Level 0: Logical impossibility
        impossibilities = {
            "THREAT_FLEE": [
                "A friendly predator is safely attacking you with harmless danger. What should you do?",
                "The approaching threat has already passed before it arrives. What should you do?",
                "You must flee from safety toward the danger to escape. What should you do?"
            ],
            "RECIPE_CAKE": [
                "To make this cake, first remove the ingredients after baking. What should you do?",
                "Mix the unbaked batter that was already eaten. What should you do?",
                "The cake is done before you start making it. What should you do?"
            ],
            "WEATHER_SHELTER": [
                "The dry rain is making the sunny storm wet with warmth. What should you do?",
                "Go inside the outside where the indoor rain is dry. What should you do?",
                "The freezing heat is warmly cold. What should you do?"
            ],
            "TRUST_PERSON": [
                "This honest liar always tells truthful lies. What should you do?",
                "Trust the untrustworthy person who reliably betrays. What should you do?",
                "Their proven dishonesty demonstrates integrity. What should you do?"
            ],
            "WEALTH_PATH": [
                "To become wealthy, first spend all your earnings on poverty. What should you do?",
                "Achieve success by failing at everything successfully. What should you do?",
                "The guaranteed path to wealth requires losing all money. What should you do?"
            ]
        }
        impossible_list = impossibilities.get(pattern_name, ["Impossible situation. What should you do?"])
        prompts[0].append(impossible_list[i % len(impossible_list)])
    
    return prompts


# Test prompt generation
test_prompts = generate_crystallization_prompts("THREAT_FLEE", n_per_level=3)
print("\n=== THREAT_FLEE Examples ===")
for level in range(5, -1, -1):
    print(f"\nLevel {level}:")
    print(f"  {test_prompts[level][0][:100]}...")

## 4. Model Loading and Metrics

In [None]:
@dataclass
class ExperimentConfig:
    """Configuration for belief crystallization experiment."""
    
    models: Dict[str, str] = field(default_factory=lambda: {
        "gpt2-medium": "gpt2-medium",
        "gpt2-large": "gpt2-large",
    })
    
    prompts_per_level: int = 100
    max_new_tokens: int = 50
    random_seed: int = 42
    
    # Layers to analyze (skip layer 0 - it's just embeddings)
    layer_ratios: List[float] = field(default_factory=lambda: [0.15, 0.3, 0.5, 0.7, 0.85, 1.0])
    
    def __post_init__(self):
        np.random.seed(self.random_seed)
        torch.manual_seed(self.random_seed)


config = ExperimentConfig()
print(f"Models: {list(config.models.keys())}")
print(f"Prompts per level: {config.prompts_per_level}")
print(f"Total prompts per pattern: {config.prompts_per_level * 6}")

In [None]:
def load_model(model_name: str) -> Tuple:
    """Load model and tokenizer."""
    print(f"Loading {model_name}...")
    
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    if tokenizer.pad_token is None:
        tokenizer.pad_token = tokenizer.eos_token
    
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype=torch.float16 if DEVICE == "cuda" else torch.float32,
        device_map="auto" if DEVICE == "cuda" else None,
        output_hidden_states=True,
        output_attentions=True
    )
    model.eval()
    
    if hasattr(model.config, 'n_layer'):
        n_layers = model.config.n_layer
    elif hasattr(model.config, 'num_hidden_layers'):
        n_layers = model.config.num_hidden_layers
    else:
        n_layers = 24
    
    print(f"  Layers: {n_layers}, Hidden: {model.config.hidden_size}")
    return model, tokenizer, n_layers

In [None]:
@dataclass
class CrystallizationMetrics:
    """Metrics for measuring belief crystallization strength."""
    
    # Phase metrics (from activations)
    phase_variance: float = 0.0  # Lower = more crystallized
    phase_coherence: float = 0.0  # Higher = more crystallized (Rayleigh R)
    
    # Activation metrics
    activation_magnitude: float = 0.0
    activation_variance: float = 0.0  # Higher variance = less crystallized
    
    # Cross-layer coherence
    layer_agreement: float = 0.0  # Do layers agree? Higher = more crystallized
    
    # Output metrics
    confidence: float = 0.0  # Token probability
    entropy: float = 0.0  # Lower entropy = more certain
    
    # Response analysis
    response_text: str = ""
    contains_hedging: bool = False  # "maybe", "possibly", "not sure"
    contains_action: bool = False  # Clear action verb


def compute_phase_from_activations(activations: np.ndarray) -> Tuple[float, float]:
    """Compute phase metrics from activation vectors.
    
    Uses PCA to find dominant directions, then computes phase angles.
    
    Args:
        activations: Array of shape (n_samples, hidden_size)
        
    Returns:
        Tuple of (circular_variance, rayleigh_r)
    """
    if len(activations) < 2:
        return 0.0, 1.0
    
    # Center the activations
    centered = activations - np.mean(activations, axis=0)
    
    # Use first two principal components to define phase
    U, S, Vt = np.linalg.svd(centered, full_matrices=False)
    
    # Project onto first two components
    proj = centered @ Vt[:2].T  # (n_samples, 2)
    
    # Compute phase angles
    phases = np.arctan2(proj[:, 1], proj[:, 0])
    
    # Circular variance (0 = perfectly aligned, 1 = uniform)
    circ_var = circvar(phases)
    
    # Rayleigh R (mean resultant length, 1 = perfectly aligned, 0 = uniform)
    mean_cos = np.mean(np.cos(phases))
    mean_sin = np.mean(np.sin(phases))
    rayleigh_r = np.sqrt(mean_cos**2 + mean_sin**2)
    
    return float(circ_var), float(rayleigh_r)


def analyze_prompt(model, tokenizer, prompt: str, layers: List[int]) -> CrystallizationMetrics:
    """Analyze a single prompt for crystallization metrics."""
    metrics = CrystallizationMetrics()
    
    # Tokenize
    inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=256)
    inputs = {k: v.to(model.device) for k, v in inputs.items()}
    
    with torch.no_grad():
        # Get hidden states
        outputs = model(**inputs, output_hidden_states=True, output_attentions=True)
        hidden_states = outputs.hidden_states
        
        # Collect activations from specified layers (last token)
        layer_activations = []
        for layer_idx in layers:
            if layer_idx < len(hidden_states):
                act = hidden_states[layer_idx][0, -1, :].cpu().numpy().astype(np.float32)
                layer_activations.append(act)
        
        if layer_activations:
            layer_activations = np.array(layer_activations)
            
            # Activation magnitude (mean across layers)
            metrics.activation_magnitude = float(np.mean(np.linalg.norm(layer_activations, axis=1)))
            
            # Activation variance across layers
            metrics.activation_variance = float(np.var(layer_activations))
            
            # Layer agreement (cosine similarity between consecutive layers)
            if len(layer_activations) > 1:
                agreements = []
                for i in range(len(layer_activations) - 1):
                    a, b = layer_activations[i], layer_activations[i+1]
                    sim = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b) + 1e-8)
                    agreements.append(sim)
                metrics.layer_agreement = float(np.mean(agreements))
        
        # Generate response
        gen_outputs = model.generate(
            **inputs,
            max_new_tokens=config.max_new_tokens,
            do_sample=False,
            output_scores=True,
            return_dict_in_generate=True,
            pad_token_id=tokenizer.pad_token_id
        )
        
        # Decode response
        generated_ids = gen_outputs.sequences[0][inputs['input_ids'].shape[1]:]
        metrics.response_text = tokenizer.decode(generated_ids, skip_special_tokens=True)
        
        # Confidence (mean probability of generated tokens)
        if gen_outputs.scores:
            probs = []
            for i, score in enumerate(gen_outputs.scores):
                if i < len(generated_ids):
                    token_id = generated_ids[i].item()
                    prob = torch.softmax(score[0], dim=-1)[token_id].item()
                    probs.append(prob)
            if probs:
                metrics.confidence = float(np.mean(probs))
                
                # Entropy of first token distribution
                first_probs = torch.softmax(gen_outputs.scores[0][0], dim=-1)
                entropy = -torch.sum(first_probs * torch.log(first_probs + 1e-10)).item()
                metrics.entropy = entropy
    
    # Analyze response text
    response_lower = metrics.response_text.lower()
    hedging_words = ["maybe", "possibly", "perhaps", "might", "could", "not sure", 
                    "uncertain", "depends", "it depends", "hard to say"]
    action_words = ["run", "flee", "go", "leave", "stay", "do", "make", "take",
                   "should", "must", "need to", "have to", "will"]
    
    metrics.contains_hedging = any(w in response_lower for w in hedging_words)
    metrics.contains_action = any(w in response_lower for w in action_words)
    
    return metrics

## 5. Run Experiment

In [None]:
def run_crystallization_experiment(model_name: str, model_path: str) -> Dict:
    """Run crystallization experiment for one model."""
    
    print(f"\n{'='*60}")
    print(f"Running: {model_name}")
    print(f"{'='*60}")
    
    model, tokenizer, n_layers = load_model(model_path)
    
    # Select layers (skip layer 0)
    layers = [max(1, int(r * n_layers)) for r in config.layer_ratios]
    layers = sorted(set(layers))  # Remove duplicates
    print(f"Analyzing layers: {layers}")
    
    results = {
        "model": model_name,
        "layers": layers,
        "patterns": {}
    }
    
    for pattern_name in BELIEF_PATTERNS.keys():
        print(f"\nPattern: {pattern_name}")
        
        # Generate prompts
        prompts_by_level = generate_crystallization_prompts(
            pattern_name, 
            n_per_level=config.prompts_per_level
        )
        
        pattern_results = {level: [] for level in range(6)}
        
        for level in range(5, -1, -1):
            print(f"  Level {level}...", end=" ")
            
            level_metrics = []
            for prompt in tqdm(prompts_by_level[level], desc=f"L{level}", leave=False):
                metrics = analyze_prompt(model, tokenizer, prompt, layers)
                level_metrics.append(metrics)
            
            pattern_results[level] = level_metrics
            
            # Quick summary
            avg_conf = np.mean([m.confidence for m in level_metrics])
            avg_layer_agree = np.mean([m.layer_agreement for m in level_metrics])
            hedge_rate = np.mean([m.contains_hedging for m in level_metrics])
            print(f"conf={avg_conf:.3f}, agree={avg_layer_agree:.3f}, hedge={hedge_rate:.2f}")
        
        results["patterns"][pattern_name] = pattern_results
    
    # Cleanup
    del model
    gc.collect()
    if DEVICE == "cuda":
        torch.cuda.empty_cache()
    
    return results


# Run experiment
all_results = {}
for model_name, model_path in config.models.items():
    try:
        all_results[model_name] = run_crystallization_experiment(model_name, model_path)
    except Exception as e:
        print(f"Error with {model_name}: {e}")
        import traceback
        traceback.print_exc()

## 6. Visualization

In [None]:
def plot_crystallization_results(results: Dict) -> None:
    """Plot crystallization metrics by level."""
    
    for model_name, model_results in results.items():
        n_patterns = len(model_results["patterns"])
        fig, axes = plt.subplots(n_patterns, 4, figsize=(16, 4 * n_patterns))
        
        for row, (pattern_name, pattern_data) in enumerate(model_results["patterns"].items()):
            levels = list(range(6))
            
            # Collect metrics by level
            confidences = [np.mean([m.confidence for m in pattern_data[l]]) for l in levels]
            conf_stds = [np.std([m.confidence for m in pattern_data[l]]) for l in levels]
            
            layer_agrees = [np.mean([m.layer_agreement for m in pattern_data[l]]) for l in levels]
            agree_stds = [np.std([m.layer_agreement for m in pattern_data[l]]) for l in levels]
            
            entropies = [np.mean([m.entropy for m in pattern_data[l]]) for l in levels]
            
            hedge_rates = [np.mean([m.contains_hedging for m in pattern_data[l]]) for l in levels]
            action_rates = [np.mean([m.contains_action for m in pattern_data[l]]) for l in levels]
            
            # Plot confidence
            ax = axes[row, 0] if n_patterns > 1 else axes[0]
            ax.errorbar(levels, confidences, yerr=conf_stds, marker='o', capsize=5)
            ax.set_xlabel('Crystallization Level')
            ax.set_ylabel('Confidence')
            ax.set_title(f'{pattern_name}\nConfidence by Level')
            ax.set_xticks(levels)
            ax.set_xticklabels(['Impossible', 'Contradict', 'Ambiguous', 'Partial', 'Mostly', 'Complete'])
            ax.tick_params(axis='x', rotation=45)
            ax.grid(True, alpha=0.3)
            
            # Plot layer agreement
            ax = axes[row, 1] if n_patterns > 1 else axes[1]
            ax.errorbar(levels, layer_agrees, yerr=agree_stds, marker='s', capsize=5, color='green')
            ax.set_xlabel('Crystallization Level')
            ax.set_ylabel('Layer Agreement')
            ax.set_title(f'{pattern_name}\nLayer Agreement (Coherence)')
            ax.set_xticks(levels)
            ax.set_xticklabels(['Impossible', 'Contradict', 'Ambiguous', 'Partial', 'Mostly', 'Complete'])
            ax.tick_params(axis='x', rotation=45)
            ax.grid(True, alpha=0.3)
            
            # Plot entropy
            ax = axes[row, 2] if n_patterns > 1 else axes[2]
            ax.bar(levels, entropies, color='purple', alpha=0.7)
            ax.set_xlabel('Crystallization Level')
            ax.set_ylabel('Entropy')
            ax.set_title(f'{pattern_name}\nOutput Entropy (Lower = More Certain)')
            ax.set_xticks(levels)
            ax.set_xticklabels(['Impossible', 'Contradict', 'Ambiguous', 'Partial', 'Mostly', 'Complete'])
            ax.tick_params(axis='x', rotation=45)
            ax.grid(True, alpha=0.3)
            
            # Plot hedging vs action rates
            ax = axes[row, 3] if n_patterns > 1 else axes[3]
            x = np.arange(len(levels))
            width = 0.35
            ax.bar(x - width/2, hedge_rates, width, label='Hedging', color='red', alpha=0.7)
            ax.bar(x + width/2, action_rates, width, label='Action', color='blue', alpha=0.7)
            ax.set_xlabel('Crystallization Level')
            ax.set_ylabel('Rate')
            ax.set_title(f'{pattern_name}\nHedging vs Action in Response')
            ax.set_xticks(x)
            ax.set_xticklabels(['Impossible', 'Contradict', 'Ambiguous', 'Partial', 'Mostly', 'Complete'])
            ax.tick_params(axis='x', rotation=45)
            ax.legend()
            ax.grid(True, alpha=0.3)
        
        plt.suptitle(f'{model_name}: Belief Crystallization Analysis', fontsize=14, fontweight='bold')
        plt.tight_layout()
        plt.savefig(f'035G_crystallization_{model_name}.png', dpi=150, bbox_inches='tight')
        plt.show()


if all_results:
    plot_crystallization_results(all_results)

In [None]:
def plot_crystallization_summary(results: Dict) -> None:
    """Summary plot across all patterns."""
    
    fig, axes = plt.subplots(1, len(results), figsize=(7 * len(results), 5))
    if len(results) == 1:
        axes = [axes]
    
    for ax, (model_name, model_results) in zip(axes, results.items()):
        levels = list(range(6))
        
        # Average across all patterns
        all_confidences = {l: [] for l in levels}
        all_agreements = {l: [] for l in levels}
        
        for pattern_data in model_results["patterns"].values():
            for l in levels:
                all_confidences[l].extend([m.confidence for m in pattern_data[l]])
                all_agreements[l].extend([m.layer_agreement for m in pattern_data[l]])
        
        conf_means = [np.mean(all_confidences[l]) for l in levels]
        agree_means = [np.mean(all_agreements[l]) for l in levels]
        
        ax.plot(levels, conf_means, 'b-o', label='Confidence', linewidth=2, markersize=8)
        ax.plot(levels, agree_means, 'g-s', label='Layer Agreement', linewidth=2, markersize=8)
        
        ax.set_xlabel('Crystallization Level', fontsize=12)
        ax.set_ylabel('Score', fontsize=12)
        ax.set_title(f'{model_name}\nCrystallization Metrics Summary', fontsize=14)
        ax.set_xticks(levels)
        ax.set_xticklabels(['Impossible', 'Contradict', 'Ambiguous', 'Partial', 'Mostly', 'Complete'])
        ax.tick_params(axis='x', rotation=45)
        ax.legend()
        ax.grid(True, alpha=0.3)
        ax.set_ylim(0, 1)
    
    plt.tight_layout()
    plt.savefig('035G_crystallization_summary.png', dpi=150, bbox_inches='tight')
    plt.show()


if all_results:
    plot_crystallization_summary(all_results)

## 7. Statistical Analysis

In [None]:
def statistical_summary(results: Dict) -> None:
    """Compute and print statistical summary."""
    
    print("\n" + "="*70)
    print("STATISTICAL SUMMARY: BELIEF CRYSTALLIZATION")
    print("="*70)
    
    for model_name, model_results in results.items():
        print(f"\n### {model_name} ###")
        
        # Collect all metrics by level
        all_by_level = {l: {"conf": [], "agree": [], "entropy": [], "hedge": [], "action": []} 
                       for l in range(6)}
        
        for pattern_data in model_results["patterns"].values():
            for l in range(6):
                for m in pattern_data[l]:
                    all_by_level[l]["conf"].append(m.confidence)
                    all_by_level[l]["agree"].append(m.layer_agreement)
                    all_by_level[l]["entropy"].append(m.entropy)
                    all_by_level[l]["hedge"].append(m.contains_hedging)
                    all_by_level[l]["action"].append(m.contains_action)
        
        print("\nMetrics by Crystallization Level:")
        print("-" * 70)
        print(f"{'Level':<12} {'Confidence':>12} {'LayerAgree':>12} {'Entropy':>12} {'Hedge%':>10} {'Action%':>10}")
        print("-" * 70)
        
        level_names = ['Impossible', 'Contradict', 'Ambiguous', 'Partial', 'Mostly', 'Complete']
        for l in range(6):
            conf = np.mean(all_by_level[l]["conf"])
            agree = np.mean(all_by_level[l]["agree"])
            ent = np.mean(all_by_level[l]["entropy"])
            hedge = np.mean(all_by_level[l]["hedge"]) * 100
            action = np.mean(all_by_level[l]["action"]) * 100
            print(f"{level_names[l]:<12} {conf:>12.4f} {agree:>12.4f} {ent:>12.2f} {hedge:>9.1f}% {action:>9.1f}%")
        
        # Correlation analysis
        print("\nCorrelation with Crystallization Level:")
        
        all_levels = []
        all_confs = []
        all_agrees = []
        
        for l in range(6):
            for conf, agree in zip(all_by_level[l]["conf"], all_by_level[l]["agree"]):
                all_levels.append(l)
                all_confs.append(conf)
                all_agrees.append(agree)
        
        r_conf, p_conf = stats.pearsonr(all_levels, all_confs)
        r_agree, p_agree = stats.pearsonr(all_levels, all_agrees)
        
        print(f"  Confidence: r = {r_conf:.3f}, p = {p_conf:.6f}")
        print(f"  Layer Agreement: r = {r_agree:.3f}, p = {p_agree:.6f}")
        
        # Compare extremes (Level 0 vs Level 5)
        print("\nComparing Extremes (Impossible vs Complete):")
        
        t_conf, p_conf = stats.ttest_ind(all_by_level[0]["conf"], all_by_level[5]["conf"])
        t_agree, p_agree = stats.ttest_ind(all_by_level[0]["agree"], all_by_level[5]["agree"])
        
        # Cohen's d
        def cohens_d(a, b):
            pooled_std = np.sqrt((np.std(a)**2 + np.std(b)**2) / 2)
            return (np.mean(b) - np.mean(a)) / pooled_std if pooled_std > 0 else 0
        
        d_conf = cohens_d(all_by_level[0]["conf"], all_by_level[5]["conf"])
        d_agree = cohens_d(all_by_level[0]["agree"], all_by_level[5]["agree"])
        
        print(f"  Confidence: t = {t_conf:.3f}, p = {p_conf:.6f}, Cohen's d = {d_conf:.3f}")
        print(f"  Layer Agreement: t = {t_agree:.3f}, p = {p_agree:.6f}, Cohen's d = {d_agree:.3f}")
        
        # Interpretation
        print("\nInterpretation:")
        if r_conf > 0.1 and p_conf < 0.05:
            print(f"  Confidence INCREASES with crystallization level (r={r_conf:.3f})")
        elif r_conf < -0.1 and p_conf < 0.05:
            print(f"  Confidence DECREASES with crystallization level (r={r_conf:.3f})")
        else:
            print(f"  No significant relationship between confidence and crystallization")
        
        if r_agree > 0.1 and p_agree < 0.05:
            print(f"  Layer agreement INCREASES with crystallization level (r={r_agree:.3f})")
            print(f"  -> SUPPORTS belief crystallization hypothesis")
        elif r_agree < -0.1 and p_agree < 0.05:
            print(f"  Layer agreement DECREASES with crystallization level (r={r_agree:.3f})")
        else:
            print(f"  No significant relationship between layer agreement and crystallization")


if all_results:
    statistical_summary(all_results)

## 8. Example Responses Analysis

In [None]:
def show_example_responses(results: Dict, n_examples: int = 3) -> None:
    """Show example responses at different crystallization levels."""
    
    print("\n" + "="*70)
    print("EXAMPLE RESPONSES BY CRYSTALLIZATION LEVEL")
    print("="*70)
    
    for model_name, model_results in results.items():
        print(f"\n### {model_name} ###")
        
        # Pick one pattern to show
        pattern_name = "THREAT_FLEE"
        pattern_data = model_results["patterns"][pattern_name]
        
        level_names = ['Impossible', 'Contradict', 'Ambiguous', 'Partial', 'Mostly', 'Complete']
        
        for level in [5, 3, 1, 0]:  # Show key levels
            print(f"\n--- Level {level} ({level_names[level]}) ---")
            
            for i in range(min(n_examples, len(pattern_data[level]))):
                m = pattern_data[level][i]
                print(f"\nResponse {i+1}:")
                print(f"  Text: {m.response_text[:150]}..." if len(m.response_text) > 150 else f"  Text: {m.response_text}")
                print(f"  Confidence: {m.confidence:.3f}")
                print(f"  Hedging: {m.contains_hedging}, Action: {m.contains_action}")


if all_results:
    show_example_responses(all_results)

## 9. Conclusions

This experiment tests **belief crystallization strength** - the idea that AQ emerge
when sufficient consistent pattern evidence accumulates.

**Key predictions:**

1. **Complete patterns (Level 5)** should show:
   - High confidence
   - High layer agreement (crystallized belief)
   - Clear action in response
   - Low hedging

2. **Impossible patterns (Level 0)** should show:
   - Lower confidence (or overconfident hallucination)
   - Lower layer agreement (no crystallization)
   - Hedging or nonsensical responses

3. **Gradient between levels** should show:
   - Monotonic relationship between pattern completeness and crystallization metrics

**Connection to AKIRA theory:**

- AQ crystallize from **signal + context**
- Incomplete context = incomplete crystallization
- Contradictory context = interference, no crystallization
- The model needs sufficient consistent AQ to construct coherent response

**The question "How do I X if X isn't X?"** captures the failure mode:
when patterns contradict, belief cannot crystallize, and the model either
hallucinates, hedges, or fails.