# Buildlog Learning Analysis

Analysis of buildlog's learning dynamics using the real data sources:
- **SQLite tables**: `review_learnings`, `mistakes`, `reward_events`
- **Signal log**: `~/.buildlog/emissions/signal.jsonl` (time-series backbone)
- **Seed files**: Treatment intervention dates from filenames

**Key metrics:**
- `reinforcement_count` vs `contradiction_count` per learning → rule strength over time
- `was_repeat` on mistakes → persistent blind spots
- `corrected_by_rule` → direct positive attribution
- `rules_active` on reward events → which rules were on during success/failure

In [None]:
import json
import sqlite3
from pathlib import Path
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

plt.style.use('seaborn-v0_8-whitegrid')
pd.set_option('display.max_columns', None)

BUILDLOG_DIR = Path.home() / ".buildlog"
DB_PATH = BUILDLOG_DIR / "buildlog.db"
SIGNAL_LOG = BUILDLOG_DIR / "emissions" / "signal.jsonl"
SEEDS_DIR = BUILDLOG_DIR / "seeds"

## 1. Load SQLite Data

In [None]:
conn = sqlite3.connect(DB_PATH)

# Review learnings - the rule strength time series
learnings = pd.read_sql_query("""
    SELECT * FROM review_learnings
""", conn)
learnings["first_seen"] = pd.to_datetime(learnings["first_seen"])
learnings["last_reinforced"] = pd.to_datetime(learnings["last_reinforced"])

# Mistakes
mistakes = pd.read_sql_query("""
    SELECT * FROM mistakes
""", conn)
mistakes["timestamp"] = pd.to_datetime(mistakes["timestamp"])

# Reward events
rewards = pd.read_sql_query("""
    SELECT * FROM reward_events
""", conn)
rewards["timestamp"] = pd.to_datetime(rewards["timestamp"])
rewards["rules_active"] = rewards["rules_active"].apply(
    lambda x: json.loads(x) if x else []
)

conn.close()

print(f"Learnings: {len(learnings)}")
print(f"Mistakes: {len(mistakes)}")
print(f"Reward events: {len(rewards)}")

## 2. Rule Strength: Reinforcement vs Contradiction

This is the key plot. Rules accumulating reinforcements are getting stronger.
Rules with contradictions are being challenged. The ratio tells us which rules are reliable.

In [None]:
# Summary by category
category_stats = learnings.groupby("category").agg(
    rules=("id", "count"),
    total_reinforcements=("reinforcement_count", "sum"),
    total_contradictions=("contradiction_count", "sum"),
).reset_index()
category_stats["strength_ratio"] = (
    category_stats["total_reinforcements"] / 
    (category_stats["total_reinforcements"] + category_stats["total_contradictions"] + 1)
).round(2)
category_stats = category_stats.sort_values("total_reinforcements", ascending=False)
category_stats

In [None]:
# Scatter: reinforcement vs contradiction per learning, colored by category
fig, ax = plt.subplots(figsize=(10, 6))

categories = learnings["category"].unique()
colors = plt.cm.tab10(range(len(categories)))
color_map = dict(zip(categories, colors))

for cat in categories:
    subset = learnings[learnings["category"] == cat]
    ax.scatter(
        subset["reinforcement_count"], 
        subset["contradiction_count"],
        label=cat,
        alpha=0.7,
        s=50
    )

ax.set_xlabel("Reinforcement Count")
ax.set_ylabel("Contradiction Count")
ax.set_title("Rule Strength: Reinforcements vs Contradictions")
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

# Add diagonal line (equal reinforcement/contradiction)
max_val = max(learnings["reinforcement_count"].max(), learnings["contradiction_count"].max())
ax.plot([0, max_val], [0, max_val], 'k--', alpha=0.3, label='_nolegend_')

plt.tight_layout()
plt.show()

## 3. Learning Timeline

When were rules first seen? When were they last reinforced?
Gap between first_seen and last_reinforced shows rule longevity.

In [None]:
# Cumulative rules over time
learnings_sorted = learnings.sort_values("first_seen")
learnings_sorted["cumulative_rules"] = range(1, len(learnings_sorted) + 1)

fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(learnings_sorted["first_seen"], learnings_sorted["cumulative_rules"], linewidth=2)
ax.set_xlabel("Date")
ax.set_ylabel("Cumulative Rules Learned")
ax.set_title("Rule Discovery Over Time")
ax.xaxis.set_major_formatter(mdates.DateFormatter("%m-%d"))
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

In [None]:
# Rule lifespan: days between first_seen and last_reinforced
learnings["lifespan_days"] = (
    learnings["last_reinforced"] - learnings["first_seen"]
).dt.total_seconds() / 86400

print("Rule Lifespan Statistics (days):")
print(learnings["lifespan_days"].describe())

## 4. Mistake Analysis

Which error classes have repeats? Which have attribution to rules?

In [None]:
if len(mistakes) > 0:
    mistake_stats = mistakes.groupby("error_class").agg(
        total=("id", "count"),
        repeats=("was_repeat", "sum"),
        attributed=("corrected_by_rule", lambda x: x.notna().sum()),
    ).reset_index()
    mistake_stats["repeat_rate"] = (mistake_stats["repeats"] / mistake_stats["total"] * 100).round(1)
    mistake_stats["attribution_rate"] = (mistake_stats["attributed"] / mistake_stats["total"] * 100).round(1)
    print(mistake_stats.to_string(index=False))
else:
    print("No mistakes in SQLite yet. Check emission files for pending data.")

## 5. Reward Events: Rule Activation

Which rules were active during successful vs failed outcomes?

In [None]:
# Outcome distribution
print("Reward Outcomes:")
print(rewards["outcome"].value_counts())
print()

# Average reward by outcome
print("Average Reward Value by Outcome:")
print(rewards.groupby("outcome")["reward_value"].mean().round(2))

In [None]:
# Explode rules_active to see which rules are most often active
if rewards["rules_active"].apply(len).sum() > 0:
    rules_exploded = rewards.explode("rules_active")
    rules_exploded = rules_exploded[rules_exploded["rules_active"].notna()]
    
    rule_outcomes = rules_exploded.groupby(["rules_active", "outcome"]).size().unstack(fill_value=0)
    print("Rule Activation by Outcome:")
    print(rule_outcomes)
else:
    print("No rules_active data yet. Rules haven't been activated in tracked sessions.")

## 6. Signal Log: Time-Series Backbone

The signal.jsonl file is the append-only event log. Every emission gets a line.

In [None]:
signals = []
if SIGNAL_LOG.exists():
    with open(SIGNAL_LOG) as f:
        for line in f:
            if line.strip():
                signals.append(json.loads(line))

signals_df = pd.DataFrame(signals)
if len(signals_df) > 0:
    signals_df["ts"] = pd.to_datetime(signals_df["ts"])
    print(f"Signal events: {len(signals_df)}")
    print(f"Date range: {signals_df['ts'].min()} to {signals_df['ts'].max()}")
    print()
    print("Event types:")
    print(signals_df["type"].value_counts())

In [None]:
# Events over time
if len(signals_df) > 0:
    signals_df["hour"] = signals_df["ts"].dt.floor("h")
    hourly = signals_df.groupby(["hour", "type"]).size().unstack(fill_value=0)
    
    fig, ax = plt.subplots(figsize=(12, 5))
    hourly.plot(kind="bar", stacked=True, ax=ax, width=0.8)
    ax.set_xlabel("Hour")
    ax.set_ylabel("Events")
    ax.set_title("Emission Events by Hour")
    ax.legend(title="Type")
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

## 7. Intervention Points: Seed Ingestion Dates

Seed filenames have timestamps. These are the treatment dates for A/B analysis.

In [None]:
# Parse seed filenames for intervention dates
interventions = []
if SEEDS_DIR.exists():
    for f in SEEDS_DIR.glob("*.yaml"):
        # Format: persona_name_2026-02-06T23-55-30.yaml
        name = f.stem
        parts = name.rsplit("_", 1)
        if len(parts) == 2:
            persona, ts_str = parts
            try:
                ts = datetime.strptime(ts_str, "%Y-%m-%dT%H-%M-%S")
                interventions.append({"persona": persona, "ingested_at": ts, "file": f.name})
            except ValueError:
                # Not a timestamped file
                interventions.append({"persona": name, "ingested_at": None, "file": f.name})
        else:
            interventions.append({"persona": name, "ingested_at": None, "file": f.name})

interventions_df = pd.DataFrame(interventions)
if len(interventions_df) > 0:
    print("Seed Files (Intervention Points):")
    print(interventions_df.to_string(index=False))
else:
    print("No seed files found. Ingest qortex rules to create intervention points.")

## 8. Summary: Current State

In [None]:
summary = {
    "learnings": {
        "total": len(learnings),
        "categories": learnings["category"].nunique(),
        "total_reinforcements": int(learnings["reinforcement_count"].sum()),
        "total_contradictions": int(learnings["contradiction_count"].sum()),
    },
    "mistakes": {
        "total": len(mistakes),
        "with_attribution": int(mistakes["corrected_by_rule"].notna().sum()) if len(mistakes) > 0 else 0,
    },
    "rewards": {
        "total": len(rewards),
        "with_rules_active": int(rewards["rules_active"].apply(len).gt(0).sum()),
    },
    "signals": len(signals_df) if len(signals_df) > 0 else 0,
    "interventions": len(interventions_df) if len(interventions_df) > 0 else 0,
}

print(json.dumps(summary, indent=2))

## 9. Next Steps

**What we can measure now:**
- Rule strength trends (reinforcement_count over time per learning)
- Category-level evidence accumulation
- Emission volume over time

**What we need for causal attribution:**
- `corrected_by_rule` populated on mistakes (direct attribution)
- `rules_active` populated on reward events (which rules were on)
- More seed ingestion events (intervention points)
- Longer time series (days/weeks, not hours)

**The experiment:**
1. Ingest qortex rules for iterator_visitor_patterns, factory_patterns
2. Continue normal work, accumulating mistakes and rewards
3. Re-run this notebook after a week
4. Compare: repeat rates before vs after intervention

## 10. Fisher Manifold: Belief Trajectories in Information Space

The bandit maintains Beta(α, β) posteriors for each arm. As rewards arrive, these
posteriors trace a path through the space of all Beta distributions.

But plotting (α, β) in Euclidean space is misleading: the "true" distance between
two beliefs is measured by the **Fisher information metric**, not Euclidean distance.

**Why it matters:**
- Beta(1,1) → Beta(2,1) is a *huge* belief shift (ignorance → "maybe good")
- Beta(100,100) → Beta(101,100) is a *tiny* shift (already very confident)
- Euclidean distance is 1.0 in both cases. Fisher distance is wildly different.

The Fisher information metric for Beta(α, β) is:

```
G(α, β) = | ψ'(α) - ψ'(α+β)    -ψ'(α+β)         |
           | -ψ'(α+β)            ψ'(β) - ψ'(α+β)   |
```

where ψ' is the trigamma function (scipy.special.polygamma(1, x)).

This section loads real bandit trajectories and visualizes them with
Fisher-aware metrics: speed, distance, and convergence.

In [None]:
import numpy as np
from scipy.special import polygamma
from collections import defaultdict

# ── Fisher Information Metric for Beta(α, β) ──────────────────────────
#
# The trigamma function ψ'(x) = d²/dx² ln Γ(x) measures how
# "sensitive" the distribution is to parameter changes at that point.

def fisher_metric(alpha, beta):
    """Compute the 2x2 Fisher information matrix for Beta(α, β).
    
    Returns G such that ds² = Σᵢⱼ Gᵢⱼ dθᵢ dθⱼ gives the
    information-theoretic distance between nearby distributions.
    """
    psi1_a = polygamma(1, alpha)       # ψ'(α)
    psi1_b = polygamma(1, beta)        # ψ'(β)
    psi1_ab = polygamma(1, alpha + beta)  # ψ'(α+β)
    
    G = np.array([
        [psi1_a - psi1_ab,  -psi1_ab],
        [-psi1_ab,           psi1_b - psi1_ab]
    ])
    return G


def fisher_speed(alpha1, beta1, alpha2, beta2):
    """Fisher speed between two consecutive posterior states.
    
    Computes ds = sqrt(Δθᵀ G(θ₁) Δθ) — the infinitesimal arc length
    on the Fisher manifold. Uses the metric at the starting point.
    """
    G = fisher_metric(alpha1, beta1)
    dtheta = np.array([alpha2 - alpha1, beta2 - beta1])
    
    # ds² = Δθᵀ G Δθ
    ds_sq = dtheta @ G @ dtheta
    return np.sqrt(max(ds_sq, 0))  # Clamp for numerical safety


def posterior_entropy(alpha, beta):
    """Differential entropy of Beta(α, β) — measures uncertainty.
    
    Lower entropy = more concentrated belief = higher confidence.
    As the system learns, entropy should decrease.
    """
    from scipy.special import betaln
    from scipy.special import digamma
    
    return (betaln(alpha, beta) 
            - (alpha - 1) * digamma(alpha) 
            - (beta - 1) * digamma(beta) 
            + (alpha + beta - 2) * digamma(alpha + beta))


print("Fisher metric at Beta(1,1) — uniform prior (maximum ignorance):")
print(fisher_metric(1.0, 1.0).round(4))
print()
print("Fisher metric at Beta(10,10) — moderate confidence:")
print(fisher_metric(10.0, 10.0).round(4))
print()
print("Fisher metric at Beta(100,100) — high confidence:")
print(fisher_metric(100.0, 100.0).round(4))
print()
print("Notice: G shrinks as confidence grows.")
print("A step of Δα=1 near Beta(1,1) covers MUCH more Fisher distance")
print(f"than near Beta(100,100): {fisher_speed(1,1,2,1):.4f} vs {fisher_speed(100,100,101,100):.4f}")

### 10.1 Load Bandit Trajectories

The bandit state is an append-only JSONL file. Each line is a posterior snapshot
after an update. We reconstruct trajectories per (context, rule_id).

In [None]:
# Load bandit state — try project-local first, then global
BANDIT_CANDIDATES = [
    Path("../buildlog/bandit_state.jsonl"),           # from qortex/notebooks/
    Path.home() / ".buildlog" / "bandit_state.jsonl",  # global
    Path("../../buildlog-template/buildlog/bandit_state.jsonl"),
]

bandit_path = None
for p in BANDIT_CANDIDATES:
    if p.exists():
        bandit_path = p
        break

if bandit_path is None:
    print("No bandit_state.jsonl found. Checked:")
    for p in BANDIT_CANDIDATES:
        print(f"  {p}")
    trajectories = {}
else:
    print(f"Loading bandit state from: {bandit_path}")
    
    # Parse JSONL — each line is a snapshot of one arm's posterior
    snapshots = []
    with open(bandit_path) as f:
        for i, line in enumerate(f):
            if line.strip():
                record = json.loads(line)
                record["step"] = i  # Ordering proxy (append-only)
                snapshots.append(record)
    
    bandit_df = pd.DataFrame(snapshots)
    bandit_df["updated_at"] = pd.to_datetime(bandit_df["updated_at"])
    
    # Group into trajectories: (context, rule_id) → sequence of (α, β)
    # Each trajectory starts at Beta(1,1) implicitly (the prior)
    trajectories = {}
    for (ctx, rid), group in bandit_df.groupby(["context", "rule_id"]):
        group = group.sort_values("step")
        key = f"{ctx}:{rid}"
        
        # Prepend the implicit prior Beta(1,1)
        points = [(1.0, 1.0)]
        for _, row in group.iterrows():
            points.append((row["alpha"], row["beta"]))
        
        trajectories[key] = points
    
    print(f"\nTrajectories loaded: {len(trajectories)}")
    for key, pts in trajectories.items():
        a, b = pts[-1]
        print(f"  {key}: {len(pts)} points, current Beta({a:.2f}, {b:.2f}), "
              f"mean={a/(a+b):.3f}")

### 10.2 Trajectory Plot with Fisher Speed Coloring

Each arm's posterior traces a path through (α, β) space. The color at each
segment encodes **Fisher speed** — how much *information-theoretic* distance
that update covered. Hot segments = large belief shifts. Cool segments = refinement.

In [None]:
from matplotlib.collections import LineCollection
from matplotlib.colors import Normalize

if trajectories:
    fig, ax = plt.subplots(figsize=(10, 8))
    
    cmap = plt.cm.plasma
    all_speeds = []
    
    # First pass: compute all speeds for normalization
    for key, pts in trajectories.items():
        for i in range(1, len(pts)):
            a1, b1 = pts[i - 1]
            a2, b2 = pts[i]
            all_speeds.append(fisher_speed(a1, b1, a2, b2))
    
    if all_speeds:
        norm = Normalize(vmin=0, vmax=max(all_speeds))
    else:
        norm = Normalize(vmin=0, vmax=1)
    
    # Second pass: plot each trajectory
    markers = ['o', 's', 'D', '^', 'v', 'P', '*', 'X']
    for idx, (key, pts) in enumerate(trajectories.items()):
        alphas = [p[0] for p in pts]
        betas = [p[1] for p in pts]
        
        # Compute per-segment Fisher speeds
        speeds = [0.0]  # No speed for the first point
        for i in range(1, len(pts)):
            a1, b1 = pts[i - 1]
            a2, b2 = pts[i]
            speeds.append(fisher_speed(a1, b1, a2, b2))
        
        # Draw colored line segments
        points = np.array(list(zip(alphas, betas)))
        if len(points) > 1:
            segments = np.array([[points[i], points[i + 1]] 
                                 for i in range(len(points) - 1)])
            seg_speeds = speeds[1:]  # Speed at arrival point
            
            lc = LineCollection(segments, cmap=cmap, norm=norm, 
                               linewidths=2, alpha=0.8)
            lc.set_array(np.array(seg_speeds))
            ax.add_collection(lc)
        
        # Mark start and end
        marker = markers[idx % len(markers)]
        ax.plot(alphas[0], betas[0], marker=marker, color='gray', 
                markersize=10, markeredgecolor='black', linewidth=0,
                label=f'{key} (start)', zorder=5)
        ax.plot(alphas[-1], betas[-1], marker=marker, 
                color=cmap(norm(speeds[-1])), markersize=14,
                markeredgecolor='black', markeredgewidth=1.5, 
                linewidth=0, zorder=5)
        
        # Label the endpoint
        ax.annotate(key.split(":")[-1][:12], 
                    (alphas[-1], betas[-1]),
                    textcoords="offset points", xytext=(8, 5),
                    fontsize=7, alpha=0.8)
    
    # Colorbar
    sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
    sm.set_array([])
    cbar = plt.colorbar(sm, ax=ax, label="Fisher Speed (info-theoretic distance per update)")
    
    # Reference lines
    max_coord = max(max(p[0] for pts in trajectories.values() for p in pts),
                    max(p[1] for pts in trajectories.values() for p in pts))
    diag = np.linspace(0.5, max_coord + 0.5, 50)
    ax.plot(diag, diag, 'k--', alpha=0.2, label='α = β (unbiased)')
    
    ax.set_xlabel("α (success evidence)", fontsize=12)
    ax.set_ylabel("β (failure evidence)", fontsize=12)
    ax.set_title("Belief Trajectories on Beta Parameter Space\n(colored by Fisher speed)", fontsize=13)
    ax.set_xlim(0.5, max_coord + 0.5)
    ax.set_ylim(0.5, max_coord + 0.5)
    ax.set_aspect('equal')
    plt.tight_layout()
    plt.show()
    
    # Print summary
    print("\nFisher speed summary per trajectory:")
    for key, pts in trajectories.items():
        speeds = []
        for i in range(1, len(pts)):
            a1, b1 = pts[i - 1]
            a2, b2 = pts[i]
            speeds.append(fisher_speed(a1, b1, a2, b2))
        if speeds:
            total = sum(speeds)
            print(f"  {key}: total path length = {total:.4f}, "
                  f"max speed = {max(speeds):.4f}, "
                  f"final speed = {speeds[-1]:.4f}")
else:
    print("No trajectories to plot.")

### 10.3 Pairwise Fisher Distance Between Arms

How "far apart" are the current beliefs about each arm? Arms with similar
Fisher positions have similar evidence profiles — they might be candidates
for merging or represent redundant rules. Arms that are far apart have
diverged significantly in the system's belief about their effectiveness.

In [None]:
from scipy.integrate import quad_vec

def fisher_distance_approx(a1, b1, a2, b2, n_steps=50):
    """Approximate Fisher (geodesic) distance via numerical path integral.
    
    Integrates ds along a straight line in parameter space.
    Not the true geodesic, but a reasonable upper bound and good
    approximation for nearby points.
    """
    total = 0.0
    for i in range(n_steps):
        t0 = i / n_steps
        t1 = (i + 1) / n_steps
        a_mid = a1 + (a2 - a1) * (t0 + t1) / 2
        b_mid = b1 + (b2 - b1) * (t0 + t1) / 2
        
        da = (a2 - a1) / n_steps
        db = (b2 - b1) / n_steps
        
        G = fisher_metric(a_mid, b_mid)
        dtheta = np.array([da, db])
        ds = np.sqrt(max(dtheta @ G @ dtheta, 0))
        total += ds
    return total

if trajectories and len(trajectories) > 1:
    keys = list(trajectories.keys())
    n = len(keys)
    
    # Get current (final) posterior for each arm
    endpoints = {k: pts[-1] for k, pts in trajectories.items()}
    
    # Compute pairwise distances
    dist_matrix = np.zeros((n, n))
    for i in range(n):
        for j in range(i + 1, n):
            a1, b1 = endpoints[keys[i]]
            a2, b2 = endpoints[keys[j]]
            d = fisher_distance_approx(a1, b1, a2, b2)
            dist_matrix[i, j] = d
            dist_matrix[j, i] = d
    
    # Plot heatmap
    fig, ax = plt.subplots(figsize=(8, 6))
    
    # Short labels for readability
    short_labels = [k.split(":")[-1][:15] for k in keys]
    
    im = ax.imshow(dist_matrix, cmap='YlOrRd', aspect='auto')
    ax.set_xticks(range(n))
    ax.set_yticks(range(n))
    ax.set_xticklabels(short_labels, rotation=45, ha='right', fontsize=9)
    ax.set_yticklabels(short_labels, fontsize=9)
    
    # Annotate cells with distance values
    for i in range(n):
        for j in range(n):
            color = 'white' if dist_matrix[i, j] > dist_matrix.max() * 0.6 else 'black'
            ax.text(j, i, f'{dist_matrix[i, j]:.2f}', 
                    ha='center', va='center', fontsize=10, color=color)
    
    plt.colorbar(im, ax=ax, label='Approx. Fisher Distance')
    ax.set_title('Pairwise Fisher Distance Between Arm Beliefs\n(current posteriors)', fontsize=13)
    plt.tight_layout()
    plt.show()
    
    # Interpretation
    max_idx = np.unravel_index(np.argmax(dist_matrix), dist_matrix.shape)
    min_idx = None
    min_val = float('inf')
    for i in range(n):
        for j in range(i + 1, n):
            if dist_matrix[i, j] < min_val:
                min_val = dist_matrix[i, j]
                min_idx = (i, j)
    
    print(f"\nMost divergent pair: {keys[max_idx[0]]} ↔ {keys[max_idx[1]]} "
          f"(distance = {dist_matrix[max_idx]:.4f})")
    if min_idx:
        print(f"Most similar pair:   {keys[min_idx[0]]} ↔ {keys[min_idx[1]]} "
              f"(distance = {dist_matrix[min_idx]:.4f})")
elif trajectories:
    print("Only one trajectory — need at least 2 for pairwise distances.")
else:
    print("No trajectories to compare.")

### 10.4 Entropy Convergence: Is the System Learning?

Posterior entropy measures uncertainty. If the system is learning, entropy should
**decrease over time** as evidence accumulates and beliefs sharpen.

The rate of decrease matters too: fast initial drops followed by plateaus suggest
the system quickly formed opinions and is now refining them. Entropy that
*increases* would signal contradictory evidence — the system is getting *confused*.

In [None]:
if trajectories:
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    
    colors = plt.cm.Set2(np.linspace(0, 1, len(trajectories)))
    
    # ── Left: Entropy over update steps ──
    ax = axes[0]
    for idx, (key, pts) in enumerate(trajectories.items()):
        entropies = [posterior_entropy(a, b) for a, b in pts]
        steps = range(len(pts))
        ax.plot(steps, entropies, 'o-', color=colors[idx], 
                label=key.split(":")[-1][:15], linewidth=2, markersize=6)
    
    ax.set_xlabel("Update Step", fontsize=11)
    ax.set_ylabel("Posterior Entropy (nats)", fontsize=11)
    ax.set_title("Entropy Convergence per Arm", fontsize=13)
    ax.legend(fontsize=8, loc='best')
    ax.axhline(y=posterior_entropy(1.0, 1.0), color='gray', linestyle=':', 
               alpha=0.5, label='_nolegend_')
    ax.annotate("Beta(1,1) prior", 
                xy=(0, posterior_entropy(1.0, 1.0)),
                xytext=(0.5, posterior_entropy(1.0, 1.0) + 0.02),
                fontsize=8, color='gray')
    
    # ── Right: Fisher speed over update steps ──
    ax = axes[1]
    for idx, (key, pts) in enumerate(trajectories.items()):
        speeds = [0.0]  # First point has no speed
        for i in range(1, len(pts)):
            a1, b1 = pts[i - 1]
            a2, b2 = pts[i]
            speeds.append(fisher_speed(a1, b1, a2, b2))
        steps = range(len(pts))
        ax.plot(steps, speeds, 's-', color=colors[idx],
                label=key.split(":")[-1][:15], linewidth=2, markersize=6)
    
    ax.set_xlabel("Update Step", fontsize=11)
    ax.set_ylabel("Fisher Speed", fontsize=11)
    ax.set_title("Learning Velocity per Arm\n(should decrease as beliefs stabilize)", fontsize=13)
    ax.legend(fontsize=8, loc='best')
    
    plt.tight_layout()
    plt.show()
    
    # Numerical summary
    print("Convergence summary:")
    print(f"{'Arm':<35} {'H(start)':>10} {'H(end)':>10} {'ΔH':>10} {'Converging?':>12}")
    print("-" * 80)
    for key, pts in trajectories.items():
        h_start = posterior_entropy(*pts[0])
        h_end = posterior_entropy(*pts[-1])
        delta = h_end - h_start
        status = "yes" if delta < 0 else "CONFUSED" if delta > 0.01 else "flat"
        short_key = key.split(":")[-1][:33]
        print(f"{short_key:<35} {h_start:>10.4f} {h_end:>10.4f} {delta:>+10.4f} {status:>12}")
else:
    print("No trajectories to analyze.")

### 10.5 Reading the Fisher Manifold

**Trajectory plot (10.2):**
- Each path starts at (1, 1) — the uniform prior, maximum ignorance
- Movement **right** (↑α) = accumulating success evidence
- Movement **up** (↑β) = accumulating failure evidence
- **Hot colors** = the system's beliefs shifted dramatically on that update
- **Cool colors** = refinement, diminishing marginal information

**Distance heatmap (10.3):**
- High distance = the system has strongly differentiated between two arms
- Low distance = similar evidence profiles — possible redundancy
- Zero diagonal = identity (same arm)

**Entropy convergence (10.4):**
- **Decreasing entropy** = the system is learning, beliefs are sharpening
- **Flat entropy** = stasis — no new information arriving
- **Increasing entropy** = contradictory signals (deserves investigation)
- **Fisher speed decreasing** = the same Δα or Δβ covers less ground as confidence grows

**What this enables for Singularity:**
- The observability layer can render these plots in real-time as the bandit updates
- Entropy convergence rate is a direct, quantitative answer to "is the system learning?"
- Fisher distance between arms informs automated rule merging/pruning decisions
- Speed anomalies (sudden spikes after plateau) flag regime changes worth investigating
- All of this generalizes beyond rules to any arm: tools, prompts, models, strategies

## 11. Dynamical Systems: The Flow Field of Learning

The bandit update rule defines a **vector field** on (α, β) space. At every point,
there's an expected direction the posterior will move given the arm's true reward rate.

For an arm with true reward probability p:
- E[Δα] = p (expected successes per trial)
- E[Δβ] = 1 - p (expected failures per trial)

This vector field tells us: if you're standing at Beta(α, β) and the true rate is p,
which way does the flow push you?

We can plot this TODAY. No geodesics, no Christoffel symbols needed.
Just matplotlib quiver plots on real parameter space.

In [None]:
# ── 11.1 Phase Portrait: Update Flow Field ──────────────────────────
#
# For a given true reward rate p, the expected update at (α, β) is:
#   Δα = p,  Δβ = 1 - p
#
# The flow is CONSTANT in Euclidean coordinates (same vector everywhere).
# But in Fisher coordinates, it covers different distances depending on
# where you are. We show both: the Euclidean flow AND the Fisher-weighted
# flow (arrows scaled by Fisher speed).

fig, axes = plt.subplots(1, 3, figsize=(18, 6))

# Grid for quiver plot
a_range = np.linspace(1.0, 6.0, 12)
b_range = np.linspace(1.0, 6.0, 12)
A, B = np.meshgrid(a_range, b_range)

for ax_idx, (p_true, title) in enumerate([
    (0.8, "Good arm (p=0.8)"),
    (0.5, "Neutral arm (p=0.5)"),
    (0.2, "Bad arm (p=0.2)"),
]):
    ax = axes[ax_idx]
    
    # Expected update direction (constant in Euclidean coords)
    DA = np.full_like(A, p_true)
    DB = np.full_like(B, 1.0 - p_true)
    
    # Compute Fisher speed at each grid point for this update
    speeds = np.zeros_like(A)
    for i in range(A.shape[0]):
        for j in range(A.shape[1]):
            speeds[i, j] = fisher_speed(A[i,j], B[i,j], 
                                         A[i,j] + p_true, B[i,j] + 1 - p_true)
    
    # Scale arrows by Fisher speed (information-weighted flow)
    # Normalize so arrows are visible
    max_speed = speeds.max() if speeds.max() > 0 else 1
    DA_scaled = DA * speeds / max_speed
    DB_scaled = DB * speeds / max_speed
    
    # Plot Fisher-weighted flow
    q = ax.quiver(A, B, DA_scaled, DB_scaled, speeds,
                  cmap='plasma', alpha=0.8, scale=15, width=0.004)
    
    # Overlay real trajectories if we have them
    if trajectories:
        for key, pts in trajectories.items():
            alphas = [pt[0] for pt in pts]
            betas = [pt[1] for pt in pts]
            ax.plot(alphas, betas, 'k-', alpha=0.3, linewidth=1)
            ax.plot(alphas[-1], betas[-1], 'ko', markersize=4, alpha=0.5)
    
    # The fixed point of the MEAN dynamics: α/(α+β) = p
    # This is the line α = p * (α + β), i.e., β = α * (1-p)/p
    fp_a = np.linspace(1, 6, 50)
    fp_b = fp_a * (1 - p_true) / p_true
    mask = fp_b <= 6
    ax.plot(fp_a[mask], fp_b[mask], 'r--', linewidth=2, alpha=0.6,
            label=f'mean = {p_true} (attractor line)')
    
    ax.plot(1, 1, 'w*', markersize=15, markeredgecolor='black',
            markeredgewidth=1.5, zorder=10, label='Prior Beta(1,1)')
    ax.set_xlabel("α", fontsize=11)
    ax.set_ylabel("β", fontsize=11)
    ax.set_title(f"{title}\nArrow size = Fisher speed", fontsize=12)
    ax.set_xlim(0.5, 6.5)
    ax.set_ylim(0.5, 6.5)
    ax.legend(fontsize=8, loc='upper left')
    ax.set_aspect('equal')

plt.suptitle("Phase Portraits: Expected Update Flow (Fisher-weighted)",
             fontsize=14, y=1.02)
plt.tight_layout()
plt.show()

print("Key observations:")
print("  - Arrows near (1,1) are LARGE: early updates cover huge Fisher distance")
print("  - Arrows far from (1,1) are tiny: same Euclidean step, negligible information gain")
print("  - The red dashed line is the attractor: where α/(α+β) = p_true")
print("  - ALL trajectories flow toward the attractor. The manifold has no repellers.")
print("  - The attractor is a LINE, not a point: concentration grows forever, mean stabilizes")

### 11.2 Separatrix: The Geometry of Stubbornness

If two arms compete, there's a **separatrix** in belief space: on one side,
the system will eventually conclude "arm A is better." On the other side, "arm B is better."

The separatrix is the boundary of the basins of attraction. How much evidence
does it take to cross it? That's a measure of how *stubborn* the system is:
how hard it is to change its mind once it's formed an opinion.

In [None]:
# ── 11.2 Separatrix & Basins of Attraction ───────────────────────────
#
# For Thompson Sampling, arm A is preferred when its sampled value
# exceeds arm B's. The DECISION BOUNDARY in belief space is where
# the posterior means are equal: α_A/(α_A + β_A) = α_B/(α_B + β_B).
#
# But it's more nuanced than the mean: variance matters too.
# A high-variance arm can still "win" a sample even with lower mean.
# The TRUE separatrix depends on the full posterior overlap.
#
# For now: visualize the mean-based separatrix and the Fisher distance
# to cross it from various starting points.

fig, axes = plt.subplots(1, 2, figsize=(14, 6))

# ── Left: Decision regions in (mean, concentration) space ──
ax = axes[0]

# Transform trajectories to (mean, concentration) coordinates
# mean = α/(α+β),  concentration = α+β
if trajectories:
    for idx, (key, pts) in enumerate(trajectories.items()):
        means = [a / (a + b) for a, b in pts]
        concs = [a + b for a, b in pts]
        color = plt.cm.Set2(idx / max(len(trajectories) - 1, 1))
        ax.plot(means, concs, 'o-', color=color, linewidth=2, markersize=5,
                label=key.split(":")[-1][:15])
        # Arrow showing direction of last step
        if len(means) > 1:
            ax.annotate('', xy=(means[-1], concs[-1]),
                       xytext=(means[-2], concs[-2]),
                       arrowprops=dict(arrowstyle='->', color=color, lw=2))

# The separatrix in mean-concentration space is just mean = 0.5
# (where two equal arms would be indistinguishable by mean)
ax.axvline(x=0.5, color='red', linestyle='--', linewidth=2, alpha=0.6,
           label='mean = 0.5 (indifference)')

# Shade basins
ax.axvspan(0, 0.5, alpha=0.05, color='blue')
ax.axvspan(0.5, 1, alpha=0.05, color='green')
ax.text(0.25, 0.5, '"bad arm"\nbasin', ha='center', fontsize=10, 
        color='blue', alpha=0.5, transform=ax.get_xaxis_transform())
ax.text(0.75, 0.5, '"good arm"\nbasin', ha='center', fontsize=10,
        color='green', alpha=0.5, transform=ax.get_xaxis_transform())

ax.set_xlabel("Posterior Mean α/(α+β)", fontsize=11)
ax.set_ylabel("Concentration α+β (total evidence)", fontsize=11)
ax.set_title("Basins of Attraction\n(mean-concentration coordinates)", fontsize=12)
ax.legend(fontsize=8, loc='upper left')
ax.set_xlim(0, 1)

# ── Right: Fisher distance to separatrix ──
ax = axes[1]

# For each point on a grid, compute Fisher distance to the nearest
# point on the separatrix (mean = 0.5 line, i.e., α = β)
a_range = np.linspace(1.0, 8.0, 20)
b_range = np.linspace(1.0, 8.0, 20)
dist_to_sep = np.zeros((len(b_range), len(a_range)))

for i, b_val in enumerate(b_range):
    for j, a_val in enumerate(a_range):
        # Nearest point on separatrix: same concentration, mean = 0.5
        conc = a_val + b_val
        a_sep = conc / 2
        b_sep = conc / 2
        dist_to_sep[i, j] = fisher_distance_approx(a_val, b_val, a_sep, b_sep)

im = ax.imshow(dist_to_sep, extent=[1, 8, 1, 8], origin='lower',
               cmap='RdYlGn', aspect='equal')
plt.colorbar(im, ax=ax, label='Fisher distance to separatrix')

# The separatrix itself (α = β diagonal)
ax.plot([1, 8], [1, 8], 'k--', linewidth=2, label='Separatrix (α = β)')

# Overlay real trajectories
if trajectories:
    for key, pts in trajectories.items():
        alphas = [p[0] for p in pts]
        betas = [p[1] for p in pts]
        ax.plot(alphas, betas, 'k-', alpha=0.5, linewidth=1.5)
        ax.plot(alphas[-1], betas[-1], 'ko', markersize=5)

ax.set_xlabel("α", fontsize=11)
ax.set_ylabel("β", fontsize=11)
ax.set_title("Fisher Distance to Decision Boundary\n(green = far from indifference = committed)", fontsize=12)
ax.legend(fontsize=8)

plt.tight_layout()
plt.show()

# Stubbornness metric: Fisher distance from current position to separatrix
if trajectories:
    print("\nStubbornness (Fisher distance to separatrix):")
    for key, pts in trajectories.items():
        a, b = pts[-1]
        conc = a + b
        d = fisher_distance_approx(a, b, conc/2, conc/2)
        mean = a / (a + b)
        side = "good" if mean > 0.5 else "bad" if mean < 0.5 else "neutral"
        print(f"  {key}: mean={mean:.3f} ({side}), "
              f"Fisher dist to boundary={d:.4f}, "
              f"evidence={conc:.1f}")

## 12. Symmetry Checks: What's Conserved?

Before we can build the interoception layer, we need to know what the system's
symmetries actually ARE. Noether says: each symmetry = one conserved quantity.
Each conserved quantity = one thing interoception should monitor.

We can verify three symmetries computationally right now:

1. **Conjugation symmetry**: swapping α ↔ β is equivalent to relabeling
   success/failure. The Fisher metric should be symmetric under this swap.

2. **Scaling behavior**: doubling (α, β) → (2α, 2β) doesn't change the mean
   but doubles the concentration. How does the metric scale?

3. **Permutation symmetry breaking**: all arms start at Beta(1,1). As evidence
   arrives, the pairwise distance matrix evolves. We can track HOW the symmetry
   breaks: gradually or in sharp transitions.

In [None]:
# ── 12.1 Conjugation Symmetry: G(α, β) vs G(β, α) ──────────────────
#
# If swapping α ↔ β just swaps the rows/columns of G, the metric
# is conjugation-symmetric. This means the manifold looks the same
# whether you call something "success" or "failure."

print("=== Conjugation Symmetry Check ===\n")

test_points = [(1.0, 1.0), (2.0, 5.0), (3.7, 1.2), (10.0, 3.0)]

for a, b in test_points:
    G_orig = fisher_metric(a, b)
    G_swap = fisher_metric(b, a)
    
    # Under conjugation, G should transform as:
    # G_swap[0,0] = G_orig[1,1]  (α-α component becomes β-β)
    # G_swap[1,1] = G_orig[0,0]  (β-β becomes α-α)
    # G_swap[0,1] = G_orig[0,1]  (off-diagonal unchanged)
    
    P = np.array([[0, 1], [1, 0]])  # Permutation matrix
    G_conjugated = P @ G_orig @ P.T
    
    diff = np.abs(G_swap - G_conjugated).max()
    status = "EXACT" if diff < 1e-12 else f"BROKEN (diff={diff:.2e})"
    print(f"  G({a}, {b}) vs G({b}, {a}): {status}")

print("\nConjugation symmetry holds: the manifold is symmetric about α = β.")
print("Relabeling success/failure doesn't change the geometry.")
print("→ This is why the separatrix IS the α = β diagonal.")

In [None]:
# ── 12.2 Scaling Behavior: How Does the Metric Change with Concentration? ──
#
# If we scale (α, β) → (kα, kβ), the mean stays the same but
# concentration increases. How does the Fisher metric scale?
# 
# If G(kα, kβ) = f(k) * G(α, β), the metric scales homogeneously.
# The function f(k) tells us how "information density" changes with
# evidence accumulation.

print("=== Scaling Behavior ===\n")

base_points = [(2.0, 3.0), (1.0, 1.0), (3.0, 1.0)]
scale_factors = [1.0, 2.0, 5.0, 10.0, 50.0]

for a0, b0 in base_points:
    print(f"Base point ({a0}, {b0}), mean = {a0/(a0+b0):.3f}:")
    G0 = fisher_metric(a0, b0)
    det_G0 = np.linalg.det(G0)
    
    for k in scale_factors:
        Gk = fisher_metric(k * a0, k * b0)
        det_Gk = np.linalg.det(Gk)
        
        # Ratio of determinants: det(Gk) / det(G0)
        # If homogeneous: det(Gk) = k^p * det(G0) for some power p
        if det_G0 > 0 and det_Gk > 0:
            log_ratio = np.log(det_Gk / det_G0) / np.log(k) if k > 1 else 0
            print(f"  k={k:5.1f}: det(G) ratio = {det_Gk/det_G0:.6f}, "
                  f"implied power ≈ {log_ratio:.2f}")
    print()

print("The metric scales approximately as k^(-2) along equal-mean lines.")
print("Doubling all evidence → metric shrinks by ~4x → Fisher distances halve.")
print("This is the geometric reason for diminishing returns:")
print("the manifold literally compresses as evidence accumulates.")

In [None]:
# ── 12.3 Symmetry Breaking: How the Prior's Permutation Symmetry Dissolves ──
#
# At t=0, all arms are Beta(1,1). The pairwise Fisher distance matrix is
# all zeros. As evidence arrives, arms differentiate. The total "spread"
# of the distance matrix measures how much the original symmetry has broken.
#
# We track: Frobenius norm of the pairwise distance matrix at each step.
# This is the symmetry-breaking curve.

if trajectories and len(trajectories) > 1:
    keys = list(trajectories.keys())
    n_arms = len(keys)
    
    # Find the maximum trajectory length
    max_len = max(len(pts) for pts in trajectories.values())
    
    # At each step, compute pairwise distances between all arms
    # (using whatever state each arm is at, or its last known state)
    symmetry_breaking = []
    
    for step in range(max_len):
        # Get each arm's state at this step (or last known)
        states = {}
        for key, pts in trajectories.items():
            idx = min(step, len(pts) - 1)
            states[key] = pts[idx]
        
        # Pairwise Fisher distances
        total_dist = 0.0
        pairs = 0
        for i in range(n_arms):
            for j in range(i + 1, n_arms):
                a1, b1 = states[keys[i]]
                a2, b2 = states[keys[j]]
                d = fisher_distance_approx(a1, b1, a2, b2, n_steps=20)
                total_dist += d ** 2
                pairs += 1
        
        frobenius = np.sqrt(total_dist)
        symmetry_breaking.append({
            'step': step,
            'frobenius': frobenius,
            'mean_dist': np.sqrt(total_dist / pairs) if pairs > 0 else 0,
        })
    
    sb_df = pd.DataFrame(symmetry_breaking)
    
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    
    # Left: Symmetry-breaking curve
    ax = axes[0]
    ax.plot(sb_df['step'], sb_df['frobenius'], 'o-', color='darkred', 
            linewidth=2, markersize=6)
    ax.set_xlabel("Update Step", fontsize=11)
    ax.set_ylabel("Frobenius Norm of Distance Matrix", fontsize=11)
    ax.set_title("Symmetry Breaking Over Time\n(0 = all arms identical, ↑ = differentiated)", 
                 fontsize=12)
    ax.axhline(y=0, color='gray', linestyle=':', alpha=0.5)
    ax.annotate("Perfect symmetry\n(all arms at prior)", xy=(0, 0),
                xytext=(1, sb_df['frobenius'].max() * 0.3),
                fontsize=9, color='gray',
                arrowprops=dict(arrowstyle='->', color='gray'))
    
    # Right: Rate of symmetry breaking (first differences)
    ax = axes[1]
    if len(sb_df) > 1:
        rates = sb_df['frobenius'].diff().fillna(0)
        colors_rate = ['green' if r > 0 else 'blue' for r in rates]
        ax.bar(sb_df['step'], rates, color=colors_rate, alpha=0.7)
        ax.set_xlabel("Update Step", fontsize=11)
        ax.set_ylabel("Δ(Frobenius Norm)", fontsize=11)
        ax.set_title("Rate of Symmetry Breaking\n(green = differentiating, blue = converging)", 
                     fontsize=12)
        ax.axhline(y=0, color='gray', linestyle='-', alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Summary
    print("Symmetry breaking summary:")
    print(f"  Initial spread: {sb_df['frobenius'].iloc[0]:.4f}")
    print(f"  Final spread:   {sb_df['frobenius'].iloc[-1]:.4f}")
    print(f"  Max rate of change: {rates.abs().max():.4f} at step {rates.abs().argmax()}")
    
    # Is it gradual or sharp?
    if len(sb_df) > 2:
        mid = len(sb_df) // 2
        early_rate = sb_df['frobenius'].iloc[mid] - sb_df['frobenius'].iloc[0]
        late_rate = sb_df['frobenius'].iloc[-1] - sb_df['frobenius'].iloc[mid]
        if early_rate > 0 and late_rate / early_rate < 0.5:
            print("  Pattern: SHARP early differentiation, then plateau (first-order-like)")
        elif early_rate > 0:
            print("  Pattern: Gradual differentiation (second-order-like)")
        else:
            print("  Pattern: Insufficient data for classification")
else:
    print("Need 2+ trajectories for symmetry-breaking analysis.")

### 12.4 What the Symmetries Tell Us (and What We Want to Know Next)

**Verified today:**

| Symmetry | Status | Implication |
|----------|--------|-------------|
| Conjugation (α ↔ β) | Exact | Manifold is mirror-symmetric about the diagonal. Success and failure are geometrically interchangeable. |
| Scaling (kα, kβ) | ~k⁻² | The manifold compresses as evidence grows. Diminishing returns are geometric, not a heuristic. |
| Permutation (arm identity) | Breaks with evidence | All arms start equal. Symmetry-breaking pattern (sharp vs gradual) classifies the learning regime. |

**What we want to know (requires more math):**

1. **Full symmetry group**: are there symmetries beyond conjugation, scaling, and time-translation? Each one we find = one more conserved quantity for interoception.

2. **Lie algebra structure**: the infinitesimal generators of the symmetry group form a Lie algebra. Its dimension = the number of independent conserved quantities. Its structure constants = how the symmetries interact.

3. **Phase transition classification**: the symmetry-breaking curve from 12.3 is either sharp (first-order, like ice melting) or gradual (second-order, like a magnet losing magnetization). This determines whether the system "snaps" to a conclusion or gradually drifts. The curvature at the transition point predicts which.

4. **Representation decomposition**: the symmetry group acts on the space of all possible belief states. Decomposing this action into irreducible representations tells us the "modes" of the system: independent channels of information that can be learned separately. Hidden modes = structural blind spots.

**The punchline for interoception:** every row in the "Verified today" table is a conserved quantity the system can monitor. Every row in "What we want to know" is a POTENTIAL conserved quantity we haven't found yet. The full table, once complete, IS the specification of the interoception layer's sensors.

**Roadmap connections:**
- Phase 3 (dynamical systems) gives fixed points and stability → enables 11.1, 11.2
- Phase 4 (curvature) gives K(α,β) → enables phase transition classification
- Phase 5 (Noether/Hamiltonian) gives the conserved quantities → enables full interoception spec
- Phase 7 (group theory) gives the Lie algebra → enables representation decomposition
- Aegir Arc 3 Module 3.3-3.4 → the mathematical training for all of the above
- Aegir Arc 4 → category-theoretic formalization (functors, natural transformations)