# Buildlog Learning Analysis

Analysis of buildlog's learning dynamics using the real data sources:
- **SQLite tables**: `review_learnings`, `mistakes`, `reward_events`
- **Signal log**: `~/.buildlog/emissions/signal.jsonl` (time-series backbone)
- **Seed files**: Treatment intervention dates from filenames

**Key metrics:**
- `reinforcement_count` vs `contradiction_count` per learning → rule strength over time
- `was_repeat` on mistakes → persistent blind spots
- `corrected_by_rule` → direct positive attribution
- `rules_active` on reward events → which rules were on during success/failure

In [None]:
import json
import sqlite3
from pathlib import Path
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

plt.style.use('seaborn-v0_8-whitegrid')
pd.set_option('display.max_columns', None)

BUILDLOG_DIR = Path.home() / ".buildlog"
DB_PATH = BUILDLOG_DIR / "buildlog.db"
SIGNAL_LOG = BUILDLOG_DIR / "emissions" / "signal.jsonl"
SEEDS_DIR = BUILDLOG_DIR / "seeds"

## 1. Load SQLite Data

In [None]:
conn = sqlite3.connect(DB_PATH)

# Review learnings - the rule strength time series
learnings = pd.read_sql_query("""
    SELECT * FROM review_learnings
""", conn)
learnings["first_seen"] = pd.to_datetime(learnings["first_seen"])
learnings["last_reinforced"] = pd.to_datetime(learnings["last_reinforced"])

# Mistakes
mistakes = pd.read_sql_query("""
    SELECT * FROM mistakes
""", conn)
mistakes["timestamp"] = pd.to_datetime(mistakes["timestamp"])

# Reward events
rewards = pd.read_sql_query("""
    SELECT * FROM reward_events
""", conn)
rewards["timestamp"] = pd.to_datetime(rewards["timestamp"])
rewards["rules_active"] = rewards["rules_active"].apply(
    lambda x: json.loads(x) if x else []
)

conn.close()

print(f"Learnings: {len(learnings)}")
print(f"Mistakes: {len(mistakes)}")
print(f"Reward events: {len(rewards)}")

## 2. Rule Strength: Reinforcement vs Contradiction

This is the key plot. Rules accumulating reinforcements are getting stronger.
Rules with contradictions are being challenged. The ratio tells us which rules are reliable.

In [None]:
# Summary by category
category_stats = learnings.groupby("category").agg(
    rules=("id", "count"),
    total_reinforcements=("reinforcement_count", "sum"),
    total_contradictions=("contradiction_count", "sum"),
).reset_index()
category_stats["strength_ratio"] = (
    category_stats["total_reinforcements"] / 
    (category_stats["total_reinforcements"] + category_stats["total_contradictions"] + 1)
).round(2)
category_stats = category_stats.sort_values("total_reinforcements", ascending=False)
category_stats

In [None]:
# Scatter: reinforcement vs contradiction per learning, colored by category
fig, ax = plt.subplots(figsize=(10, 6))

categories = learnings["category"].unique()
colors = plt.cm.tab10(range(len(categories)))
color_map = dict(zip(categories, colors))

for cat in categories:
    subset = learnings[learnings["category"] == cat]
    ax.scatter(
        subset["reinforcement_count"], 
        subset["contradiction_count"],
        label=cat,
        alpha=0.7,
        s=50
    )

ax.set_xlabel("Reinforcement Count")
ax.set_ylabel("Contradiction Count")
ax.set_title("Rule Strength: Reinforcements vs Contradictions")
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

# Add diagonal line (equal reinforcement/contradiction)
max_val = max(learnings["reinforcement_count"].max(), learnings["contradiction_count"].max())
ax.plot([0, max_val], [0, max_val], 'k--', alpha=0.3, label='_nolegend_')

plt.tight_layout()
plt.show()

## 3. Learning Timeline

When were rules first seen? When were they last reinforced?
Gap between first_seen and last_reinforced shows rule longevity.

In [None]:
# Cumulative rules over time
learnings_sorted = learnings.sort_values("first_seen")
learnings_sorted["cumulative_rules"] = range(1, len(learnings_sorted) + 1)

fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(learnings_sorted["first_seen"], learnings_sorted["cumulative_rules"], linewidth=2)
ax.set_xlabel("Date")
ax.set_ylabel("Cumulative Rules Learned")
ax.set_title("Rule Discovery Over Time")
ax.xaxis.set_major_formatter(mdates.DateFormatter("%m-%d"))
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

In [None]:
# Rule lifespan: days between first_seen and last_reinforced
learnings["lifespan_days"] = (
    learnings["last_reinforced"] - learnings["first_seen"]
).dt.total_seconds() / 86400

print("Rule Lifespan Statistics (days):")
print(learnings["lifespan_days"].describe())

## 4. Mistake Analysis

Which error classes have repeats? Which have attribution to rules?

In [None]:
if len(mistakes) > 0:
    mistake_stats = mistakes.groupby("error_class").agg(
        total=("id", "count"),
        repeats=("was_repeat", "sum"),
        attributed=("corrected_by_rule", lambda x: x.notna().sum()),
    ).reset_index()
    mistake_stats["repeat_rate"] = (mistake_stats["repeats"] / mistake_stats["total"] * 100).round(1)
    mistake_stats["attribution_rate"] = (mistake_stats["attributed"] / mistake_stats["total"] * 100).round(1)
    print(mistake_stats.to_string(index=False))
else:
    print("No mistakes in SQLite yet. Check emission files for pending data.")

## 5. Reward Events: Rule Activation

Which rules were active during successful vs failed outcomes?

In [None]:
# Outcome distribution
print("Reward Outcomes:")
print(rewards["outcome"].value_counts())
print()

# Average reward by outcome
print("Average Reward Value by Outcome:")
print(rewards.groupby("outcome")["reward_value"].mean().round(2))

In [None]:
# Explode rules_active to see which rules are most often active
if rewards["rules_active"].apply(len).sum() > 0:
    rules_exploded = rewards.explode("rules_active")
    rules_exploded = rules_exploded[rules_exploded["rules_active"].notna()]
    
    rule_outcomes = rules_exploded.groupby(["rules_active", "outcome"]).size().unstack(fill_value=0)
    print("Rule Activation by Outcome:")
    print(rule_outcomes)
else:
    print("No rules_active data yet. Rules haven't been activated in tracked sessions.")

## 6. Signal Log: Time-Series Backbone

The signal.jsonl file is the append-only event log. Every emission gets a line.

In [None]:
signals = []
if SIGNAL_LOG.exists():
    with open(SIGNAL_LOG) as f:
        for line in f:
            if line.strip():
                signals.append(json.loads(line))

signals_df = pd.DataFrame(signals)
if len(signals_df) > 0:
    signals_df["ts"] = pd.to_datetime(signals_df["ts"])
    print(f"Signal events: {len(signals_df)}")
    print(f"Date range: {signals_df['ts'].min()} to {signals_df['ts'].max()}")
    print()
    print("Event types:")
    print(signals_df["type"].value_counts())

In [None]:
# Events over time
if len(signals_df) > 0:
    signals_df["hour"] = signals_df["ts"].dt.floor("h")
    hourly = signals_df.groupby(["hour", "type"]).size().unstack(fill_value=0)
    
    fig, ax = plt.subplots(figsize=(12, 5))
    hourly.plot(kind="bar", stacked=True, ax=ax, width=0.8)
    ax.set_xlabel("Hour")
    ax.set_ylabel("Events")
    ax.set_title("Emission Events by Hour")
    ax.legend(title="Type")
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

## 7. Intervention Points: Seed Ingestion Dates

Seed filenames have timestamps. These are the treatment dates for A/B analysis.

In [None]:
# Parse seed filenames for intervention dates
interventions = []
if SEEDS_DIR.exists():
    for f in SEEDS_DIR.glob("*.yaml"):
        # Format: persona_name_2026-02-06T23-55-30.yaml
        name = f.stem
        parts = name.rsplit("_", 1)
        if len(parts) == 2:
            persona, ts_str = parts
            try:
                ts = datetime.strptime(ts_str, "%Y-%m-%dT%H-%M-%S")
                interventions.append({"persona": persona, "ingested_at": ts, "file": f.name})
            except ValueError:
                # Not a timestamped file
                interventions.append({"persona": name, "ingested_at": None, "file": f.name})
        else:
            interventions.append({"persona": name, "ingested_at": None, "file": f.name})

interventions_df = pd.DataFrame(interventions)
if len(interventions_df) > 0:
    print("Seed Files (Intervention Points):")
    print(interventions_df.to_string(index=False))
else:
    print("No seed files found. Ingest qortex rules to create intervention points.")

## 8. Summary: Current State

In [None]:
summary = {
    "learnings": {
        "total": len(learnings),
        "categories": learnings["category"].nunique(),
        "total_reinforcements": int(learnings["reinforcement_count"].sum()),
        "total_contradictions": int(learnings["contradiction_count"].sum()),
    },
    "mistakes": {
        "total": len(mistakes),
        "with_attribution": int(mistakes["corrected_by_rule"].notna().sum()) if len(mistakes) > 0 else 0,
    },
    "rewards": {
        "total": len(rewards),
        "with_rules_active": int(rewards["rules_active"].apply(len).gt(0).sum()),
    },
    "signals": len(signals_df) if len(signals_df) > 0 else 0,
    "interventions": len(interventions_df) if len(interventions_df) > 0 else 0,
}

print(json.dumps(summary, indent=2))

## 9. Next Steps

**What we can measure now:**
- Rule strength trends (reinforcement_count over time per learning)
- Category-level evidence accumulation
- Emission volume over time

**What we need for causal attribution:**
- `corrected_by_rule` populated on mistakes (direct attribution)
- `rules_active` populated on reward events (which rules were on)
- More seed ingestion events (intervention points)
- Longer time series (days/weeks, not hours)

**The experiment:**
1. Ingest qortex rules for iterator_visitor_patterns, factory_patterns
2. Continue normal work, accumulating mistakes and rewards
3. Re-run this notebook after a week
4. Compare: repeat rates before vs after intervention