# Homework 02: Open & Axial Coding Walkthrough

Grounded-theory error analysis for the email summariser workshop. Use this notebook to link qualitative insights from the annotation tool back into the Analyze → Measure → Improve loop.

## 0. Prerequisites
- Run Notebook 00/01a to prepare either the filtered or synthetic email CSV.
- Use `tools/generate_email_traces.py` to create traces for the current Git commit (short SHA stored as the `run_id`). The optional cell below lets you run it in-place during a workshop demo.
- Launch `tools/email_annotation_app.py` to collect open coding notes and failure modes in the browser, then return here to analyse the DuckDB tables.

In [1]:
from pathlib import Path
import pandas as pd
import duckdb
from IPython.display import display, Markdown
import ipywidgets as widgets

DATA_DIR = Path("../data")
TRACE_ROOT = Path("../annotation/traces")
DUCKDB_PATH = DATA_DIR / "email_annotations.duckdb"

### 1. Choose data source
Pick between the filtered production slice and the synthetic seed set from Notebook 01a.

In [2]:
DATA_OPTIONS = {
    "filtered": DATA_DIR / "filtered_emails_sample.csv",
    "synthetic": DATA_DIR / "synthetic_emails.csv",
}

source_toggle = widgets.ToggleButtons(
    options=[("Filtered Emails", "filtered"), ("Synthetic Seed Set", "synthetic")],
    value="filtered" if (DATA_DIR / "filtered_emails.csv").exists() else "synthetic",
    description="Dataset:",
    style={"description_width": "initial"},
)

display(source_toggle)

ToggleButtons(description='Dataset:', options=(('Filtered Emails', 'filtered'), ('Synthetic Seed Set', 'synthe…

In [3]:
SELECTED_KEY = source_toggle.value
DATA_SOURCE_PATH = DATA_OPTIONS[SELECTED_KEY]
if not DATA_SOURCE_PATH.exists():
    raise FileNotFoundError(
        f"Expected dataset at {DATA_SOURCE_PATH}. Run Notebook 00 or 01a first."
    )

Markdown(f"**Using:** `{DATA_SOURCE_PATH}`")

**Using:** `../data/filtered_emails_sample.csv`

### 2. (Optional) Regenerate traces for this run
Set the flag to `True` to call `tools/generate_email_traces.py` using the selected dataset. The script writes trace JSON under `annotation/traces/<run_id>` and upserts rows into DuckDB.

In [5]:
RUN_TRACE_GENERATOR = True  # toggle to True for live demos

if RUN_TRACE_GENERATOR:
    import subprocess
    import sys

    cmd = [
        sys.executable,
        str(Path("../tools/generate_email_traces.py")),
        "--emails",
        str(DATA_SOURCE_PATH.resolve()),
        "--out",
        str(TRACE_ROOT.resolve()),
    ]
    print("Running:", " ".join(cmd))
    subprocess.run(cmd, check=True)

Running: /Users/luvsuneja/Documents/repos/evals-workshop/.venv/bin/python ../tools/generate_email_traces.py --emails /Users/luvsuneja/Documents/repos/evals-workshop/data/filtered_emails_sample.csv --out /Users/luvsuneja/Documents/repos/evals-workshop/annotation/traces


Working tree has uncommitted changes. Commit or stash before generating traces.


CalledProcessError: Command '['/Users/luvsuneja/Documents/repos/evals-workshop/.venv/bin/python', '../tools/generate_email_traces.py', '--emails', '/Users/luvsuneja/Documents/repos/evals-workshop/data/filtered_emails_sample.csv', '--out', '/Users/luvsuneja/Documents/repos/evals-workshop/annotation/traces']' returned non-zero exit status 1.

### 3. Verify available trace runs
Each trace directory name equals the short Git SHA captured when the generator script ran.

In [None]:
available_runs = []
if TRACE_ROOT.exists():
    available_runs = sorted(p.name for p in TRACE_ROOT.iterdir() if p.is_dir())

if not available_runs:
    raise RuntimeError(
        "No trace runs detected under ../annotation/traces. Generate traces before proceeding."
    )

ACTIVE_RUN_ID = available_runs[-1]
Markdown(f"**Active run:** `{ACTIVE_RUN_ID}` (showing most recent)")


### Annotation Tool (Browser UI)
- **Launch:** `python tools/email_annotation_app.py` from the repo root.
- **Navigate:** open `http://localhost:5000`.
- **Controls:** `A` adds an annotation (Enter saves, Esc cancels), `Z` marks **pass**, `X` marks **fail**, `F` links failure modes, `←/→` navigation.
- **Auto-save:** every change is written directly to DuckDB.
- Capture notes/failure modes before returning to this notebook to analyze results.


### 4. Connect to DuckDB and inspect trace runs

In [None]:
conn = duckdb.connect(str(DUCKDB_PATH))

runs_df = conn.execute(
    "SELECT run_id, prompt_path, source_csv, generated_at FROM trace_runs ORDER BY generated_at DESC"
).df()

display(runs_df)

### 5. Preview the chosen dataset

In [None]:
emails_df = pd.read_csv(DATA_SOURCE_PATH)
Markdown(f"Loaded **{len(emails_df):,}** emails from `{DATA_SOURCE_PATH.name}`")
emails_df.head()

### 6. Emails registered for the active run

In [None]:
emails_raw_df = conn.execute(
    """
    SELECT email_hash, subject, metadata, run_id, ingested_at
    FROM emails_raw
    WHERE run_id = ?
    ORDER BY email_hash
    """,
    (ACTIVE_RUN_ID,),
).df()

if emails_raw_df.empty:
    raise RuntimeError(
        f"No emails found in DuckDB for run_id={ACTIVE_RUN_ID}. Re-run the trace generator."
    )

emails_raw_df.head()

### 7. Open coding inventory
Pull annotations captured through the web tool (auto-saved to DuckDB).

In [None]:
annotations_df = conn.execute(
    """
    SELECT email_hash, annotation_id, labeler_id, open_code, pass_fail, created_at
    FROM annotations
    WHERE run_id = ?
    ORDER BY created_at
    """,
    (ACTIVE_RUN_ID,),
).df()

Markdown(f"Collected **{len(annotations_df):,}** annotations for run `{ACTIVE_RUN_ID}`")
annotations_df.head()

### 8. Pass/Fail mix and annotation velocity

In [None]:
if annotations_df.empty:
    display(
        Markdown(
            "⚠️ No annotations logged yet—open the web tool and capture a few observations."
        )
    )
else:
    annotations_df["status"] = annotations_df["pass_fail"].map(
        {True: "Pass", False: "Fail", None: "Unknown"}
    )
    summary = annotations_df.groupby("status").size().reset_index(name="count")
    display(summary)

### 9. Join annotations with email metadata
Use JSON metadata stored in `emails_raw` to segment open codes by designation, tone, intent, etc.

In [None]:
display(
    Markdown(
        "Optional metadata pivots (Intent/Designation/Tone) are only available when those fields exist—skipping for this dataset."
    )
)

### 10. Axial coding snapshot

In [None]:
failure_modes_df = conn.execute(
    """
    SELECT fm.display_name, count(*) AS occurrences
    FROM axial_links al
    JOIN failure_modes fm ON al.failure_mode_id = fm.failure_mode_id
    WHERE al.run_id = ?
    GROUP BY 1
    ORDER BY occurrences DESC
    """,
    (ACTIVE_RUN_ID,),
).df()

if failure_modes_df.empty:
    display(
        Markdown(
            "⚠️ No failure modes linked yet—use the web tool (F) to start axial coding."
        )
    )
else:
    display(failure_modes_df)

### 11. Failure-mode × Intent co-occurrence

In [None]:
if not failure_modes_df.empty and not annotations_df.empty:
    axial_df = conn.execute(
        """
        SELECT al.annotation_id, al.failure_mode_id, fm.display_name, a.email_hash
        FROM axial_links al
        JOIN failure_modes fm ON al.failure_mode_id = fm.failure_mode_id
        JOIN annotations a ON al.annotation_id = a.annotation_id
        WHERE al.run_id = ?
        """,
        (ACTIVE_RUN_ID,),
    ).df()

    if not axial_df.empty:
        merged = axial_df.merge(
            annotations_df[["annotation_id", "Intent"]], on="annotation_id", how="left"
        )
        co_matrix = pd.crosstab(merged["display_name"], merged["Intent"])
        display(co_matrix)

### 12. Export for sharing
Write flattened tables so facilitators/stakeholders can review outside the notebook.

In [None]:
EXPORT = False  # toggle to True to write CSV/JSON snapshots

if EXPORT:
    export_annotations = DATA_DIR / f"email_annotations_{ACTIVE_RUN_ID}.csv"
    export_failure_modes = DATA_DIR / f"failure_modes_{ACTIVE_RUN_ID}.csv"
    annotations_df.to_csv(export_annotations, index=False)
    failure_modes_df.to_csv(export_failure_modes, index=False)
    Markdown(
        f"Exported annotations → `{export_annotations.name}` and failure modes → `{export_failure_modes.name}`"
    )

---
**Next steps:**
1. Ensure each major failure mode has ≥20 failing examples before moving to Notebook 03.
2. Re-run the prompt (Notebook 01) + trace generator after prompt fixes so you can compare runs (`run_id` == short Git SHA).
3. Log qualitative takeaways and TODOs in the facilitation tracker / plan.