# Week 2 — Part 03: Compare Runs and Report Lab

**Estimated time:** 60–90 minutes

---

## Pre-study (Level 0)

Level 1 assumes Level 0 is complete. If you need a refresher on evaluation metrics (accuracy/precision/recall/F1):

- [Level 1 Pre-study index](../PRESTUDY.md)
- [Level 0 — Evaluation metrics](../../level_0/Chapters/4/02_core_concepts.md)

---

## What success looks like (end of Part 03)

- You can load or create a list of runs with consistent fields.
- You can compute a summary (best run + average metric).
- You write a report artifact under `output/compare_runs/`:
  - `runs.json`
  - `report.md`

### Checkpoint

After running this notebook, you should be able to open:

- `output/compare_runs/runs.json`
- `output/compare_runs/report.md`

## Learning Objectives

- Compare experiment runs using consistent metrics
- Summarize best/worst runs with clear reasoning
- Build a simple report artifact (JSON/Markdown)
- Practice root-cause analysis across runs

## Overview

Comparing runs requires consistent fields and consistent artifacts.

In this lab you will:

- load or create a small list of runs
- select the best run using a clear rule
- compute a summary
- write report artifacts under `output/compare_runs/`

If you need a refresher on evaluation metrics, use the Level 0 links at the top of the notebook.

In [None]:
import json
from pathlib import Path

runs = [
    {"run_id": "run_001", "model": "logreg", "accuracy": 0.84, "f1": 0.82, "notes": "baseline"},
    {"run_id": "run_002", "model": "logreg", "accuracy": 0.87, "f1": 0.86, "notes": "more iterations"},
    {"run_id": "run_003", "model": "rf", "accuracy": 0.89, "f1": 0.88, "notes": "higher depth"},
]

out_dir = Path("output/compare_runs")
out_dir.mkdir(parents=True, exist_ok=True)
(out_dir / "runs.json").write_text(json.dumps(runs, indent=2), encoding="utf-8")
print("wrote", out_dir / "runs.json")

In [None]:
def select_best_run_todo(runs):
    """TODO: return the best run.

    Criteria (suggested):

    - highest accuracy
    - tie-break: highest f1
    """
    return runs[0]


def summarize_runs_todo(runs):
    """TODO: return a small summary dict used for reporting."""
    best = select_best_run_todo(runs)
    avg_acc = sum(r["accuracy"] for r in runs) / len(runs)
    return {"best": best, "avg_accuracy": round(avg_acc, 3), "n": len(runs)}


summary = summarize_runs_todo(runs)
print(summary)

In [None]:
def write_report_todo(path: Path, summary: dict) -> None:
    """TODO: write a markdown report.

    Suggested sections:

    - Total runs
    - Average accuracy
    - Best run (with run_id, model, metrics, notes)
    """
    lines = ["# Run Comparison Report", "", f"Total runs: {summary['n']}"]
    path.write_text("\n".join(lines), encoding="utf-8")


write_report_todo(out_dir / "report.md", summary)
print("wrote", out_dir / "report.md")

## Appendix: Solutions (peek only after trying)

Reference implementations for the TODO functions in this notebook.

In [None]:
def select_best_run_todo(runs):
    return max(runs, key=lambda r: (r["accuracy"], r["f1"]))


def summarize_runs_todo(runs):
    best = select_best_run_todo(runs)
    avg_acc = sum(r["accuracy"] for r in runs) / len(runs)
    avg_f1 = sum(r["f1"] for r in runs) / len(runs)
    return {
        "best": best,
        "avg_accuracy": round(avg_acc, 3),
        "avg_f1": round(avg_f1, 3),
        "n": len(runs),
    }


def write_report_todo(path: Path, summary: dict) -> None:
    lines = ["# Run Comparison Report", ""]
    lines.append(f"Total runs: {summary['n']}")
    lines.append(f"Average accuracy: {summary['avg_accuracy']}")
    if "avg_f1" in summary:
        lines.append(f"Average f1: {summary['avg_f1']}")
    lines.append("")

    best = summary["best"]
    lines.append("## Best run")
    lines.append(f"- run_id: {best['run_id']}")
    lines.append(f"- model: {best['model']}")
    lines.append(f"- accuracy: {best['accuracy']}")
    lines.append(f"- f1: {best['f1']}")
    lines.append(f"- notes: {best['notes']}")

    path.write_text("\n".join(lines), encoding="utf-8")


summary_solution = summarize_runs_todo(runs)
write_report_todo(out_dir / "report_solution.md", summary_solution)
print("wrote", out_dir / "report_solution.md")