# RePath Model Retraining Workflow

This notebook walks through benchmark prep, annotation seeding, candidate training/export, and candidate evaluation.

## 1. Paths and Environment
By default this notebook resolves roots in this order: env override -> local `repath-model` path -> fallback `repath-mobile` path.


In [None]:
import os
import subprocess
from pathlib import Path


def resolve_repo_root(start: Path) -> Path:
    start = Path(start).resolve()
    for candidate in [start, *start.parents]:
        if (candidate / "pyproject.toml").exists() and (candidate / "scripts").is_dir():
            return candidate
    return start


def resolve_root(env_key: str, candidates):
    value = os.environ.get(env_key, "").strip()
    if value:
        return Path(value).expanduser().resolve()
    for candidate in candidates:
        candidate = candidate.resolve()
        if candidate.exists():
            return candidate
    return candidates[0].resolve()


ROOT = resolve_repo_root(Path.cwd())
os.chdir(ROOT)
MODEL_ASSETS_ROOT = resolve_root(
    "REPATH_MODEL_ASSETS_ROOT",
    [ROOT / "assets/models", ROOT / "../repath-mobile/assets/models"],
)
BENCHMARK_ROOT = resolve_root(
    "REPATH_BENCHMARK_ROOT",
    [ROOT / "test/benchmarks", ROOT / "../repath-mobile/test/benchmarks"],
)
TRAINING_ARTIFACTS_ROOT = resolve_root(
    "REPATH_TRAINING_ARTIFACTS_ROOT",
    [ROOT / "artifacts", ROOT / "../repath-mobile/ml/artifacts"],
)
KAGGLE_DIR = os.environ.get("KAGGLE_WASTE_DIR", "")

print({
    "repath_model": str(ROOT),
    "model_assets_root": str(MODEL_ASSETS_ROOT),
    "benchmark_root": str(BENCHMARK_ROOT),
    "training_artifacts_root": str(TRAINING_ARTIFACTS_ROOT),
    "kaggle_dir": KAGGLE_DIR,
})



## 2. Benchmark Prep and Coverage
Run sync, holdout build, resolved manifest build, coverage, and audit checks.

In [None]:
subprocess.run([
    "python3", "scripts/evaluation/sync_benchmark_progress.py",
    "--manifest", str(BENCHMARK_ROOT / "municipal-benchmark-manifest-v2.json"),
    "--completed", str(BENCHMARK_ROOT / "benchmark-labeled.csv"),
], check=True)


In [None]:
cmd = [
    "python3", "scripts/evaluation/build_supported_holdout_manifest.py",
    "--labels", str(MODEL_ASSETS_ROOT / "yolo-repath.labels.json"),
    "--input-csv", str(BENCHMARK_ROOT / "benchmark-labeled.csv"),
    "--retraining-manifest", str(TRAINING_ARTIFACTS_ROOT / "retraining/retraining-manifest.json"),
    "--cache-dir", str(BENCHMARK_ROOT / "images/supported-holdout"),
    "--out", str(BENCHMARK_ROOT / "benchmark-manifest.supported-holdout.json"),
]
if KAGGLE_DIR:
    cmd.extend(["--kaggle-dir", KAGGLE_DIR])
subprocess.run(cmd, check=True)


In [None]:
subprocess.run([
    "python3", "scripts/evaluation/build_resolved_benchmark_manifest.py",
    "--manifest", str(BENCHMARK_ROOT / "municipal-benchmark-manifest-v2.json"),
    "--completed", str(BENCHMARK_ROOT / "benchmark-labeled.csv"),
    "--cache-dir", str(BENCHMARK_ROOT / "images"),
    "--out", str(BENCHMARK_ROOT / "municipal-benchmark-manifest.resolved.json"),
], check=True)


In [None]:
subprocess.run([
    "python3", "scripts/evaluation/check_benchmark_coverage.py",
    "--taxonomy", str(MODEL_ASSETS_ROOT / "municipal-taxonomy-v1.json"),
    "--manifest", str(BENCHMARK_ROOT / "municipal-benchmark-manifest.resolved.json"),
    "--out", str(BENCHMARK_ROOT / "benchmark-coverage-report.resolved.json"),
], check=True)

subprocess.run([
    "python3", "scripts/evaluation/audit_benchmark_dataset.py",
    "--manifest", str(BENCHMARK_ROOT / "municipal-benchmark-manifest.resolved.json"),
    "--taxonomy", str(MODEL_ASSETS_ROOT / "municipal-taxonomy-v1.json"),
    "--out", str(BENCHMARK_ROOT / "benchmark-dataset-audit.resolved.json"),
], check=True)


## 3. Annotation + Training
These commands expect annotation bundle content under `TRAINING_ARTIFACTS_ROOT / retraining/annotation-bundle`.


In [None]:
subprocess.run([
    "python3", "scripts/annotation/seed_annotation_boxes.py",
    "--bundle-root", str(TRAINING_ARTIFACTS_ROOT / "retraining/annotation-bundle"),
    "--model", str(MODEL_ASSETS_ROOT / "yolo-repath.tflite"),
    "--labels", str(MODEL_ASSETS_ROOT / "yolo-repath.labels.json"),
], check=True)


In [None]:
# Optional quick dry run
subprocess.run([
    "python3", "scripts/training/train_detector_from_annotation.py",
    "--bundle-root", str(TRAINING_ARTIFACTS_ROOT / "retraining/annotation-bundle"),
    "--candidate-root", str(TRAINING_ARTIFACTS_ROOT / "models/candidates"),
    "--project", str(TRAINING_ARTIFACTS_ROOT / "training-runs"),
    "--dry-run",
], check=True)


## 4. Candidate Evaluation
Run candidate benchmark + analysis + comparison against baseline.

In [None]:
subprocess.run([
    "python3", "scripts/evaluation/benchmark_candidate_model.py",
    "--candidates-root", str(TRAINING_ARTIFACTS_ROOT / "models/candidates"),
    "--manifest", str(BENCHMARK_ROOT / "municipal-benchmark-manifest.resolved.json"),
    "--out", str(BENCHMARK_ROOT / "latest-results.candidate.json"),
    "--supported-only",
], check=True)

subprocess.run([
    "python3", "scripts/evaluation/compare_benchmark_results.py",
    "--baseline", str(BENCHMARK_ROOT / "latest-results.json"),
    "--candidate", str(BENCHMARK_ROOT / "latest-results.candidate.json"),
    "--out", str(BENCHMARK_ROOT / "latest-results.compare.json"),
], check=True)


## What To Look For
- `benchmark-coverage-report.resolved.json`: coverage gaps and unknown labels.
- `benchmark-dataset-audit.resolved.json`: duplicates, missing URLs, class balance recommendations.
- `latest-results.candidate.analysis.json`: missed labels and false-positive priorities.
- `latest-results.compare.json`: baseline vs candidate metric deltas.