# ScopeBench Dataset + Calibration + Plugin Lab

This notebook is a hands-on continuation of the quickstart. It walks through:

1. dataset case bootstrapping (`dataset-suggest`),
2. validation and contribution checks,
3. calibration tuning from telemetry, and
4. plugin authoring and verification.


## 0) Optional environment bootstrap


In [None]:
# %pip install -U pip
# %pip install -e ".[dev]"

import json
import subprocess
from pathlib import Path

def run_json(cmd: str):
    raw = subprocess.check_output(cmd, shell=True, text=True)
    return json.loads(raw)

def run_text(cmd: str):
    return subprocess.check_output(cmd, shell=True, text=True).strip()


## 1) Generate a draft dataset case with `dataset-suggest`


In [None]:
cmd = (
    "scopebench dataset-suggest "
    "--id swe-notebook-001 "
    "--domain swe "
    "--instruction 'Stabilize flaky checkout tests without broad refactors' "
    "--contract examples/coding_test_stabilization.contract.yaml "
    "--plan examples/coding_test_stabilization.plan.yaml "
    "--expected-decision ASK "
    "--expected-rationale 'Needs stronger rollback and blast-radius controls' "
    "--json"
)
case = run_json(cmd)
case.keys()


In [None]:
# Uncomment to append a generated case to a working dataset file.
# run_text(
#     "scopebench dataset-suggest "
#     "--id swe-notebook-001 "
#     "--domain swe "
#     "--instruction 'Stabilize flaky checkout tests without broad refactors' "
#     "--contract examples/coding_test_stabilization.contract.yaml "
#     "--plan examples/coding_test_stabilization.plan.yaml "
#     "--expected-decision ASK "
#     "--expected-rationale 'Needs stronger rollback and blast-radius controls' "
#     "--append-to /tmp/community_cases.jsonl"
# )


## 2) Validate dataset quality gates


In [None]:
# Replace with your real file if needed.
# validation = run_json("scopebench dataset-validate /tmp/community_cases.jsonl --json")
# validation


## 3) Tune calibration from telemetry


In [None]:
# Provide a real telemetry JSONL file to run this.
# weekly = run_json("scopebench weekly-calibrate telemetry.jsonl --out axis_calibration.json --json")
# weekly


In [None]:
# Apply saved calibration during evaluation.
# calibrated = run_json(
#     "scopebench run examples/ops_rotate_key.contract.yaml examples/ops_rotate_key.plan.yaml "
#     "--calibration-file axis_calibration.json --json"
# )
# calibrated["decision"]


## 4) Plugin authoring workflow


In [None]:
# Interactive plugin generator
# run_text("scopebench plugin-generate --out /tmp/robotics-starter.yaml")

# Lint + compatibility harness
# run_text("scopebench plugin-lint /tmp/robotics-starter.yaml")
# run_text("scopebench plugin-harness /tmp/robotics-starter.yaml --max-golden-cases 100")


## 5) API equivalents to automate onboarding

If you are building internal onboarding portals, pair this notebook with:

- `POST /dataset/suggest`
- `POST /dataset/validate`
- `GET /calibration/dashboard`
- `POST /calibration/adjust`
- `POST /plugins/wizard/generate`

See the markdown tutorial: `docs/tutorials/dataset_calibration_plugin_lab.md`.
