Release v0.2.0

github-actions released this 24 Jun 03:16

6c7202b

Added

skills.sh — install the Mira agent skill into a Claude Code skills directory
so an agent can author and run evals. --global targets ~/.claude/skills/mira,
--local (the default) targets ./.claude/skills/mira. It copies from a local
checkout when present, else fetches from GitHub raw (--ref), so
curl -fsSL .../skills.sh | sh works on a box that only has the prebuilt
binary. Each run is a clean replace, so it also serves as the upgrade path.
Native TypeScript SDK (sdks/typescript, mira-eval) — author
eval studies in TypeScript/Node with no Rust dependency: a zero-runtime-dep
library over the protocol, with wire types and protocol metadata generated from
schema/v1/ (a self-contained codegen.mjs --check drift guard, the TS dual of
the Rust/Python guards), a serve() loop (incl. the execute/score split and
list_samples pagination), a parity authoring API, and conformance + behaviour
tests. Worked example: examples/greet-typescript. Publishes to npm as
mira-eval via OIDC trusted publishing (publish-typescript in publish.yml),
mirroring the Python PyPI flow.
Named launchers in mira.toml: [launchers.NAME] saves a study invocation
(bin/example/cmd/uv/python/python3 + package/manifest_path),
selected with --launcher NAME. default_launcher makes a bare mira run
work; explicit launch flags override the named launcher, mirroring --preset.
cargo binstall mira-cli support: [package.metadata.binstall] points binstall
at the prebuilt release tarballs, so the mira binary installs without a compile.
Polyglot launcher flags — mira --uv / --python / --python3 SCRIPT
drive a non-Rust study directly (e.g. mira --python3 study.py run), replacing
the verbose --cmd "python3 study.py". --cmd still works for an arbitrary
command line.
mira help --full now surfaces a GUIDES section (each docs/ guide with a
one-line scope, for progressive disclosure) and a link to the mira agent skill
in LINKS, so an agent can self-orient to the docs and skill in one read. A
drift guard keeps the guide list in sync with docs/README.md.
Run folders, save-by-default, and resume. Every mira run/mira score now
saves a run folder under the results dir by default — <run_id>/ with
meta.json, report.json/report.html, and one cases/<key>/result.json per
finished case (written atomically as it lands). --dry-run opts out.
mira run --resume <run_id> reopens an interrupted run's folder, skips the cases
already recorded under cases/, and runs only what's missing.
mira report <run_id> — new subcommand that re-renders a saved run's reports
from its stored per-case results, with no study process and no re-execution.

Changed

The execution unit (one eval × sample × target × axis × trial) is now called a
case (was "cell"): Cell/CellSpec → Case/CaseSpec, run_cells →
run_cases, etc. The dataset-row builder .case(id, prompt) → .sample(id, prompt), and the prebuilt-Sample adder .sample(Sample) → .add_sample(Sample).
The pub type Case = Sample alias is removed.

Removed

--checkpoint, --fresh, and --save on mira run/mira score, plus the
mira::session::Session type. The single-file checkpoint is superseded by the
always-saved run folder; resume is now explicit via --resume <run_id> (a fresh
run mints a new id and reuses nothing, so there is no silent stale-result reuse).
Configure the results dir via [results].dir in mira.toml (the --save <dir>
override is gone).

Assets 8