Yard v0.5.0
Added
-
Rule auto-learn +
yard harness review(harness H4 completion). The
learning loop already auto-recorded worker-proposed skills (S3); now a
run'sharness_suggestionsof kind"rule"are auto-recorded too, as
.agents/rules/learned-<slug>.md— an always-apply constraint H1 inlines
into every packet (the worker proposes, Yard's deterministic core writes; no
clobber; gated byauto_rule, default on). Because a rule is always-on it
has no per-task attribution to score, so learned rules are kept until removed
(git-reversible) rather than auto-pruned like skills. Newyard harness reviewshows the learned rules and the learned skills with their eval
scores in one place. (Deterministic-observation candidate mining — failure
themes into candidates — remains the open part of H4.) -
Workspace hooks (harness H3) — deterministic guards that bind every
worker. Executables in.agents/hooks/pre-run.d/*run before a worker
spawns; a non-zero exit blocks the run (the task fails with the hook's
reason, so the drain stops on it — fix the cause and re-run). Executables in
post-run.d/*run during evaluation; a non-zero exit is a fatal check
that blocks Done, folded into the evaluation. Each hook runs in the repo
root withYARD_TASK_ID/YARD_RUN_DIR/YARD_WORKER, a 30s timeout
(longer is killed and fails), and stdout/stderr captured to
<run_dir>/hooks/<phase>/. Only executable files run, in sorted filename
order. Unlike a single CLI's hooks, these bind Codex, Claude Code, and any
generic-adapter worker alike. Yard ships no enabled hooks —yard init
lays down emptypre-run.d/post-run.dand a documentedREADME.md. Off
withhooks: falseinyard.yaml. -
Explicit skill authoring:
yard skill research / create / apply(S2/S3).
On-demand skills without hand-writing a SKILL.md.yard skill research "<topic>"runs a researcher-role worker that drafts a candidate skill to a
run dir and installs nothing;yard skill apply <run-id>installs that
draft;yard skill create <name> [--from "<topic>"]authors and installs in
one step. The run is queue-isolated — like the planner it spawns one
worker, but derives no intent/queue, so authoring a skill never disturbs the
live intent (the gap that deferred this). The worker proposes the content;
Yard (the deterministic core) is the sole writer. Authored skills are tagged
source: created(notlearned), so they are user-chosen and never
auto-pruned — they persist like a library equip untilunequip.
Fixed
- Recover a task wrongly stuck
Failedby a dead orchestrator. If Yard
exited after a worker finished but before the result was evaluated, the
task could end upFailedeven though its run produced a cleandone
result — and neither restart-recovery noryard recovercould salvage it
(recovery only looked atRunningtasks), forcing a wasteful full re-run.
Recovery now also re-evaluates such a task's stranded result, detected by an
unfinalized orphan run (worker.pidstill on disk — a finalized run
removes it — with the process gone). It routes through the evaluator, so a
genuinely-bad result stays failed; only real, completed work is reclaimed.
Surfaced by dogfooding Deadline12, where a completed map task satFailed.