Yard v0.5.0

zzunkie released this 16 Jun 12:52

· 12 commits to main since this release

f4f4a08

Added

Rule auto-learn + yard harness review (harness H4 completion). The
learning loop already auto-recorded worker-proposed skills (S3); now a
run's harness_suggestions of kind "rule" are auto-recorded too, as
.agents/rules/learned-<slug>.md — an always-apply constraint H1 inlines
into every packet (the worker proposes, Yard's deterministic core writes; no
clobber; gated by auto_rule, default on). Because a rule is always-on it
has no per-task attribution to score, so learned rules are kept until removed
(git-reversible) rather than auto-pruned like skills. New yard harness review shows the learned rules and the learned skills with their eval
scores in one place. (Deterministic-observation candidate mining — failure
themes into candidates — remains the open part of H4.)
Workspace hooks (harness H3) — deterministic guards that bind every
worker. Executables in .agents/hooks/pre-run.d/* run before a worker
spawns; a non-zero exit blocks the run (the task fails with the hook's
reason, so the drain stops on it — fix the cause and re-run). Executables in
post-run.d/* run during evaluation; a non-zero exit is a fatal check
that blocks Done, folded into the evaluation. Each hook runs in the repo
root with YARD_TASK_ID / YARD_RUN_DIR / YARD_WORKER, a 30s timeout
(longer is killed and fails), and stdout/stderr captured to
<run_dir>/hooks/<phase>/. Only executable files run, in sorted filename
order. Unlike a single CLI's hooks, these bind Codex, Claude Code, and any
generic-adapter worker alike. Yard ships no enabled hooks — yard init
lays down empty pre-run.d/post-run.d and a documented README.md. Off
with hooks: false in yard.yaml.
Explicit skill authoring: yard skill research / create / apply (S2/S3).
On-demand skills without hand-writing a SKILL.md. yard skill research "<topic>" runs a researcher-role worker that drafts a candidate skill to a
run dir and installs nothing; yard skill apply <run-id> installs that
draft; yard skill create <name> [--from "<topic>"] authors and installs in
one step. The run is queue-isolated — like the planner it spawns one
worker, but derives no intent/queue, so authoring a skill never disturbs the
live intent (the gap that deferred this). The worker proposes the content;
Yard (the deterministic core) is the sole writer. Authored skills are tagged
source: created (not learned), so they are user-chosen and never
auto-pruned — they persist like a library equip until unequip.

Fixed

Recover a task wrongly stuck Failed by a dead orchestrator. If Yard
exited after a worker finished but before the result was evaluated, the
task could end up Failed even though its run produced a clean done
result — and neither restart-recovery nor yard recover could salvage it
(recovery only looked at Running tasks), forcing a wasteful full re-run.
Recovery now also re-evaluates such a task's stranded result, detected by an
unfinalized orphan run (worker.pid still on disk — a finalized run
removes it — with the process gone). It routes through the evaluator, so a
genuinely-bad result stays failed; only real, completed work is reclaimed.
Surfaced by dogfooding Deadline12, where a completed map task sat Failed.

Assets 2