Skip to content

Yard v0.5.0

Choose a tag to compare

@zzunkie zzunkie released this 16 Jun 12:52
· 12 commits to main since this release

Added

  • Rule auto-learn + yard harness review (harness H4 completion). The
    learning loop already auto-recorded worker-proposed skills (S3); now a
    run's harness_suggestions of kind "rule" are auto-recorded too, as
    .agents/rules/learned-<slug>.md — an always-apply constraint H1 inlines
    into every packet (the worker proposes, Yard's deterministic core writes; no
    clobber; gated by auto_rule, default on). Because a rule is always-on it
    has no per-task attribution to score, so learned rules are kept until removed
    (git-reversible) rather than auto-pruned like skills. New yard harness review shows the learned rules and the learned skills with their eval
    scores in one place. (Deterministic-observation candidate mining — failure
    themes into candidates — remains the open part of H4.)

  • Workspace hooks (harness H3) — deterministic guards that bind every
    worker.
    Executables in .agents/hooks/pre-run.d/* run before a worker
    spawns; a non-zero exit blocks the run (the task fails with the hook's
    reason, so the drain stops on it — fix the cause and re-run). Executables in
    post-run.d/* run during evaluation; a non-zero exit is a fatal check
    that blocks Done
    , folded into the evaluation. Each hook runs in the repo
    root with YARD_TASK_ID / YARD_RUN_DIR / YARD_WORKER, a 30s timeout
    (longer is killed and fails), and stdout/stderr captured to
    <run_dir>/hooks/<phase>/. Only executable files run, in sorted filename
    order. Unlike a single CLI's hooks, these bind Codex, Claude Code, and any
    generic-adapter worker alike. Yard ships no enabled hooks — yard init
    lays down empty pre-run.d/post-run.d and a documented README.md. Off
    with hooks: false in yard.yaml.

  • Explicit skill authoring: yard skill research / create / apply (S2/S3).
    On-demand skills without hand-writing a SKILL.md. yard skill research "<topic>" runs a researcher-role worker that drafts a candidate skill to a
    run dir and installs nothing; yard skill apply <run-id> installs that
    draft; yard skill create <name> [--from "<topic>"] authors and installs in
    one step. The run is queue-isolated — like the planner it spawns one
    worker, but derives no intent/queue, so authoring a skill never disturbs the
    live intent (the gap that deferred this). The worker proposes the content;
    Yard (the deterministic core) is the sole writer. Authored skills are tagged
    source: created (not learned), so they are user-chosen and never
    auto-pruned — they persist like a library equip until unequip.

Fixed

  • Recover a task wrongly stuck Failed by a dead orchestrator. If Yard
    exited after a worker finished but before the result was evaluated, the
    task could end up Failed even though its run produced a clean done
    result — and neither restart-recovery nor yard recover could salvage it
    (recovery only looked at Running tasks), forcing a wasteful full re-run.
    Recovery now also re-evaluates such a task's stranded result, detected by an
    unfinalized orphan run (worker.pid still on disk — a finalized run
    removes it — with the process gone). It routes through the evaluator, so a
    genuinely-bad result stays failed; only real, completed work is reclaimed.
    Surfaced by dogfooding Deadline12, where a completed map task sat Failed.