Skip to content

Agentao 0.4.6 — agentao run M0 + -p shim

Choose a tag to compare

@jin-bo jin-bo released this 09 May 07:17
· 87 commits to main since this release
39dbc7b

Agentao 0.4.6

A non-interactive automation release on top of 0.4.5. No public
Python API or wire-format change
, but agentao -p callers that
script on exit codes must read the migration note below — max_iterations
moved from exit 2 to exit 4, and exit 2 now means "invalid
usage / spec validation failed". Everything else upgrades in place
via pip install -U agentao.

The headline:

  • agentao run — new structured automation surface. Spec on
    stdin or --spec FILE, merged with explicit CLI flags, executed
    as one Agentao turn, returning either the final assistant text
    (--format text) or one machine-readable envelope
    (--format json). Pydantic-validated spec with extra="forbid",
    Pydantic envelope with forward-compatible consumer policy.
  • agentao -p is now a thin shim over agentao run --format text --prompt …. Both surfaces share one exit-code table:
    0/1/2/3/4/130.
  • Plugin runtime/loader import boundary is now an executable
    invariant.
    A new contract test asserts that
    import agentao.plugins does not transitively pull the loader
    package or the YAML parser.
  • Logger / agentao.log silencing knobs documented across the
    embedding entrypoints.
    No code change — the
    LLMClient.__init__ short-circuit on logger= and log_file=None
    both already existed; this release makes them visible to host
    authors with one canonical anchor in docs/EMBEDDING.md §2 and
    five crosslinking pages.
  • §7.7 Multi-Agent Kanban Scheduling — new cookbook chapter in
    Part 7 of the developer guide pointing at the external derivative
    project jin-bo/agentao-kanban.

Why this release

Two threads landed in the 2026-05-07 → 2026-05-08 window:

  1. The non-interactive automation contract
    (docs/implementation/NON_INTERACTIVE_RUN_PLAN.{md,zh.md}).
    agentao -p is intentionally simple, which makes it
    inadequate for automation: runtime settings split across flags,
    env vars, cwd, and CLI defaults; stdin treated as prompt text
    rather than a structured task object; callers parsing
    natural-language output to recover status, replay path, token
    usage, or failure reason. M0 of this plan ships in 0.4.6 — the
    subcommand, spec loader, merge rules, text/json output,
    exit codes, the non-interactive abort path, the -p shim, and
    the test surface. Post-MVP scope (jsonl event stream,
    attachments, multi-provider selector, per-run plugin dirs,
    session resume, a checked-in RunSpec schema snapshot) stays
    out of 0.4.6 — see Out of scope below.
  2. Post-0.4.5 boundary-review polish. 0.4.5 shipped items
    #1–#5b plus #6 prep of core-boundary-review. This release
    adds the executable invariant test that locks in the plugin
    runtime/loader split (item #5b's deliverable), trims
    simplify-pass docstrings on the host and plugin batches, and
    documents the import boundary as a tier table so future
    contributors do not regress it.

agentao run — structured automation surface

Shape

# Spec from stdin
agentao run --format json < task.yaml

# Spec from file with explicit flag overrides
agentao run --spec .agentao/tasks/review.yaml --model gpt-5.5 --format json

# Inline prompt convenience (no spec file)
agentao run --prompt "Summarize the current directory" --format json

--spec and piped stdin are mutually exclusive — passing both
exits 2.

Spec contract (M0)

prompt: string                 # required (or pass via --prompt)
cwd: string
model: string
base_url: string
permission_mode: read-only | workspace-write | full-access | plan
interaction_policy: reject     # M0 only accepts "reject"
permissions:
  allow:
    - tool: string             # glob — same syntax as ~/.agentao/permissions.json
      args: { ... }
      domain:
        url_arg: string
        allowlist: [string]
        blocklist: [string]
  deny:
    - tool: string
      args: { ... }
max_iterations: int            # default 100
skills: [string]               # appended to discovered active skills
replay: boolean
output:
  format: text | json

extra="forbid" — unknown spec fields fail with exit 2.
Secrets (api_key) are never accepted in the spec; they stay
in the environment or in a host-injected client.

CLI flags only override spec values when explicitly provided
(argparse defaults do not erase spec fields). The merge order is
built-in defaults → spec → explicit CLI flags.

Output

--format text writes only the final assistant text to stdout.
Diagnostics go to stderr. This is the closest analog to
agentao -p.

--format json emits one envelope after the run completes:

{
  "status": "ok",
  "session_id": "...",
  "turn_id": "...",
  "cwd": "/abs/path/to/project",
  "model": "gpt-5.5",
  "final_text": "...",
  "replay_path": ".agentao/replays/<id>.jsonl",
  "usage": {
    "prompt_tokens": 12000,
    "completion_tokens": 900,
    "total_tokens": 12900
  },
  "tool_calls": 7,
  "warnings": []
}

On failure, final_text is null and error carries
{ type, message, tool_name?, tool_call_id?, question?, matched_rule? }. error.type is one of permission_required,
permission_denied, interaction_required, max_iterations,
runtime_error, invalid_spec, interrupted. Consumers should
treat the envelope as forward-compatible (extra="ignore" on the
model) and silently drop newly-added fields.

Exit codes (unified across agentao run and agentao -p)

Code Meaning
0 Completed normally
1 Runtime error
2 Invalid usage / spec validation failed / unknown spec field
3 Permission or interaction required (no interactive approval)
4 Max tool iterations reached; answer may be incomplete
130 Interrupted (SIGINT / SIGTERM)

agentao -p becomes a thin shim

run_print_mode in agentao/cli/entrypoints.py is now two lines:

def run_print_mode(prompt: str) -> int:
    from .run import execute as _run_execute
    return _run_execute(["--format", "text", "--prompt", prompt])

The success path (prompt → final_text → exit 0) and the
runtime-error path (exit 1) are byte-identical to 0.4.5. The
behavior delta is the exit-code mapping above and the appearance of
exit 3 for permission / interaction requirements that previously
never surfaced from -p (it had no permission rejection path).

Migration for -p exit-code consumers

The only mapping that changed is max_iterations. If your CI
checks [ $? -eq 2 ] to detect "answer may be incomplete", change
it to [ $? -eq 4 ]. Treat 2 as "invalid usage / spec error"
going forward. New exits 3 (permission / interaction) and 130
(SIGINT) are additive — they never appeared from -p before.

agentao -p does not accept --spec, structured output, or
permission rules in the spec; it is intentionally the
inline-prompt shape. For anything more structured, switch to
agentao run.

Plugin runtime/loader boundary — now an executable invariant

docs/design/core-boundary-review.md items 5a/5b separated
runtime-path plugin code (agentao.plugins.*) from the loader path
(agentao.embedding.plugins.*). The split was a convention
through 0.4.5 — a reviewer reading a PR could catch a
loader-import sneaking onto the runtime hot path, but nothing in
the test surface caught it.

tests/test_plugin_boundary_contract.py (new in 0.4.6) closes
that gap: it imports agentao.plugins in a clean subprocess (so
it's isolated from pytest's already-loaded modules) and asserts:

  • none of agentao.embedding.plugins.{manager, manifest, diagnostics, mcp}.* is in sys.modules afterward
  • none of agentao.embedding.plugins.resolvers.* is in
    sys.modules
  • the YAML parser (yaml) is not loaded transitively
  • the runtime path itself remains intact —
    agentao.plugins.{models, skills, agents, hooks} are reachable

docs/design/core-boundary-review.{md,zh.md} gain a tier table
under "Import map after 5a/5b" so future contributors can see at a
glance which paths are public, first-party-runtime, and
first-party-loader.

Logger silencing knobs documented across embedding surfaces

By default, Agentao(...) writes a rolling debug log to
<working_directory>/agentao.log and elevates the package-root
"agentao" logger to DEBUG. Both behaviors are knobs an embedded
host can disable, but the docs glossed over two facts:

  1. LLMClient.__init__ short-circuits the file-handler branch
    when logger= is non-None, so injecting a logger silences
    agentao.log for free — there is no need to also pass
    log_file=None.
  2. Passing log_file=None without logger= still elevates
    getLogger("agentao") to DEBUG, which surprises hosts that
    only wanted to suppress the file.

0.4.6 consolidates the contract under one canonical anchor at
docs/EMBEDDING.md §2 → "Optional: silencing or redirecting agentao.log" (knob matrix, Agentao(...) injection example,
LLMClient direct-build example, the gotcha). Five crosslinking
pages now point at it: docs/LOGGING.md,
developer-guide/{en,zh}/part-2/2-constructor-reference.md,
developer-guide/{en,zh}/part-6/6-observability.md. Pure
documentation — no runtime behavior change.

§7.7 Multi-Agent Kanban Scheduling — new cookbook chapter

Part 7 of the developer guide is the integration-blueprints
cookbook. Through 0.4.5 it had six blueprints, all answering the
same question: "how do I embed one Agentao instance into my
product?"
.

§7.7 adds the first blueprint that answers a different shape:
"how do I run many specialized agents as a system, with state,
retries, and isolation, where Agentao is one of the backends?"
It
anchors on the external derivative project
jin-bo/agentao-kanban,
which turns three Agentao sub-agents (planner / worker /
reviewer) into a self-running kanban board with no human in the
loop.

The chapter intentionally pins to architectural choices and the
Agentao surfaces consumed (sub-agent format, §3.3 host-client
architecture, §4.7 host contract, ACP server, MCP) rather than to
specific kanban CLI flags or filenames — kanban evolves on its own
release cadence.

What did not change

  • No public API or wire-format change. agentao.host Pydantic
    models, the host.events.v1.json / host.acp.v1.json schemas,
    and the Agentao(...) constructor signature are unchanged from
    0.4.5. Eight legacy callback kwargs continue to emit one
    DeprecationWarning per construction.
  • No required code change to upgrade. pip install -U agentao
    is the only step. Hosts that already use agentao -p for
    success-path scripting see zero behavior change; only callers
    that script on the max-iterations exit code need the migration
    above.
  • No CLI command rename. /replay, /sessions, /resume,
    /mcp, /skills, /agent, /permissions, /help all behave
    identically.
  • The agentao.harness deprecated alias is still alive. Its
    removal stays scheduled for 0.5.0.

Migration notes

  • -p exit-code consumers: see the table above. Only
    max_iterations changed mapping (24); 2 now means
    "invalid usage / spec error".
  • Hosts that want structured output from a one-shot run: stop
    parsing -p's natural-language output, switch to
    agentao run --format json, and consume the documented
    envelope. Pin the consumer to extra="ignore" semantics so
    forward-compatible field additions don't break older parsers.
  • Hosts that build PermissionEngine from a spec-shaped
    object:
    the new RunPermissionRule.to_engine_dict("allow" | "deny") returns an engine-ready dict.
  • Hosts that don't want Agentao to mutate their root logger:
    pass logger= to Agentao(...). That single switch also
    silences the default <wd>/agentao.log file. See
    docs/EMBEDDING.md §2 for the full matrix.

Tests

The four release gates from 0.4.4 / 0.4.5 are preserved:

AGENTAO_TEST_LIVE_MODELS=0 AGENTAO_TEST_LIVE_LLM=0 \
  uv run python -m pytest tests/
uv run mypy --strict --package agentao.host
uv run python scripts/write_host_schema.py --check
uv run python scripts/write_replay_schema.py --check

tests/test_run_subcommand.py (new) covers spec loading from
stdin and --spec, explicit-flag override semantics, unknown-field
failure, output formats, error envelopes, the unified exit-code
table, and -p shim parity. tests/test_plugin_boundary_contract.py
(new) runs the plugin runtime/loader invariant in a clean
subprocess.

scripts/write_host_schema.py --check and
scripts/write_replay_schema.py --check both report up-to-date —
M0 does not change host.events.v1.json or host.acp.v1.json.

Upgrade

pip install -U agentao

Out of scope (deferred)

  • --format jsonl live event stream + RunLifecycleEvent.
    Tracked in NON_INTERACTIVE_RUN_PLAN.md Post-MVP.
  • Spec attachments: / provider: / per-run plugins:
    fields.
    Same tracker.
  • SIGINT-precise JSONL termination. M0 ships best-effort signal
    routing through CancellationToken; sub-event-boundary
    termination is post-MVP.
  • Session resume from agentao run. Today the run produces a
    session id but cannot be resumed by --resume.
  • A checked-in JSON Schema snapshot for RunSpec /
    RunResult.
    The Pydantic models are the contract today; a
    snapshot under docs/schema/ and a drift gate similar to
    host.events.v1.json is post-MVP.
  • agentao.harness alias removal. Still scheduled for 0.5.0.
  • Eight legacy callback kwargs — signature surgery on the
    Agentao(...) constructor. Scheduled for 0.5.0 alongside the
    agentao.harness alias removal.
  • agentao/session.py shim removal plus the Path.cwd()
    fallback removal. Scheduled for 0.5.0; carries the four
    ACP-tests migration.
  • PermissionEngine legacy auto-load path tightening.
    Conversion into a hard error is deferred; the convenience form
    (PermissionEngine(project_root=...)) stays accepted.
  • bashlex-based supersedence of the workspace-write
    sensitive-write preset's regex tier.
    Carried over from 0.4.3.
  • PreCompact gate, http-type Stop hooks, plugin-hook events in
    the host public model, hook attachment pipeline.
    All carried
    over from 0.4.4 unchanged.
  • docs/releases/v0.4.0.md and v0.4.1.md — backfilling
    these remains deferred; carried over from 0.4.5.