Skip to content

Agentao 0.4.4

Choose a tag to compare

@jin-bo jin-bo released this 06 May 05:21
· 82 commits to main since this release
6f1cd3a

Agentao 0.4.4

A Claude-Code compatibility + tool-hardening release on top of 0.4.3.
No breaking changes; no public API or wire-format change.
pip install -U agentao upgrades in place from any 0.4.x release.

The headline features:

  • Two new lifecycle hook events — Stop and PreCompact
    alongside the six existing plugin-hook events. Wire shape is
    Claude Code's flat snake_case top-level payload, so a hook script
    written against Claude Code's documented stdin shape runs
    unchanged.
  • A full Stop control gate: exit code 2, JSON
    decision: "block", continue: false, stopReason,
    suppressOutput, systemMessage,
    hookSpecificOutput.additionalContext — all honored, with
    per-chat() re-entry capping so a runaway force_continue can't
    spin the loop forever.
  • SearchTextTool argv hardening — patterns beginning with -
    can no longer be parsed as flags by git grep or rg.
  • SearchTextTool rg source-level skip pruning--glob '!dir'
    flags are passed to rg directly so node_modules / build /
    target / friends are excluded by the engine rather than
    post-filtered out of its output.
  • EditTool unicode-fuzzy tier-3 match — typographic codepoints
    (smart quotes, em-/en-dash, NBSP, ideographic space, …) are
    normalized to ASCII before comparison, mirroring git apply fuzzy
    behaviour.

Plus a doom-loop double-dispatch fix and a hook-parser
non-string-trigger isinstance guard.

Why this release

Two threads landed in the 2026-05-05 → 2026-05-06 window:

  1. Stop / PreCompact hooks plan
    (docs/implementation/STOP_PRECOMPACT_HOOKS_PLAN.{md,zh.md}) —
    shipped as PR-1 (Phase A, event surface, #27) then PR-2 (Phase B,
    control-aware gate, #28). The split lets hosts that only need
    observation run on Phase A; hosts that need Claude-Code-style
    control wire on Phase B without changing their hook scripts.
  2. pi-mono tool-tier borrow review
    (docs/design/pi-mono-tools-review.{md,zh.md}) — verdicts on
    tool-side correctness gaps. Two of the verdicts ship in this
    release: the Edit tier-3 fuzzy match and the Search argv
    hardening pass. The rest stay deferred per the review.

Stop / PreCompact lifecycle hooks

Two new events on the existing plugin-hook surface

SUPPORTED_HOOK_EVENTS now includes Stop and PreCompact. They
ride the same PluginHookDispatcher plumbing as the existing six
events, with one structural difference: their on-wire payload uses
Claude Code's flat snake_case top-level schema instead of
Agentao's {event, data} envelope. Other adapter methods are
unchanged — flipping the whole adapter would break every existing
event consumer (and _matches's data["toolName"] path), so Stop
and PreCompact are made Claude-compatible from the start while the
others keep their envelope.

The _matches filter is extended to handle PreCompact's
manual|auto matcher alongside the existing tool_name /
prompt_text matchers.

Dispatch sites

Event Dispatch sites
Stop turn-end finalizer for final_response / max_iterations / doom_loop
PreCompact microcompact, full (compression_threshold), full (api_overflow), minimal_history

The compaction_type field on the PreCompact emit lets hosts
distinguish benign compaction from emergency truncation; hosts that
snapshot context for forensic replay want both.

Per-event hook-type allowlist

SUPPORTED_HOOK_TYPES_BY_EVENT is a per-event allowlist;
events not listed fall back to SUPPORTED_HOOK_TYPES. Stop and
PreCompact deliberately exclude prompt — at runtime
_dispatch_lifecycle (and Phase B's Stop-specific runner) only
invoke command hooks for these events, so a prompt-type rule
would parse as supported but be silently dropped at dispatch. The
parser now warns and skips at parse time.

Stop control gate

Stop hooks honor Claude Code's full Stop output schema:

Mechanism Effect
Exit code 2 Block — stderr becomes the reason
JSON decision: "block" + reason Block — reason is preserved
JSON continue: false Stop the chat loop (precedence over decision)
JSON stopReason Recorded; surfaces to host on the emit payload
JSON suppressOutput: true Suppress echo of additionalContext onto the assistant final answer
JSON systemMessage Recorded for replay fidelity
JSON hookSpecificOutput.additionalContext Appended to the assistant message

force_continue re-enters the chat loop with follow_up_message
appended as a user turn. Re-entry is capped per chat() invocation
(stop_reentry_cap=3 by default) — a runaway hook that always
returns force_continue produces a reentry_capped outcome rather
than spinning the loop.

stop_hook_active flips to True on the 2nd-and-subsequent
dispatches within one chat() invocation, so a hook script written
against Claude Code's documented stop_hook_active semantic — "True
if I am being re-entered after my own previous force-continue" —
sees the matching value without any extra plumbing on the host
side.

PLUGIN_HOOK_FIRED schema

Phase A → Phase B is additive. Phase A's emit schema is
preserved; Phase B's Stop emit extends it:

Field Source
hook_name "Stop" or "PreCompact"
outcome Stop: allow / block / continue / continue_at_max_iter / reentry_capped (Phase B). PreCompact: always allow (observe-only)
turn_end_reason Stop only: final_response / max_iterations / doom_loop (discriminator for continue across exit sites)
at_max_iter Stop only: derived from turn_end_reason == "max_iterations"
matched_rule_count Selection count — gates emission (when zero, no event is emitted)
added_context_count Stop only: len(stop_result.additional_contexts)
suppress_output Stop only: from JSON suppressOutput
compaction_type PreCompact only: microcompact / full / minimal_history
trigger PreCompact only: auto (no manual site exists)

matched_rule_count is a selection count, not an execution
count
. It is computed via select_matching_rules("Stop", ...)
before the dispatcher fork, so a no-match dispatch produces no
subprocess and no event. The dispatcher result also carries the
count as defense-in-depth for a future refactor.

What this is not

This is a transport / replay event, not a host-public event.
The agentao.host.EventStream discriminated union currently covers
ToolLifecycleEvent | SubagentLifecycleEvent | PermissionDecisionEvent and does not include plugin-hook
events. Hosts that consume Agentao.events() will not see Stop /
PreCompact from this plan; only the transport / replay layer (and
tests reading the transport queue) will. Promoting plugin-hook
events into the host public model is tracked separately in
PUBLIC_EVENT_PROMOTION_PLAN.md.

Search-tool argv hardening

A user-supplied pattern beginning with - could be parsed as a
flag by the underlying engine. Two changes:

  • _git_grep now passes the pattern via -e <pattern>. (git
    grep's -- is the pathspec separator, not an option
    terminator — passing -- <pattern> would interpret <pattern>
    as the first pathspec.)
  • _ripgrep now places the pattern after --.

Coverage: --help, --pre=..., leading -e payload — all three
are now searched as literal strings on both engines. New
tests/test_search_argument_injection.py locks the contract.

Search-tool rg source-level skip pruning

The --glob '!<dir>' flags are now passed to rg directly so
heavyweight directories (node_modules, build, target, …) are
excluded by rg itself rather than post-filtered out of its output.
This matters most in the non-git fallback path where there is no
.gitignore to lean on.

Negative globs are appended after the positive file_pattern
glob because rg gives later globs precedence — a regression test
locks in the ordering. _effective_skip_dirs opt-in semantics are
preserved: a caller who explicitly references node_modules in
their query still searches it.

Edit unicode-fuzzy tier-3

EditTool matching now has three tiers:

  1. Byte-exact — identical bytes, identical line offsets.
  2. Whitespace-flex — leading/trailing whitespace and run-length
    normalized.
  3. Unicode-fuzzy (new) — typographic codepoints normalized to
    ASCII before comparison: smart quotes (“ ” ‘ ’),
    em-/en-dash (— –-), NBSP (U+00A0) → space, ideographic
    space (U+3000) → space.

Tiers 1 and 2 hit first; tier 3 only runs on a miss. The shared
_line_window_matches / _apply_match helpers preserve CRLF byte
offsets and replace_all spans every normalized-equivalent
occurrence.

Mirrors git apply fuzzy behaviour. New
tests/test_edit_unicode_fuzzy.py covers tier-3 hits, replace_all
across mixed dash variants, CRLF preservation, and tier-precedence
(byte-exact and whitespace-flex must hit before tier 3).

What did not change

  • No public API or wire-format change. agentao.host Pydantic
    models, the host.events.v1.json / host.acp.v1.json schemas,
    and the Agentao(...) constructor signature are unchanged from
    0.4.3.
  • No required code change to upgrade. pip install -U agentao
    is the only step.
  • No CLI command rename. Stop / PreCompact hooks ship as
    plugin-hook events; the CLI surface is unchanged.
  • The agentao.harness deprecated alias is still alive. Its
    removal stays scheduled for 0.5.0.

Tests

2549 passed, 2 skipped, 9 deselected under
AGENTAO_TEST_LIVE_MODELS=0 AGENTAO_TEST_LIVE_LLM=0 (CI's offline
mode). The strict typing gate (mypy --strict --package agentao.host) and the schema drift gate
(scripts/write_host_schema.py --check,
scripts/write_replay_schema.py --check) both green.

The Stop / PreCompact pass adds 22 test files (12 from Phase A, 8
new + 5 modified from Phase B) routed through shared
tests/support/stop_precompact.py helpers
(write_json_emitting_hook, write_exit_code_hook,
make_runner_with_stub_llm, dispatch_stop_with_json_payload).
Parser-table cases call _parse_stop_command_output directly to
avoid platform-fragile chmod+shebang dependencies.

Upgrade

pip install -U agentao

Hosts that already register plugin-hook rules can add Stop and
PreCompact entries to their hooks.json — Claude Code's flat
snake_case schema is the wire shape, so an existing Claude Code
hook script for these two events runs unchanged.

Out of scope (deferred)

  • PreCompact gate. Claude Code supports blocking PreCompact via
    exit 2 / decision: "block". Agentao keeps PreCompact
    observe-only in 0.4.4 — the documented compatibility gap is
    tracked in STOP_PRECOMPACT_HOOKS_PLAN.md §B5 and is not
    labelled "deferred" in the plan because the design choice was
    intentional, not a slip. Reopen if a real workload needs it.
  • http-type Stop hooks. "http" is in
    KNOWN_UNSUPPORTED_HOOK_TYPES; the parser warns and skips. Hosts
    that need HTTP-callback Stop hooks must wait for an Agentao
    HTTP-hook runner.
  • Plugin-hook events in the host public model. Agentao.events()
    still does not surface PLUGIN_HOOK_FIRED. Tracked in
    PUBLIC_EVENT_PROMOTION_PLAN.md.
  • Hook attachment pipeline. _dispatch_lifecycle returns
    list[HookAttachmentRecord] but every existing call site discards
    it — there is no shared "attach to turn" wiring. Surfacing
    attachments to the conversation / replay layer is a cross-cutting
    follow-up tracked separately as
    PLUGIN_HOOK_ATTACHMENT_PIPELINE_PLAN.
  • bashlex-based supersedence of the workspace-write
    sensitive-write preset's regex tier.
    Carried over from 0.4.3.
  • agentao.harness alias removal — still scheduled for 0.5.0.
  • docs/releases/v0.4.0.md and v0.4.1.md — backfilling these
    remains deferred.