refactor(agents): remove YAML manifest agent support (#912)#914
Conversation
Consolidate custom agents on a single definition format: Python `agent.py`. Deletes ~200 lines of dynamic `type()`-based class construction, Pydantic manifest validation, and per-agent MCP-config merging from the registry. The `agent.yaml` companion sidecar (used to declare a `models:` preference list next to a Python agent) is unchanged. Behaviour - `_load_from_dir` now loads only `agent.py`. A directory containing only `agent.yaml` (no sibling `agent.py`) emits a `DeprecationWarning` and is skipped — `~/.gaia/agents/<id>/agent.yaml` as the sole agent definition is no longer loaded. - Companion-YAML guard: when both files exist, only `models:` is read. If the sidecar carries legacy manifest fingerprint keys (`manifest_version`, `tools`, `instructions`, `mcp_servers`, `id`), a `DeprecationWarning` enumerates the ignored keys instead of silently discarding them. - Companion `models:` is now type-checked: a scalar value is rejected with a logger warning, never leaks into the registration as a string. - `AgentRegistration.source` and `AgentInfo.source` narrowed to `Literal["builtin", "custom_python"]`. - Importer rejects bundles whose entry directories ship YAML-only or no `agent.py` at all, with an actionable `ImportResult.errors` message instead of silently moving an unloadable directory into place. Exporter drops the YAML-only branch from `_is_custom_agent_dir`. Removed - `AgentManifest` Pydantic model and its `validate_tools` / `validate_id` validators. - `_load_manifest_agent`, `_create_manifest_agent_class`, and `_write_merged_mcp_config` from `AgentRegistry`. - `schemas/agent-manifest.schema.json` and `util/gen_manifest_schema.py` (and the corresponding lint check in `util/check_agent_conventions.py`). - All `tests/unit/agents/test_registry.py` cases that exercised the manifest loader. Tests - New `TestYamlOnlyDeprecation`, `TestCompanionYamlGuard`, and `TestRemovedSymbols` classes in `tests/unit/agents/test_registry.py`, plus a `sys.modules` cleanup fixture so `importlib`-based loads cannot leak across tests. - Two new bundle-import tests in `tests/unit/test_export_import.py` covering legacy YAML-only bundles and bundles missing `agent.py` entirely. Docs - `docs/guides/custom-agent.mdx`: added a "Migrating from a YAML Manifest" section with a field-by-field translation table and a worked example. - `docs/spec/plugin-registry.mdx` and `docs/sdk/patterns.mdx`: rewritten to remove manifest-format coverage. - `docs/releases/v0.17.5.mdx`: new release-notes entry flagging the breaking change. Closes #912
The Verify external URLs check fails because the linked anchor lives in this PR's docs/guides/custom-agent.mdx and is not on the live amd-gaia.ai site yet. Use a Mintlify-internal relative link, which the internal cross-reference check (already passing) validates without hitting the live site.
Release notes for v0.17.5 will land in the release PR rather than this feature PR, to avoid merge conflicts with sibling milestone work.
SummaryClean, well-scoped removal of YAML-manifest agent support. The registry change is small and surgical (~−276 lines), the migration path is well-documented in Issues Found🟡 Important — Stale guidance in onboarding docs still recommends the removed YAML formatThese files are not in the diff but actively contradict the PR. Both repo contributors and the
This is the same kind of doc/code drift the PR's regression test ( 🟢 Minor — Obsolete subsections in code-index docs
🟢 Minor — Stale docstring entry in convention linter
(Drop the 🟢 Minor — Export silently drops YAML-only dirs; users get no migration prompt from that path
Suggest emitting a single 🟢 Minor — Non-string entries in companion
|
|
@claude can you evaluate the pros/cons of removing yaml support? I do think there's space for both where you implement the programmatic portions of an agent in a python file while using yaml for simple configurations of the agent. analyze what is cleanest and architecturally sound. |
|
@kovtcharov-amd — good question, and worth pulling apart. Two things are getting bundled in the term "YAML support" and they have different tradeoffs. What this PR removed vs. what it kept
The deletion is specifically the YAML-as-class-definition path — the one that did dynamic class construction, MRO stitching, Pros / cons of going further on the hybridPros of expanding the sidecar (YAML-for-config):
Cons / what bit us last time:
Architectural recommendationI'd argue the cleanest position is what we have right now, with a clear rule for the sidecar:
Fields that should not drift back into YAML:
Concrete next step if you want to push further on the hybrid: when So: keep this PR as-is, treat the |
…#918) ## Summary Follow-up to [amd#914](amd#914). The github-actions review-bot left a batch of doc-cleanup suggestions that landed after merge — onboarding files (`CLAUDE.md`, `.github/workflows/claude.yml`, `.claude/agents/*.md`) and a couple of SDK doc subsections still told contributors to register agents via YAML manifests, directly contradicting the Python-only registry shipped in amd#914. Same drift class the [`TestRemovedSymbols`](https://github.com/amd/gaia/blob/main/tests/unit/agents/test_registry.py) regression guard exists to prevent — it just didn't extend to prose docs and `.claude/agents/`. Also folds in the two silent-drop polish items the bot flagged: companion YAML `models:` filtering and `export_custom_agents` skipping legacy YAML-only directories. ## Threads - **Onboarding/agent-doc cleanup** — `CLAUDE.md` (4 spots), `.github/workflows/claude.yml` (review-bot guidance), `.claude/agents/{gaia-agent-builder,sdk-architect,code-reviewer,architecture-reviewer,rag-specialist}.md`. Why: contributors and the `gaia-agent-builder` Claude agent read these as authoritative — leaving them stale would steer new agents back into the deleted YAML path. - **SDK doc subsections** — drops the dead "Opting in from a YAML-manifest agent" snippet in [docs/sdk/sdks/code-index.mdx](docs/sdk/sdks/code-index.mdx) and rewrites the equivalent line in [docs/guides/code-index.mdx](docs/guides/code-index.mdx) to describe Python-class composition. Why: example showed a YAML format that no longer loads. - **Linter docstring** — `util/check_agent_conventions.py` listed "Manifest JSON Schema is not stale" under hard checks, but the corresponding `_check_manifest_schema()` was deleted in amd#914. Drop the stale bullet. - **Loud-failure polish** — [`registry.py`](src/gaia/agents/registry.py) now warns when companion YAML `models:` contains non-string entries instead of silently filtering them; [`export_import.py`](src/gaia/installer/export_import.py) emits one `log.warning` per skipped YAML-only directory during export so users get the same migration signal the discovery path already gives. Why: matches the "no silent fallback / actionable error" rule from CLAUDE.md. Two new test cases guard the warnings. ## Test plan - [x] `python -m pytest tests/unit/agents/test_registry.py tests/unit/test_export_import.py -xvs` — 46 passed - [x] `python util/check_agent_conventions.py` — 0 errors - [x] `python util/lint.py --black --isort` — pass - [x] grep -rn "YAML-manifest agents can\|or YAML manifest\|registered via YAML manifest\|AgentManifest" `CLAUDE.md .claude/ .github/ docs/sdk/ docs/guides/code-index.mdx util/` — zero hits (migration guide and release notes intentionally retained) Refs amd#914.
## Changes
This pull request introduces a new optional governance layer for GAIA
agents, providing action-level governance (ACGS-lite semantics) with
extension points for future workflow-level features. The governance
system is opt-in and does not affect existing agents unless explicitly
enabled. The changes include the addition of a new `gaia.governance`
package, a comprehensive example agent demonstrating governance
features, and detailed documentation to guide users. The governance
framework is modular, allowing developers to mix in governance
capabilities, tag tools with risk levels, and configure policy engines,
reviewers, and audit logging.
The most important changes are:
**New Governance Framework:**
* Added the `gaia.governance` package, introducing a modular governance
layer for GAIA agents. This includes the `GovernedAgentMixin`,
`GaiaGovernanceAdapter`, risk tagging decorators, and extension points
for policy engines, receipt services, and checkpoint runtimes.
* Implemented the `GaiaGovernanceAdapter` class, which composes policy
evaluation, checkpointing, receipt issuance, and policy version binding
into a single entry point. It ensures secure, auditable, and extensible
governance flows for agent tool calls.
* Provided an `action_mapper` utility to map GAIA tool calls into
governance action requests, standardizing how actions are represented
for policy evaluation.
**Documentation and Examples:**
* Added a comprehensive `README.md` for the `gaia.governance` package,
including quick start instructions, configuration options, security
properties, and extension points. This documentation enables developers
to quickly understand and adopt the governance system.
* Introduced a new example, `examples/governed_weather_agent.py`,
demonstrating how to wrap an agent with governance, define risk-tagged
tools, and handle governance decisions (ALLOW, BLOCK, REVIEW) with local
and MCP tools.
**Packaging:**
* Updated `setup.py` to include the new `gaia.governance` package in the
distribution, ensuring it is installed and available for import.
---
## Hardening & Polish (added in 4 follow-up commits)
Triggered by a PR-review pass that surfaced merge blockers and
architectural feedback. All concerns addressed without expanding feature
scope.
**Merge blockers fixed** — `f242e28 fix(governance): harden error
handling and align docs with additive tags`
* Tightened five `except Exception` sites that were silently swallowing
errors. The most important one (`_resolve_canonical_tool_name`) now logs
unexpected resolver errors with `exc_info=True` instead of falling
through silently. This closes the alias-bypass risk where governance
could check tags on the wrong key when the resolver had a bug. The other
four sites (`_lookup_tool_fn`, `_invoke_callback`, `_prompt_review`,
`JsonlReceiptService._read_all`) now use specific exception types and
log at WARNING.
* `_prompt_review` now returns `(approved, exception_or_None)` so
`_handle_review_checkpoint` can stamp the exception type and message
into the receipt's `metadata.evidence.resolution.reason` (`15bc40b`).
The audit log can now distinguish "reviewer chose no" from "reviewer
crashed" — previously both produced the same boilerplate `"reviewer
rejected"` reason.
* Documentation now matches the code: tag merge is **additive (union,
deduplicated)** — *not* "explicit dict wins". Updated README, the
`@govern` decorator's docstring, and the inline comment in
`mixin._build_action_request` to describe what the tests have always
asserted.
* `_canonical_hash` for BLOCK-receipt evidence now handles non-JSON tool
args, complex types, and cycles without falling back to `repr()`,
keeping receipts deterministically hashable across all inputs.
* `JsonlReceiptService.issue_receipt` now performs strict canonical JSON
validation at issue time, rejecting non-canonical metadata (NaN/Inf,
opaque objects) so tampered or unparseable receipts cannot land in the
audit log.
* Public docs registered: new `docs/sdk/sdks/governance.mdx` plus an
entry in `docs/docs.json` SDK navigation. Closes the missing-docs
blocker.
**CI guard** — `2ed500d ci(test_api): cap job runtime at 30 minutes`
* The API Tests job had no `timeout-minutes` and was hanging for 4+
hours on the in-flight CI run for this PR. Added a 30-minute cap (covers
worst-case Lemonade boot + model pull + tests) so future runs fail fast
on hangs.
**Polish** — `ca941a9 refactor(governance): polish pass — drop dead
code, tighten lock, deep-copy tags`
Driven by a parallel three-agent review (code-reviewer +
architecture-reviewer + test-engineer):
* Deleted `workflow_mapper.py` and
`StaticPolicyBindingService.bind_receipt`. Both were "forward-compat
seams" with zero callers in src/, tests/, examples/, or docs/. They'll
come back in the PR that adds the real event surface, when the actual
signature is known. YAGNI.
* Tightened `JsonlReceiptService.get_receipt`: cache reads/writes were
unsynchronized while a concurrent `issue_receipt` was mutating the same
dict under `_lock`. Both paths are now under the lock.
* `GovernedAgentMixin.__init__` now deep-copies inner risk-tag lists so
a caller cannot mutate the agent's tag table after construction by
holding onto the original list reference.
* Added a comment on the `bool`-before-`int` ordering in
`_canonical_json_value` (subclass relationship — without the order,
`True` would canonicalize as `1`).
* Debug breadcrumb on receipt-log malformed-line skips, so an operator
chasing a missing receipt has something to grep.
**Test additions** — `5cdfee5 test(governance): cover hardened error
paths and fail-closed branches`
Added 6 new tests covering branches that had no regression guard:
* `test_resolver_unexpected_exception_logs_and_governs_raw_name` —
proves a buggy `_resolve_tool_name` raising RuntimeError still triggers
governance on the raw name AND emits an operator-visible warning. Future
regression where the warning is swapped for a silent fallback fails this
test.
* `test_resolver_lookup_error_is_silent_and_governs_raw_name` — proves
the expected "tool not in registry" case (`LookupError`) is absorbed
silently with no log noise.
* `test_unknown_transition_outcome_fails_closed` — proves a custom
`CheckpointRuntime` returning a status the mixin doesn't know is denied,
not let through.
* `test_handle_transition_rejects_unknown_decision_type` — same idea at
the adapter layer for an unknown `GovernanceDecision.decision`.
* `test_read_all_skips_malformed_lines` — proves a corrupt line in the
middle of an audit log doesn't block readers from finding subsequent
valid records.
* Existing callback-exception and reviewer-exception tests gained
`caplog` assertions so a future silent-swallow regression is caught.
Plus two readability fixes: renamed
`test_explicit_dict_overrides_decorated_tags` →
`test_explicit_empty_dict_does_not_downgrade_decorator_tags` (the body
asserted additive semantics, the old name said the opposite); replaced
hardcoded `"test_governance_adapter.SlotOnlyEvidence"` qualname strings
with `f"{Cls.__module__}.{Cls.__qualname__}"` so the tests survive a
file rename.
**Verification (fresh evidence at HEAD `15bc40b`)**
* Governance test suite: **67 passed** (was 27 before the polish — added
5 from the in-flight strict-evidence work and 6 from the polish review).
* `python util/lint.py --black --isort`: PASS.
* No dead code residue: `git grep` of `workflow_mapper`,
`map_gaia_event_to_transition`, `bind_receipt` returns zero matches.
* Public-import smoke test: `GaiaGovernanceAdapter.default()` constructs
with the four expected components.
* Broader unit tests (excl. `tests/unit/chat/` which needs the optional
`[ui]` extra): **946 passed, 16 skipped** — no regressions introduced.
* Upstream merge of `amd:main` (10+ commits including the
YAML-manifest-removal refactor `amd#914`) is incorporated. `_TOOL_REGISTRY`
survived that refactor; governance imports remain green.
**Items intentionally not in this PR** (deferred for follow-up):
* `Agent.__init__` accepting `**kwargs` so multi-mixin composition
(`MCPAgent + GovernedAgentMixin + ApiAgent`) doesn't trip on closed
signatures — touches `agents/base/agent.py` and is a separate concern.
* Public accessor for `_TOOL_REGISTRY` to replace the
`gaia.agents.base.tools._TOOL_REGISTRY` private import in
`mixin._lookup_tool_fn`.
* Extracting `_canonical_hash` and `_canonical_json_value` to a public
`gaia.governance.canonical` module so any conforming
`ReceiptServiceProtocol` can verify or recompute hashes independently.
* `default()` accepting component overrides for `policy_engine`,
`receipt_service`, `checkpoint_runtime`, `policy_binding` so third
parties can swap engines without forgoing the factory.
These are good ideas that expand public API surface and belong in a
focused follow-up PR rather than bundled into this merge.
---
## Governance REVIEW + existing confirmation path
Follow-up for PR review 4197475871: this PR takes Path A. Governance
remains an opt-in policy layer, but REVIEW decisions now reuse GAIA
Agent UI confirmation when the active console advertises
`blocking_confirmation = True` (`SSEOutputHandler`). An explicit
`governance_reviewer` still takes precedence for non-UI or custom
approval flows, and default `AgentConsole` remains fail-closed because
its confirmation method auto-approves.
Regression coverage added:
* Blocking-console fallback: governance REVIEW delegates to
`console.confirm_tool_execution` only for consoles marked
`blocking_confirmation = True`.
* Agent UI path: a governance-tagged REVIEW tool with `SSEOutputHandler`
emits the existing `permission_request` event and runs only after
approval.
* Default-console safety: unmarked consoles are not treated as implicit
reviewers, preserving fail-closed behavior.
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: dislovelhl <dislovelhl@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Consolidates custom agents on a single definition format: Python
agent.py. Deletes ~200 lines of dynamictype()-based class construction, Pydantic manifest validation, and per-agent MCP-config merging fromsrc/gaia/agents/registry.py. Unblocks upcoming external-service work (OAuth-bound integrations) where capability declarations likeREQUIRED_CONNECTIONSneed a single class-attribute home the Agent UI can introspect.The
agent.yamlcompanion sidecar (declaresmodels:next to a Python agent) is unchanged.Closes #912
Why this matters
type()builder, so every agent is a regular Python class readable by mypy, IDEs, andgit grep.registry.py; the dynamic class builder,_TOOL_REGISTRY.clear()race risk, and the merged-MCP-config disk write are all gone.Behaviour
_load_from_dirloads onlyagent.py. A directory containing onlyagent.yaml(no siblingagent.py) emits aDeprecationWarningand is skipped.agent.pyandagent.yamlexist, onlymodels:is read. Legacy manifest fingerprint keys (manifest_version,tools,instructions,mcp_servers,id) trigger aDeprecationWarningenumerating the ignored keys instead of silently discarding them. A scalarmodels:value is rejected with a logger warning rather than leaking through as a string that would later iterate as characters.AgentRegistration.source(dataclass) andAgentInfo.source(Pydantic) are nowLiteral["builtin", "custom_python"]. Pydantic enforces at the API boundary.agent.pyat all surface an actionableImportResult.errorsmessage rather than silently moving an unloadable directory into~/.gaia/agents/. Exporter drops the YAML-only branch from_is_custom_agent_dir.Removed
AgentManifestPydantic model and itsvalidate_tools/validate_idvalidators._load_manifest_agent,_create_manifest_agent_class,_write_merged_mcp_configfromAgentRegistry.schemas/agent-manifest.schema.jsonandutil/gen_manifest_schema.py(and the corresponding lint check inutil/check_agent_conventions.py).tests/unit/agents/test_registry.pycases that exercised the manifest loader.Pre-merge verification
_register_builtin_agentsare Python-only.src/gaia/agents/builder/template.pygenerate_agent_source()returns Python source for all three render paths;tests/unit/agents/test_builder_agent.pyalready asserts noagent.yamlis written. No BuilderAgent changes required.~/.gaia/agents/*/agent.yaml.Test plan
python -m pytest tests/unit/agents/test_registry.py tests/unit/test_export_import.py -xvs— 45 passedpython -m pytest tests/unit/ -x— 1642 passed, 15 skipped (3 pre-existingchat/ui/test_sse_confirmation.pyflakes verified independent of this PR)python util/lint.py --all --fix— 6/6 quality checks passgrep -rn "AgentManifest\|_load_manifest_agent\|_create_manifest_agent_class\|_write_merged_mcp_config\|custom_manifest" src/ tests/ docs/— no residual references outside the deprecation-warning string, the regression-guard test, and the release-notes entrypython -c "import json; json.load(open('docs/docs.json'))"— Mintlify nav still valid JSONMigration
Users with
~/.gaia/agents/<id>/agent.yamlmanifests need to convert toagent.py. The release notes link to a new "Migrating from a YAML Manifest" section indocs/guides/custom-agent.mdxwith a field-by-field translation table and a worked example. The fastest path is the Gaia Builder Agent (gaia chat --ui→ + → Build a Custom Agent), which scaffolds a workingagent.py.