feat: OpenAI and Anthropic tool-format adapters with middleware (#55, #50, #40) by dgenio · Pull Request #69 · dgenio/agent-kernel

dgenio · 2026-05-13T10:49:19Z

What changed

Adds agent_kernel.adapters with two drop-in middleware classes that translate Capability objects into vendor tool schemas, route tool calls through the full kernel pipeline (grant → invoke → firewall → trace), and return vendor-shaped tool-result objects.

File	Change
`src/agent_kernel/adapters/__init__.py`	New — public adapter exports.
`src/agent_kernel/adapters/_base.py`	New — `BaseToolMiddleware` (hook registration + dispatch, request/grant/invoke flow, error-as-result), `ToolCallEvent` / `ToolResultEvent` dataclasses, schema generation (`build_input_schema`, `normalize_for_openai_strict`, `validate_input`), namespace helpers, canonical `frame_to_payload` / `error_to_payload`.
`src/agent_kernel/adapters/openai.py`	New — `OpenAIMiddleware`, `capabilities_to_tools`, `tool_call_to_request`, `format_result`. Supports both Responses API and Chat Completions; auto-detects input shape; dotted IDs ↔ `namespace__function`.
`src/agent_kernel/adapters/anthropic.py`	New — `AnthropicMiddleware`, `capabilities_to_tools` with per-capability and middleware-default `cache_control`, `tool_use_to_request`, `format_result`.
`src/agent_kernel/models.py`	Adds `ToolHints` dataclass and three optional fields on `Capability`: `parameters_model` (pydantic), `parameters_schema` (raw JSON Schema escape hatch), `tool_hints` (vendor flags). All default to `None`.
`src/agent_kernel/kernel.py`	Adds `Kernel.list_capabilities()` accessor (used by adapters; generally useful).
`src/agent_kernel/__init__.py`	Re-exports `OpenAIMiddleware`, `AnthropicMiddleware`, `ToolHints`.
`tests/test_adapters.py`	New — 57 tests across schema conversion, round-trip, middleware flow, hook ordering (sync + async), abort, justification injection, validation errors, error-as-result paths.
`pyproject.toml`	Adds `pydantic>=2` runtime dependency.
`docs/integrations.md`	New "LLM tool-format adapters" section with OpenAI + Anthropic usage examples, namespace mapping, strict mode, cache control, hooks, error handling.
`docs/architecture.md`	New "Adapters" component bullet.
`AGENTS.md`	Updates dep list to `httpx` + `pydantic`.
`CHANGELOG.md`	`[Unreleased]` entries under `Added` and `Changed`.

Why

Closes #55, #50, #40. Developers using OpenAI / Anthropic APIs previously had to hand-translate between Capability objects and each vendor's tool schema and stitch the call/result loop through grant_capability / invoke manually. The new middleware eliminates that boilerplate while preserving every kernel invariant — every call still goes through grant + token + firewall + trace, and per-call tokens prevent cross-principal reuse (I-06). PolicyDenied, CapabilityNotFound, DriverError, argument-validation failures, and hook abort signals all surface as tool-result errors so the surrounding agent loop never crashes.

Design decisions

Pydantic for schema source. Capability.parameters_model (pydantic) is the canonical input-schema source; parameters_schema (raw dict) is the escape hatch. allowed_fields is left alone — it remains an output redaction control consumed by the firewall and is deliberately not used to advertise input shape (which it was never designed to describe).
Both OpenAI shapes. Auto-detect on input; opt-in (format="chat_completions") on output. Default output is Responses API since issue OpenAI tool-format adapter & middleware #55 names function_call_output.
Namespace mapping (OpenAI only). Dots → __ (double underscore) so capability IDs with underscores in segments round-trip unambiguously. Anthropic preserves dotted IDs.
Strict mode is opt-in per capability via ToolHints(strict=True). The adapter walks the pydantic-emitted schema and forces every object's required + additionalProperties: false. Falls back to non-strict with a warning if normalisation raises.
Hooks are sync or async callables (auto-detected). Pre-hooks may mutate args, inject justification (for WRITE/DESTRUCTIVE policy), or set aborted=True. Post-hooks observe (or replace) the resulting Frame. Hook exceptions during pre-invoke become tool errors; exceptions during post-invoke are logged but never crash the batch.
Justification flow. Three layers: (1) a handle_tool_calls(..., justification="") batch parameter, (2) per-call override via args["_justification"], (3) hook-injected via event.justification. The simplest path works for READ tools; WRITE tools can supply justification via any of the three.

How verified

ruff format --check src/ tests/ examples/  → 45 files already formatted
ruff check src/ tests/ examples/           → All checks passed
mypy src/                                  → Success: no issues found in 27 source files
pytest -q --cov=agent_kernel               → 364 passed in 5.52s, 96% total coverage
pytest --cov=agent_kernel.adapters         → 57 passed, 98% adapter coverage
PYTHONIOENCODING=utf-8 python examples/basic_cli.py        → ✓
PYTHONIOENCODING=utf-8 python examples/billing_demo.py     → ✓
PYTHONIOENCODING=utf-8 python examples/http_driver_demo.py → ✓

(Local pytest baseline before this PR: 287 tests; after: 364, +77. New tests live entirely in tests/test_adapters.py. Total project coverage improved from 95% → 96%.)

Scope notes (Mode B)

New dependency: pydantic>=2. Justified by adapters using model_json_schema() + model_validate(). Updated AGENTS.md so future readers see the new dep set is httpx + pydantic.
Capability model extensions: three new fields, all default None. No existing fixture or test required updates. Backward-compatible.
Module sizes: _base.py 438 lines, openai.py 351, anthropic.py 268. The first two exceed AGENTS.md's 300-line guideline. The existing repo has three modules in the same boat (policy.py 520, policy_dsl.py 503, kernel.py 466 — see [policy/kernel] Tech debt: decompose policy_dsl.py and broaden dry-run driver test coverage #68). I chose not to split because the helpers and middleware are tightly coupled (every middleware method calls into the helpers); splitting introduces import gymnastics without clarifying semantic boundaries. Happy to factor _base.py into _base.py + _helpers.py if you'd prefer.
Not in scope (offered as follow-ups, not bundled): OTel instrumentation of the adapter layer (belongs in OpenTelemetry integration: spans, metrics, and trace export #38); a Capability.parameters_model style change to require pydantic everywhere; SDK-typed return values (the openai / anthropic packages would only buy IDE autocomplete that consumers can get themselves).

Risks

Pydantic version drift. Pydantic v2's model_json_schema() may emit Draft 2020-12 features OpenAI strict mode doesn't accept. The normalize_for_openai_strict walker handles the two common issues (additionalProperties, required). For schemas with $ref / $defs, OpenAI strict accepts those — we leave them alone. If a user supplies a schema feature that breaks normalisation, the adapter falls back to non-strict with a warning rather than raising.
At-most-once delivery, not at-least-once. The middleware never retries: if kernel.invoke raises a DriverError (after the kernel's own driver fallback), the call becomes a tool error. This matches the rest of the codebase — drivers handle retry internally where appropriate.
Hook ordering & state. Hook lists are not protected by a lock. If a middleware is shared across concurrent batches and intercept_* is called from one thread while a batch is dispatching from another, ordering is unspecified. The expected pattern is per-principal middleware instances, hooks registered at setup time — matching the issue body's OpenAIMiddleware(kernel, principal) shape.

AI agent instruction files reviewed

AGENTS.md — updated dep set; no convention changes.
docs/agent-context/invariants.md — no change needed; adapters consume Frame post-firewall and route through kernel.invoke, so I-01, I-02, I-06 remain enforced by existing code paths.
docs/agent-context/review-checklist.md, lessons-learned.md, workflows.md — no change needed.
.github/copilot-instructions.md, .claude/CLAUDE.md — no change needed.

Checklist

make ci passes (fmt → lint → mypy strict → pytest → examples)
Docstrings match the final implementation
No dead code (all new parameters, helpers, and types exercised by tests)
Naming consistent: capability, principal, grant, Frame throughout (see docs/integrations.md)
Backward-compat: new Capability fields default to None; no existing test required updates
CHANGELOG.md updated under [Unreleased]
Updated canonical docs (docs/architecture.md, docs/integrations.md, AGENTS.md) in the same PR

🤖 Generated with Claude Code

…50, #40) Adds `agent_kernel.adapters` with two drop-in middleware classes that translate Capability objects into vendor tool schemas, route tool calls through the full kernel pipeline (grant → invoke → firewall → trace), and return vendor-shaped tool-result objects. Both share a `BaseToolMiddleware` that owns hook registration, error-as-result conversion, and the canonical Frame → JSON payload shape. OpenAIMiddleware emits Responses-API tools by default (also supports Chat Completions via `format=chat_completions`), with dotted capability IDs mapped to `namespace__function` form and OpenAI `strict` mode opt-in via `Capability.tool_hints`. AnthropicMiddleware emits Anthropic Messages tools with optional `cache_control` (per-capability or middleware default) and preserves dotted capability IDs. Both auto-detect Chat/Responses shape on input regardless of configured output format. Capability gains three optional fields: `parameters_model` (pydantic model used for JSON-Schema generation and input validation), `parameters_schema` (raw JSON Schema escape hatch), and `tool_hints` (ToolHints — vendor flags). All default to None, preserving backward compat. Kernel gains a small `list_capabilities()` accessor. Adds `pydantic>=2` as a runtime dep (justified by the new adapters; only used inside the adapters package). No `openai` / `anthropic` SDK dependency — every adapter function is a pure dict transform. PolicyDenied, CapabilityNotFound, DriverError, argument-validation failures, and hook abort signals all surface as tool-result errors rather than raised exceptions so the LLM can react. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR introduces a new agent_kernel.adapters package providing OpenAI and Anthropic “tool-format” adapters plus middleware that routes vendor tool calls through the kernel’s full pipeline (grant → invoke → firewall → trace), with schema generation/validation support via Pydantic.

Changes:

Added OpenAI + Anthropic adapter modules and a shared BaseToolMiddleware (hooks, dispatch, vendor-shape formatting, schema helpers).
Extended Capability with optional parameters_model, parameters_schema, and tool_hints (ToolHints) to drive tool schemas and optional strict/cache settings.
Added Kernel.list_capabilities() and updated docs/tests/changelog and runtime deps (pydantic>=2).

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/test_adapters.py	New test suite covering schema conversion, middleware flow, hooks, aborts, and error-as-result behavior.
src/agent_kernel/models.py	Adds `ToolHints` and new optional `Capability` fields for adapter schema/validation/hints.
src/agent_kernel/kernel.py	Adds `Kernel.list_capabilities()` to enumerate registered capabilities.
src/agent_kernel/adapters/_base.py	New shared middleware base, hook/event types, schema helpers, payload helpers, namespace helpers.
src/agent_kernel/adapters/openai.py	New OpenAI tool schema conversion + middleware supporting Responses + Chat Completions formats.
src/agent_kernel/adapters/anthropic.py	New Anthropic tool schema conversion + middleware with optional `cache_control`.
src/agent_kernel/adapters/init.py	Public exports for adapter layer.
src/agent_kernel/init.py	Re-exports middlewares and `ToolHints` at top level.
pyproject.toml	Adds runtime dependency on `pydantic>=2`.
docs/integrations.md	Adds “LLM tool-format adapters” documentation and usage examples.
docs/architecture.md	Documents adapters as an architecture component.
AGENTS.md	Updates minimal dependency list to include `pydantic`.
CHANGELOG.md	Adds `[Unreleased]` entries describing the new adapter feature set and dependency change.

Comments suppressed due to low confidence (2)

src/agent_kernel/adapters/openai.py:197

Same as above: _parse_arguments raises ValueError for invalid argument types/JSON. For consistency with the repo’s error-contract rule in AGENTS.md, map these parse failures to a custom AgentKernelError subclass so callers can reliably catch agent-kernel errors (and so exception types are part of the contract).

    if not isinstance(raw, str):
        raise ValueError(
            f"OpenAI tool_call 'arguments' must be a JSON string or dict, got {type(raw).__name__}."
        )

src/agent_kernel/adapters/anthropic.py:128

Same issue here: raising ValueError for non-dict input violates the repo’s “no bare ValueError to callers” rule. If you add a custom adapter parse/validation exception, use it consistently for all adapter-facing shape errors.

    if raw_input is None:
        raw_input = {}
    if not isinstance(raw_input, dict):
        raise ValueError(
            f"Anthropic tool_use 'input' must be an object (got {type(raw_input).__name__})."
        )

+    ``billing.list_invoices`` → ``billing__list_invoices``. The ``__`` separator
+    is reserved so the inverse mapping is unambiguous even when individual
+    segments contain underscores.
+    """
+    return capability_id.replace(".", _NAMESPACE_SEP)
+
+
+def restore_namespace(safe_name: str) -> str:
+    """Inverse of :func:`make_namespace_safe_name`.
+
+    ``billing__list_invoices`` → ``billing.list_invoices``.
+    """


+    if not isinstance(name, str) or not name:
+        raise ValueError(
+            "OpenAI tool_call is missing a function name. Expected either "
+            "'function.name' (Chat Completions) or 'name' (Responses API)."
+        )


+        Args:
+            tool_calls: Either ``response.output`` items from the Responses API
+                (filtered or unfiltered — non-function items are passed
+                through unchanged) or ``message.tool_calls`` items from the
+                Chat Completions API. Input shape is auto-detected per call.
+            justification: Justification applied to every call in the batch.
+                Individual calls may override by including
+                ``"_justification": "..."`` in their arguments.
+
+        Returns:
+            One vendor-shaped result envelope per *processed* tool call, in
+            input order. Non-tool-call items in the input are skipped so the
+            caller can stitch results back into the conversation as-is.
+        """


+    name = tool_use_block.get("name")
+    if not isinstance(name, str) or not name:
+        raise ValueError(
+            "Anthropic tool_use block is missing a 'name' field or it is not a string."
+        )


Copilot AI review requested due to automatic review settings May 13, 2026 10:49

Copilot started reviewing on behalf of dgenio May 13, 2026 10:49 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: OpenAI and Anthropic tool-format adapters with middleware (#55, #50, #40)#69

feat: OpenAI and Anthropic tool-format adapters with middleware (#55, #50, #40)#69
dgenio wants to merge 1 commit into
mainfrom
feat/llm-adapter-middleware

dgenio commented May 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dgenio commented May 13, 2026

What changed

Why

Design decisions

How verified

Scope notes (Mode B)

Risks

AI agent instruction files reviewed

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants