Skip to content

feat: OpenAI and Anthropic tool-format adapters with middleware (#55, #50, #40)#69

Open
dgenio wants to merge 1 commit into
mainfrom
feat/llm-adapter-middleware
Open

feat: OpenAI and Anthropic tool-format adapters with middleware (#55, #50, #40)#69
dgenio wants to merge 1 commit into
mainfrom
feat/llm-adapter-middleware

Conversation

@dgenio
Copy link
Copy Markdown
Owner

@dgenio dgenio commented May 13, 2026

What changed

Adds agent_kernel.adapters with two drop-in middleware classes that translate Capability objects into vendor tool schemas, route tool calls through the full kernel pipeline (grant → invoke → firewall → trace), and return vendor-shaped tool-result objects.

File Change
src/agent_kernel/adapters/__init__.py New — public adapter exports.
src/agent_kernel/adapters/_base.py New — BaseToolMiddleware (hook registration + dispatch, request/grant/invoke flow, error-as-result), ToolCallEvent / ToolResultEvent dataclasses, schema generation (build_input_schema, normalize_for_openai_strict, validate_input), namespace helpers, canonical frame_to_payload / error_to_payload.
src/agent_kernel/adapters/openai.py New — OpenAIMiddleware, capabilities_to_tools, tool_call_to_request, format_result. Supports both Responses API and Chat Completions; auto-detects input shape; dotted IDs ↔ namespace__function.
src/agent_kernel/adapters/anthropic.py New — AnthropicMiddleware, capabilities_to_tools with per-capability and middleware-default cache_control, tool_use_to_request, format_result.
src/agent_kernel/models.py Adds ToolHints dataclass and three optional fields on Capability: parameters_model (pydantic), parameters_schema (raw JSON Schema escape hatch), tool_hints (vendor flags). All default to None.
src/agent_kernel/kernel.py Adds Kernel.list_capabilities() accessor (used by adapters; generally useful).
src/agent_kernel/__init__.py Re-exports OpenAIMiddleware, AnthropicMiddleware, ToolHints.
tests/test_adapters.py New — 57 tests across schema conversion, round-trip, middleware flow, hook ordering (sync + async), abort, justification injection, validation errors, error-as-result paths.
pyproject.toml Adds pydantic>=2 runtime dependency.
docs/integrations.md New "LLM tool-format adapters" section with OpenAI + Anthropic usage examples, namespace mapping, strict mode, cache control, hooks, error handling.
docs/architecture.md New "Adapters" component bullet.
AGENTS.md Updates dep list to httpx + pydantic.
CHANGELOG.md [Unreleased] entries under Added and Changed.

Why

Closes #55, #50, #40. Developers using OpenAI / Anthropic APIs previously had to hand-translate between Capability objects and each vendor's tool schema and stitch the call/result loop through grant_capability / invoke manually. The new middleware eliminates that boilerplate while preserving every kernel invariant — every call still goes through grant + token + firewall + trace, and per-call tokens prevent cross-principal reuse (I-06). PolicyDenied, CapabilityNotFound, DriverError, argument-validation failures, and hook abort signals all surface as tool-result errors so the surrounding agent loop never crashes.

Design decisions

  • Pydantic for schema source. Capability.parameters_model (pydantic) is the canonical input-schema source; parameters_schema (raw dict) is the escape hatch. allowed_fields is left alone — it remains an output redaction control consumed by the firewall and is deliberately not used to advertise input shape (which it was never designed to describe).
  • Both OpenAI shapes. Auto-detect on input; opt-in (format="chat_completions") on output. Default output is Responses API since issue OpenAI tool-format adapter & middleware #55 names function_call_output.
  • Namespace mapping (OpenAI only). Dots → __ (double underscore) so capability IDs with underscores in segments round-trip unambiguously. Anthropic preserves dotted IDs.
  • Strict mode is opt-in per capability via ToolHints(strict=True). The adapter walks the pydantic-emitted schema and forces every object's required + additionalProperties: false. Falls back to non-strict with a warning if normalisation raises.
  • Hooks are sync or async callables (auto-detected). Pre-hooks may mutate args, inject justification (for WRITE/DESTRUCTIVE policy), or set aborted=True. Post-hooks observe (or replace) the resulting Frame. Hook exceptions during pre-invoke become tool errors; exceptions during post-invoke are logged but never crash the batch.
  • Justification flow. Three layers: (1) a handle_tool_calls(..., justification="") batch parameter, (2) per-call override via args["_justification"], (3) hook-injected via event.justification. The simplest path works for READ tools; WRITE tools can supply justification via any of the three.

How verified

ruff format --check src/ tests/ examples/  → 45 files already formatted
ruff check src/ tests/ examples/           → All checks passed
mypy src/                                  → Success: no issues found in 27 source files
pytest -q --cov=agent_kernel               → 364 passed in 5.52s, 96% total coverage
pytest --cov=agent_kernel.adapters         → 57 passed, 98% adapter coverage
PYTHONIOENCODING=utf-8 python examples/basic_cli.py        → ✓
PYTHONIOENCODING=utf-8 python examples/billing_demo.py     → ✓
PYTHONIOENCODING=utf-8 python examples/http_driver_demo.py → ✓

(Local pytest baseline before this PR: 287 tests; after: 364, +77. New tests live entirely in tests/test_adapters.py. Total project coverage improved from 95% → 96%.)

Scope notes (Mode B)

  • New dependency: pydantic>=2. Justified by adapters using model_json_schema() + model_validate(). Updated AGENTS.md so future readers see the new dep set is httpx + pydantic.
  • Capability model extensions: three new fields, all default None. No existing fixture or test required updates. Backward-compatible.
  • Module sizes: _base.py 438 lines, openai.py 351, anthropic.py 268. The first two exceed AGENTS.md's 300-line guideline. The existing repo has three modules in the same boat (policy.py 520, policy_dsl.py 503, kernel.py 466 — see [policy/kernel] Tech debt: decompose policy_dsl.py and broaden dry-run driver test coverage #68). I chose not to split because the helpers and middleware are tightly coupled (every middleware method calls into the helpers); splitting introduces import gymnastics without clarifying semantic boundaries. Happy to factor _base.py into _base.py + _helpers.py if you'd prefer.
  • Not in scope (offered as follow-ups, not bundled): OTel instrumentation of the adapter layer (belongs in OpenTelemetry integration: spans, metrics, and trace export #38); a Capability.parameters_model style change to require pydantic everywhere; SDK-typed return values (the openai / anthropic packages would only buy IDE autocomplete that consumers can get themselves).

Risks

  • Pydantic version drift. Pydantic v2's model_json_schema() may emit Draft 2020-12 features OpenAI strict mode doesn't accept. The normalize_for_openai_strict walker handles the two common issues (additionalProperties, required). For schemas with $ref / $defs, OpenAI strict accepts those — we leave them alone. If a user supplies a schema feature that breaks normalisation, the adapter falls back to non-strict with a warning rather than raising.
  • At-most-once delivery, not at-least-once. The middleware never retries: if kernel.invoke raises a DriverError (after the kernel's own driver fallback), the call becomes a tool error. This matches the rest of the codebase — drivers handle retry internally where appropriate.
  • Hook ordering & state. Hook lists are not protected by a lock. If a middleware is shared across concurrent batches and intercept_* is called from one thread while a batch is dispatching from another, ordering is unspecified. The expected pattern is per-principal middleware instances, hooks registered at setup time — matching the issue body's OpenAIMiddleware(kernel, principal) shape.

AI agent instruction files reviewed

  • AGENTS.md — updated dep set; no convention changes.
  • docs/agent-context/invariants.md — no change needed; adapters consume Frame post-firewall and route through kernel.invoke, so I-01, I-02, I-06 remain enforced by existing code paths.
  • docs/agent-context/review-checklist.md, lessons-learned.md, workflows.md — no change needed.
  • .github/copilot-instructions.md, .claude/CLAUDE.md — no change needed.

Checklist

  • make ci passes (fmt → lint → mypy strict → pytest → examples)
  • Docstrings match the final implementation
  • No dead code (all new parameters, helpers, and types exercised by tests)
  • Naming consistent: capability, principal, grant, Frame throughout (see docs/integrations.md)
  • Backward-compat: new Capability fields default to None; no existing test required updates
  • CHANGELOG.md updated under [Unreleased]
  • Updated canonical docs (docs/architecture.md, docs/integrations.md, AGENTS.md) in the same PR

🤖 Generated with Claude Code

…50, #40)

Adds `agent_kernel.adapters` with two drop-in middleware classes that
translate Capability objects into vendor tool schemas, route tool calls
through the full kernel pipeline (grant → invoke → firewall → trace), and
return vendor-shaped tool-result objects. Both share a `BaseToolMiddleware`
that owns hook registration, error-as-result conversion, and the canonical
Frame → JSON payload shape.

OpenAIMiddleware emits Responses-API tools by default (also supports Chat
Completions via `format=chat_completions`), with dotted capability IDs
mapped to `namespace__function` form and OpenAI `strict` mode opt-in via
`Capability.tool_hints`. AnthropicMiddleware emits Anthropic Messages tools
with optional `cache_control` (per-capability or middleware default) and
preserves dotted capability IDs. Both auto-detect Chat/Responses shape on
input regardless of configured output format.

Capability gains three optional fields: `parameters_model` (pydantic model
used for JSON-Schema generation and input validation), `parameters_schema`
(raw JSON Schema escape hatch), and `tool_hints` (ToolHints — vendor flags).
All default to None, preserving backward compat. Kernel gains a small
`list_capabilities()` accessor.

Adds `pydantic>=2` as a runtime dep (justified by the new adapters; only
used inside the adapters package). No `openai` / `anthropic` SDK
dependency — every adapter function is a pure dict transform.

PolicyDenied, CapabilityNotFound, DriverError, argument-validation failures,
and hook abort signals all surface as tool-result errors rather than raised
exceptions so the LLM can react.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 13, 2026 10:49
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new agent_kernel.adapters package providing OpenAI and Anthropic “tool-format” adapters plus middleware that routes vendor tool calls through the kernel’s full pipeline (grant → invoke → firewall → trace), with schema generation/validation support via Pydantic.

Changes:

  • Added OpenAI + Anthropic adapter modules and a shared BaseToolMiddleware (hooks, dispatch, vendor-shape formatting, schema helpers).
  • Extended Capability with optional parameters_model, parameters_schema, and tool_hints (ToolHints) to drive tool schemas and optional strict/cache settings.
  • Added Kernel.list_capabilities() and updated docs/tests/changelog and runtime deps (pydantic>=2).

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/test_adapters.py New test suite covering schema conversion, middleware flow, hooks, aborts, and error-as-result behavior.
src/agent_kernel/models.py Adds ToolHints and new optional Capability fields for adapter schema/validation/hints.
src/agent_kernel/kernel.py Adds Kernel.list_capabilities() to enumerate registered capabilities.
src/agent_kernel/adapters/_base.py New shared middleware base, hook/event types, schema helpers, payload helpers, namespace helpers.
src/agent_kernel/adapters/openai.py New OpenAI tool schema conversion + middleware supporting Responses + Chat Completions formats.
src/agent_kernel/adapters/anthropic.py New Anthropic tool schema conversion + middleware with optional cache_control.
src/agent_kernel/adapters/init.py Public exports for adapter layer.
src/agent_kernel/init.py Re-exports middlewares and ToolHints at top level.
pyproject.toml Adds runtime dependency on pydantic>=2.
docs/integrations.md Adds “LLM tool-format adapters” documentation and usage examples.
docs/architecture.md Documents adapters as an architecture component.
AGENTS.md Updates minimal dependency list to include pydantic.
CHANGELOG.md Adds [Unreleased] entries describing the new adapter feature set and dependency change.
Comments suppressed due to low confidence (2)

src/agent_kernel/adapters/openai.py:197

  • Same as above: _parse_arguments raises ValueError for invalid argument types/JSON. For consistency with the repo’s error-contract rule in AGENTS.md, map these parse failures to a custom AgentKernelError subclass so callers can reliably catch agent-kernel errors (and so exception types are part of the contract).
    if not isinstance(raw, str):
        raise ValueError(
            f"OpenAI tool_call 'arguments' must be a JSON string or dict, got {type(raw).__name__}."
        )

src/agent_kernel/adapters/anthropic.py:128

  • Same issue here: raising ValueError for non-dict input violates the repo’s “no bare ValueError to callers” rule. If you add a custom adapter parse/validation exception, use it consistently for all adapter-facing shape errors.
    if raw_input is None:
        raw_input = {}
    if not isinstance(raw_input, dict):
        raise ValueError(
            f"Anthropic tool_use 'input' must be an object (got {type(raw_input).__name__})."
        )

Comment on lines +185 to +196
``billing.list_invoices`` → ``billing__list_invoices``. The ``__`` separator
is reserved so the inverse mapping is unambiguous even when individual
segments contain underscores.
"""
return capability_id.replace(".", _NAMESPACE_SEP)


def restore_namespace(safe_name: str) -> str:
"""Inverse of :func:`make_namespace_safe_name`.

``billing__list_invoices`` → ``billing.list_invoices``.
"""
Comment on lines +175 to +179
if not isinstance(name, str) or not name:
raise ValueError(
"OpenAI tool_call is missing a function name. Expected either "
"'function.name' (Chat Completions) or 'name' (Responses API)."
)
Comment on lines +273 to +286
Args:
tool_calls: Either ``response.output`` items from the Responses API
(filtered or unfiltered — non-function items are passed
through unchanged) or ``message.tool_calls`` items from the
Chat Completions API. Input shape is auto-detected per call.
justification: Justification applied to every call in the batch.
Individual calls may override by including
``"_justification": "..."`` in their arguments.

Returns:
One vendor-shaped result envelope per *processed* tool call, in
input order. Non-tool-call items in the input are skipped so the
caller can stitch results back into the conversation as-is.
"""
Comment on lines +117 to +121
name = tool_use_block.get("name")
if not isinstance(name, str) or not name:
raise ValueError(
"Anthropic tool_use block is missing a 'name' field or it is not a string."
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenAI tool-format adapter & middleware

2 participants