Skip to content

[backend] LLMBackend Protocol + canonical types + native reference implementations #87

@dep0we

Description

@dep0we

Background

atomic_agents/_llm.py uses procedural dispatch (if/elif by model-id prefix) for LLM provider routing rather than the protocol-pattern abstraction that PR #57 (MemoryBackend) established for the framework. This is the antipattern Principle #2 in CLAUDE.md warns against ("Don't bolt on Postgres support as if backend == 'postgres':.... Define the protocol; ship filesystem-default; let alternate impls register at import time.").

Three providers work today (Anthropic, OpenAI, Moonshot Kimi) — the normalization at the bottom (_RawLLMResponse) is real and agent.call() already sees a uniform shape. What's missing is the abstraction at the top: there's no register_llm_backend(), no Protocol contract, no way for an external package to add Gemini / Bedrock / Ollama / Vertex without forking _llm.py.

This issue is the missing protocol in the protocol-pattern series alongside #60 (Lock), #61 (Log), #62 (Persona), #63 (AgentProfile), #64 (ToolRegistry), #65 (Corpus).

Why this matters now

  • Pre-public-flip positioning. When Atomic Agents goes public, the spec needs to be teachable as "here are 28 numbered docs, here's the protocol surface — implement against it." Procedural LLM dispatch in _llm.py is operator-visible and obstructs the spec-portability story.
  • Conformance suite (ROADMAP [infra] Import the Atomic Agents spec into docs/ #9). Alternate spec implementations need a Protocol to conform against. Not just for Memory.
  • Issue [deployment] LLM client resilience layer — retries, timeout, 429/529 handling #81 (LLM client resilience). Currently Anthropic-specific. After LLMBackend lands, resilience composes via a RetryingLLMBackend wrapper that works against any backend.
  • Multi-provider demand. Operators ask about Gemini / Bedrock / Ollama. The right answer is "here's the LLMBackend Protocol; ship a 200-line third-party package" — not "add it to the framework's core."

Pre-merge review (codex)

Codex pressure-tested the architectural plan before this issue was filed (per the project's review-in-rounds methodology). 1 P1 + 5 P2 + 4 P3 findings, all addressed in the linked plan. Highlights:

  • P1 (load-bearing): Provider coupling lives in agent.py:263, 279, 932 (tool dispatch + tool-result follow-up), not just _llm.py. Refactoring only the dispatcher would leave third-party backends unable to fully participate in tool loops. Fix: introduce canonical LLMToolDefinition / LLMToolUse / LLMToolResult types in atomic_agents/llm/types.py; backends translate at the boundary; agent.py only sees canonical types.
  • P2: supported_capabilities() → set[str] is too coarse — capabilities are per-model, not per-backend. Fix: typed LLMCapabilities(model_id) dataclass with explicit fields.
  • P2: Cost/token estimation must be pluggable when models become pluggable. Fix: Protocol gains pricing(model_id) → PricingInfo | None + count_tokens(...); _costs.PRICING becomes the fallback.
  • P2: cache_breakpoints: bool flattens what's currently a list (agent.py:727) and spec/04's multi-layer cache model. Fix: structured cache_directives: list[CacheDirective] preserves intent.
  • P2: Async/streaming as separate Protocols (SyncLLMBackend v1; AsyncLLMBackend + StreamingLLMBackend reserved for future).
  • P2: Registry conflict semantics — when two backends both claim a model id, raise AmbiguousBackendError; model.md can specify provider: <id> to disambiguate.

Why NOT LiteLLM in core (conformance-boundary framing)

An earlier review recommended LiteLLM as the default backend (one OSS dependency, 100+ providers). On reflection: LiteLLM in core is the wrong answer for this framework's positioning.

  • Atomic Agents only claims what its spec + conformance suite cover (per CLAUDE.md [infra] Add CI workflow — GitHub Actions for tests #10). Blessing LiteLLM means our spec implicitly endorses a third-party abstraction whose own correctness across 100+ providers is a moving target.
  • Anthropic-specific features like cache_control (load-bearing for cost optimization on long persona prompts) need first-class treatment; LiteLLM's normalization may not preserve them cleanly.
  • LangChain's experience is the cautionary tale: framework users get pulled into the abstraction's drift.

LiteLLM as a third-party atomic-agents-litellm adapter is fine and welcome — community-maintained, opt-in for operators who want broader provider coverage at the cost of an extra dependency. It just doesn't ship in core. The conformance boundary is the LLMBackend Protocol; anything that satisfies it can be a backend, in core or out.

Deliverables (10 steps)

  1. Canonical typesatomic_agents/llm/types.py with LLMToolDefinition, LLMToolUse, LLMToolResult, CacheDirective, LLMCapabilities, PricingInfo
  2. SyncLLMBackend Protocolatomic_agents/llm/backend.py with methods: provider_id, supports_model, capabilities(model_id), pricing(model_id), count_tokens(...), call(...), format_tool_results(...). Reserved namespaces for AsyncLLMBackend and StreamingLLMBackend (not implemented v1).
  3. Module layoutatomic_agents/llm/{__init__.py, types.py, backend.py, anthropic.py, openai_compat.py, moonshot.py, _utils.py}
  4. Reference implementations:
    • AnthropicLLMBackend — wraps current _call_anthropic + tool translation
    • OpenAICompatibleLLMBackend — configurable class (provider_id, key_spec, model_namespace, model_transform, base_url, capability_hooks); covers OpenAI direct + future endpoints that fit the contract
    • MoonshotLLMBackend — thin factory subclass for readability
  5. Refactor agent.py tool dispatch — lines 263, 279, 932 use canonical types instead of model-prefix branching (codex P1)
  6. Registry conflict semanticsAmbiguousBackendError + optional provider: field in model.md
  7. Spec docdocs/spec/28-llm-backend.md mirroring 20-memory-backend.md (numbered 28 since doctor took 27)
  8. Conformance test suite — ~30 conformance + ~30 impl-specific = ~60 total tests
  9. Doc updates + backward-compat_llm.py becomes a thin shim that preserves _get_key() helpers for doctor.py:501 compatibility (codex P3); CHANGELOG, README "What's shipped", CLAUDE.md architecture diagram, spec/04 + spec/17 cross-refs
  10. Cross-reference issue [deployment] LLM client resilience layer — retries, timeout, 429/529 handling #81 (LLM client resilience) — implement as RetryingLLMBackend wrapper composing over any SyncLLMBackend

Cross-references

Sequencing

  1. ✅ Plan drafted + codex-reviewed in ~/.claude/plans/i-built-multiple-versions-frolicking-liskov.md
  2. ✅ This issue filed
  3. ⏳ CHANGELOG [Unreleased] stub added (this commit / a follow-up tiny PR)
  4. ⏳ Implementation in a separate session per the standard methodology — codex rounds pre-merge, bisectable commits, /ship workflow, lands as v0.11.0 (additive Minor; no ### BREAKING callout — backward-compat strict)

Out of scope

  • Async/streaming Protocols (reserved for future)
  • Gemini / Bedrock / Vertex / Ollama backends (third-party packages can ship; framework stays small)
  • LiteLLM in core (third-party atomic-agents-litellm welcome; not in core per conformance-boundary framing above)
  • pyproject.toml dependency changes (current anthropic hard dep + [openai] extra cover the reference backends)

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendProtocol-pattern backend abstractions (memory, logs, locks, etc.)enhancementNew feature or requestspecImplementation of an Atomic Agents spec doc

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions