v0.8.0
What's new in 0.8.0
Added
-
OpenAI-compatible chat-completions HTTP face (
amplifier-agent serve chat-completions) —/v1/models+/v1/chat/completionsover HTTP with bearer-token auth. Streams responses, multi-provider routing via served-models registry, workspace correlation throughX-Client-Session-Idheader. Enables direct integration with opencode via theamplifier-app-opencodewrapper and any OpenAI-compatible client. (#65) -
amplifier-agent authsubcommand — set/list/remove/status/clear actions over~/.amplifier-agent/credentials.json(mode 0600). Resolution chain is env-first: shell env vars (ANTHROPIC_API_KEY, …) win over the file so existing shell-rc workflows are unchanged. "Set once, works everywhere" UX matchingclaude login/gh auth login/aws configurewithout the OAuth ceremony. (#65) -
Host-tool delegation — tools declared by the host (
host_config.json:host_tools) are surfaced to the model with stub schemas; on invocation, server emits a signal tool_call back to the client (samechunk_id), client executes host-side, returns result for the model to continue. Lets the host own filesystem, shell, browser, or any custom tool without bundling into amplifier-agent. (#65) -
Model routing matrix integration — per-role provider/model preferences resolved per turn. (#64)
-
cost_usdin chat-completions usage envelope — non-standard field carrying the actual dollar cost provider modules computed, accumulated across sub-turns, serialized as a string for Decimal precision. Standard OpenAI clients ignore the field; cost-aware clients render the real per-turn $$. (#68) -
host_config.providers(plural) registry — declares which providers the server-mode lifespan loads and how to instantiate each. Schema:providers: {<provider_id>: {module?: str, config?: dict}}. Module defaults to provider_id when omitted; each provider's config is passed asextra_configintolist_provider_models(). (#69) -
amplifier-agent serve status / stop / restartsubcommands — operational lifecycle for the chat-completions HTTP server. Status reports running/stale/not-running, where it's reachable, and how many models from which providers it's serving (self-cleans stale state files when the PID is gone). Stop sends SIGTERM with a configurable graceful-exit window (--timeout), escalating to SIGKILL on expiry or on--force. Restart performs an identity-restart using the args stored at original launch. State is tracked in~/.amplifier-agent/state/serve.json(mode 0600, parent dir 0700; api_key is sensitive and never logged). (#69)
Changed
-
Breaking (server mode only):
amplifier-agent serve chat-completionsnow requireshost_config.providersto be a non-empty dict. Any provider declared there that cannot initialize (missing credentials, module not installed,list_models()raises, returns 0 models) causes the server to exit 2 with a structured error listing every problem. The previous behavior — iterating a hardcodedKNOWN_PROVIDERSlist, silently skipping unreachable providers, and falling back to an unusable placeholderamplifiermodel — is gone. Single-turn mode (amplifier-agent run) is unaffected. (#69) -
POST /v1/chat/completionsnow validatesmodelagainst the served registry. Requests with an unknown model return HTTP 400{"error": {"code": "unknown_model", ...}}immediately, instead of being silently routed to whichever provider loaded first and failing 4 seconds later with an upstreamnot_found_errorembedded indelta.content. (#69) -
stream: falseis now honored. Requests with that flag return a single JSON body; onlystream: true(or absent) uses SSE. (#69) -
Upstream errors raised before any content chunks are emitted now surface as HTTP 502 with a structured OpenAI-shape error envelope, instead of being embedded inside
delta.contentof a 200 SSE response. (Mid-stream errors after the first chunk remain embedded indelta.content— once SSE starts, the status line is committed.) (#69) -
/v1/modelsno longer falls back to a placeholder{"id": "amplifier", ...}entry. The lifespan guaranteesserved_models_registryis non-empty (or the server exits at boot), so the fallback was unreachable in practice. (#69) -
Lifespan provider iteration now reads from
host_config.providersinstead of the hardcodedKNOWN_PROVIDERScatalog. The CLI'sKNOWN_PROVIDERSconstant stays defined for single-turn mode and other CLI commands. (#69) -
/v1/modelsresponse surfaces a_providertag per model so OpenAI-compatible clients can see which provider serves each entry. (#65) -
Usage-counter telemetry in chat-completions responses correctly reflects the provider that actually served the turn. (#65)
Wire protocol
Unchanged at 0.3.0 — no wrapper bump. TypeScript wrapper stays at 0.7.0, Python wrapper stays at 0.3.0.
Migration
Existing server-mode users (anyone running amplifier-agent serve chat-completions on a 0.7.x or pre-#69 0.8.0 commit): Add a providers block to your host_config.json. Minimum to keep working with just Anthropic:
{
"providers": {
"anthropic": {}
}
}Multi-provider example:
{
"providers": {
"anthropic": {},
"openai": {"config": {"base_url": "https://api.openai.com/v1"}}
}
}Without host_config.providers, the server will exit at boot with a clear error message rather than running in a broken half-state.
No other breaking changes. Existing CLI (run, models list), JSON-RPC wire, single-turn mode, and HTTP clients reading standard OpenAI fields all continue to work.
See CHANGELOG.md [0.8.0] for full details.