Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/_data/nav.yml
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@
url: /features/acp/
- title: API Server
url: /features/api-server/
- title: Chat Server
url: /features/chat-server/
- title: Evaluation
url: /features/evaluation/
- title: RAG
Expand Down
4 changes: 2 additions & 2 deletions docs/configuration/models/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ models:
| Property | Type | Required | Description |
| --------------------- | ---------- | -------- | ------------------------------------------------------------------------------------- |
| `provider` | string | ✓ | Provider: `openai`, `anthropic`, `google`, `amazon-bedrock`, `dmr`, `mistral`, `xai`, `nebius`, `minimax`, `requesty`, `azure`, `ollama`, `github-copilot`, or any [named provider]({{ '/providers/custom/' | relative_url }}). |
| `model` | string | ✓ | Model name (e.g., `gpt-4o`, `claude-sonnet-4-0`, `gemini-2.5-flash`) |
| `model` | string | ✓ | Model name (e.g., `gpt-4o`, `claude-sonnet-4-5`, `gemini-2.5-flash`) |
| `temperature` | float | ✗ | Sampling randomness. Range is provider-dependent — typically `0.0–2.0` (Anthropic caps at `1.0`). `0.0` is deterministic. |
| `max_tokens` | int | ✗ | Maximum response length in tokens |
| `top_p` | float | ✗ | Nucleus sampling threshold (`0.0–1.0`) |
Expand Down Expand Up @@ -232,7 +232,7 @@ models:
# Anthropic
claude:
provider: anthropic
model: claude-sonnet-4-0
model: claude-sonnet-4-5
max_tokens: 64000

# Google Gemini
Expand Down
2 changes: 1 addition & 1 deletion docs/configuration/structured-output/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ agents:
```yaml
agents:
classifier:
model: anthropic/claude-sonnet-4-0
model: anthropic/claude-sonnet-4-5
description: Classify support tickets
instruction: |
Classify the support ticket into the appropriate category
Expand Down
2 changes: 1 addition & 1 deletion docs/configuration/tools/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -395,7 +395,7 @@ toolsets:
```yaml
agents:
root:
model: anthropic/claude-sonnet-4-0
model: anthropic/claude-sonnet-4-5
description: Full-featured developer assistant
instruction: You are an expert developer.
toolsets:
Expand Down
2 changes: 1 addition & 1 deletion docs/features/api-server/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,6 @@ Toggle auto-approve with `POST /api/sessions/:id/tools/toggle` for automated wor
<div class="callout callout-info" markdown="1">
<div class="callout-title">ℹ️ See also
</div>
<p>For interactive use, see the <a href="{{ '/features/tui/' | relative_url }}">Terminal UI</a>. For agent-to-agent communication, see <a href="{{ '/features/a2a/' | relative_url }}">A2A Protocol</a> and <a href="{{ '/features/acp/' | relative_url }}">ACP</a>. For MCP integration, see <a href="{{ '/features/mcp-mode/' | relative_url }}">MCP Mode</a>.</p>
<p>For interactive use, see the <a href="{{ '/features/tui/' | relative_url }}">Terminal UI</a>. For agent-to-agent communication, see <a href="{{ '/features/a2a/' | relative_url }}">A2A Protocol</a> and <a href="{{ '/features/acp/' | relative_url }}">ACP</a>. For MCP integration, see <a href="{{ '/features/mcp-mode/' | relative_url }}">MCP Mode</a>. For an OpenAI-compatible chat-completions API, see the <a href="{{ '/features/chat-server/' | relative_url }}">Chat Server</a>.</p>

</div>
230 changes: 230 additions & 0 deletions docs/features/chat-server/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
---
title: "Chat Server"
description: "Expose your agents through an OpenAI-compatible Chat Completions API so any tool that already speaks OpenAI can drive a docker-agent agent."
permalink: /features/chat-server/
---

# Chat Server

_Expose your agents through an OpenAI-compatible Chat Completions API so any tool that already speaks OpenAI can drive a docker-agent agent._

## Overview

The `docker agent serve chat` command starts an HTTP server that exposes one or
more agents through an **OpenAI-compatible Chat Completions API** at
`/v1/chat/completions` and `/v1/models`. Any client that already speaks the
OpenAI protocol — for example
[Open WebUI](https://github.com/open-webui/open-webui), `curl`, the OpenAI
Python SDK, or LangChain — can drive a docker-agent agent without any custom
integration.

```bash
# Single agent — exposed as the model `root`
$ docker agent serve chat agent.yaml

# Multi-agent config — every agent in the team becomes a model
$ docker agent serve chat ./team.yaml

# Pick a specific agent from a multi-agent config
$ docker agent serve chat ./team.yaml --agent reviewer

# Run an agent straight from the registry
$ docker agent serve chat agentcatalog/pirate --listen 127.0.0.1:9090

# Require a Bearer token, sourced from an env var
$ docker agent serve chat agent.yaml --api-key-env CHAT_BEARER_TOKEN
```

<div class="callout callout-tip" markdown="1">
<div class="callout-title">💡 When to use chat server vs. API server
</div>
<p>Use the <strong>chat server</strong> when you want to plug docker-agent into existing OpenAI-compatible tooling (chat UIs, IDE integrations, OpenAI SDK clients). Use the <a href="{{ '/features/api-server/' | relative_url }}">API server</a> when you want full control over sessions, agent execution, tool-call confirmations, and streamed runtime events.</p>

</div>

## Endpoints

The OpenAI-compatible endpoints live under the `/v1` prefix to match the
OpenAI API surface. The OpenAPI specification is served at the top level so it
can be discovered without authentication.

| Method | Path | Description |
| ------ | ---------------------- | ---------------------------------------------------------------------- |
| `GET` | `/v1/models` | List the agents that this server exposes as models |
| `POST` | `/v1/chat/completions` | Send messages and receive a completion (regular or streaming) |
| `GET` | `/openapi.json` | OpenAPI specification for the chat server |

The model identifier in `POST /v1/chat/completions` is the **agent name**.
For a single-agent config that's typically `root`; for a multi-agent config,
each named agent becomes its own selectable model.

## Quick Start

```bash
# 1. Start the server
$ docker agent serve chat agent.yaml
Listening on 127.0.0.1:8083
OpenAI-compatible chat completions endpoint: http://127.0.0.1:8083/v1/chat/completions

# 2. List exposed agents (models)
$ curl http://127.0.0.1:8083/v1/models
{"object":"list","data":[{"id":"root","object":"model","owned_by":"docker-agent"}]}

# 3. Send a chat request
$ curl http://127.0.0.1:8083/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "root",
"messages": [{"role": "user", "content": "Hello!"}]
}'
```

### Streaming

Set `"stream": true` in the request body to receive a Server-Sent Events
(SSE) stream of OpenAI-format `chat.completion.chunk` deltas:

```bash
$ curl -N http://127.0.0.1:8083/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "root",
"stream": true,
"messages": [{"role": "user", "content": "Stream a poem"}]
}'
```

### Drive it from the OpenAI Python SDK

Because the wire format is OpenAI-compatible, point any OpenAI client at the
chat server's `base_url` and use the agent name as the model:

```python
from openai import OpenAI

client = OpenAI(
base_url="http://127.0.0.1:8083/v1",
api_key="not-needed-when-no-api-key-flag", # required by the SDK, ignored if no auth
)

resp = client.chat.completions.create(
model="root",
messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)
```

## Server-side Conversation Caching

By default the server is **stateless**: every request must contain the full
message history, exactly like OpenAI's API. Enable server-side caching by
setting `--conversations-max` to a positive value, then send a stable
`X-Conversation-Id` header on each request:

```bash
$ docker agent serve chat agent.yaml --conversations-max 100 --conversation-ttl 30m
```

```bash
$ curl http://127.0.0.1:8083/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'X-Conversation-Id: my-thread-1' \
-d '{
"model": "root",
"messages": [{"role": "user", "content": "Remember my name is Alice"}]
}'

$ curl http://127.0.0.1:8083/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'X-Conversation-Id: my-thread-1' \
-d '{
"model": "root",
"messages": [{"role": "user", "content": "What is my name?"}]
}'
```

Cached conversations are evicted after `--conversation-ttl` of inactivity, or
when the cache hits `--conversations-max` items (oldest entries are evicted
first).

## Authentication

The chat server has **no authentication by default**. To require a Bearer
token, pass `--api-key` (literal value) or `--api-key-env` (name of an
environment variable that holds the value):

```bash
$ docker agent serve chat agent.yaml --api-key-env CHAT_BEARER_TOKEN
```

Clients must then send an `Authorization: Bearer <token>` header on every
request to `/v1/*`. Both `/v1/models` and `/v1/chat/completions` are
protected once a key is set.

<div class="callout callout-warning" markdown="1">
<div class="callout-title">⚠️ Public exposure
</div>
<p>The default listen address is <code>127.0.0.1:8083</code>. If you bind to a non-loopback address, always set <code>--api-key</code> or <code>--api-key-env</code> — there is no other authentication layer.</p>

</div>

## CORS

CORS is **disabled by default**. To allow a browser-based client to call the
server, set `--cors-origin` to the exact origin (scheme + host + port) that
should be allowed:

```bash
$ docker agent serve chat agent.yaml --cors-origin https://my-ui.example.com
```

## CLI Flags

```bash
docker agent serve chat <agent-file>|<registry-ref> [flags]
```

| Flag | Default | Description |
| ----------------------------- | ------------------ | ----------------------------------------------------------------------------------------------------------------- |
| `-a, --agent <name>` | (all agents) | Name of the agent to expose. If omitted, every agent in the config is exposed as a separate model. |
| `-l, --listen <addr>` | `127.0.0.1:8083` | Address to listen on. |
| `--cors-origin <origin>` | (none) | Allowed CORS origin (e.g. `https://example.com`). Empty disables CORS. |
| `--api-key <token>` | (none) | Required Bearer token clients must present (`Authorization: Bearer <token>`). Empty disables auth. |
| `--api-key-env <name>` | (none) | Read the API key from this environment variable instead of the command line. |
| `--max-request-size <bytes>` | `1048576` (1 MiB) | Maximum request body size. |
| `--request-timeout <dur>` | `5m` | Per-request timeout (covers model + tool calls + streaming). |
| `--conversations-max <n>` | `0` | Cache up to N conversations server-side, keyed by `X-Conversation-Id`. `0` disables — clients must resend history. |
| `--conversation-ttl <dur>` | `30m` | Idle TTL after which a cached conversation is evicted. |
| `--max-idle-runtimes <n>` | `4` | Maximum number of idle runtimes pooled per agent. `0` disables pooling. |

All [runtime configuration flags]({{ '/features/cli/#runtime-configuration-flags' | relative_url }})
(`--working-dir`, `--env-from-file`, `--models-gateway`, `--hook-*`, …) are
also accepted.

## Open WebUI Integration

Open WebUI can talk to any OpenAI-compatible endpoint. To plug docker-agent
in:

1. Start the chat server, optionally with auth:

```bash
$ docker agent serve chat agent.yaml \
--listen 127.0.0.1:8083 \
--cors-origin http://localhost:3000 \
--api-key-env OPEN_WEBUI_TOKEN
```

2. In Open WebUI, add an OpenAI-compatible connection:

- **API Base URL:** `http://127.0.0.1:8083/v1`
- **API Key:** the value of `OPEN_WEBUI_TOKEN`

3. Each agent in your config appears as a selectable model.

<div class="callout callout-info" markdown="1">
<div class="callout-title">ℹ️ See also
</div>
<p>For the docker-agent–native HTTP API (sessions, tool-call confirmation, runtime events), see the <a href="{{ '/features/api-server/' | relative_url }}">API Server</a>. For full CLI flag documentation, see the <a href="{{ '/features/cli/#docker-agent-serve-chat' | relative_url }}">CLI Reference</a>.</p>

</div>
10 changes: 6 additions & 4 deletions docs/features/cli/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,8 @@ $ docker agent run [config] [message...] [flags]
$ docker agent run agent.yaml
$ docker agent run agent.yaml "Fix the bug in auth.go"
$ docker agent run agent.yaml -a developer --yolo
$ docker agent run agent.yaml --model anthropic/claude-sonnet-4-0
$ docker agent run agent.yaml --model "dev=openai/gpt-4o,reviewer=anthropic/claude-sonnet-4-0"
$ docker agent run agent.yaml --model anthropic/claude-sonnet-4-5
$ docker agent run agent.yaml --model "dev=openai/gpt-4o,reviewer=anthropic/claude-sonnet-4-5"
$ docker agent run agent.yaml --session -1 # resume last session
$ docker agent run agent.yaml --prompt-file ./context.md # include file as context

Expand Down Expand Up @@ -265,6 +265,8 @@ $ curl http://127.0.0.1:8083/v1/chat/completions \
-d '{"model": "root", "messages": [{"role": "user", "content": "hello"}]}'
```

See [Chat Server]({{ '/features/chat-server/' | relative_url }}) for the full feature reference.

### `docker agent share push` / `docker agent share pull`

Share agents via OCI registries.
Expand Down Expand Up @@ -344,7 +346,7 @@ $ docker agent alias add other ociReference
# Add an alias with runtime options
$ docker agent alias add yolo-coder agentcatalog/coder --yolo
$ docker agent alias add fast-coder agentcatalog/coder --model openai/gpt-4o-mini
$ docker agent alias add turbo agentcatalog/coder --yolo --model anthropic/claude-sonnet-4-0
$ docker agent alias add turbo agentcatalog/coder --yolo --model anthropic/claude-sonnet-4-5

# Use an alias
$ docker agent run pirate
Expand All @@ -364,7 +366,7 @@ $ docker agent alias ls
Registered aliases (3):

fast-coder → agentcatalog/coder [model=openai/gpt-4o-mini]
turbo → agentcatalog/coder [yolo, model=anthropic/claude-sonnet-4-0]
turbo → agentcatalog/coder [yolo, model=anthropic/claude-sonnet-4-5]
yolo-coder → agentcatalog/coder [yolo]

Run an alias with: docker agent run <alias>
Expand Down
4 changes: 2 additions & 2 deletions docs/features/mcp-mode/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,14 +108,14 @@ When you expose a multi-agent configuration via MCP, each agent becomes a separa
```yaml
agents:
root:
model: anthropic/claude-sonnet-4-0
model: anthropic/claude-sonnet-4-5
description: Main coordinator
sub_agents: [designer, engineer]
designer:
model: openai/gpt-5-mini
description: UI/UX design specialist
engineer:
model: anthropic/claude-sonnet-4-0
model: anthropic/claude-sonnet-4-5
description: Software engineer
```

Expand Down
2 changes: 1 addition & 1 deletion docs/features/remote-mcp/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ Combine multiple remote MCP servers in a single agent:
```yaml
agents:
root:
model: anthropic/claude-sonnet-4-0
model: anthropic/claude-sonnet-4-5
instruction: |
You help manage projects and deployments.
toolsets:
Expand Down
2 changes: 1 addition & 1 deletion docs/guides/go-sdk/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ openaiClient, _ := openai.NewClient(ctx, &latest.ModelConfig{
// Anthropic
anthropicClient, _ := anthropic.NewClient(ctx, &latest.ModelConfig{
Provider: "anthropic",
Model: "claude-sonnet-4-0",
Model: "claude-sonnet-4-5",
}, env)

// Google Gemini
Expand Down
Loading
Loading