fix(agents): Agent Chat streaming fails on Ollama (400 'invalid message content type: <nil>')

## Summary

Interactive **Agent Chat** (Chat page → WebSocket `/agents/stream` → `AgentService.stream_chat` → `agent.run_stream`) fails on a local-Ollama agent. Both the primary (`ollama:qwen3:8b`) and fallback (`ollama:llama3.1:8b`) return the same Ollama-side 400, so `FallbackModel` raises a `FallbackExceptionGroup` and the UI shows *"Stream error: All models from FallbackModel failed (2 sub-exceptions)"*.

## Exact error (both sub-exceptions)

```
openai.BadRequestError: 400 - {'message': 'invalid message content type: <nil>', 'type': 'invalid_request_error'}
pydantic_ai.exceptions.ModelHTTPError: status_code: 400, model_name: qwen3:8b   (and llama3.1:8b)
```

Raised from `pydantic_ai/models/openai.py:request_stream → _completions_create` (`stream=True`) against Ollama's OpenAI-compatible `/v1/chat/completions`. Fast fail (~1s), first turn (`history_length=0`).

## Root cause

Streaming-path incompatibility between PydanticAI's `OpenAIChatModel.request_stream` (OpenAI client + `OllamaProvider`, base_url `…/v1`) and Ollama's OpenAI-compat endpoint: a message in the streamed request is serialized with `content: null`, which Ollama rejects (stricter than the real OpenAI API, which tolerates it).

## Key distinction (verified)

- The **non-streaming** path (`agent.run()` in `AgentService.chat`, used by the showcase `agent_hitl_flow` step) **works** on Ollama (ran ~15 s, HTTP 200, tokens returned).
- Only the **streaming** path (`agent.run_stream` in `AgentService.stream_chat`) hits the 400.
- **Cloud** providers stream fine (Gemini streamed earlier; it only failed on quota). So this is Ollama-OpenAI-compat + streaming specific — not a key/quota/model-name issue, and unrelated to the #336 (pending_action) / #340 (key-present) fixes.

## Proposed fix

Add a **non-streaming fallback for the `ollama` provider** in `AgentService.stream_chat`: when `agent_default_model` (and/or fallback) is an `ollama:` model, run the turn with `agent.run()` and emit the result through the existing event path — one `text_delta` with the full text, then the existing `deps.pending_action` → `approval_required` handling, then `complete`. This sidesteps the broken streamed request while preserving the WS contract and the #336 HITL approval flow.

- Cloud providers keep the true streaming path (must remain unaffected).
- Alternative considered: sanitize the null-content message before the streamed request — rejected as brittle (it lives in PydanticAI/openai-client serialization). A PydanticAI version bump is out of scope (stop-and-ask per AGENTS.md).

## Tests

- `stream_chat` with an `ollama:*` `agent_default_model` emits `text_delta` + `complete` (and `approval_required` when a gated tool fires) without calling `run_stream`.
- `stream_chat` with a cloud `agent_default_model` still uses `run_stream` (regression guard).

## Notes

Found 2026-06-01 testing the #336 HITL approval card via Chat on a local-Ollama stack. Compounding local limitation: even on the working non-streaming path, qwen3:8b (8B) often doesn't emit tool calls (see related work).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agents): Agent Chat streaming fails on Ollama (400 'invalid message content type: <nil>') #342

Summary

Exact error (both sub-exceptions)

Root cause

Key distinction (verified)

Proposed fix

Tests

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

fix(agents): Agent Chat streaming fails on Ollama (400 'invalid message content type: <nil>') #342

Description

Summary

Exact error (both sub-exceptions)

Root cause

Key distinction (verified)

Proposed fix

Tests

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions