Streaming requests to /v1/chat/completions return 500 or are silently dropped

## Problem

When a client sends `stream: true` in a `/v1/chat/completions` request, the response is either a 500 error or silently dropped, depending on the provider.

### Root cause

`chat_handler` does not support streaming — it always calls `g_app.chat_completion()` and returns the result via `web.json_response()`. However, `process_chat()` only sets `stream=false` when the field is **absent** from the request:

```python
if "stream" not in chat:
    chat["stream"] = False
```

When a client explicitly sends `stream=true`:

1. **Provider-side failure**: The upstream provider (e.g. Ollama) returns SSE (`text/event-stream`). `response_json()` tries to parse SSE as JSON → `"Expecting value: line 1 column 1 (char 0)"` → exception → HTTP 500.

2. **Client-side failure**: Even if the provider call succeeds (e.g. with providers that ignore the stream flag), `chat_handler` returns plain JSON. Streaming clients like the Vercel AI SDK (`@ai-sdk/openai-compatible`) expect `text/event-stream` with `chat.completion.chunk` objects containing a `delta` field. They silently discard the unexpected JSON and report no response.

### Impact

Any client that defaults to `stream: true` (which is the default for most OpenAI-compatible SDKs, including OpenClaw's embedded agent) gets either 500 errors or empty responses.

### Reproduction

```bash
# From any client pointing at llmspy:
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"<any-model>","messages":[{"role":"user","content":"hi"}],"stream":true}'
# Returns 500 or provider-dependent error
```

## Proposed fix

1. Force `stream=false` unconditionally in `process_chat()` so providers always return parseable JSON.
2. In `chat_handler`, detect the client's original `stream` preference and, when `true`, convert the JSON response to SSE chunks in `chat.completion.chunk` format before sending.

See PR for implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming requests to /v1/chat/completions return 500 or are silently dropped #4

Problem

Root cause

Impact

Reproduction

Proposed fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Streaming requests to /v1/chat/completions return 500 or are silently dropped #4

Description

Problem

Root cause

Impact

Reproduction

Proposed fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions