Skip to content

/v1/chat/completions streams thinking into content instead of reasoning_content when no tools are requested #13

@luisi-tech

Description

@luisi-tech

May not be an issue for most and I’m not savvy enough to know if this would be the same for other chat hosts, but experienced this and asked Opus to summarize what it found when using ds4-server with openai:

Streaming chat requests without a tools field bypass the structured OpenAI streamer and fall through to the plain SSE path, which emits every token (including the literal closer) as content deltas. Reasoning never lands in reasoning_content, and clients like Jan that look for balanced ... tags see only the closer.

Repro:

curl -N http://127.0.0.1:8000/v1/chat/completions
-H 'Content-Type: application/json'
-d '{"model":"deepseek-v4-flash","messages":[{"role":"user","content":"What is 17 * 23?"}],"stream":true}'

Observed: every delta is {"content":"..."}, including one {"content":""} mid-stream.
Expected: thinking tokens in reasoning_content deltas, no literal on the wire.
Cause is in ds4_server.c around line 4964: structured_stream and openai_live_tools both require j->req.has_tools, so non-tool chats skip openai_sse_stream_update (which already handles thinking correctly) and use the plain sse_chunk path at line 5083. Dropping the has_tools requirement so the OpenAI structured streamer runs for all OpenAI streaming chats fixes it; verified locally against Jan.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions