fix: yield thinking tag boundaries in content='all' mode by cpsievert · Pull Request #296 · posit-dev/chatlas

cpsievert · 2026-05-07T00:00:19Z

Summary

Removes the if content_mode == "text": guards around tag boundary yield statements in both the sync and async _submit_turns paths
<thinking>\n and \n</thinking>\n\n are now yielded to consumers in all modes, not just content="text"
Updates TestStreamThinkingAll and test_content_all_async to assert that tag boundaries ARE present alongside ContentThinking objects
Adds a test_order_of_chunks test that verifies the exact sequence: open-tag string → ContentThinking → close-tag string → response text

Motivation

shinychat PR posit-dev/shinychat#210 removes server-side thinking detection and relies entirely on the client's tag parser seeing <thinking> markers in the stream. Tool-use apps must use content="all" mode. Without this fix those apps see thinking content rendered as regular assistant text because the <thinking> / </thinking> boundary strings are never passed downstream.

Test plan

All 11 tests in tests/test_stream_thinking.py pass
uv run pyright chatlas/_chat.py → 0 errors, 0 warnings

The streaming loop now emits `<thinking>\n` before the first thinking chunk and `\n</thinking>\n\n` on transition to non-thinking content (or at end of stream), giving consumers well-formed output. For `content="text"` mode, tags are yielded as string chunks so concatenated output is properly delimited. For `content="all"` mode, behavior is unchanged — typed ContentThinking objects are yielded. Also removes the synthetic "\n\n" separator from the OpenAI provider's reasoning_summary_text.done event since the thinking→text transition now provides the visual break. Companion to tidyverse/ellmer#975.

Streaming chunks are fragments, not complete thoughts. Adding a _complete PrivateAttr (default True) lets __str__() skip tag wrapping for chunks emitted during streaming, preventing repeated <thinking>...</thinking> around each fragment in content="all" mode. Providers now use ContentThinking._as_chunk() for streaming fragments.

Previously `<thinking>\n` and `\n</thinking>\n\n` boundary strings were only yielded to consumers when `content="text"`. Remove that guard so they are emitted in all modes, including `content="all"`. This is required by shinychat PR posit-dev/shinychat#210, which removes server-side thinking detection and relies on the client tag parser seeing `<thinking>` markers in the stream. Without this fix, tool-use apps (which require `content="all"`) with thinking-capable models render thinking content as regular assistant text. Both the sync and async `_submit_turns` paths are updated. Tests in `TestStreamThinkingAll` and the async `test_content_all_async` are updated to assert that tag boundary strings ARE present in the output.

cpsievert · 2026-05-07T00:02:49Z

Superseded by a rebased PR on a clean branch.

cpsievert added 4 commits May 6, 2026 17:15

docs: add changelog entry for streaming thinking tag boundaries

13b5b5f

cpsievert closed this May 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: yield thinking tag boundaries in content='all' mode#296

fix: yield thinking tag boundaries in content='all' mode#296
cpsievert wants to merge 4 commits intomainfrom
fix/streaming-thinking-tags

cpsievert commented May 7, 2026

Uh oh!

cpsievert commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cpsievert commented May 7, 2026

Summary

Motivation

Test plan

Uh oh!

cpsievert commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant