feat(client): add streaming robustness and extended thinking support by hakula139 · Pull Request #5 · hakula139/oxide-code

hakula139 · 2026-04-04T16:28:29Z

Summary

Add proper support for extended thinking, redacted thinking, server tool use, and unknown content block types in the SSE streaming pipeline. Enable adaptive thinking by default.

Add typed Thinking, RedactedThinking, ServerToolUse variants to ContentBlockInfo, Delta, BlockAccumulator, and ContentBlock for proper streaming, accumulation, and API round-tripping
Add ThinkingDelta and SignatureDelta to Delta (signature overwrites, not appends)
Add ThinkingConfig (adaptive-only) in config module, wired into CreateMessageRequest
Enable adaptive thinking by default (ThinkingConfig::Adaptive)
Add strip_trailing_thinking — strips trailing thinking blocks from the last assistant message (the API rejects messages ending with thinking blocks); inserts placeholder for thinking-only responses to preserve user / assistant alternation
Keep #[serde(other)] Unknown catch-all on StreamEvent, ContentBlockInfo, and Delta for unrecognized future types
Extract init_accumulator, apply_delta, and parse_tool_json helpers from stream_response
Optional dimmed thinking display via OX_SHOW_THINKING env var — streams thinking deltas to stdout with ANSI dim styling, off by default

Design Decisions

Thinking enabled by default: Adaptive mode lets the model decide the budget. No user configuration needed — thinking just works.
Round-trip preservation: Thinking and redacted_thinking blocks are stored in ContentBlock and sent back in subsequent API requests, preserving conversation continuity.
Signature handling: SignatureDelta overwrites (not appends) — it's a full cryptographic value, not incremental text. Credential rotation stripping is deferred to the Keychain OAuth PR.
ThinkingConfig in config module: Avoids an inverted dependency where config.rs would import from client::anthropic. The type conceptually belongs with configuration.
Adaptive-only: ThinkingConfig::Enabled (fixed budget) was removed — no production or planned code path needs it. The Enabled variant can be trivially re-added when a fixed-budget mode is actually required (e.g., for older models on 3P providers).
Thinking-only placeholder: strip_trailing_thinking inserts a [No message content] placeholder when stripping removes all content, instead of deleting the message. Deletion would break user / assistant alternation, causing consecutive user messages that the API rejects. Matches Claude Code's filterTrailingThinkingFromLastAssistant behavior.
Thinking display opt-in: Off by default since the bare REPL lacks collapsible sections. When enabled (OX_SHOW_THINKING=1), each thinking delta is wrapped in ANSI dim codes so the accumulator stores clean text for API round-tripping.
Variant ordering: ContentBlock, ContentBlockInfo, and BlockAccumulator follow the same variant order — tool-use variants grouped together (ToolUse, ServerToolUse), then ToolResult, then thinking variants.

Changes

File	Description
`client/anthropic.rs`	`Thinking`, `RedactedThinking`, `ServerToolUse` on `ContentBlockInfo`; `ThinkingDelta`, `SignatureDelta` on `Delta`; `Unknown` catch-alls; `thinking` field on `CreateMessageRequest`; 9 new tests
`config.rs`	`ThinkingConfig` enum (adaptive-only); `env_bool` helper; `show_thinking` field from `OX_SHOW_THINKING` env var; 1 new test
`main.rs`	`BlockAccumulator` variants for all block types; `init_accumulator`, `apply_delta`, `parse_tool_json` helpers; `strip_trailing_thinking` call; dimmed thinking display gated on `show_thinking`; `ContentBlockStop` handling for thinking newline
`message.rs`	`ServerToolUse`, `Thinking`, `RedactedThinking` on `ContentBlock`; variant ordering aligned with `ContentBlockInfo`; `strip_trailing_thinking` targeting last assistant message with placeholder insertion; 9 new tests
`CLAUDE.md`	Code Review criteria (DRY, cross-file consistency, idiomatic Rust); test conciseness convention
`docs/roadmap.md`	Restructure "Working Today" into subsections; update thinking description; add TOML config file to Next Phase
`docs/research/extended-thinking.md`	Research notes on thinking blocks, signatures, round-tripping, content block type taxonomy, and normalization pipeline

Test plan

cargo fmt --all --check — clean
cargo build compiles cleanly
cargo clippy --all-targets -- -D warnings — zero warnings
cargo test — 162 tests pass (19 new)
cargo llvm-cov --ignore-filename-regex 'main\.rs' — 86% line coverage
Manual test: OX_SHOW_THINKING=1 ox — thinking text streams dimmed before response; adaptive mode skips thinking for trivial queries

Add #[serde(other)] catch-all Unknown variants to StreamEvent, ContentBlockInfo, and Delta so that unrecognized types (e.g., thinking, redacted_thinking, signature_delta) deserialize without crashing. Add a Skipped variant to BlockAccumulator that absorbs deltas silently and produces no ContentBlock, keeping the agent loop stable when the API introduces new block types.

Document how Claude Code handles thinking, redacted_thinking, server_tool_use, and signature blocks. Covers streaming lifecycle, round-tripping requirements, credential rotation constraints, and implementation implications for oxide-code.

…_use support Replace the Unknown catch-all with proper typed variants for thinking, redacted_thinking, and server_tool_use content blocks. Add ThinkingDelta and SignatureDelta to the Delta enum. Add block accumulators that preserve thinking text and signatures for API round-tripping. Enable adaptive thinking by default in Config. Add strip_trailing_thinking to remove thinking blocks from the end of assistant messages before sending (API constraint). Extract init_accumulator and apply_delta helpers from stream_response to keep it under the line limit. Add ThinkingConfig (adaptive / enabled) to CreateMessageRequest, driven by Config.thinking. The Unknown catch-all remains for truly unrecognized future types.

…e status Replace the stale "Implementation Implications" planning section with a brief inline status note, matching the factual reference style of anthropic-api.md.

ThinkingConfig conceptually belongs with configuration, not the HTTP client. Moving it to config.rs fixes the inverted dependency where config imported from client::anthropic. Also fixes the #[expect(dead_code)] reason string to describe current state per convention, and adds a comment explaining the hardcoded adaptive thinking default.

- Add Debug derive to BlockAccumulator for consistency with parallel enums (ContentBlockInfo, ContentBlock) and diagnostic traceability. - Log unhandled block/delta combinations at debug level instead of silently dropping them, aiding protocol issue diagnosis. - Guard trailing newline emission against empty text blocks to prevent spurious output.

- Assert the surviving block type in removes_redacted_at_end (was only checking length, which would pass even if the wrong block survived). - Add test for multiple consecutive trailing thinking blocks to exercise the while loop. - Add test for all-thinking assistant message to document the empty content vec edge case.

- Fix roadmap streaming robustness bullet to reflect that thinking, redacted_thinking, and server_tool_use are now fully handled, not just silently skipped. - Add DRY, cross-file consistency, and idiomatic Rust to the Code Review checklist in CLAUDE.md.

Drop removes_redacted_at_end — it was subsumed by removes_multiple_consecutive, which already exercises both Thinking and RedactedThinking removal through the while loop. Strengthen preserves_non_trailing to assert block identity and order, not just count. Add test conciseness convention to CLAUDE.md: prefer fewer thorough tests over many minimal ones; drop tests subsumed by more comprehensive ones.

strip_trailing_thinking can leave an assistant message with empty content if the response contained only thinking blocks. The API rejects empty content arrays, so filter these out before sending. Also include the block type in the delta mismatch debug trace for better diagnostic context.

Move ContentBlock::ServerToolUse tests before ContentBlock::Thinking to mirror the enum definition order (ToolResult, ServerToolUse, Thinking, RedactedThinking).

The docstring still listed only Text and ToolUse for assistant messages, missing ServerToolUse, Thinking, and RedactedThinking added in this PR.

When OX_SHOW_THINKING=1, stream thinking deltas to stdout with ANSI dim styling (\x1b[2m). Off by default — thinking blocks are accumulated silently for API round-tripping as before. - Add `show_thinking` field to Config, loaded from OX_SHOW_THINKING env var - Thread the flag through repl → agent_turn → stream_response → helpers - Write dim text in init_accumulator (initial thinking) and apply_delta (thinking deltas) - Handle ContentBlockStop for thinking blocks to emit a trailing newline separating thinking from text output

- Update extended thinking bullet to mention OX_SHOW_THINKING. - Add Configuration File section under Next Phase: TOML config with layered loading (global → user → project → env var overrides).

…iling_thinking - Move ServerToolUse before ToolResult to align variant order with ContentBlockInfo and BlockAccumulator (tool-use variants grouped). - Narrow strip_trailing_thinking to target only the last assistant message via rfind — earlier messages were already processed. - Clarify comment on empty-message removal after thinking stripping. - Reorder test sections to mirror new variant order. - Add strip_trailing_thinking_targets_only_last_assistant test.

Use consistent phrasing ("Silently skipped during stream processing") across all three #[serde(other)] Unknown variants: StreamEvent, ContentBlockInfo, and Delta.

Extract the truthiness check (`"1"` / `"true"`) into a reusable `env_bool` function, pairing with `non_empty_env` for string-valued env vars. Simplifies the `show_thinking` assignment and provides a consistent pattern for future `OX_*` boolean flags.

Only Adaptive is used — no production or planned code path constructs Enabled. Adding it back is trivial when a fixed-budget thinking mode is actually needed.

Replace write!(stdout, "{text}") with stdout.write_all(text.as_bytes()) where no format interpolation is needed.

…of deleting Deleting an empty-after-stripping assistant message breaks user/assistant alternation, causing consecutive user messages that the API rejects. Insert a "[No message content]" placeholder instead, matching Claude Code's filterTrailingThinkingFromLastAssistant behavior. Also update research notes with the full normalization pipeline and ordering constraints discovered in Claude Code's source.

Align with the editorial bracket convention and the existing [N chars] marker in truncate_line.

PR #64 (modal infrastructure) shipped Option C: bare /model opens the combined picker, bare /effort errors with a usage hint pointing at /model. The user guide, design notes, and roadmap still described the older "both bare forms open the picker with different initial focus" shape. Updated: - docs/guide/slash-commands.md — table description, mid-turn classification paragraph, and the "Switching the Effort" / "Switching the Model" sections. - docs/design/slash/commands.md — design decision #5, /effort and /model per-command notes, source list (`agent_loop_task` → `agent_turn`). - docs/design/slash/modals.md — design decisions #4 (`SessionInfo` → `LiveSessionInfo`) and #7 (typed-arg-only contract). - docs/roadmap.md — moved the combined picker out of "Current Focus" (shipped in PR #64) into Working Today; replaced with the deferred /effort slider. - CLAUDE.md — `slash/effort.rs` description updated to match the typed-arg contract.

hakula139 added the bug Something isn't working label Apr 4, 2026

hakula139 self-assigned this Apr 4, 2026

docs(roadmap): move streaming robustness to shipped, mark PR 2.1 done

3417e2f

hakula139 force-pushed the feat/streaming-robustness branch from a962497 to 3417e2f Compare April 4, 2026 16:32

hakula139 force-pushed the feat/streaming-robustness branch from cf08e20 to da8c365 Compare April 4, 2026 16:37

hakula139 changed the title ~~fix(client): handle unknown SSE content block and delta types gracefully~~ feat(client): add streaming robustness and extended thinking support Apr 4, 2026

hakula139 added enhancement New feature or request and removed bug Something isn't working labels Apr 4, 2026

docs(research): update extended thinking notes with current oxide-cod…

62d1dba

…e status Replace the stale "Implementation Implications" planning section with a brief inline status note, matching the factual reference style of anthropic-api.md.

hakula139 force-pushed the feat/streaming-robustness branch from 04eb661 to 62d1dba Compare April 4, 2026 16:57

hakula139 added 10 commits April 5, 2026 01:19

style(message): reorder test sections to match enum variant order

5491ba0

Move ContentBlock::ServerToolUse tests before ContentBlock::Thinking to mirror the enum definition order (ToolResult, ServerToolUse, Thinking, RedactedThinking).

docs(message): update ContentBlock docstring for new variants

03ad4a8

The docstring still listed only Text and ToolUse for assistant messages, missing ServerToolUse, Thinking, and RedactedThinking added in this PR.

docs(roadmap): add thinking display and TOML config file

2ab3fd0

- Update extended thinking bullet to mention OX_SHOW_THINKING. - Add Configuration File section under Next Phase: TOML config with layered loading (global → user → project → env var overrides).

hakula139 force-pushed the feat/streaming-robustness branch from e8d1577 to 2ab3fd0 Compare April 4, 2026 18:43

hakula139 added 6 commits April 5, 2026 19:06

style(client): unify catch-all doc comments on Unknown variants

d643cba

Use consistent phrasing ("Silently skipped during stream processing") across all three #[serde(other)] Unknown variants: StreamEvent, ContentBlockInfo, and Delta.

refactor(config): remove unused ThinkingConfig::Enabled variant

e9113aa

Only Adaptive is used — no production or planned code path constructs Enabled. Adding it back is trivial when a fixed-budget thinking mode is actually needed.

perf(main): use write_all for plain text output

8483d89

Replace write!(stdout, "{text}") with stdout.write_all(text.as_bytes()) where no format interpolation is needed.

style(bash): use brackets for truncation marker

343c489

Align with the editorial bracket convention and the existing [N chars] marker in truncate_line.

hakula139 merged commit 432e594 into main Apr 5, 2026
1 check passed

hakula139 deleted the feat/streaming-robustness branch April 5, 2026 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(client): add streaming robustness and extended thinking support#5

feat(client): add streaming robustness and extended thinking support#5
hakula139 merged 22 commits intomainfrom
feat/streaming-robustness

hakula139 commented Apr 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hakula139 commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design Decisions

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hakula139 commented Apr 4, 2026 •

edited

Loading