Skip to content

fix: handle reasoning_content for Kimi/thinking models#7252

Merged
michaelneale merged 7 commits intoblock:mainfrom
clayarnoldg2m:fix/reasoning-content-kimi-thinking-models
Feb 16, 2026
Merged

fix: handle reasoning_content for Kimi/thinking models#7252
michaelneale merged 7 commits intoblock:mainfrom
clayarnoldg2m:fix/reasoning-content-kimi-thinking-models

Conversation

@clayarnoldg2m
Copy link
Contributor

Summary

  • Preserves reasoning_content when splitting parallel tool calls in agent.rs — providers like Kimi require it on all assistant messages with tool_calls when thinking mode is enabled
  • Omits reasoning_content field entirely when empty in format_messages instead of sending "" — Kimi rejects empty reasoning_content
  • Properly accumulates reasoning_content chunks during streaming and emits as MessageContent::reasoning()

Context

Fixes #6902. Kimi (Moonshot) and other thinking models that use reasoning_content (similar to DeepSeek) were failing because:

  1. When parallel tool calls are split into separate messages, the reasoning content was lost
  2. An empty reasoning_content: "" was sent even when no reasoning occurred, which Kimi rejects
  3. Streaming reasoning_content chunks were not accumulated, so reasoning was lost in streamed responses

Reference: PR #6962 identified these issues. This PR implements the three core fixes cleanly.

Test plan

  • cargo test -p goose --lib — 666 tests pass
  • cargo clippy -p goose --all-targets -- -D warnings — clean
  • Manual verification with DeepSeek reasoning model (reasoning still works)
  • Manual verification with Kimi thinking model (tool calls + reasoning work)

🤖 Generated with Claude Code

clayarnoldg2m and others added 7 commits February 16, 2026 22:39
…oning_content

xAI (and other providers like DeepSeek) can return reasoning_content alongside
text content, producing 2+ content items. The test now accepts >= 1 items and
verifies at least one is Text, instead of requiring exactly 1 item.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: clayarnoldg2m <carnold@g2m.ai>
The submit filter in ProviderConfiguationModal only included fields where
the user had typed a new value (entry.value), causing untouched fields like
API Host to be silently dropped. Now the filter also includes fields with
existing non-masked server values (entry.serverValue), preventing config
reversion when only some fields are edited.

Fixes block#7245

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: clayarnoldg2m <carnold@g2m.ai>
Replace provider.complete() with provider.complete_with_model() using
explicit max_tokens of 16384 in both generate_new_app_content() and
generate_updated_app_content(). This prevents app HTML from being
truncated when the provider's default max_tokens is too low.

Fixes block#7239

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: clayarnoldg2m <carnold@g2m.ai>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: clayarnoldg2m <carnold@g2m.ai>
Add a Some("thinking") arm to parse_stream_json_response() that
extracts thinking content from Gemini CLI stream events and creates
MessageContent::Thinking entries. Without this, thinking blocks were
silently dropped, causing truncated responses.

Includes tests for thinking block parsing and no-thinking fallback.

Fixes block#7203

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: clayarnoldg2m <carnold@g2m.ai>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: clayarnoldg2m <carnold@g2m.ai>
Three fixes for reasoning_content handling:

1. agent.rs: Preserve reasoning_content when splitting parallel tool calls.
   Providers like Kimi require reasoning_content on all assistant messages
   with tool_calls when thinking mode is enabled.

2. openai.rs format_messages: Omit reasoning_content field entirely when
   empty instead of sending empty string. Kimi rejects empty
   reasoning_content ("").

3. openai.rs streaming: Properly accumulate reasoning_content chunks
   across streaming deltas and emit as MessageContent::reasoning().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: clayarnoldg2m <carnold@g2m.ai>
@michaelneale
Copy link
Collaborator

Full CI passed on branch PR: #7262

Copy link
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice - tested it in a branch

@michaelneale michaelneale added this pull request to the merge queue Feb 16, 2026
Merged via the queue into block:main with commit 05d2af4 Feb 16, 2026
36 checks passed
michaelneale added a commit that referenced this pull request Feb 17, 2026
* main:
  fix: handle reasoning_content for Kimi/thinking models (#7252)
  feat: sandboxing for macos (#7197)
  fix(otel): use monotonic_counter prefix and support temporality env var (#7234)
  Streaming markdown (#7233)
  Improve compaction messages to enable better post-compaction agent behavior (#7259)
  fix: avoid shell-escaping special characters except quotes (#7242)
tlongwell-block added a commit that referenced this pull request Feb 17, 2026
* origin/main:
  docs: playwright CLI skill tutorial (#7261)
  install node in goose dir (#7220)
  fix: relax test_basic_response assertion for providers returning reasoning_content (#7249)
  fix: handle reasoning_content for Kimi/thinking models (#7252)
  feat: sandboxing for macos (#7197)
  fix(otel): use monotonic_counter prefix and support temporality env var (#7234)
  Streaming markdown (#7233)
  Improve compaction messages to enable better post-compaction agent behavior (#7259)
  fix: avoid shell-escaping special characters except quotes (#7242)
  fix: use dynamic port for Tetrate auth callback server (#7228)
  docs: removing LLM Usage admonitions (#7227)
  feat(otel): respect standard OTel env vars for exporter selection (#7144)
  fix: fork session (#7219)
  Bump version numbers for 1.24.0 release (#7214)
  Move platform extensions into their own folder (#7210)
  fix: ignore deprecated skills extension (#7139)

# Conflicts:
#	Cargo.lock
#	Cargo.toml
zanesq added a commit that referenced this pull request Feb 17, 2026
…led-extensions-cmd

* 'main' of github.com:block/goose: (24 commits)
  Set up direnv and update flake inputs (#6526)
  fix: restore subagent tool call notifications after summon refactor (#7243)
  fix(ui): preserve server config values on partial provider config save (#7248)
  fix(claude-code): allow goose to run inside a Claude Code session (#7232)
  fix(openai): route gpt-5 codex via responses and map base paths (#7254)
  feat: add GoosePlatform to AgentConfig and MCP initialization (#6931)
  Fix copied over (#7270)
  feat(gemini-cli): add streaming support via stream-json events (#7244)
  fix: filter models without tool support from recommended list (#7198)
  fix(google): handle more thoughtSignature vagaries during streaming (#7204)
  docs: playwright CLI skill tutorial (#7261)
  install node in goose dir (#7220)
  fix: relax test_basic_response assertion for providers returning reasoning_content (#7249)
  fix: handle reasoning_content for Kimi/thinking models (#7252)
  feat: sandboxing for macos (#7197)
  fix(otel): use monotonic_counter prefix and support temporality env var (#7234)
  Streaming markdown (#7233)
  Improve compaction messages to enable better post-compaction agent behavior (#7259)
  fix: avoid shell-escaping special characters except quotes (#7242)
  fix: use dynamic port for Tetrate auth callback server (#7228)
  ...
katzdave added a commit to YusukeShimizu/goose that referenced this pull request Feb 17, 2026
* origin/main: (263 commits)
  working_dir usage more clear in add_extension (block#6958)
  Use Canonical Models to set context window sizes (block#6723)
  Set up direnv and update flake inputs (block#6526)
  fix: restore subagent tool call notifications after summon refactor (block#7243)
  fix(ui): preserve server config values on partial provider config save (block#7248)
  fix(claude-code): allow goose to run inside a Claude Code session (block#7232)
  fix(openai): route gpt-5 codex via responses and map base paths (block#7254)
  feat: add GoosePlatform to AgentConfig and MCP initialization (block#6931)
  Fix copied over (block#7270)
  feat(gemini-cli): add streaming support via stream-json events (block#7244)
  fix: filter models without tool support from recommended list (block#7198)
  fix(google): handle more thoughtSignature vagaries during streaming (block#7204)
  docs: playwright CLI skill tutorial (block#7261)
  install node in goose dir (block#7220)
  fix: relax test_basic_response assertion for providers returning reasoning_content (block#7249)
  fix: handle reasoning_content for Kimi/thinking models (block#7252)
  feat: sandboxing for macos (block#7197)
  fix(otel): use monotonic_counter prefix and support temporality env var (block#7234)
  Streaming markdown (block#7233)
  Improve compaction messages to enable better post-compaction agent behavior (block#7259)
  ...

# Conflicts:
#	crates/goose/src/providers/openai.rs
zanesq added a commit that referenced this pull request Feb 17, 2026
…ions-fallback

* 'main' of github.com:block/goose: (43 commits)
  Added cmd to validate bundled extensions json (#7217)
  working_dir usage more clear in add_extension (#6958)
  Use Canonical Models to set context window sizes (#6723)
  Set up direnv and update flake inputs (#6526)
  fix: restore subagent tool call notifications after summon refactor (#7243)
  fix(ui): preserve server config values on partial provider config save (#7248)
  fix(claude-code): allow goose to run inside a Claude Code session (#7232)
  fix(openai): route gpt-5 codex via responses and map base paths (#7254)
  feat: add GoosePlatform to AgentConfig and MCP initialization (#6931)
  Fix copied over (#7270)
  feat(gemini-cli): add streaming support via stream-json events (#7244)
  fix: filter models without tool support from recommended list (#7198)
  fix(google): handle more thoughtSignature vagaries during streaming (#7204)
  docs: playwright CLI skill tutorial (#7261)
  install node in goose dir (#7220)
  fix: relax test_basic_response assertion for providers returning reasoning_content (#7249)
  fix: handle reasoning_content for Kimi/thinking models (#7252)
  feat: sandboxing for macos (#7197)
  fix(otel): use monotonic_counter prefix and support temporality env var (#7234)
  Streaming markdown (#7233)
  ...

# Conflicts:
#	crates/goose/src/config/extensions.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Kimi k2.5 API Error: "thinking is enabled but reasoning_content is missing" in tool calls

2 participants

Comments