Skip to content

refactor(anthropic): fix N+1 thinking message storage issue#7958

Merged
DOsinga merged 1 commit intoblock:mainfrom
rabi:anthropic_thinking
Mar 20, 2026
Merged

refactor(anthropic): fix N+1 thinking message storage issue#7958
DOsinga merged 1 commit intoblock:mainfrom
rabi:anthropic_thinking

Conversation

@rabi
Copy link
Contributor

@rabi rabi commented Mar 17, 2026

Summary

Thinking delta handling was initially proposed in this PR, but #7626 merged afterwards seems to have added it now. This adds certain optimizations proposed earlier in this PR.

As thinking_delta is yielded and stored individually, then the complete block re-emitted at content_block_stop, N+1 entries per thinking block end up in both database and session memory though they're filtered in format_messages when replaying.

Partial thinking deltas serves no purpose without a signature and the UI renders thinking as a complete block rather than incrementally. Emit once at content_block_stop.

Replace JSON delta comparisons with typed ContentDelta enum. Add streaming tests.

Testing

Locally tested with anthropic provider.

@rabi rabi force-pushed the anthropic_thinking branch from ae84c8a to c95de17 Compare March 18, 2026 04:12
@rabi rabi changed the title fix(anthropic): handle thinking/signature content in Anthropic streaming refactor(anthropic): fix N+1 thinking message storage bug Mar 18, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c95de17474

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

The thinking delta handling was proposed in this PR initially,
but block#7626 merged first. This adds certain optimizations proposed
earlier in this PR.

As thinking_delta is yielded and stored individually, then the
complete block re-emitted at content_block_stop — N+1 entries per
thinking block end up in both database and session memory.

Streaming partial thinking deltas serves no purpose: without a
signature they cannot be sent back to the API (format_messages
drops them), and the UI renders thinking as a complete block
rather than incrementally.

Emit once at content_block_stop. Replace JSON delta comparisons
with typed ContentDelta enum. Add streaming tests.

Change-Id: Id42262e30f1eea8d24f1ddb87381395714a91821
Signed-off-by: Rabi Mishra <mishra.rabi@gmail.com>
@rabi rabi force-pushed the anthropic_thinking branch from c95de17 to fc16a5e Compare March 18, 2026 04:28
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc16a5e473

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@rabi rabi changed the title refactor(anthropic): fix N+1 thinking message storage bug refactor(anthropic): fix N+1 thinking message storage issue Mar 18, 2026
Copy link
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is correct — buffering thinking deltas and emitting a single complete message at content_block_stop is the right approach. The N+1 storage issue is real and this resolves it cleanly. The ContentDelta enum is a nice improvement over stringly-typed JSON checks. Tests follow the existing run_streaming_test pattern and test real behaviour. Good work!

@DOsinga DOsinga added this pull request to the merge queue Mar 20, 2026
Merged via the queue into block:main with commit d257600 Mar 20, 2026
21 checks passed
elijahsgh pushed a commit to elijahsgh/goose that referenced this pull request Mar 21, 2026
Signed-off-by: Rabi Mishra <mishra.rabi@gmail.com>
Signed-off-by: esnyder <elijah.snyder1@gmail.com>
elijahsgh pushed a commit to elijahsgh/goose that referenced this pull request Mar 21, 2026
Signed-off-by: Rabi Mishra <mishra.rabi@gmail.com>
Signed-off-by: esnyder <elijah.snyder1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants