refactor(llm-access-kiro): split anthropic/stream.rs into submodules by acking-you · Pull Request #10 · acking-you/static_flow

acking-you · 2026-05-30T22:49:15Z

What

Split the 2546-line llm-access-kiro/src/anthropic/stream.rs into a
stream/mod.rs facade plus 6 concern-focused submodules. Pure structural move —
no logic changes. Next step in the incremental llm-access* refactor (follows
#7 cache_sim, #4 request.rs).

Module layout

 Kiro Event
    -> [usage]           input-token resolution + usage JSON
    -> [sse_event]       SSE wire value
    -> [signature]       synthetic thinking signatures
    -> [inline_thinking] <thinking> tag extraction
    -> [state]           SSE block state machine
    -> [context]         StreamContext / BufferedStreamContext drivers

File	Holds
`usage.rs`	`KiroInputTokenSource`, `KIRO_CONTEXT_USAGE_MIN_REQUEST_TOKENS`, `anthropic_usage_json`, `resolve_input_tokens[_with_threshold]`
`sse_event.rs`	`SseEvent` + impl
`signature.rs`	`synthetic_thinking_signature` + proto encoders / derive helpers / wire-format consts
`inline_thinking.rs`	`build_inline_thinking_content_blocks`, `strip`/`split`, quote-aware tag scanners, `InlineThinkingBlock`
`state.rs`	`SseStateManager`, `BlockState`
`context.rs`	`StreamContext`, `BufferedStreamContext` + context-only helpers (`estimate_tokens`, `find_char_boundary`, `canonicalize_structured_output_json`, `SYNTHETIC_THINKING_PLACEHOLDER`)

Module style (per the agreed standard)

mod.rs convention; no use super::* in submodules — explicit origin imports.
No pub use X::* globs: mod.rs re-exports only the 9 items used outside
the module, by name, so the stream::X public surface (consumed by
llm-access provider.rs and the anthropic/websearch.rs sibling) is
unchanged.
StreamContext.state_manager is a pub field of type SseStateManager, so
SseStateManager must stay pub-reachable (E0446) — re-exported by name.
Minimal visibility: cross-module items get pub + /// doc (state machine's
is_block_open_of_type/has_non_thinking_blocks, the inline-thinking tag
scanners + strip, synthetic_thinking_signature, and the 6 signature
wire-format consts the shape test asserts against). pub(super) items now used
only within stream were narrowed to private (InlineThinkingBlock,
split_inline_thinking_content). No pub(crate).
Tests inlined: 7 usage tests, the SSE-format test, and the inline-thinking
split test move into their submodules; the StreamContext integration cluster
- shared proto-shape verification stay in context.rs (it's the only place
  that can reach the private process_assistant_response 3 tests exercise).

Verification

Item inventory diff vs original: 0 dropped, 0 added; #[test] count
31 = 31.
cargo clippy -p llm-access-kiro -- -D warnings (lib and --tests): clean.
cargo test -p llm-access-kiro: 163 passed.
Reverse-dep cargo build -p llm-access: compiles against the unchanged public
surface.
rustfmt on the 7 new files only; deps/lance & deps/lancedb stayed clean.

🤖 Generated with Claude Code

gemini-code-assist

Code Review

This pull request refactors the Anthropic-compatible SSE stream adapter by splitting the monolithic context.rs file into several focused submodules: inline_thinking, signature, sse_event, state, and usage, coordinated by a new mod.rs. This significantly improves code modularity and maintainability. Feedback on the changes suggests optimizing strip_inline_thinking_content in inline_thinking.rs by collecting the filtered iterator directly into a String to avoid a temporary Vec<String> allocation.

gemini-code-assist · 2026-05-30T22:50:14Z

+pub fn strip_inline_thinking_content(content: &str) -> String {
+    split_inline_thinking_content(content, true)
+        .into_iter()
+        .filter_map(|block| match block {
+            InlineThinkingBlock::Text(text) => Some(text),
+            InlineThinkingBlock::Thinking(_) => None,
+        })
+        .collect::<Vec<_>>()
+        .join("")
+}


Avoid allocating a temporary Vec<String> and then joining them. Instead, you can collect the iterator directly into a String using .collect::<String>(), which is more efficient and idiomatic in Rust.

pub fn strip_inline_thinking_content(content: &str) -> String { split_inline_thinking_content(content, true) .into_iter() .filter_map(|block| match block { InlineThinkingBlock::Text(text) => Some(text), InlineThinkingBlock::Thinking(_) => None, }) .collect::<String>() }

References

Avoid allocating temporary collections (such as Vec<String>) and cloning elements solely for hashing. Instead, hash borrowed segments directly by chaining iterators or updating the hasher incrementally, especially on performance-critical hot paths.

Fixed in 961575d. Collect the kept Text segments directly into a String via collect::<String>(), dropping the temporary Vec<String> + join(""). Empty-separator join is plain concatenation, so output is byte-identical — hidden_thinking_strips_inline_thinking_without_signature still passes.

gemini-code-assist

Code Review

This pull request refactors the Anthropic-compatible SSE stream adapter by modularizing the monolithic context.rs file into several focused submodules: inline_thinking, signature, sse_event, state, and usage. This significantly improves the code's maintainability and readability. Feedback on the changes suggests optimizing strip_inline_thinking_content in inline_thinking.rs to collect directly into a String instead of allocating a temporary Vec first, which improves performance and idiomaticity.

gemini-code-assist · 2026-05-30T22:51:10Z

+pub fn strip_inline_thinking_content(content: &str) -> String {
+    split_inline_thinking_content(content, true)
+        .into_iter()
+        .filter_map(|block| match block {
+            InlineThinkingBlock::Text(text) => Some(text),
+            InlineThinkingBlock::Thinking(_) => None,
+        })
+        .collect::<Vec<_>>()
+        .join("")
+}


In strip_inline_thinking_content, collecting the filtered text blocks into a temporary Vec and then calling join("") causes unnecessary allocations and overhead. Since String implements FromIterator<String>, you can collect the iterator directly into a String to improve performance and idiomaticity.

pub fn strip_inline_thinking_content(content: &str) -> String { split_inline_thinking_content(content, true) .into_iter() .filter_map(|block| match block { InlineThinkingBlock::Text(text) => Some(text), InlineThinkingBlock::Thinking(_) => None, }) .collect::<String>() }

Fixed in 961575d. Collect the kept Text segments directly into a String via collect::<String>(), dropping the temporary Vec<String> + join(""). Empty-separator join is plain concatenation, so output is byte-identical — hidden_thinking_strips_inline_thinking_without_signature still passes.

…tring strip_inline_thinking_content collected the kept text segments into a temporary Vec<String> and then join("")'d them. String implements FromIterator<String>, so collect into a String directly: one allocation that appends each segment, instead of a Vec plus a second joined String. Empty-separator join is plain concatenation, so the output is byte-identical; hidden_thinking_strips_inline_thinking_without_signature still passes. Addresses a PR #10 review finding from gemini-code-assist. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…modules Break the 2546-line anthropic/stream.rs into a stream/mod.rs facade plus 6 concern-focused submodules, following the agreed module style: - usage.rs: input-token resolution + Anthropic usage JSON (KiroInputTokenSource, KIRO_CONTEXT_USAGE_MIN_REQUEST_TOKENS, anthropic_usage_json, resolve_input_tokens[_with_threshold]). - sse_event.rs: the SseEvent wire value + impl. - signature.rs: synthetic thinking-signature synthesis (synthetic_thinking_ signature) with its proto encoders, derive helpers, and wire-format consts. - inline_thinking.rs: inline <thinking> extraction (build_inline_thinking_ content_blocks, strip/split, quote-aware tag scanners). - state.rs: the SSE block state machine (SseStateManager, BlockState). - context.rs: the StreamContext / BufferedStreamContext drivers plus their context-only helpers (estimate_tokens, find_char_boundary, canonicalize_ structured_output_json, SYNTHETIC_THINKING_PLACEHOLDER). Module style: - mod.rs convention (matches the rest of the workspace). - No `use super::*` in submodules: each imports its deps explicitly from their origin. No `pub use X::*` globs: mod.rs re-exports only the 9 items used outside the module, by name, so the stream::X public surface (consumed by llm-access provider.rs and the anthropic/websearch.rs sibling) is unchanged. - StreamContext.state_manager is a pub field of type SseStateManager, so SseStateManager must stay pub-reachable; re-exported by name from mod.rs. - Minimal visibility: cross-module items get `pub` + a `///` doc (the state machine's is_block_open_of_type / has_non_thinking_blocks, the inline-thinking tag scanners + strip, synthetic_thinking_signature, and the 6 signature wire-format consts the shape test asserts against). pub(super) items now used only within stream were narrowed to private (InlineThinkingBlock, split_inline_thinking_content). No pub(crate). - Tests inlined: the 7 usage tests, the SSE-format test, and the inline-thinking split test move into their submodules (so the private items they exercise need no extra exposure); the StreamContext integration cluster + shared proto-shape verification stay in context.rs against process_assistant_response (private) and the public driver API. No behavior change: item inventory diff vs the original shows zero dropped and zero added items, #[test] count 31 = 31; clippy -D warnings clean (lib + tests), 163 kiro tests pass, and the reverse-dep llm-access binary compiles against the unchanged public surface. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…tring strip_inline_thinking_content collected the kept text segments into a temporary Vec<String> and then join("")'d them. String implements FromIterator<String>, so collect into a String directly: one allocation that appends each segment, instead of a Vec plus a second joined String. Empty-separator join is plain concatenation, so the output is byte-identical; hidden_thinking_strips_inline_thinking_without_signature still passes. Addresses a PR #10 review finding from gemini-code-assist. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

acking-you · 2026-05-30T23:10:26Z

Rebased onto latest master (resolved the modify/delete conflict from #9 — its ContentLengthExceededException stop-reason fix and its test now live in stream/context.rs; verified 164 tests pass, clippy clean, llm-access builds).

/gemini review

acking-you · 2026-05-30T23:10:28Z

@codex review

gemini-code-assist

Code Review

This pull request refactors the Anthropic-compatible SSE stream adapter by modularizing the monolithic context.rs file into focused submodules: inline_thinking, signature, sse_event, state, and usage. This significantly improves code organization and maintainability. Feedback on the changes suggests optimizing the strip_inline_thinking_content function in inline_thinking.rs to avoid unnecessary string allocations and clones by scanning tags and building the stripped string directly from borrowed slices.

gemini-code-assist · 2026-05-30T23:11:35Z

+pub fn strip_inline_thinking_content(content: &str) -> String {
+    split_inline_thinking_content(content, true)
+        .into_iter()
+        .filter_map(|block| match block {
+            InlineThinkingBlock::Text(text) => Some(text),
+            InlineThinkingBlock::Thinking(_) => None,
+        })
+        .collect::<String>()
+}


The current implementation of strip_inline_thinking_content delegates to split_inline_thinking_content, which allocates a temporary Vec<InlineThinkingBlock> and clones multiple substrings into newly allocated Strings, only to immediately discard the thinking blocks and collect the text blocks.

Since this function is called on every streaming response/chunk, we can optimize it to avoid these temporary allocations and string clones by scanning the tags and building the stripped string directly from borrowed slices.

pub fn strip_inline_thinking_content(content: &str) -> String { let Some(start_pos) = find_real_thinking_start_tag(content) else { return content.to_string(); }; let mut result = String::with_capacity(content.len()); let before = &content[..start_pos]; if !before.trim().is_empty() { result.push_str(before); } let mut remaining = &content[start_pos + "<thinking>".len()..]; if remaining.starts_with('\n') { remaining = &remaining[1..]; } let end_pos = if let Some(end_pos) = find_real_thinking_end_tag(remaining) { end_pos } else if let Some(end_pos) = find_real_thinking_end_tag_at_buffer_end(remaining) { end_pos } else { return content.to_string(); }; let after_tag = &remaining[end_pos + "</thinking>".len()..]; let after_thinking = after_tag.strip_prefix("\n\n").unwrap_or(after_tag); if !after_thinking.is_empty() { result.push_str(after_thinking); } result }

Holding off on this one. The rewrite is correct and byte-equivalent (I checked all four branches: empty content, no start tag, no end tag, normal extraction), but the stated premise doesn't hold and the cost outweighs the gain here:

Not a hot path. strip_inline_thinking_content has exactly one call site workspace-wide — final_assistant_text, and only when hidden_thinking_enabled. It runs once at response finalization (via final_assistant_message/final_content_blocks, called after the stream loop in provider.rs), never inside the per-chunk process_kiro_event. So it saves one small Vec + a few String clones once per response, not per chunk.

Duplicates the scan control flow. The suggestion re-implements split_inline_thinking_content's tag scanning (start tag → skip leading \n → the two end-tag probes → strip \n\n). Keeping strip as a thin filter over split is a single source of truth; two parsers risk silently diverging if the thinking-tag semantics ever change.

Out of scope. This PR is a behavior-preserving structural split (no logic changes mixed in). Re-implementing the strip internals is a logic change, not a module move.

The previous collect::<String>() fix was a clean, local, no-duplication win — happy to take that class of change. If the micro-allocation on this cold path is worth pursuing, it'd be better as a separate perf PR that factors the scan into a shared visitor so neither copy duplicates the other.

chatgpt-codex-connector · 2026-05-30T23:13:47Z

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

gemini-code-assist Bot reviewed May 30, 2026

View reviewed changes

acking-you and others added 2 commits May 31, 2026 07:07

acking-you force-pushed the refactor/split-kiro-anthropic-stream branch from 961575d to 2124e9f Compare May 30, 2026 23:09

gemini-code-assist Bot reviewed May 30, 2026

View reviewed changes

acking-you merged commit 6b0cf15 into master May 30, 2026
3 checks passed

This was referenced May 31, 2026

refactor(llm-access-kiro): split anthropic/converter.rs into submodules #12

Merged

refactor(llm-access-core): split store.rs into submodules #13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(llm-access-kiro): split anthropic/stream.rs into submodules#10

refactor(llm-access-kiro): split anthropic/stream.rs into submodules#10
acking-you merged 2 commits into
masterfrom
refactor/split-kiro-anthropic-stream

acking-you commented May 30, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Uh oh!

acking-you May 30, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Uh oh!

acking-you May 30, 2026

Uh oh!

acking-you commented May 30, 2026

Uh oh!

acking-you commented May 30, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Uh oh!

acking-you May 30, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

acking-you commented May 30, 2026

What

Module layout

Module style (per the agreed standard)

Verification

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

acking-you May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

acking-you May 30, 2026

Choose a reason for hiding this comment

Uh oh!

acking-you commented May 30, 2026

Uh oh!

acking-you commented May 30, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

acking-you May 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant