refactor(llm-access-kiro): split anthropic/stream.rs into submodules#10
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the Anthropic-compatible SSE stream adapter by splitting the monolithic context.rs file into several focused submodules: inline_thinking, signature, sse_event, state, and usage, coordinated by a new mod.rs. This significantly improves code modularity and maintainability. Feedback on the changes suggests optimizing strip_inline_thinking_content in inline_thinking.rs by collecting the filtered iterator directly into a String to avoid a temporary Vec<String> allocation.
| pub fn strip_inline_thinking_content(content: &str) -> String { | ||
| split_inline_thinking_content(content, true) | ||
| .into_iter() | ||
| .filter_map(|block| match block { | ||
| InlineThinkingBlock::Text(text) => Some(text), | ||
| InlineThinkingBlock::Thinking(_) => None, | ||
| }) | ||
| .collect::<Vec<_>>() | ||
| .join("") | ||
| } |
There was a problem hiding this comment.
Avoid allocating a temporary Vec<String> and then joining them. Instead, you can collect the iterator directly into a String using .collect::<String>(), which is more efficient and idiomatic in Rust.
pub fn strip_inline_thinking_content(content: &str) -> String {
split_inline_thinking_content(content, true)
.into_iter()
.filter_map(|block| match block {
InlineThinkingBlock::Text(text) => Some(text),
InlineThinkingBlock::Thinking(_) => None,
})
.collect::<String>()
}References
- Avoid allocating temporary collections (such as
Vec<String>) and cloning elements solely for hashing. Instead, hash borrowed segments directly by chaining iterators or updating the hasher incrementally, especially on performance-critical hot paths.
There was a problem hiding this comment.
Fixed in 961575d. Collect the kept Text segments directly into a String via collect::<String>(), dropping the temporary Vec<String> + join(""). Empty-separator join is plain concatenation, so output is byte-identical — hidden_thinking_strips_inline_thinking_without_signature still passes.
There was a problem hiding this comment.
Code Review
This pull request refactors the Anthropic-compatible SSE stream adapter by modularizing the monolithic context.rs file into several focused submodules: inline_thinking, signature, sse_event, state, and usage. This significantly improves the code's maintainability and readability. Feedback on the changes suggests optimizing strip_inline_thinking_content in inline_thinking.rs to collect directly into a String instead of allocating a temporary Vec first, which improves performance and idiomaticity.
| pub fn strip_inline_thinking_content(content: &str) -> String { | ||
| split_inline_thinking_content(content, true) | ||
| .into_iter() | ||
| .filter_map(|block| match block { | ||
| InlineThinkingBlock::Text(text) => Some(text), | ||
| InlineThinkingBlock::Thinking(_) => None, | ||
| }) | ||
| .collect::<Vec<_>>() | ||
| .join("") | ||
| } |
There was a problem hiding this comment.
In strip_inline_thinking_content, collecting the filtered text blocks into a temporary Vec and then calling join("") causes unnecessary allocations and overhead. Since String implements FromIterator<String>, you can collect the iterator directly into a String to improve performance and idiomaticity.
pub fn strip_inline_thinking_content(content: &str) -> String {
split_inline_thinking_content(content, true)
.into_iter()
.filter_map(|block| match block {
InlineThinkingBlock::Text(text) => Some(text),
InlineThinkingBlock::Thinking(_) => None,
})
.collect::<String>()
}There was a problem hiding this comment.
Fixed in 961575d. Collect the kept Text segments directly into a String via collect::<String>(), dropping the temporary Vec<String> + join(""). Empty-separator join is plain concatenation, so output is byte-identical — hidden_thinking_strips_inline_thinking_without_signature still passes.
…tring
strip_inline_thinking_content collected the kept text segments into a
temporary Vec<String> and then join("")'d them. String implements
FromIterator<String>, so collect into a String directly: one allocation that
appends each segment, instead of a Vec plus a second joined String.
Empty-separator join is plain concatenation, so the output is byte-identical;
hidden_thinking_strips_inline_thinking_without_signature still passes.
Addresses a PR #10 review finding from gemini-code-assist.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…modules Break the 2546-line anthropic/stream.rs into a stream/mod.rs facade plus 6 concern-focused submodules, following the agreed module style: - usage.rs: input-token resolution + Anthropic usage JSON (KiroInputTokenSource, KIRO_CONTEXT_USAGE_MIN_REQUEST_TOKENS, anthropic_usage_json, resolve_input_tokens[_with_threshold]). - sse_event.rs: the SseEvent wire value + impl. - signature.rs: synthetic thinking-signature synthesis (synthetic_thinking_ signature) with its proto encoders, derive helpers, and wire-format consts. - inline_thinking.rs: inline <thinking> extraction (build_inline_thinking_ content_blocks, strip/split, quote-aware tag scanners). - state.rs: the SSE block state machine (SseStateManager, BlockState). - context.rs: the StreamContext / BufferedStreamContext drivers plus their context-only helpers (estimate_tokens, find_char_boundary, canonicalize_ structured_output_json, SYNTHETIC_THINKING_PLACEHOLDER). Module style: - mod.rs convention (matches the rest of the workspace). - No `use super::*` in submodules: each imports its deps explicitly from their origin. No `pub use X::*` globs: mod.rs re-exports only the 9 items used outside the module, by name, so the stream::X public surface (consumed by llm-access provider.rs and the anthropic/websearch.rs sibling) is unchanged. - StreamContext.state_manager is a pub field of type SseStateManager, so SseStateManager must stay pub-reachable; re-exported by name from mod.rs. - Minimal visibility: cross-module items get `pub` + a `///` doc (the state machine's is_block_open_of_type / has_non_thinking_blocks, the inline-thinking tag scanners + strip, synthetic_thinking_signature, and the 6 signature wire-format consts the shape test asserts against). pub(super) items now used only within stream were narrowed to private (InlineThinkingBlock, split_inline_thinking_content). No pub(crate). - Tests inlined: the 7 usage tests, the SSE-format test, and the inline-thinking split test move into their submodules (so the private items they exercise need no extra exposure); the StreamContext integration cluster + shared proto-shape verification stay in context.rs against process_assistant_response (private) and the public driver API. No behavior change: item inventory diff vs the original shows zero dropped and zero added items, #[test] count 31 = 31; clippy -D warnings clean (lib + tests), 163 kiro tests pass, and the reverse-dep llm-access binary compiles against the unchanged public surface. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tring
strip_inline_thinking_content collected the kept text segments into a
temporary Vec<String> and then join("")'d them. String implements
FromIterator<String>, so collect into a String directly: one allocation that
appends each segment, instead of a Vec plus a second joined String.
Empty-separator join is plain concatenation, so the output is byte-identical;
hidden_thinking_strips_inline_thinking_without_signature still passes.
Addresses a PR #10 review finding from gemini-code-assist.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
961575d to
2124e9f
Compare
|
Rebased onto latest master (resolved the modify/delete conflict from #9 — its /gemini review |
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request refactors the Anthropic-compatible SSE stream adapter by modularizing the monolithic context.rs file into focused submodules: inline_thinking, signature, sse_event, state, and usage. This significantly improves code organization and maintainability. Feedback on the changes suggests optimizing the strip_inline_thinking_content function in inline_thinking.rs to avoid unnecessary string allocations and clones by scanning tags and building the stripped string directly from borrowed slices.
| pub fn strip_inline_thinking_content(content: &str) -> String { | ||
| split_inline_thinking_content(content, true) | ||
| .into_iter() | ||
| .filter_map(|block| match block { | ||
| InlineThinkingBlock::Text(text) => Some(text), | ||
| InlineThinkingBlock::Thinking(_) => None, | ||
| }) | ||
| .collect::<String>() | ||
| } |
There was a problem hiding this comment.
The current implementation of strip_inline_thinking_content delegates to split_inline_thinking_content, which allocates a temporary Vec<InlineThinkingBlock> and clones multiple substrings into newly allocated Strings, only to immediately discard the thinking blocks and collect the text blocks.
Since this function is called on every streaming response/chunk, we can optimize it to avoid these temporary allocations and string clones by scanning the tags and building the stripped string directly from borrowed slices.
pub fn strip_inline_thinking_content(content: &str) -> String {
let Some(start_pos) = find_real_thinking_start_tag(content) else {
return content.to_string();
};
let mut result = String::with_capacity(content.len());
let before = &content[..start_pos];
if !before.trim().is_empty() {
result.push_str(before);
}
let mut remaining = &content[start_pos + "<thinking>".len()..];
if remaining.starts_with('\n') {
remaining = &remaining[1..];
}
let end_pos = if let Some(end_pos) = find_real_thinking_end_tag(remaining) {
end_pos
} else if let Some(end_pos) = find_real_thinking_end_tag_at_buffer_end(remaining) {
end_pos
} else {
return content.to_string();
};
let after_tag = &remaining[end_pos + "</thinking>".len()..];
let after_thinking = after_tag.strip_prefix("\n\n").unwrap_or(after_tag);
if !after_thinking.is_empty() {
result.push_str(after_thinking);
}
result
}There was a problem hiding this comment.
Holding off on this one. The rewrite is correct and byte-equivalent (I checked all four branches: empty content, no start tag, no end tag, normal extraction), but the stated premise doesn't hold and the cost outweighs the gain here:
- Not a hot path.
strip_inline_thinking_contenthas exactly one call site workspace-wide —final_assistant_text, and only whenhidden_thinking_enabled. It runs once at response finalization (viafinal_assistant_message/final_content_blocks, called after the stream loop in provider.rs), never inside the per-chunkprocess_kiro_event. So it saves one smallVec+ a fewStringclones once per response, not per chunk. - Duplicates the scan control flow. The suggestion re-implements
split_inline_thinking_content's tag scanning (start tag → skip leading\n→ the two end-tag probes → strip\n\n). Keepingstripas a thin filter oversplitis a single source of truth; two parsers risk silently diverging if the thinking-tag semantics ever change. - Out of scope. This PR is a behavior-preserving structural split (no logic changes mixed in). Re-implementing the strip internals is a logic change, not a module move.
The previous collect::<String>() fix was a clean, local, no-duplication win — happy to take that class of change. If the micro-allocation on this cold path is worth pursuing, it'd be better as a separate perf PR that factors the scan into a shared visitor so neither copy duplicates the other.
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
What
Split the 2546-line
llm-access-kiro/src/anthropic/stream.rsinto astream/mod.rsfacade plus 6 concern-focused submodules. Pure structural move —no logic changes. Next step in the incremental
llm-access*refactor (follows#7 cache_sim, #4 request.rs).
Module layout
usage.rsKiroInputTokenSource,KIRO_CONTEXT_USAGE_MIN_REQUEST_TOKENS,anthropic_usage_json,resolve_input_tokens[_with_threshold]sse_event.rsSseEvent+ implsignature.rssynthetic_thinking_signature+ proto encoders / derive helpers / wire-format constsinline_thinking.rsbuild_inline_thinking_content_blocks,strip/split, quote-aware tag scanners,InlineThinkingBlockstate.rsSseStateManager,BlockStatecontext.rsStreamContext,BufferedStreamContext+ context-only helpers (estimate_tokens,find_char_boundary,canonicalize_structured_output_json,SYNTHETIC_THINKING_PLACEHOLDER)Module style (per the agreed standard)
mod.rsconvention; nouse super::*in submodules — explicit origin imports.pub use X::*globs:mod.rsre-exports only the 9 items used outsidethe module, by name, so the
stream::Xpublic surface (consumed byllm-accessprovider.rsand theanthropic/websearch.rssibling) isunchanged.
StreamContext.state_manageris apubfield of typeSseStateManager, soSseStateManagermust stay pub-reachable (E0446) — re-exported by name.pub+///doc (state machine'sis_block_open_of_type/has_non_thinking_blocks, the inline-thinking tagscanners +
strip,synthetic_thinking_signature, and the 6 signaturewire-format consts the shape test asserts against).
pub(super)items now usedonly within
streamwere narrowed to private (InlineThinkingBlock,split_inline_thinking_content). Nopub(crate).split test move into their submodules; the
StreamContextintegration clustercontext.rs(it's the only placethat can reach the private
process_assistant_response3 tests exercise).Verification
#[test]count31 = 31.
cargo clippy -p llm-access-kiro -- -D warnings(lib and--tests): clean.cargo test -p llm-access-kiro: 163 passed.cargo build -p llm-access: compiles against the unchanged publicsurface.
rustfmton the 7 new files only;deps/lance&deps/lancedbstayed clean.🤖 Generated with Claude Code