refactor(core): ♻️ Decompose agent runtime modules#136
Conversation
…implify workbench UI Break down the monolithic agent run manager and session logic into smaller, purpose-built modules and extract workbench UI helpers to improve maintainability and readability. Core Rust changes: - Move runtime event handling into `agent_run_event_handler.rs` - Extract run summary and title generation into `agent_run_summary.rs` and `agent_run_title.rs` - Split agent session logic into `agent_session_compression.rs`, `agent_session_events.rs`, `agent_session_history.rs`, `agent_session_tools.rs`, and `agent_session_types.rs` - Make `ActiveRun` and `AgentRunManager` fields public for better modularity - Re-export new submodules from `agent_run_manager.rs` and update `mod.rs` - Remove obsolete plan approval and runtime event loop code from `agent_run_manager.rs` - Update `AGENTS.md` to reflect the new module organization Extensions and UI changes: - Add extension host modules: `config_io.rs`, `marketplace.rs`, `mcp.rs`, `plugins.rs`, and `skills.rs` - Refactor extensions facade in `extensions/mod.rs` - Extract workbench UI helpers into dedicated files: `dashboard-terminal-orchestrator.tsx`, `long-message-body.tsx`, `runtime-thread-surface-diff.tsx`, and `runtime-thread-surface-helpers.ts` - Simplify `dashboard-workbench.tsx` by moving logic into orchestrators - Minor cleanup in `task_item_repo.rs`
AI Code Review SummaryPR: #136 (refactor(core): ♻️ Decompose agent runtime modules) Overall AssessmentDetected 22 actionable findings, prioritize CRITICAL/HIGH before merge. Major Findings by Severity
Actionable Suggestions
Potential Risks
Test Suggestions
File-Level Coverage Notes
Inline Downgraded Items (processed but not inline)
Coverage Status
Uncovered list:
No-patch covered list:
Runtime/Budget
|
| @@ -0,0 +1,593 @@ | |||
| use tauri::{AppHandle, Emitter}; | |||
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| @@ -0,0 +1,840 @@ | |||
| use tiycore::agent::AgentMessage; | |||
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| let _ = event_tx.send(ThreadStreamEvent::ReasoningUpdated { | ||
| run_id: run_id.to_string(), | ||
| message_id, | ||
| reasoning: buffer.clone(), |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| // message in the same run whose created_at >= started_at, minus a | ||
| // small delta so the tool call lands just before it. If no match | ||
| // is found, place it at the end. | ||
| let insert_pos = msg_positions |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| if rendered.len() > MAX_TOOL_RESULT_SIZE { | ||
| rendered.truncate(MAX_TOOL_RESULT_SIZE); | ||
| // Ensure we don't cut in the middle of a multi-byte UTF-8 char | ||
| while !rendered.is_char_boundary(rendered.len()) { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| /// scope is read from the `SkillRecordDto.scope` field, which `load_skills` | ||
| /// now derives from the discovered source label rather than the query | ||
| /// parameter. | ||
| pub(super) async fn lookup_skill_actual_scope( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| } | ||
| } | ||
|
|
||
| pub(super) async fn skill_exists( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| return Number.isFinite(parsed) ? parsed : 0; | ||
| } | ||
|
|
||
| export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| totalToolCalls: number; | ||
| }; | ||
|
|
||
| export function mapSnapshotHelperStatus( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| setRegenerating(false); | ||
| refocusAfterRegenerate(); | ||
| }); | ||
| }, [threadId, modelPlan, isRegenerating, refocusAfterRegenerate, t]); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
…d extensions Extract agent run compaction, session execution, and related tests into dedicated modules (agent_run_compaction.rs, agent_session_execution.rs, agent_run_manager_tests.rs, agent_session_tests.rs) to reduce complexity in agent_run_manager.rs and agent_session.rs. Simplify the extensions facade by removing mod.rs and introducing focused modules for config_io, plugins, mcp, skills, and marketplace. Add dashboard sidebar and overlays components to the workbench shell UI. Update documentation and module references to reflect the new structure.
| description: read_string_keys(tool, &["description"]) | ||
| .unwrap_or_else(|| name.clone()), | ||
| command, | ||
| args: read_string_array_keys(tool, &["args"]), |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| ) | ||
| } | ||
|
|
||
| #[cfg(test)] |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .collect() | ||
| } | ||
|
|
||
| pub(crate) fn convert_history_messages( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| } | ||
| } | ||
|
|
||
| pub(crate) fn validate_clarify_input(value: &serde_json::Value) -> Result<(), String> { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| /// this leaves headroom for protocol overhead and JSON escaping. | ||
| const MAX_TOOL_RESULT_SIZE: usize = 8_000_000; | ||
|
|
||
| pub(crate) fn agent_tool_result_from_output( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| "path": "/compact", | ||
| "description": "Clear history but keep a summary in context.", | ||
| "argumentHint": "[instructions=...]", | ||
| "argumentsText": instructions.clone().unwrap_or_default(), |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| format!( | ||
| "Create a short thread title for this conversation.\n\ | ||
| Rules:\n\ | ||
| - {language_rule}\n\ |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| } | ||
| } | ||
|
|
||
| pub(super) async fn skill_exists( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| import { ThreadRenameInput } from "@/modules/workbench-shell/ui/thread-rename-input"; | ||
| import { ThreadStatusIndicator } from "@/modules/workbench-shell/ui/thread-status-indicator"; | ||
|
|
||
| const WORKSPACE_THREAD_PAGE_SIZE = 10; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| state: string; | ||
| }; | ||
|
|
||
| function asToolDataRecord(value: unknown) { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Move ResolvedTool, ToolProviderContext, and related tool resolution, execution, and hook methods from extensions/mod.rs into a new runtime_tools.rs module to improve separation of concerns and reduce the size of the facade file. refactor(workbench): ♻️ extract dashboard logic and thread surface state - Move dashboard orchestration helpers and constants from dashboard-workbench.tsx into a dedicated dashboard-workbench-logic.ts module for better reuse and readability - Extract runtime thread surface state, timeline mapping, and related types into a new runtime-thread-surface-state.ts module to slim down the main surface component - Update imports and references across workbench-shell UI modules - Update AGENTS.md to reflect the new module structure
| self.handle_runtime_event(run_id, terminal_event).await | ||
| } | ||
|
|
||
| pub(crate) async fn handle_runtime_event( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| use crate::ipc::frontend_channels::ThreadStreamEvent; | ||
| use crate::model::thread::RunUsageDto; | ||
|
|
||
| pub(crate) fn handle_agent_event( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| use super::agent_session::{standard_tool_timeout, AgentSession, CLARIFY_TOOL_NAME}; | ||
|
|
||
| impl AgentSession { | ||
| pub(crate) async fn execute_tool_call( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
|
|
||
| pub async fn marketplace_install_item(&self, id: &str) -> Result<PluginDetailDto, AppError> { | ||
| let item = self | ||
| .marketplace_list_items() |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .arg("clone") | ||
| .arg("--depth") | ||
| .arg("1") | ||
| .arg(&source.url) |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| /// audit. It always emits a terminal event (RunCompleted / RunFailed / | ||
| /// RunCancelled) and always clears the `ActiveRun`, even on panic-like | ||
| /// early returns, so the thread can't get stuck in Running state. | ||
| async fn run_compact_background( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| /// directly rather than via the Agent runtime. | ||
| /// | ||
| /// [`generate_discard_summary`]: crate::core::context_compression::generate_discard_summary | ||
| pub(crate) async fn run_auto_compression( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| &cache_dir.join(".claude-plugin/marketplace.json"), | ||
| )?; | ||
| let source_name = source_manifest.name.unwrap_or_else(|| source.name.clone()); | ||
| let installed = self |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| } | ||
|
|
||
| pub(super) fn marketplace_source_id(url: &str) -> String { | ||
| let mut hasher = std::collections::hash_map::DefaultHasher::new(); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| } | ||
|
|
||
| pub(super) fn is_builtin_marketplace_source_id(id: &str) -> bool { | ||
| builtin_marketplace_sources() |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
…ar persistence - Extract `persist_clear_context_reset_to_pool` to isolate context reset persistence logic - Refactor `resolve_helper_tool_task` into a reusable function with clear error handling - Add comprehensive unit tests for context clear persistence and helper tool task resolution - Improve test coverage for terminal events and reasoning completion policies
…y cleanup - Always attach the payload hook so the DeepSeek thinking normalizer runs even when no provider_options are present - Add `is_deepseek_provider` detection by provider type and base URL - Add `normalize_deepseek_thinking_payload` to sanitize assistant messages by filling missing reasoning_content, preventing null content, and stripping reasoning when thinking is disabled - Pass `thinking_level` through helper agent orchestrator requests and enable thinking only when the model supports it - Skip empty reasoning records in `convert_history_messages` to avoid serialization issues with providers like DeepSeek - Clear pending thinking at tool-result boundaries to prevent orphan reasoning from leaking across assistant messages - Make `merge_payload` public and reuse it from `agent_session` - Add comprehensive tests for DeepSeek detection and payload normalization
| .arg("clone") | ||
| .arg("--depth") | ||
| .arg("1") | ||
| .arg(&source.url) |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| event: &str, | ||
| payload: serde_json::Value, | ||
| ) -> Result<HookOutput, AppError> { | ||
| let command_path = plugin.path.join(handler); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
|
|
||
| let output = self | ||
| .execute_command_json( | ||
| OsStr::new(&tool.command), |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| // mutates) will saturate the IPC queue and block thread list rendering. | ||
| export const SIDEBAR_SYNC_MIN_GAP_MS = 300; | ||
|
|
||
| export function buildInitialWorkspaceThreadDisplayCounts() { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| return Number.isFinite(parsed) ? parsed : 0; | ||
| } | ||
|
|
||
| export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| * a tool's state when a stale snapshot resolves after live stream events | ||
| * have already advanced the tool. | ||
| */ | ||
| function isMoreAdvancedToolState( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| showOutputLabel?: boolean; | ||
| }; | ||
|
|
||
| export function getReadToolPresentation(tool: RuntimeSurfaceToolEntry): ReadToolPresentation | null { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| ) | ||
| })?; | ||
|
|
||
| if !discovered && !plugin_dir.exists() { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| </div> | ||
| </div> | ||
| ) : ( | ||
| sortWorkspacesWithWorktrees(workspaces as WorkspaceItem[]).map((workspace) => { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| }; | ||
| } | ||
|
|
||
|
|
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Introduce turn_index propagation across agent session events, stream events, and persistence to enable proper response boundary tracking and reasoning lifecycle management. Key changes: - Add turn_index field to ThreadStreamEvent::MessageCompleted and ReasoningUpdated, and propagate it from AgentEvent::MessageUpdate and MessageEnd through the event handling pipeline - Persist turn_index into message metadata when messages complete - Improve reasoning message termination logic to discard invalid or empty reasoning blocks without valid thinking signatures - Expand DeepSeek payload normalizer to backfill reasoning_content on text-only assistant messages, not just those with tool_calls - Add discard_dangling_reasoning cleanup for interrupted runs on startup recovery - Update all related tests and frontend integration tests to cover turn_index propagation and new DeepSeek normalization scenarios Closes: Phase 4 reasoning mis-allocation, DeepSeek 400 errors
|
|
||
| use super::agent_run_manager::AgentRunManager; | ||
|
|
||
| pub(crate) fn build_orphaned_run_terminal_event( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| self.handle_runtime_event(run_id, terminal_event).await | ||
| } | ||
|
|
||
| pub(crate) async fn handle_runtime_event( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| }); | ||
| } | ||
|
|
||
| match outcome.result { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| } | ||
|
|
||
| impl AgentSession { | ||
| pub(crate) async fn execute_tool_call( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .collect() | ||
| } | ||
|
|
||
| pub(crate) fn convert_history_messages( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| import { ThreadRenameInput } from "@/modules/workbench-shell/ui/thread-rename-input"; | ||
| import { ThreadStatusIndicator } from "@/modules/workbench-shell/ui/thread-status-indicator"; | ||
|
|
||
| const WORKSPACE_THREAD_PAGE_SIZE = 10; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| // mutates) will saturate the IPC queue and block thread list rendering. | ||
| export const SIDEBAR_SYNC_MIN_GAP_MS = 300; | ||
|
|
||
| export function buildInitialWorkspaceThreadDisplayCounts() { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| return Number.isFinite(parsed) ? parsed : 0; | ||
| } | ||
|
|
||
| export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| showOutputLabel?: boolean; | ||
| }; | ||
|
|
||
| export function getReadToolPresentation(tool: RuntimeSurfaceToolEntry): ReadToolPresentation | null { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| input?: unknown; | ||
| name: string; | ||
| result?: unknown; | ||
| state: string; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| self.handle_runtime_event(run_id, terminal_event).await | ||
| } | ||
|
|
||
| pub(crate) async fn handle_runtime_event( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| use crate::ipc::frontend_channels::ThreadStreamEvent; | ||
| use crate::model::thread::RunUsageDto; | ||
|
|
||
| pub(crate) fn handle_agent_event( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| } | ||
| } | ||
|
|
||
| pub(crate) fn validate_clarify_input(value: &serde_json::Value) -> Result<(), String> { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| /// this leaves headroom for protocol overhead and JSON escaping. | ||
| const MAX_TOOL_RESULT_SIZE: usize = 8_000_000; | ||
|
|
||
| pub(crate) fn agent_tool_result_from_output( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| }) | ||
| } | ||
|
|
||
| pub(super) fn read_string_array_keys( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| }) | ||
| } | ||
|
|
||
| pub(super) fn mask_sensitive_value(value: &str) -> String { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .then_with(|| left.id.cmp(&right.id)) | ||
| } | ||
|
|
||
| pub(super) fn apply_skill_state(record: &mut SkillRecordDto, state: &SkillStateStore) { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| run_id: "run-1".into(), | ||
| message_id: "msg-1".into(), | ||
| content: "Full response".into(), | ||
| turn_index: None, |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .await; | ||
| } | ||
|
|
||
| let tool_call_storage_id = uuid::Uuid::now_v7().to_string(); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| ) | ||
| })?; | ||
|
|
||
| if !discovered && !plugin_dir.exists() { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Add a new `supports_reasoning` field across the model configuration stack to properly track and propagate reasoning capabilities: - Introduce `supports_reasoning` to `RuntimeModelRole`, `RunModelPlanRoleDto`, and related types - Infer reasoning support automatically from model IDs with keyword patterns (e.g. o1, r1, gpt-5) - Allow manual overrides via `capabilityOverrides` to disable/enable reasoning per model - Extract `inferModelCapabilities` and `getEffectiveModelCapabilities` into a shared module - Update `resolve_runtime_model_role` to respect declared reasoning capability instead of forcing it - Refactor `resolve_model_plan` to apply thinking level only to reasoning-capable roles via `apply_thinking_level_to_model_role` - Adjust `configure_agent` to check both thinking level and `model.reasoning` flag - Add comprehensive tests for reasoning capability resolution, thinking level application, and helper role behavior BREAKING CHANGE: model reasoning is no longer forced on when thinking level is enabled; it now depends on the model's declared capability
| /// audit. It always emits a terminal event (RunCompleted / RunFailed / | ||
| /// RunCancelled) and always clears the `ActiveRun`, even on panic-like | ||
| /// early returns, so the thread can't get stuck in Running state. | ||
| async fn run_compact_background( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| /// (errors, final output) of large tool results. When the omitted middle | ||
| /// section is very small (< 50 chars), a simple head truncation is used | ||
| /// instead to avoid a gap marker that hides barely any content. | ||
| pub(crate) fn truncate_tool_result_head_tail(text: &str, max_chars: usize) -> String { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .collect() | ||
| } | ||
|
|
||
| pub(crate) fn convert_history_messages( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| Ok(items) | ||
| } | ||
|
|
||
| pub async fn marketplace_install_item(&self, id: &str) -> Result<PluginDetailDto, AppError> { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| event: &str, | ||
| payload: serde_json::Value, | ||
| ) -> Result<HookOutput, AppError> { | ||
| let command_path = plugin.path.join(handler); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| return Number.isFinite(parsed) ? parsed : 0; | ||
| } | ||
|
|
||
| export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .map(([toolName, count]) => `${toolName} ${count}`); | ||
| } | ||
|
|
||
| export function getHelperElapsedSeconds( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| input?: unknown; | ||
| name: string; | ||
| result?: unknown; | ||
| state: string; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| git2 = { version = "0.20", features = ["vendored-libgit2", "vendored-openssl"] } | ||
| similar = "2" | ||
| tiycore = "0.1.19" | ||
| tiycore = "0.1.21-rc.26042620" |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| let spawn_model_role = preview_spec.model_plan.primary.clone(); | ||
| let spawn_response_language = response_language.map(str::to_owned); | ||
| let spawn_frontend_tx = frontend_tx.clone(); | ||
| tokio::spawn(async move { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
…to tiycore Move DeepSeek reasoning_content normalization from application layer into tiycore, eliminating duplicate handling across agent_session and subagent orchestrator. - Remove is_deepseek_provider and normalize_deepseek_thinking_payload from agent_session - Remove redundant normalize_deepseek_thinking_payload tests from agent_session_tests - Update subagent orchestrator payload hook to rely on tiycore built-in handling - Add reasoning_content_constrained field in settings_manager tests - Update agent_run integration tests with supportsReasoning flag - Bump tiycore to 0.2.1-rc.26042720 and refresh Cargo.lock
| let _ = event_tx.send(ThreadStreamEvent::ReasoningUpdated { | ||
| run_id: run_id.to_string(), | ||
| message_id, | ||
| reasoning: buffer.clone(), |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| use crate::ipc::frontend_channels::ThreadStreamEvent; | ||
| use crate::model::thread::RunUsageDto; | ||
|
|
||
| pub(crate) fn handle_agent_event( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| ); | ||
| } | ||
|
|
||
| self.checkpoint_requested.store(true, Ordering::SeqCst); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .collect() | ||
| } | ||
|
|
||
| pub(crate) fn convert_history_messages( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| // candidate text message and insert_pos — a reasoning message in | ||
| // between indicates the tool call came from a different model | ||
| // response and must remain a separate assistant message. | ||
| let merge_target_pos = msg_positions[..insert_pos] |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| return Number.isFinite(parsed) ? parsed : 0; | ||
| } | ||
|
|
||
| export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| input?: unknown; | ||
| name: string; | ||
| result?: unknown; | ||
| state: string; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| model_name: model_id.to_string(), | ||
| provider_type: "openai".to_string(), | ||
| provider_name: "OpenAI".to_string(), | ||
| api_key: Some("test-key".to_string()), |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| // Merge per-model provider_options into every outgoing LLM request payload. | ||
| { | ||
| let provider_options = request.model_role.provider_options.clone(); | ||
| if provider_options.is_some() { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| tool_calls = tc_count, | ||
| helpers = helper_count, | ||
| reasoning = reasoning_count, | ||
| "interrupted dangling runs/tool_calls/helpers on startup" |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Add path traversal prevention by canonicalizing the hook path and checking it stays within the plugin directory. Previously, a malicious plugin could escape its directory via crafted hook paths. Additionally: - Update log message in thread_manager to include "reasoning" in startup dangling runs/tool_calls/helpers/reasoning count. - Import WORKSPACE_THREAD_PAGE_SIZE from shared logic instead of local constant in dashboard-sidebar. - Change state property type from string to SurfaceToolState in RuntimeSurfaceToolEntry for better type safety.
| description: read_string_keys(tool, &["description"]) | ||
| .unwrap_or_else(|| name.clone()), | ||
| command, | ||
| args: read_string_array_keys(tool, &["args"]), |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .collect::<Vec<_>>() | ||
| }); | ||
|
|
||
| let output = self |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
|
|
||
| let setup = async { | ||
| message_repo::insert(&self.pool, &user_message).await?; | ||
| message_repo::insert(&self.pool, &reset_message).await?; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| } | ||
|
|
||
| pub(super) fn marketplace_source_id(url: &str) -> String { | ||
| let mut hasher = std::collections::hash_map::DefaultHasher::new(); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| .arg("clone") | ||
| .arg("--depth") | ||
| .arg("1") | ||
| .arg(&source.url) |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| let installed = self.load_installed_plugin_records().await?; | ||
| let mut items = Vec::with_capacity(installed.len()); | ||
| for record in installed { | ||
| let mut runtime = self.load_plugin_from_dir(Path::new(&record.path), false)?; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| args: Some(read_string_array_keys(spec, &["args"])), | ||
| env: read_string_map_keys(spec, &["env"]), | ||
| cwd: read_string_keys(spec, &["cwd"]), | ||
| url: read_string_keys(spec, &["url"]), |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| maxOutputTokens: model.maxOutputTokens ?? null, | ||
| supportsImageInput: model.capabilityOverrides.vision ?? null, | ||
| supportsImageInput: capabilities.vision, | ||
| supportsReasoning: capabilities.reasoning, |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| if did_update_sources { | ||
| self.save_marketplace_sources(&store)?; | ||
| } | ||
| let installed = self.load_installed_plugin_records().await?; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| ) | ||
| })?; | ||
|
|
||
| if !discovered && !plugin_dir.exists() { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Update the Homebrew cask URL to remove duplicate 'v' prefix in version path. Add additional application data directories to the zap uninstall cleanup list (HTTPStorages, Logs, and WebKit) to ensure complete removal of local application data.
| self.handle_runtime_event(run_id, terminal_event).await | ||
| } | ||
|
|
||
| pub(crate) async fn handle_runtime_event( |
There was a problem hiding this comment.
[CRITICAL] agent_run_event_handler.rs has zero test coverage for the main event handler
The central runtime event handler that drives all agent run state transitions has no automated tests. Race conditions in status transitions (e.g. MessageDelta → MessageCompleted, or duplicate terminal events) are entirely untested.
Suggestion: Add unit tests for each match arm of handle_runtime_event using a mock AgentRunManager or by extracting pure decision functions (terminal_event_status, should_complete_reasoning_for_event, is_terminal_runtime_event) and testing those with synthetic ThreadStreamEvent variants. The free functions at R16-R59 are ideal candidates for immediate unit testing.
Risk: State machine regressions (e.g. wrong thread status after cancellation, reasoning messages left in 'streaming' state) will only be caught manually.
Confidence: 0.95
| /// | ||
| /// Returns `(run_id, event_rx)` so the caller can forward events over a | ||
| /// Tauri `Channel` identical to `start_run`. | ||
| pub async fn compact_thread_context( |
There was a problem hiding this comment.
[HIGH] compact_thread_context and run_compact_background have no tests
The /compact command entry point — including ActiveRun registration, message persistence order, broadcast channel setup, and the spawned background task — has no test coverage. The existing test only covers the simpler /clear path.
Suggestion: Add a test that: (1) calls compact_thread_context on a seeded thread; (2) verifies the user message, reset marker, and run row are persisted; (3) verifies the frontend channel receives RunStarted and ContextCompressing events. The LLM-dependent part (run_compact_background) can be tested with a mock or by verifying the final state after a short timeout.
Risk: Compaction lifecycle bugs (stuck Running state, missing reset markers, missing events) will only be caught in manual testing.
Confidence: 0.88
| message_repo::update_status(&self.pool, &message_id, finalized_message_status).await?; | ||
| } | ||
| // Reasoning termination: classify by content/signature validity | ||
| if let Some(message_id) = reasoning_message_id { |
There was a problem hiding this comment.
[HIGH] finish_run reasoning discard logic has no test coverage
The reasoning message classification in finish_run (discard if empty or missing signature for non-completed runs, keep otherwise) directly controls what thinking content users see after interrupted/cancelled runs. No tests validate any of the four combinations (empty vs non-empty × signature present vs absent).
Suggestion: Extract the discard decision into a pure function and add parameterized tests covering: (1) completed run → never discard; (2) non-completed + empty content → discard; (3) non-completed + no signature → discard; (4) non-completed + content + signature → keep as completed.
Risk: Users may lose valid thinking content on interrupted runs, or see garbage thinking content that should have been discarded.
Confidence: 0.87
| } | ||
|
|
||
| impl AgentSession { | ||
| pub(crate) async fn execute_tool_call( |
There was a problem hiding this comment.
[HIGH] execute_tool_call dispatch has no test coverage for any branch
The primary tool execution dispatcher has no integration tests. Branch-specific logic like approval flow event emission (ApprovalRequired → ApprovalResolved), timeout result persistence, and clarify request lifecycle are all untested.
Suggestion: At minimum, add unit tests for the event emission patterns: (1) tool approval required then approved emits both events; (2) tool timeout persists the error to tool_call_repo; (3) clarify request emits ClarifyRequired then ClarifyResolved. These can be tested with mock event_tx and tool_gateway.
Risk: Event emission ordering bugs (e.g. missing ToolRunning event, double ToolFailed) would cause frontend state desynchronization.
Confidence: 0.85
| ) -> Result<HookOutput, AppError> { | ||
| let command_path = plugin.path.join(handler); | ||
| // Prevent path traversal: ensure the resolved command stays within the plugin directory. | ||
| let command_path = std::fs::canonicalize(&command_path).map_err(|error| { |
There was a problem hiding this comment.
[HIGH] execute_hook path traversal prevention is untested
The path traversal guard in execute_hook is security-critical but untested. A regression in the starts_with comparison or canonicalize behavior could silently allow execution of arbitrary binaries.
Suggestion: Add tests that: (1) a handler '../../../etc/passwd' is rejected with the hook_escape error; (2) a symlink inside the plugin dir that points outside is rejected; (3) a valid handler within the plugin dir succeeds.
Risk: A path traversal bypass would allow plugin manifests to execute arbitrary commands outside their directory.
Confidence: 0.92
| }); | ||
| } | ||
|
|
||
| export function mapRunStateToWorkbenchThreadStatus( |
There was a problem hiding this comment.
[MEDIUM] No unit tests for dashboard-workbench-logic.ts pure functions
The extracted workbench logic module contains status mapping and data computation functions with no test coverage. These functions drive sidebar status indicators and context badge display.
Suggestion: Add tests for: each RunState value in mapRunStateToWorkbenchThreadStatus; each status in mapRunFinishedStatusToThreadStatus; buildThreadContextBadgeData with zero/positive context window; mergeLocalFallbackThreads with overlapping and disjoint thread sets; formatCompactTokenCount boundary values.
Risk: Incorrect sidebar thread status badges or context usage display after refactoring
Confidence: 0.88
| git2 = { version = "0.20", features = ["vendored-libgit2", "vendored-openssl"] } | ||
| similar = "2" | ||
| tiycore = "0.1.19" | ||
| tiycore = { version = "0.2.1-rc.26042720" } |
There was a problem hiding this comment.
[LOW] RC version of tiycore used as a production dependency
The tiycore dependency was bumped from a stable version (0.1.19) to a release candidate (0.2.1-rc.26042720). RC versions may have unstable APIs or bugs and shouldn't ship to production without explicit sign-off.
Suggestion: Ensure this RC is intentional and temporary. Add a comment or tracking issue reference in Cargo.toml, and plan to pin to the stable 0.2.1 once it's released before merging to main.
Risk: An RC dependency could introduce regressions, breaking changes, or instability in agent runtime behavior.
Confidence: 0.90
| git2 = { version = "0.20", features = ["vendored-libgit2", "vendored-openssl"] } | ||
| similar = "2" | ||
| tiycore = "0.1.19" | ||
| tiycore = { version = "0.2.1-rc.26042720" } |
There was a problem hiding this comment.
[LOW] Pre-release tiycore dependency introduces supply-chain risk
The tiycore dependency is updated from a stable version (0.1.19) to a pre-release RC version (0.2.1-rc.26042720), which may carry additional supply-chain and stability risk.
Suggestion: Pin to a stable release before shipping to production. If an RC is required for feature access, verify the upstream tag and commit signature.
Risk: RC builds may contain unreviewed changes, security regressions, or unstable behavior not present in the stable release.
Confidence: 0.90
|
|
||
| if let Some(mut stdin) = child.stdin.take() { | ||
| tokio::spawn(async move { | ||
| let _ = stdin.write_all(&payload).await; |
There was a problem hiding this comment.
[LOW] Stdin write error silently ignored in execute_command_
The stdin write to the child process in execute_command_ silently ignores errors. If writing the JSON payload fails, the child may hang waiting for input or process empty input, producing confusing results.
Suggestion: Log the error at minimum: if let Err(e) = stdin.write_all(&payload).await { tracing::warn!(...); }. Consider propagating the error or killing the child process if the write fails.
Risk: Silent stdin write failures can cause plugins/hooks to hang or produce incorrect output without any diagnostic signal.
Confidence: 0.85
| status: message.status, | ||
| }; | ||
| } | ||
|
|
There was a problem hiding this comment.
[LOW] Double blank lines in runtime-thread-surface-state.ts
Inconsistent spacing with double blank lines in the new state module.
Suggestion: Remove extra blank lines to maintain single-blank-line convention.
Risk: Minor style inconsistency; no functional impact.
Confidence: 0.90
Summary
runtime-thread-surface.tsxanddashboard-workbench.tsx.AGENTS.md.Test Plan
cargo fmt --manifest-path src-tauri/Cargo.toml --checkcargo test --manifest-path src-tauri/Cargo.toml— 499 unit + 197 integration tests passed, matching the Phase 0 baseline.npm run typechecknpm run test:unit— 20 files / 213 tests passed.npm run build:web— passed with existing Vite chunk-size warnings.🤖 Generated with TiyCode