refactor(core): ♻️ Decompose agent runtime modules by HayWolf · Pull Request #136 · tiylabs/tiycode

HayWolf · 2026-04-25T17:24:03Z

Summary

Split large Rust runtime and extension modules into focused companion files while preserving facade-style public APIs.
Extract reusable workbench UI helpers/components from runtime-thread-surface.tsx and dashboard-workbench.tsx.
Update agent-facing project structure guidance in AGENTS.md.

Test Plan

cargo fmt --manifest-path src-tauri/Cargo.toml --check
cargo test --manifest-path src-tauri/Cargo.toml — 499 unit + 197 integration tests passed, matching the Phase 0 baseline.
npm run typecheck
npm run test:unit — 20 files / 213 tests passed.
npm run build:web — passed with existing Vite chunk-size warnings.

🤖 Generated with TiyCode

…implify workbench UI Break down the monolithic agent run manager and session logic into smaller, purpose-built modules and extract workbench UI helpers to improve maintainability and readability. Core Rust changes: - Move runtime event handling into `agent_run_event_handler.rs` - Extract run summary and title generation into `agent_run_summary.rs` and `agent_run_title.rs` - Split agent session logic into `agent_session_compression.rs`, `agent_session_events.rs`, `agent_session_history.rs`, `agent_session_tools.rs`, and `agent_session_types.rs` - Make `ActiveRun` and `AgentRunManager` fields public for better modularity - Re-export new submodules from `agent_run_manager.rs` and update `mod.rs` - Remove obsolete plan approval and runtime event loop code from `agent_run_manager.rs` - Update `AGENTS.md` to reflect the new module organization Extensions and UI changes: - Add extension host modules: `config_io.rs`, `marketplace.rs`, `mcp.rs`, `plugins.rs`, and `skills.rs` - Refactor extensions facade in `extensions/mod.rs` - Extract workbench UI helpers into dedicated files: `dashboard-terminal-orchestrator.tsx`, `long-message-body.tsx`, `runtime-thread-surface-diff.tsx`, and `runtime-thread-surface-helpers.ts` - Simplify `dashboard-workbench.tsx` by moving logic into orchestrators - Minor cleanup in `task_item_repo.rs`

github-actions · 2026-04-25T17:30:51Z

AI Code Review Summary

PR: #136 (refactor(core): ♻️ Decompose agent runtime modules)
Preferred language: English

Overall Assessment

Detected 22 actionable findings, prioritize CRITICAL/HIGH before merge.

Major Findings by Severity

CRITICAL (1)
- src-tauri/src/core/agent_run_event_handler.rs:102 - agent_run_event_handler.rs has zero test coverage for the main event handler
HIGH (8)
- src-tauri/src/core/agent_run_compaction.rs:118 - compact_thread_context and run_compact_background have no tests
- src-tauri/src/core/agent_run_event_handler.rs:501 - finish_run reasoning discard logic has no test coverage
- src-tauri/src/core/agent_session_execution.rs:62 - execute_tool_call dispatch has no test coverage for any branch
- src-tauri/src/extensions/plugins.rs:868 - execute_hook path traversal prevention is untested
- src-tauri/src/extensions/plugins.rs:920 - execute_plugin_tool and execute_command_ have no test coverage
- src-tauri/src/extensions/plugins.rs:953 - Plugin tool commands execute arbitrary executables without path confinement
- src/modules/workbench-shell/ui/runtime-thread-surface-metadata.ts:96 - No unit tests for metadata parsing functions in runtime-thread-surface-metadata.ts
- src/modules/workbench-shell/ui/runtime-thread-surface-state.ts:484 - No unit tests added for newly-extracted pure functions in runtime-thread-surface-state.ts
MEDIUM (9)
- src-tauri/src/extensions/marketplace.rs:275 - SSRF via unvalidated git clone URL in sync_marketplace_source_repo
- src-tauri/src/extensions/marketplace.rs:311 - N+1 redundant DB query in marketplace_list_items via marketplace_items_for_source
- src-tauri/src/extensions/marketplace.rs:554 - DefaultHasher used for persistent marketplace source IDs is not cross-version stable
- src-tauri/src/extensions/marketplace.rs:204 - marketplace_install_item does full list scan to find a single item
- src-tauri/src/extensions/plugins.rs:1480 - substitute_variables has no test coverage
- src-tauri/src/extensions/plugins.rs:1488 - mask_sensitive_value and mask_url have no test coverage
- src/modules/workbench-shell/ui/dashboard-overlays.tsx:234 - Same indentation issue in DashboardOverlays
- src/modules/workbench-shell/ui/dashboard-sidebar.tsx:120 - Extracted JSX retains parent-level indentation, hurting readability
LOW (4)
- src-tauri/Cargo.toml:57 - RC version of tiycore used as a production dependency
- src-tauri/Cargo.toml:57 - Pre-release tiycore dependency introduces supply-chain risk
- src-tauri/src/extensions/plugins.rs:1030 - Stdin write error silently ignored in execute_command_
- src/modules/workbench-shell/ui/runtime-thread-surface-state.ts:123 - Double blank lines in runtime-thread-surface-state.ts

Actionable Suggestions

Add tool_call_repo::update_result calls in all result branches of execute_tool_call (Denied, EscalationRequired, Cancelled, Err, Executed) to prevent stale 'requested' records
Replace let _ = stdin.write_all(...) with error logging or propagation in execute_command_
Use dunce::canonicalize consistently in execute_hook to match the plugin path resolution in load_plugin_from_dir
Add a tracking issue or TODO comment for the tiycore RC version and plan to upgrade to stable before release
Consider documenting the plugin trust model (trusted vs untrusted sources) and adding permission-level validation for tool.command values
Add path confinement for plugin tool commands similar to the hook path-traversal check already in execute_hook
Validate MCP server URLs against private/internal IP ranges before auto-starting plugin-managed MCP servers
Canonicalize and validate plugin tool CWD to ensure it stays within the workspace or plugin directory
Add an environment variable denylist (PATH, LD_PRELOAD, DYLD_INSERT_LIBRARIES, etc.) for plugin tool env maps
Pin tiycore to a stable release before production deployment
Consider requiring explicit user approval before installing plugins or enabling plugin-managed MCP servers
Add unit tests for the 4 free functions in agent_run_event_handler.rs (build_orphaned_run_terminal_event, terminal_event_status, is_terminal_runtime_event, should_complete_reasoning_for_event) — these require no mocking

Potential Risks

Stale tool_call records accumulating in 'requested' status could degrade DB query performance over time and mislead the UI
RC tiycore dependency could introduce breaking changes or runtime bugs in agent sessions
Plugin tool commands with no sandboxing could be exploited if plugins are loaded from untrusted marketplace sources
Silent stdin write failures could cause intermittent plugin/hook failures that are hard to diagnose
A malicious plugin can execute arbitrary system commands via tool.command
A malicious plugin can probe internal network services via MCP URL SSRF
A malicious plugin can subvert spawned process execution via env var injection (LD_PRELOAD, PATH override)
A malicious plugin can set arbitrary working directories for spawned commands
Pre-release tiycore dependency may introduce unreviewed security changes
State machine regressions in agent_run_event_handler.rs will only be caught in manual end-to-end testing
Path traversal bypass in execute_hook could allow arbitrary command execution
Compaction lifecycle bugs could leave threads stuck in Running state

Test Suggestions

Add integration test verifying tool_call_repo status is updated to 'failed' for all error paths in execute_tool_call
Add Windows-specific test for hook path traversal check with dunce vs std::fs::canonicalize
Add test for execute_command_ with stdin write failure (mock broken pipe)
Add test for compact_thread_context concurrent with an active run (race condition)
Verify all new modules compile and pass cargo test with the RC tiycore version
Write security tests verifying hook path-traversal prevention works with symlinks and ../ sequences
Add tests that plugin tool commands cannot specify executables outside expected directories
Add tests that MCP URLs pointing to 127.0.0.1, 169.254.169.254, 10.x.x.x, and 192.168.x.x are rejected or require approval
Add tests that PATH, LD_PRELOAD, and other sensitive env var keys are denied in plugin tool env maps
Add tests that plugin CWD containing ../ or absolute paths outside workspace are rejected
Fuzz test parse_plugin_manifest and parse_plugin_command_markdown with malformed inputs
Unit test all pure/free functions first (lowest effort, highest coverage density)

File-Level Coverage Notes

src-tauri/src/core/agent_run_manager.rs: Patch content unavailable (binary/large/renamed), reviewed as file-level risk only. (cannot_inline_without_patch)
src-tauri/src/core/agent_session.rs: Patch content unavailable (binary/large/renamed), reviewed as file-level risk only. (cannot_inline_without_patch)
src-tauri/src/core/agent_session_tests.rs: Patch content unavailable (binary/large/renamed), reviewed as file-level risk only. (cannot_inline_without_patch)
src-tauri/src/extensions/mcp.rs: Patch content unavailable (binary/large/renamed), reviewed as file-level risk only. (cannot_inline_without_patch)
src-tauri/src/extensions/mod.rs: Patch content unavailable (binary/large/renamed), reviewed as file-level risk only. (cannot_inline_without_patch)
src-tauri/Cargo.toml: low_risk (Simple dependency version change; testing risk is in downstream modules consuming the new API.)
src-tauri/src/extensions/plugins.rs: high_risk (Good test coverage for manifest parsing, DTO builders, and free-function helpers. Critical gap is the I/O-heavy async methods (execute_hook, execute_plugin_tool, install/uninstall/sync) which have zero coverage.)
src-tauri/src/core/agent_run_event_handler.rs: critical_risk (This is the most critical testing gap in the batch. The entire event-driven state machine that governs run lifecycle has no automated verification.)
src-tauri/src/core/agent_session_execution.rs: high_risk (resolve_helper_tool_task has good test coverage. The main execution paths and all other sub-handlers are untested.)
src-tauri/src/core/subagent/orchestrator.rs: medium_risk (The refactoring of merge_payload and addition of thinking_level propagation are behaviorally significant but lightly tested.)
src-tauri/src/core/agent_run_compaction.rs: high_risk (persist_clear_context_reset_to_pool has a good test. The main orchestration methods are completely untested.)
src-tauri/src/core/agent_session_compression.rs: medium_risk (Best test coverage in the batch. persist_compression_markers_to_pool and run_auto_compression (non-LLM paths) have solid tests. The calibration and LLM-dependent paths remain gaps.)
src-tauri/src/extensions/marketplace.rs: Has several performance issues: N+1 DB queries, full-list scan for single-item install, sequential git sync, and per-call allocations in builtin source helpers. (The N+1 pattern in marketplace_list_items is the most impactful issue to address first.)
src-tauri/src/extensions/skills.rs: Lacks any caching layer; every existence check or scope lookup does a full filesystem walk and parse. Linear search in state application. (The skill_exists / lookup_skill_actual_scope full-scan pattern is the highest-impact issue here.)
src-tauri/src/extensions/config_io.rs: Minor I/O concern with display_config_path canonicalize. Mutex-based diagnostic recording is acceptable for small lists. Overall low risk. (Low priority; canonicalize caching is a nice-to-have.)
src-tauri/src/extensions/runtime_tools.rs: resolve_tool does linear scan across all plugins and MCP servers. Hook execution is sequential with interleaved DB writes. Acceptable for typical counts but no indexing. (No critical issues; consider a tool-name index if plugin count grows.)
src-tauri/src/core/settings_manager.rs: Only a test data addition (reasoning_content_constrained field). No performance impact.
src-tauri/src/core/thread_manager.rs: Adds a single DB call (discard_dangling_reasoning) to startup recovery. Low overhead; the query is a simple indexed UPDATE.
src-tauri/src/commands/git.rs: Test-only change adding a field. No performance impact.
src-tauri/src/commands/thread.rs: Test-only change adding a field. No performance impact.
... and 31 more file-level entries.

Inline Downgraded Items (processed but not inline)

None

Coverage Status

Target files: 51
Covered files: 35
Uncovered files: 16
No-patch/binary covered as file-level: 5
Findings with unknown confidence (N/A): 0

Uncovered list:

src-tauri/src/core/agent_run_manager_tests.rs: planner_signaled_done_before_coverage
src-tauri/src/core/agent_run_summary.rs: planner_signaled_done_before_coverage
src-tauri/src/core/agent_run_title.rs: planner_signaled_done_before_coverage
src-tauri/src/core/agent_session_events.rs: planner_signaled_done_before_coverage
src-tauri/src/core/agent_session_history.rs: planner_signaled_done_before_coverage
src-tauri/src/core/agent_session_tools.rs: planner_signaled_done_before_coverage
src-tauri/src/core/agent_session_types.rs: planner_signaled_done_before_coverage
src-tauri/src/core/mod.rs: planner_signaled_done_before_coverage
src/modules/settings-center/model/run-model-plan.test.ts: planner_signaled_done_before_coverage
src/modules/settings-center/model/run-model-plan.ts: planner_signaled_done_before_coverage
src/modules/settings-center/ui/settings-center-overlay.tsx: planner_signaled_done_before_coverage
src/modules/workbench-shell/ui/dashboard-terminal-orchestrator.tsx: planner_signaled_done_before_coverage
src/modules/workbench-shell/ui/runtime-thread-surface-diff.tsx: planner_signaled_done_before_coverage
src/modules/workbench-shell/ui/runtime-thread-surface-helpers.ts: planner_signaled_done_before_coverage
src/modules/workbench-shell/ui/runtime-thread-surface-tools.tsx: planner_signaled_done_before_coverage
src/modules/workbench-shell/ui/thread-rename-input.tsx: planner_signaled_done_before_coverage

No-patch covered list:

src-tauri/src/core/agent_run_manager.rs: patch_unavailable_or_binary
src-tauri/src/core/agent_session.rs: patch_unavailable_or_binary
src-tauri/src/core/agent_session_tests.rs: patch_unavailable_or_binary
src-tauri/src/extensions/mcp.rs: patch_unavailable_or_binary
src-tauri/src/extensions/mod.rs: patch_unavailable_or_binary

Runtime/Budget

Rounds used: 1/4
Planned batches: 3
Executed batches: 3
Sub-agent runs: 8
Planner calls: 1
Reviewer calls: 8
Model calls: 9/64
Structured-output summary-only degradation: NO

github-actions

Automated PR review completed.

Findings kept: 40
Findings with unknown confidence: 0
Inline comments attempted: 32
Target files: 27
Covered files: 27
Uncovered files: 0
See the summary comment for detailed analysis and coverage details.

Sign in to view

@@ -0,0 +1,593 @@
+use tauri::{AppHandle, Emitter};


Sign in to view

@@ -0,0 +1,840 @@
+use tiycore::agent::AgentMessage;


Sign in to view

+                        let _ = event_tx.send(ThreadStreamEvent::ReasoningUpdated {
+                            run_id: run_id.to_string(),
+                            message_id,
+                            reasoning: buffer.clone(),


Sign in to view

+            // message in the same run whose created_at >= started_at, minus a
+            // small delta so the tool call lands just before it.  If no match
+            // is found, place it at the end.
+            let insert_pos = msg_positions


Sign in to view

+    if rendered.len() > MAX_TOOL_RESULT_SIZE {
+        rendered.truncate(MAX_TOOL_RESULT_SIZE);
+        // Ensure we don't cut in the middle of a multi-byte UTF-8 char
+        while !rendered.is_char_boundary(rendered.len()) {


Sign in to view

+    /// scope is read from the `SkillRecordDto.scope` field, which `load_skills`
+    /// now derives from the discovered source label rather than the query
+    /// parameter.
+    pub(super) async fn lookup_skill_actual_scope(


Sign in to view

+        }
+    }
+
+    pub(super) async fn skill_exists(


Sign in to view

+  return Number.isFinite(parsed) ? parsed : 0;
+}
+
+export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> {


Sign in to view

+  totalToolCalls: number;
+};
+
+export function mapSnapshotHelperStatus(


Sign in to view

+        setRegenerating(false);
+        refocusAfterRegenerate();
+      });
+  }, [threadId, modelPlan, isRegenerating, refocusAfterRegenerate, t]);


…d extensions Extract agent run compaction, session execution, and related tests into dedicated modules (agent_run_compaction.rs, agent_session_execution.rs, agent_run_manager_tests.rs, agent_session_tests.rs) to reduce complexity in agent_run_manager.rs and agent_session.rs. Simplify the extensions facade by removing mod.rs and introducing focused modules for config_io, plugins, mcp, skills, and marketplace. Add dashboard sidebar and overlays components to the workbench shell UI. Update documentation and module references to reflect the new structure.

github-actions

Automated PR review completed.

Findings kept: 26
Findings with unknown confidence: 0
Inline comments attempted: 26
Target files: 33
Covered files: 33
Uncovered files: 0
See the summary comment for detailed analysis and coverage details.

Sign in to view

+                        description: read_string_keys(tool, &["description"])
+                            .unwrap_or_else(|| name.clone()),
+                        command,
+                        args: read_string_array_keys(tool, &["args"]),


Sign in to view

+    )
+}
+
+#[cfg(test)]


Sign in to view

+        .collect()
+}
+
+pub(crate) fn convert_history_messages(


Sign in to view

+    }
+}
+
+pub(crate) fn validate_clarify_input(value: &serde_json::Value) -> Result<(), String> {


Sign in to view

+/// this leaves headroom for protocol overhead and JSON escaping.
+const MAX_TOOL_RESULT_SIZE: usize = 8_000_000;
+
+pub(crate) fn agent_tool_result_from_output(


Sign in to view

+                    "path": "/compact",
+                    "description": "Clear history but keep a summary in context.",
+                    "argumentHint": "[instructions=...]",
+                    "argumentsText": instructions.clone().unwrap_or_default(),


Sign in to view

+    format!(
+        "Create a short thread title for this conversation.\n\
+Rules:\n\
+- {language_rule}\n\


Sign in to view

+        }
+    }
+
+    pub(super) async fn skill_exists(


Sign in to view

+import { ThreadRenameInput } from "@/modules/workbench-shell/ui/thread-rename-input";
+import { ThreadStatusIndicator } from "@/modules/workbench-shell/ui/thread-status-indicator";
+
+const WORKSPACE_THREAD_PAGE_SIZE = 10;


Sign in to view

+  state: string;
+};
+
+function asToolDataRecord(value: unknown) {


Move ResolvedTool, ToolProviderContext, and related tool resolution, execution, and hook methods from extensions/mod.rs into a new runtime_tools.rs module to improve separation of concerns and reduce the size of the facade file. refactor(workbench): ♻️ extract dashboard logic and thread surface state - Move dashboard orchestration helpers and constants from dashboard-workbench.tsx into a dedicated dashboard-workbench-logic.ts module for better reuse and readability - Extract runtime thread surface state, timeline mapping, and related types into a new runtime-thread-surface-state.ts module to slim down the main surface component - Update imports and references across workbench-shell UI modules - Update AGENTS.md to reflect the new module structure

github-actions

Automated PR review completed.

Findings kept: 11
Findings with unknown confidence: 0
Inline comments attempted: 11
Target files: 36
Covered files: 19
Uncovered files: 17
See the summary comment for detailed analysis and coverage details.

Sign in to view

+        self.handle_runtime_event(run_id, terminal_event).await
+    }
+
+    pub(crate) async fn handle_runtime_event(


Sign in to view

+use crate::ipc::frontend_channels::ThreadStreamEvent;
+use crate::model::thread::RunUsageDto;
+
+pub(crate) fn handle_agent_event(


Sign in to view

+use super::agent_session::{standard_tool_timeout, AgentSession, CLARIFY_TOOL_NAME};
+
+impl AgentSession {
+    pub(crate) async fn execute_tool_call(


Sign in to view

+
+    pub async fn marketplace_install_item(&self, id: &str) -> Result<PluginDetailDto, AppError> {
+        let item = self
+            .marketplace_list_items()


Sign in to view

+                .arg("clone")
+                .arg("--depth")
+                .arg("1")
+                .arg(&source.url)


Sign in to view

+    /// audit. It always emits a terminal event (RunCompleted / RunFailed /
+    /// RunCancelled) and always clears the `ActiveRun`, even on panic-like
+    /// early returns, so the thread can't get stuck in Running state.
+    async fn run_compact_background(


Sign in to view

+/// directly rather than via the Agent runtime.
+///
+/// [`generate_discard_summary`]: crate::core::context_compression::generate_discard_summary
+pub(crate) async fn run_auto_compression(


Sign in to view

+            &cache_dir.join(".claude-plugin/marketplace.json"),
+        )?;
+        let source_name = source_manifest.name.unwrap_or_else(|| source.name.clone());
+        let installed = self


Sign in to view

+}
+
+pub(super) fn marketplace_source_id(url: &str) -> String {
+    let mut hasher = std::collections::hash_map::DefaultHasher::new();


Sign in to view

+}
+
+pub(super) fn is_builtin_marketplace_source_id(id: &str) -> bool {
+    builtin_marketplace_sources()


…ar persistence - Extract `persist_clear_context_reset_to_pool` to isolate context reset persistence logic - Refactor `resolve_helper_tool_task` into a reusable function with clear error handling - Add comprehensive unit tests for context clear persistence and helper tool task resolution - Improve test coverage for terminal events and reasoning completion policies

…y cleanup - Always attach the payload hook so the DeepSeek thinking normalizer runs even when no provider_options are present - Add `is_deepseek_provider` detection by provider type and base URL - Add `normalize_deepseek_thinking_payload` to sanitize assistant messages by filling missing reasoning_content, preventing null content, and stripping reasoning when thinking is disabled - Pass `thinking_level` through helper agent orchestrator requests and enable thinking only when the model supports it - Skip empty reasoning records in `convert_history_messages` to avoid serialization issues with providers like DeepSeek - Clear pending thinking at tool-result boundaries to prevent orphan reasoning from leaking across assistant messages - Make `merge_payload` public and reuse it from `agent_session` - Add comprehensive tests for DeepSeek detection and payload normalization

github-actions

Automated PR review completed.

Findings kept: 30
Findings with unknown confidence: 0
Inline comments attempted: 24
Target files: 37
Covered files: 36
Uncovered files: 1
See the summary comment for detailed analysis and coverage details.

Sign in to view

+                .arg("clone")
+                .arg("--depth")
+                .arg("1")
+                .arg(&source.url)


Sign in to view

+        event: &str,
+        payload: serde_json::Value,
+    ) -> Result<HookOutput, AppError> {
+        let command_path = plugin.path.join(handler);


Sign in to view

+
+        let output = self
+            .execute_command_json(
+                OsStr::new(&tool.command),


Sign in to view

+// mutates) will saturate the IPC queue and block thread list rendering.
+export const SIDEBAR_SYNC_MIN_GAP_MS = 300;
+
+export function buildInitialWorkspaceThreadDisplayCounts() {


Sign in to view

+  return Number.isFinite(parsed) ? parsed : 0;
+}
+
+export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> {


Sign in to view

+ * a tool's state when a stale snapshot resolves after live stream events
+ * have already advanced the tool.
+ */
+function isMoreAdvancedToolState(


Sign in to view

+  showOutputLabel?: boolean;
+};
+
+export function getReadToolPresentation(tool: RuntimeSurfaceToolEntry): ReadToolPresentation | null {


Sign in to view

+            )
+        })?;
+
+        if !discovered && !plugin_dir.exists() {


Sign in to view

+                        </div>
+                      </div>
+                    ) : (
+                    sortWorkspacesWithWorktrees(workspaces as WorkspaceItem[]).map((workspace) => {


Sign in to view

+  };
+}
+
+


Introduce turn_index propagation across agent session events, stream events, and persistence to enable proper response boundary tracking and reasoning lifecycle management. Key changes: - Add turn_index field to ThreadStreamEvent::MessageCompleted and ReasoningUpdated, and propagate it from AgentEvent::MessageUpdate and MessageEnd through the event handling pipeline - Persist turn_index into message metadata when messages complete - Improve reasoning message termination logic to discard invalid or empty reasoning blocks without valid thinking signatures - Expand DeepSeek payload normalizer to backfill reasoning_content on text-only assistant messages, not just those with tool_calls - Add discard_dangling_reasoning cleanup for interrupted runs on startup recovery - Update all related tests and frontend integration tests to cover turn_index propagation and new DeepSeek normalization scenarios Closes: Phase 4 reasoning mis-allocation, DeepSeek 400 errors

github-actions

Automated PR review completed.

Findings kept: 36
Findings with unknown confidence: 0
Inline comments attempted: 32
Target files: 42
Covered files: 42
Uncovered files: 0
See the summary comment for detailed analysis and coverage details.

Sign in to view

+
+use super::agent_run_manager::AgentRunManager;
+
+pub(crate) fn build_orphaned_run_terminal_event(


Sign in to view

+        self.handle_runtime_event(run_id, terminal_event).await
+    }
+
+    pub(crate) async fn handle_runtime_event(


Sign in to view

+                    });
+                }
+
+                match outcome.result {


Sign in to view

+}
+
+impl AgentSession {
+    pub(crate) async fn execute_tool_call(


Sign in to view

+        .collect()
+}
+
+pub(crate) fn convert_history_messages(


Sign in to view

+import { ThreadRenameInput } from "@/modules/workbench-shell/ui/thread-rename-input";
+import { ThreadStatusIndicator } from "@/modules/workbench-shell/ui/thread-status-indicator";
+
+const WORKSPACE_THREAD_PAGE_SIZE = 10;


Sign in to view

+// mutates) will saturate the IPC queue and block thread list rendering.
+export const SIDEBAR_SYNC_MIN_GAP_MS = 300;
+
+export function buildInitialWorkspaceThreadDisplayCounts() {


Sign in to view

+  return Number.isFinite(parsed) ? parsed : 0;
+}
+
+export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> {


Sign in to view

+  showOutputLabel?: boolean;
+};
+
+export function getReadToolPresentation(tool: RuntimeSurfaceToolEntry): ReadToolPresentation | null {


Sign in to view

+  input?: unknown;
+  name: string;
+  result?: unknown;
+  state: string;


github-actions

Automated PR review completed.

Findings kept: 24
Findings with unknown confidence: 0
Inline comments attempted: 22
Target files: 42
Covered files: 29
Uncovered files: 13
See the summary comment for detailed analysis and coverage details.

Sign in to view

+        self.handle_runtime_event(run_id, terminal_event).await
+    }
+
+    pub(crate) async fn handle_runtime_event(


Sign in to view

+use crate::ipc::frontend_channels::ThreadStreamEvent;
+use crate::model::thread::RunUsageDto;
+
+pub(crate) fn handle_agent_event(


Sign in to view

+    }
+}
+
+pub(crate) fn validate_clarify_input(value: &serde_json::Value) -> Result<(), String> {


Sign in to view

+/// this leaves headroom for protocol overhead and JSON escaping.
+const MAX_TOOL_RESULT_SIZE: usize = 8_000_000;
+
+pub(crate) fn agent_tool_result_from_output(


Sign in to view

+    })
+}
+
+pub(super) fn read_string_array_keys(


Sign in to view

+        })
+}
+
+pub(super) fn mask_sensitive_value(value: &str) -> String {


Sign in to view

+        .then_with(|| left.id.cmp(&right.id))
+}
+
+pub(super) fn apply_skill_state(record: &mut SkillRecordDto, state: &SkillStateStore) {


Sign in to view

        run_id: "run-1".into(),
        message_id: "msg-1".into(),
        content: "Full response".into(),
+        turn_index: None,


Sign in to view

+                .await;
+        }
+
+        let tool_call_storage_id = uuid::Uuid::now_v7().to_string();


Sign in to view

+            )
+        })?;
+
+        if !discovered && !plugin_dir.exists() {


Add a new `supports_reasoning` field across the model configuration stack to properly track and propagate reasoning capabilities: - Introduce `supports_reasoning` to `RuntimeModelRole`, `RunModelPlanRoleDto`, and related types - Infer reasoning support automatically from model IDs with keyword patterns (e.g. o1, r1, gpt-5) - Allow manual overrides via `capabilityOverrides` to disable/enable reasoning per model - Extract `inferModelCapabilities` and `getEffectiveModelCapabilities` into a shared module - Update `resolve_runtime_model_role` to respect declared reasoning capability instead of forcing it - Refactor `resolve_model_plan` to apply thinking level only to reasoning-capable roles via `apply_thinking_level_to_model_role` - Adjust `configure_agent` to check both thinking level and `model.reasoning` flag - Add comprehensive tests for reasoning capability resolution, thinking level application, and helper role behavior BREAKING CHANGE: model reasoning is no longer forced on when thinking level is enabled; it now depends on the model's declared capability

github-actions

Automated PR review completed.

Findings kept: 39
Findings with unknown confidence: 0
Inline comments attempted: 32
Target files: 49
Covered files: 49
Uncovered files: 0
See the summary comment for detailed analysis and coverage details.

Sign in to view

+    /// audit. It always emits a terminal event (RunCompleted / RunFailed /
+    /// RunCancelled) and always clears the `ActiveRun`, even on panic-like
+    /// early returns, so the thread can't get stuck in Running state.
+    async fn run_compact_background(


Sign in to view

+/// (errors, final output) of large tool results. When the omitted middle
+/// section is very small (< 50 chars), a simple head truncation is used
+/// instead to avoid a gap marker that hides barely any content.
+pub(crate) fn truncate_tool_result_head_tail(text: &str, max_chars: usize) -> String {


Sign in to view

+        .collect()
+}
+
+pub(crate) fn convert_history_messages(


Sign in to view

+        Ok(items)
+    }
+
+    pub async fn marketplace_install_item(&self, id: &str) -> Result<PluginDetailDto, AppError> {


Sign in to view

+        event: &str,
+        payload: serde_json::Value,
+    ) -> Result<HookOutput, AppError> {
+        let command_path = plugin.path.join(handler);


Sign in to view

+  return Number.isFinite(parsed) ? parsed : 0;
+}
+
+export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> {


Sign in to view

+    .map(([toolName, count]) => `${toolName} ${count}`);
+}
+
+export function getHelperElapsedSeconds(


Sign in to view

+  input?: unknown;
+  name: string;
+  result?: unknown;
+  state: string;


Sign in to view

 git2 = { version = "0.20", features = ["vendored-libgit2", "vendored-openssl"] }
 similar = "2"
-tiycore = "0.1.19"
+tiycore = "0.1.21-rc.26042620"


Sign in to view

+        let spawn_model_role = preview_spec.model_plan.primary.clone();
+        let spawn_response_language = response_language.map(str::to_owned);
+        let spawn_frontend_tx = frontend_tx.clone();
+        tokio::spawn(async move {


…to tiycore Move DeepSeek reasoning_content normalization from application layer into tiycore, eliminating duplicate handling across agent_session and subagent orchestrator. - Remove is_deepseek_provider and normalize_deepseek_thinking_payload from agent_session - Remove redundant normalize_deepseek_thinking_payload tests from agent_session_tests - Update subagent orchestrator payload hook to rely on tiycore built-in handling - Add reasoning_content_constrained field in settings_manager tests - Update agent_run integration tests with supportsReasoning flag - Bump tiycore to 0.2.1-rc.26042720 and refresh Cargo.lock

github-actions

Automated PR review completed.

Findings kept: 35
Findings with unknown confidence: 0
Inline comments attempted: 32
Target files: 51
Covered files: 51
Uncovered files: 0
See the summary comment for detailed analysis and coverage details.

Sign in to view

+                        let _ = event_tx.send(ThreadStreamEvent::ReasoningUpdated {
+                            run_id: run_id.to_string(),
+                            message_id,
+                            reasoning: buffer.clone(),


Sign in to view

+use crate::ipc::frontend_channels::ThreadStreamEvent;
+use crate::model::thread::RunUsageDto;
+
+pub(crate) fn handle_agent_event(


Sign in to view

+            );
+        }
+
+        self.checkpoint_requested.store(true, Ordering::SeqCst);


Sign in to view

+        .collect()
+}
+
+pub(crate) fn convert_history_messages(


Sign in to view

+            // candidate text message and insert_pos — a reasoning message in
+            // between indicates the tool call came from a different model
+            // response and must remain a separate assistant message.
+            let merge_target_pos = msg_positions[..insert_pos]


Sign in to view

+  return Number.isFinite(parsed) ? parsed : 0;
+}
+
+export function buildDiffPreviewRows(diff: string): Array<DiffPreviewRow> {


Sign in to view

+  input?: unknown;
+  name: string;
+  result?: unknown;
+  state: string;


Sign in to view

+            model_name: model_id.to_string(),
+            provider_type: "openai".to_string(),
+            provider_name: "OpenAI".to_string(),
+            api_key: Some("test-key".to_string()),


Sign in to view

+        // Merge per-model provider_options into every outgoing LLM request payload.
+        {
+            let provider_options = request.model_role.provider_options.clone();
+            if provider_options.is_some() {


Sign in to view

                tool_calls = tc_count,
                helpers = helper_count,
+                reasoning = reasoning_count,
                "interrupted dangling runs/tool_calls/helpers on startup"


Add path traversal prevention by canonicalizing the hook path and checking it stays within the plugin directory. Previously, a malicious plugin could escape its directory via crafted hook paths. Additionally: - Update log message in thread_manager to include "reasoning" in startup dangling runs/tool_calls/helpers/reasoning count. - Import WORKSPACE_THREAD_PAGE_SIZE from shared logic instead of local constant in dashboard-sidebar. - Change state property type from string to SurfaceToolState in RuntimeSurfaceToolEntry for better type safety.

github-actions

Automated PR review completed.

Findings kept: 10
Findings with unknown confidence: 0
Inline comments attempted: 10
Target files: 51
Covered files: 25
Uncovered files: 26
See the summary comment for detailed analysis and coverage details.

Sign in to view

+                        description: read_string_keys(tool, &["description"])
+                            .unwrap_or_else(|| name.clone()),
+                        command,
+                        args: read_string_array_keys(tool, &["args"]),


Sign in to view

+                .collect::<Vec<_>>()
+        });
+
+        let output = self


Sign in to view

+
+        let setup = async {
+            message_repo::insert(&self.pool, &user_message).await?;
+            message_repo::insert(&self.pool, &reset_message).await?;


Sign in to view

+}
+
+pub(super) fn marketplace_source_id(url: &str) -> String {
+    let mut hasher = std::collections::hash_map::DefaultHasher::new();


Sign in to view

+                .arg("clone")
+                .arg("--depth")
+                .arg("1")
+                .arg(&source.url)


Sign in to view

+        let installed = self.load_installed_plugin_records().await?;
+        let mut items = Vec::with_capacity(installed.len());
+        for record in installed {
+            let mut runtime = self.load_plugin_from_dir(Path::new(&record.path), false)?;


Sign in to view

+        args: Some(read_string_array_keys(spec, &["args"])),
+        env: read_string_map_keys(spec, &["env"]),
+        cwd: read_string_keys(spec, &["cwd"]),
+        url: read_string_keys(spec, &["url"]),


Sign in to view

    maxOutputTokens: model.maxOutputTokens ?? null,
-    supportsImageInput: model.capabilityOverrides.vision ?? null,
+    supportsImageInput: capabilities.vision,
+    supportsReasoning: capabilities.reasoning,


Sign in to view

+        if did_update_sources {
+            self.save_marketplace_sources(&store)?;
+        }
+        let installed = self.load_installed_plugin_records().await?;


Sign in to view

+            )
+        })?;
+
+        if !discovered && !plugin_dir.exists() {


Update the Homebrew cask URL to remove duplicate 'v' prefix in version path. Add additional application data directories to the zap uninstall cleanup list (HTTPStorages, Logs, and WebKit) to ensure complete removal of local application data.

github-actions

Automated PR review completed.

Findings kept: 22
Findings with unknown confidence: 0
Inline comments attempted: 22
Target files: 51
Covered files: 35
Uncovered files: 16
See the summary comment for detailed analysis and coverage details.

github-actions · 2026-04-28T00:44:21Z

+        self.handle_runtime_event(run_id, terminal_event).await
+    }
+
+    pub(crate) async fn handle_runtime_event(


[CRITICAL] agent_run_event_handler.rs has zero test coverage for the main event handler

The central runtime event handler that drives all agent run state transitions has no automated tests. Race conditions in status transitions (e.g. MessageDelta → MessageCompleted, or duplicate terminal events) are entirely untested.

Suggestion: Add unit tests for each match arm of handle_runtime_event using a mock AgentRunManager or by extracting pure decision functions (terminal_event_status, should_complete_reasoning_for_event, is_terminal_runtime_event) and testing those with synthetic ThreadStreamEvent variants. The free functions at R16-R59 are ideal candidates for immediate unit testing.

Risk: State machine regressions (e.g. wrong thread status after cancellation, reasoning messages left in 'streaming' state) will only be caught manually.

Confidence: 0.95

[From SubAgent: testing]

github-actions · 2026-04-28T00:44:21Z

+    ///
+    /// Returns `(run_id, event_rx)` so the caller can forward events over a
+    /// Tauri `Channel` identical to `start_run`.
+    pub async fn compact_thread_context(


[HIGH] compact_thread_context and run_compact_background have no tests

The /compact command entry point — including ActiveRun registration, message persistence order, broadcast channel setup, and the spawned background task — has no test coverage. The existing test only covers the simpler /clear path.

Suggestion: Add a test that: (1) calls compact_thread_context on a seeded thread; (2) verifies the user message, reset marker, and run row are persisted; (3) verifies the frontend channel receives RunStarted and ContextCompressing events. The LLM-dependent part (run_compact_background) can be tested with a mock or by verifying the final state after a short timeout.

Risk: Compaction lifecycle bugs (stuck Running state, missing reset markers, missing events) will only be caught in manual testing.

Confidence: 0.88

[From SubAgent: testing]

github-actions · 2026-04-28T00:44:21Z

+            message_repo::update_status(&self.pool, &message_id, finalized_message_status).await?;
+        }
+        // Reasoning termination: classify by content/signature validity
+        if let Some(message_id) = reasoning_message_id {


[HIGH] finish_run reasoning discard logic has no test coverage

The reasoning message classification in finish_run (discard if empty or missing signature for non-completed runs, keep otherwise) directly controls what thinking content users see after interrupted/cancelled runs. No tests validate any of the four combinations (empty vs non-empty × signature present vs absent).

Suggestion: Extract the discard decision into a pure function and add parameterized tests covering: (1) completed run → never discard; (2) non-completed + empty content → discard; (3) non-completed + no signature → discard; (4) non-completed + content + signature → keep as completed.

Risk: Users may lose valid thinking content on interrupted runs, or see garbage thinking content that should have been discarded.

Confidence: 0.87

[From SubAgent: testing]

github-actions · 2026-04-28T00:44:21Z

+}
+
+impl AgentSession {
+    pub(crate) async fn execute_tool_call(


[HIGH] execute_tool_call dispatch has no test coverage for any branch

The primary tool execution dispatcher has no integration tests. Branch-specific logic like approval flow event emission (ApprovalRequired → ApprovalResolved), timeout result persistence, and clarify request lifecycle are all untested.

Suggestion: At minimum, add unit tests for the event emission patterns: (1) tool approval required then approved emits both events; (2) tool timeout persists the error to tool_call_repo; (3) clarify request emits ClarifyRequired then ClarifyResolved. These can be tested with mock event_tx and tool_gateway.

Risk: Event emission ordering bugs (e.g. missing ToolRunning event, double ToolFailed) would cause frontend state desynchronization.

Confidence: 0.85

[From SubAgent: testing]

github-actions · 2026-04-28T00:44:21Z

+    ) -> Result<HookOutput, AppError> {
+        let command_path = plugin.path.join(handler);
+        // Prevent path traversal: ensure the resolved command stays within the plugin directory.
+        let command_path = std::fs::canonicalize(&command_path).map_err(|error| {


[HIGH] execute_hook path traversal prevention is untested

The path traversal guard in execute_hook is security-critical but untested. A regression in the starts_with comparison or canonicalize behavior could silently allow execution of arbitrary binaries.

Suggestion: Add tests that: (1) a handler '../../../etc/passwd' is rejected with the hook_escape error; (2) a symlink inside the plugin dir that points outside is rejected; (3) a valid handler within the plugin dir succeeds.

Risk: A path traversal bypass would allow plugin manifests to execute arbitrary commands outside their directory.

Confidence: 0.92

[From SubAgent: testing]

github-actions · 2026-04-28T00:44:21Z

+  });
+}
+
+export function mapRunStateToWorkbenchThreadStatus(


[MEDIUM] No unit tests for dashboard-workbench-logic.ts pure functions

The extracted workbench logic module contains status mapping and data computation functions with no test coverage. These functions drive sidebar status indicators and context badge display.

Suggestion: Add tests for: each RunState value in mapRunStateToWorkbenchThreadStatus; each status in mapRunFinishedStatusToThreadStatus; buildThreadContextBadgeData with zero/positive context window; mergeLocalFallbackThreads with overlapping and disjoint thread sets; formatCompactTokenCount boundary values.

Risk: Incorrect sidebar thread status badges or context usage display after refactoring

Confidence: 0.88

[From SubAgent: testing]

github-actions · 2026-04-28T00:44:21Z

 git2 = { version = "0.20", features = ["vendored-libgit2", "vendored-openssl"] }
 similar = "2"
-tiycore = "0.1.19"
+tiycore = { version = "0.2.1-rc.26042720" }


[LOW] RC version of tiycore used as a production dependency

The tiycore dependency was bumped from a stable version (0.1.19) to a release candidate (0.2.1-rc.26042720). RC versions may have unstable APIs or bugs and shouldn't ship to production without explicit sign-off.

Suggestion: Ensure this RC is intentional and temporary. Add a comment or tracking issue reference in Cargo.toml, and plan to pin to the stable 0.2.1 once it's released before merging to main.

Risk: An RC dependency could introduce regressions, breaking changes, or instability in agent runtime behavior.

Confidence: 0.90

[From SubAgent: general]

github-actions · 2026-04-28T00:44:21Z

 git2 = { version = "0.20", features = ["vendored-libgit2", "vendored-openssl"] }
 similar = "2"
-tiycore = "0.1.19"
+tiycore = { version = "0.2.1-rc.26042720" }


[LOW] Pre-release tiycore dependency introduces supply-chain risk

The tiycore dependency is updated from a stable version (0.1.19) to a pre-release RC version (0.2.1-rc.26042720), which may carry additional supply-chain and stability risk.

Suggestion: Pin to a stable release before shipping to production. If an RC is required for feature access, verify the upstream tag and commit signature.

Risk: RC builds may contain unreviewed changes, security regressions, or unstable behavior not present in the stable release.

Confidence: 0.90

[From SubAgent: security]

github-actions · 2026-04-28T00:44:21Z

+
+        if let Some(mut stdin) = child.stdin.take() {
+            tokio::spawn(async move {
+                let _ = stdin.write_all(&payload).await;


[LOW] Stdin write error silently ignored in execute_command_

The stdin write to the child process in execute_command_ silently ignores errors. If writing the JSON payload fails, the child may hang waiting for input or process empty input, producing confusing results.

Suggestion: Log the error at minimum: if let Err(e) = stdin.write_all(&payload).await { tracing::warn!(...); }. Consider propagating the error or killing the child process if the write fails.

Risk: Silent stdin write failures can cause plugins/hooks to hang or produce incorrect output without any diagnostic signal.

Confidence: 0.85

[From SubAgent: general]

github-actions · 2026-04-28T00:44:21Z

+    status: message.status,
+  };
+}
+


[LOW] Double blank lines in runtime-thread-surface-state.ts

Inconsistent spacing with double blank lines in the new state module.

Suggestion: Remove extra blank lines to maintain single-blank-line convention.

Risk: Minor style inconsistency; no functional impact.

Confidence: 0.90

[From SubAgent: general]

github-actions Bot reviewed Apr 25, 2026

View reviewed changes

github-actions Bot reviewed Apr 26, 2026

View reviewed changes

jorben added 3 commits April 26, 2026 08:25

fix(subagent): 🐛 only enable thinking when model supports reasoning

19b7b03

github-actions Bot reviewed Apr 26, 2026

View reviewed changes

chore(deps): 🔧 update windows-sys and windows-core dependencies

d6e14ce

github-actions Bot reviewed Apr 27, 2026

View reviewed changes

github-actions Bot reviewed Apr 28, 2026

View reviewed changes

jorben merged commit dd55d5c into master Apr 28, 2026
4 checks passed

jorben deleted the refact/arch-optimization-2604 branch April 28, 2026 00:39

github-actions Bot reviewed Apr 28, 2026

View reviewed changes


		use super::agent_run_manager::AgentRunManager;

		pub(crate) fn build_orphaned_run_terminal_event(

Conversation

HayWolf commented Apr 25, 2026

Summary

Test Plan

Uh oh!

github-actions Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Code Review Summary

Overall Assessment

Major Findings by Severity

Actionable Suggestions

Potential Risks

Test Suggestions

File-Level Coverage Notes

Inline Downgraded Items (processed but not inline)

Coverage Status

Runtime/Budget

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

github-actions Bot commented Apr 25, 2026 •

edited

Loading