Merged
Conversation
fewtarius
added a commit
that referenced
this pull request
Jan 4, 2026
…ixes Complete redesign of agent orchestrator workflow engine to fix critical bugs in todo tracking, message alternation, and continuation guidance. Evolved from rigid "force tools" approach to intelligent context-aware orchestration that follows the orchestrator.txt flow diagram correctly. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SUMMARY OF FIXES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ **Fix #1: Todo Workflow Infinite Loops & Update-Before-Create** Agent tried to update todos before creating list, causing crashes and loops. Graduated interventions gave contradictory instructions. Solution: - Added CRITICAL ERROR TO AVOID section in todo_operations tool - Visual step-by-step workflow added to tool description - Rewrote all 3 graduated intervention levels with consistent guidance - Implemented proper Continue Flag pattern matching orchestrator.txt - Added shouldContinueAfterChecks flag for correct workflow flow Result: No more update-before-create crashes, workflow matches design exactly **Fix #2: Response Loops (Agent Repeating Same Text)** Agent stuck repeating same response infinitely when todos incomplete. pendingAutoContinueMessage was set but never injected. Solution: - Inject pendingAutoContinueMessage at iteration start - Call injectAutoContinueIfTodosIncomplete() when no tools + active todos - Graduated intervention (Level 1 → 2 → 3) now works as designed - Remove last assistant message to prevent loops Result: No more infinite response loops, graduated intervention working **Fix #3: Tool Result Infinite Loop (read_tool_result Stuck)** Agent stuck seeing SAME tool result chunk repeatedly for 15+ iterations. TOOL_RESULT_CHUNK messages preserved across iterations incorrectly. Solution: - Removed preservation logic for TOOL_RESULT_CHUNK messages - Chunks now appear once when tool executes - Agent must call read_tool_result to get more chunks - Proper pagination flow restored Result: No more chunk re-injection loops, clean pagination behavior **Fix #4: Message Alternation Violations** Multiple consecutive assistant messages broke Claude API compatibility. Evolved through 3 iterations: Rigid → Binary → Flexible → Context-Aware Final Solution - Todo-Aware Continuation Guidance: - 4 guidance variants: (has todos YES/NO) × (tools used YES/NO) - With todos + tools: "MANDATORY TODO WORKFLOW: mark → work → complete" - With todos + no tools: "You have incomplete todos - MUST follow workflow" - Without todos + tools: "Need more data? → tools. Have enough? → respond" - Without todos + no tools: "Already answered? Use tools for follow-up" Result: Fixes consecutive messages, allows flexibility, enforces discipline **Fix #5: Planning Loop False Positives** Planning loop detector flagged normal workflow (mark todo → work → complete) as infinite loop because it saw consecutive todo_operations calls. Solution: - Added isTodoCompletionCall() helper function - Marking todos complete now counts as progress - Only flag as loop if NO work tools AND NO todo completions Result: Normal workflow allowed, actual loops still detected **Fix #6: Stale Todo List (Workflow Stopped Early)** Workflow stopped when todos incomplete because orchestrator saw stale "all complete" state. currentTodoList only updated after tool execution. Solution: - Read fresh todo list from MCP BEFORE every workflow check - Only if currentTodoList.count > 0 (known active list exists) - TodoReminderInjector: Clarified workflow guidance wording - Makes explicit: mark in-progress → DO THE WORK → mark completed Result: Workflows continue correctly, no premature stops, fresh state accurate **Fix #7: Duplicate Tool Cards in Streaming Mode** Tool cards appeared twice in UI - once from streaming, once from main loop. Solution: - Skip tool message creation in main loop when streaming active - Check streamContinuation - if present, streaming created messages - Non-streaming unchanged Result: Clean UI, no duplicate cards **Fix #8: Web Research Error Card Clutter** Red error cards for expected situations (empty pages, no results). VectorRAGError helpful messages wrapped with confusing text. Solution: - Pass through VectorRAGError messages without wrapping - Handle partial failures gracefully (some sources succeed = green card) - Only show error if ALL sources fail Result: Clean UI, helpful guidance preserved ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TESTING RESULTS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ✅ Simple workflow (3 stories): All tracked, completed, brief summary ✅ Complex workflow (3 research tasks): All tracked, completed, brief summary ✅ Fresh todo reads detect incomplete todos correctly ✅ Agent doesn't repeat work in final summary ✅ No more response loops or chunk re-injection ✅ Planning loop detector allows normal workflow ✅ Context-aware continuation guidance works ✅ Streaming mode: no duplicate tool cards ✅ Web research: clean error handling ✅ Build: PASS (all commits) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ DOCUMENTATION ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Added comprehensive documentation: - project-docs/AGENT_ORCHESTRATOR.md (complete architecture) * Complete workflow flow diagram (Mermaid) * Detailed 8-step decision tree * Fresh todo state read documentation * Continuation priority table * All fixes documented with root causes * Known Issues updated with resolutions - ai-assisted/2026-01-04/workflow-alternation-fix/ * CONTINUATION_PROMPT.md (session handoff) * AGENT_PLAN.md (remaining work breakdown) - .github/copilot-instructions.md * Added isBackground=false requirement (CRITICAL) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FILES MODIFIED ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Primary: - Sources/APIFramework/AgentOrchestrator.swift * Fresh todo reads before workflow checks * Context-aware continuation guidance system * Graduated intervention injection fixed * Planning loop detection improvements * Duplicate tool card fix for streaming - Sources/MCPFramework/TodoReminderInjector.swift * Todo-aware workflow guidance (4 variants) * Clear final message guidance when all tasks complete - Sources/MCPFramework/Tools/TodoOperationsTool.swift * CREATE FIRST requirement documentation * Visual step-by-step workflow added - Sources/ConfigurationSystem/SimpleSystemPromptManager.swift * Todo workflow discipline documentation Supporting: - Sources/MCPFramework/Tools/WebResearchTool.swift * VectorRAGError pass-through without wrapping - Sources/ConversationEngine/WebResearchService.swift * Partial failure handling (some sources succeed) Documentation: - project-docs/AGENT_ORCHESTRATOR.md (new + updated) - ai-assisted/2026-01-04/workflow-alternation-fix/* (new) - .github/copilot-instructions.md (updated) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ARCHITECTURAL IMPROVEMENTS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Context-Aware Guidance System: - 4 guidance variants adapt to workflow state - Based on: (has incomplete todos?) × (tools called?) - Each variant provides specific, actionable instructions - Prevents workflow violations through clear communication Fresh State Reads: - Todo list read fresh from MCP before every workflow check - Prevents stale cache bugs that caused premature stops - Only reads when active todo list exists (performance optimization) Graduated Intervention Pressure: - Level 1: Polite reminder about incomplete todos - Level 2: Warning about loop behavior - Level 3: Final warning before failure - Escalates pressure if agent keeps ignoring todos Proper Continue Flag Pattern: - Matches orchestrator.txt flow diagram exactly - Can be set by: tool execution, incomplete todos, workflow mode - Priority ordering ensures correct continuation behavior Unified Todo Workflow Discipline: - All guidance sources enforce same workflow - CREATE list first, THEN mark in-progress - Mark in-progress → DO THE WORK → mark completed - Never skip status updates Code Quality: - Eliminated ~100 lines of redundant code - Cleaner separation: streaming vs non-streaming paths - Better error handling and logging Result: Intelligent orchestrator that adapts guidance based on workflow context while enforcing todo discipline and preventing loops.
fewtarius
added a commit
that referenced
this pull request
Jan 7, 2026
**Problem:** State and ER diagrams have rendering issues and cause confusion. **Solution:** 1. Removed from system prompt (SystemPromptConfiguration.swift) - Deleted 'stateDiagram' and 'erDiagram' from Mermaid types list - LLMs will no longer be instructed to generate these types 2. Added parser guards (MermaidParser.swift) - Return .unsupported for state/ER diagrams - Log warning when encountered - Prevents attempted rendering **Impact:** - Test file mermaid-test.json will now show diagrams #3 (class) then #5 (gantt) with #4 (state) hidden - No more diagram numbering confusion - No more cross-diagram contamination **To Re-Enable:** 1. Fix rendering issues in StateDiagramRenderer.swift and ERDiagramRenderer.swift 2. Remove parser guards 3. Re-add to system prompt **Files Modified:** - Sources/ConfigurationSystem/SystemPromptConfiguration.swift (line 564) - Sources/UserInterface/Chat/Mermaid/MermaidParser.swift (lines 22-26) **Testing:** ✅ Build: PASS ⏳ User should no longer see State diagram (#4) ⏳ Class diagram should work correctly
fewtarius
added a commit
that referenced
this pull request
Jan 15, 2026
…tance **Problem:** 1. Personality/SystemPrompt editors showed empty fields on first open (race condition) 2. Changing default personality in preferences didn't affect new conversations 3. Mermaid diagrams render poorly for complex flows (hardcoded spacing) **Solution:** 1. Replaced .onAppear pattern with custom init() in editors: - PersonalityEditor: Initialize @State values directly from personality parameter - SystemPromptConfigurationEditor: Same pattern for consistency - Data available immediately when view appears, no async delay 2. ConversationSettings.init() now reads defaultPersonalityId from UserDefaults: - Falls back to Assistant UUID if not set - New conversations inherit user's default personality preference 3. Issue #3 investigated but not fixed in this commit: - All Mermaid renderers use hardcoded sizing (nodeWidth=180, spacing=100, etc.) - Recommendation: Switch to mermaid - Recommendation: Switch to mermaid - Recommendation: Switch to mermaid - Recommendation: Switchg:* - Recomm: - Recommendation: Switch to mermaid - Recommendation: Switch to mermaide view displays ✅ Default personality now inherited by new conversations
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.