Skip to content

Agent Loop

CortexPrism edited this page Jun 17, 2026 · 1 revision

Agent Loop

The agent loop is the core of CortexPrism. It processes one complete user→agent exchange through memory injection, LLM calls, tool parsing/execution, and post-turn storage.

agentTurn() Flow

agentTurn(opts)
  1. injectMemory(systemPrompt, hits)   ← prepend relevant memory
  2. persistMessage(userMessage)
  3. [TOOL LOOP — up to MAX_TOOL_ROUNDS=8]
     a. LLM call (stream or complete)
     b. parseToolCalls(response)        ← extract <tool_call>{...}</tool_call>
     c. for each call:
        - validateToolCall()            ← Parallax policy check
        - tool.execute()
        - logEvent(tool_call)
     d. formatToolResults() → re-prompt
  4. persistMessage(agentResponse)
  5. incrementTurn(sessionId)
  6. writeEpisodic(summary)             ← fire-and-forget
  7. reflectOnTurn() [if enabled]       ← fire-and-forget
  8. logEvent(llm_call)
  return AgentTurnResult

Options

Option Type Purpose
userMessage string User input
provider LLMProvider Active LLM provider
model string Model name
sessionDb Db Per-session SQLite instance
sessionId string Session identifier
systemPrompt string System prompt (with injected memory)
stream boolean Stream output chunks
onChunk function Chunk callback for streaming
registry ToolRegistry Registered tools
toolContext ToolContext Working dir, approval gate
embedder EmbeddingProvider For memory retrieval
enableReflection boolean Post-turn reflection

Tool Follow-up Loop

After each tool execution round, the agent receives tool results and is re-prompted:

  1. Tool results are formatted as <tool_result> XML blocks
  2. Results are appended as a user message in the conversation
  3. The LLM is called again with updated message history
  4. Up to 8 rounds (MAX_TOOL_ROUNDS) before hitting the ceiling
  5. When the ceiling is hit, the agent summarizes progress and remaining work

When ≤1 rounds remain, a hard instruction is added to stop calling tools and deliver a final response.

Buffered Streaming

When tools are registered, the LLM response is streamed into a buffer internally (not directly to the client). After the full response is received:

  • Tool calls are extracted and parsed
  • Only clean prose (no <tool_call> XML, no bare JSON) is forwarded via onChunk
  • Client-side safety net strips any remaining XML/JSON from the WebSocket handler

Post-Turn Processing (Fire-and-Forget)

These tasks run asynchronously after the response is sent — they never block the agent:

Operation Destination Purpose
writeEpisodic() episodic_memory Store turn summary
extractAndStoreEntities() semantic_memory + entity_graph Build knowledge graph
detectAndPersistPreference() semantic_memory + MEMORY.md Capture user preferences
reflectOnTurn()storeReflection() reflection_memory Extract behavioral patterns
logEvent() lens.db Audit log entry
extractSkillFromSession() skills table Procedural knowledge (≥2 tool calls)

MetaCognition

src/agent/metacog.ts analyzes user messages before the LLM call to determine delegation strategy:

  • Complex code + explorationdelegate with suggested types [explore, code]
  • Research + independent subtasksparallelize with [research]
  • Pure explorationdelegate with explore
  • Destructive multi-stepplan_with_rollback with plan

The suggestedSubAgents field guides the LLM in choosing sub-agent types.

See Also

  • Sub-Agents — Child agent spawning and lifecycle
  • Request Flow — Full request lifecycle diagrams
  • Security — Tool validation and policy enforcement

Clone this wiki locally