Skip to content

Request Flow

CortexPrism edited this page Jun 17, 2026 · 1 revision

Request Flow

Visual maps of the full lifecycle of a user request through the agent system.

Full Request Flow

User Message
  │
  ▼
Pipeline: pre-assess hook → can mutate input or abort
  │
  ▼
Persist user message → session_messages
  │
  ▼
loadHybridHistory() — recency window + FTS-scored older messages
  │
  ▼
MetaCognition: assessTask() — delegate, plan_with_rollback, direct, ask_first
  │
  ▼
Pipeline: post-assess hook
  │
  ▼
Build system prompt:
  1. Base soul (soul.ts DEFAULT_SOUL or ~/.cortex/SOUL.md)
  2. injectMemory() — relevant memories appended
  3. findMatchingSkills() — skill definitions appended
  4. applyMetaCogPrefix() — meta-cognition guidance
  5. injectToolsIntoPrompt() — tool schemas (if registered)
  6. injectNodeContext() — distributed node info
  │
  ▼
Pipeline: pre-llm hook (MQM model selection)
  │
  ▼
Agent Loop — up to maxToolRounds (default 8):
  │
  ├── Pipeline: pre-reason hook
  ├── LLM call: effectiveProvider.stream()
  ├── Pipeline: post-reason hook
  ├── Parse tool calls in response?
  │   ├── No tools → final clean response → exit loop
  │   └── Tools found:
  │       ├── Emit prose portion to client (strip tool XML)
  │       ├── For each tool call:
  │       │   ├── Pipeline: pre-tool hook
  │       │   ├── Policy validation (allow/deny)
  │       │   ├── tool.execute(args, context)
  │       │   ├── Pipeline: post-tool hook
  │       │   └── logEvent() → lens.db
  │       ├── Format <tool_result> XML
  │       └── Append to messages → next round
  │
  ▼
Pipeline: post-llm hook
  │
  ▼
Pipeline: pre-output hook → can mutate final response or abort
  │
  ▼
Persist assistant response → session_messages
  │
  ▼
Post-Turn Storage (fire-and-forget):
  ├── writeEpisodic() → episodic_memory
  ├── extractAndStoreEntities() → semantic_memory + entity_graph
  ├── detectAndPersistPreference() → MEMORY.md
  ├── reflectOnTurn() → storeReflection() → reflection_memory
  ├── logEvent() → lens.db audit log
  └── extractSkillFromSession() (if ≥2 tool calls)
  │
  ▼
Pipeline: post-output hook
  │
  ▼
Final Response to User

Path A — No Tool Calls

User → Loop: userMessage
  → Pipeline: pre-assess
  → persistMessage(user)
  → loadHistory() ← session_messages
  → assessTask() [MetaCognition]
  → Pipeline: post-assess
  → Memory: retrieve() — keyword + vector + graph search
  → Build system prompt [soul → +memory → +skills → +metacog]
  → Pipeline: pre-llm (MQM)
  → Pipeline: pre-reason
  → LLM: provider.stream(messages, systemPrompt) → onChunk()
  → Pipeline: post-reason
  (no tools → exit loop)
  → Pipeline: post-llm
  → Pipeline: pre-output
  → persistMessage(assistant)
  → writeEpisodic() [fire & forget]
  → extractAndStoreEntities() [fire & forget]
  → reflectOnTurn() [fire & forget]
  → Pipeline: post-output
  → AgentTurnResult

Path B — With Tool Calls

User → Loop: userMessage
  → Pipeline: pre-assess
  → persistMessage(user) + loadHistory()
  → assessTask() [MetaCognition]
  → Pipeline: post-assess
  → Memory: retrieve()
  → Build system prompt [+ tool schemas]
  → Pipeline: pre-llm (MQM)

  Loop (up to 8 rounds):
    → Pipeline: pre-reason
    → LLM: provider.stream() — buffered (no direct stream)
    → Parse response for <tool_call> JSON
    → Pipeline: post-reason
    → Emit prose via onChunk()

    For each tool call:
      → Pipeline: pre-tool
      → executeTool(tc, registry, toolCtx)
        → Policy: validateToolCall() → allowed / denied
        → tool.execute(args, context)
      → logEvent() → lens.db
      → Pipeline: post-tool

    → formatToolResults() → <tool_result> XML
    → Append assistant + tool results to messages
    → Next round

  (no tools in final round → exit)
  → Pipeline: post-llm
  → Pipeline: pre-output
  → persistMessage(assistant)
  → writeEpisodic() [fire & forget]
  → extractAndStoreEntities() [fire & forget]
  → reflectOnTurn() [fire & forget]
  → extractSkillFromSession() if ≥2 tool calls
  → Pipeline: post-output
  → AgentTurnResult

Memory: Read & Write Paths

Read (per turn, before LLM call)

User query
  ├── Keyword search (FTS5): episodic_fts + semantic_fts
  ├── Vector search: cosine similarity on embeddings
  ├── Graph traversal: searchEntities() + traverseGraph()
  └── Merge + deduplicate → time-decay scoring → top-N hits injected

Write (after turn, fire-and-forget)

Turn summary
  ├── writeEpisodic() → episodic_memory + episodic_fts
  ├── extractAndStoreEntities() → semantic_memory + entity_graph
  ├── User preference detected → writeSemantic() + MEMORY.md
  └── reflectOnTurn() via LLM → reflection_memory

System Prompt Layers

Layer 1 — Soul: src/agent/soul.ts / ~/.cortex/SOUL.md
  Identity · Behavior · Tool Usage · Safety
Layer 2 — User Context: ~/.cortex/USER.md (optional)
Layer 3 — Persistent Memory: ~/.cortex/MEMORY.md
Layer 4 — Retrieved Memory: memory.db hits via injectMemory()
Layer 5 — Skill Hints: findMatchingSkills()
Layer 6 — Meta-cognition Guidance: applyMetaCogPrefix()
Layer 7 — Tool Schemas (conditional): injectToolsIntoPrompt()
Layer 8 — Node Context (conditional): injectNodeContext()
  │
  ▼
Sent to LLM as system prompt

See Also

Clone this wiki locally