Skip to content

feat(observability): trajectory_events SQLite table for subagent debugging #28

@justrach

Description

@justrach

Motivation

Today, every conversation is persisted to SQLite as a single opaque context JSON blob in the conversations table. When a subagent run goes wrong (a tool call fails, the model loops, a child agent silently swallows an error), there's no way to:

  • See per-event timestamps — only conversation-level created_at / updated_at.
  • Query by tool name, agent, or error (SELECT … WHERE tool_name = … is impossible against an opaque blob).
  • Inspect subagent trajectories independently — child agents are nested inside the parent's context blob, not addressable on their own.
  • See events that didn't make it into the final context (errors that were retried, partial deltas, dropped tool results).

Goal: when running a subagent with a real API key, get a queryable trajectory trace of what tool calls happened in what order, with what inputs/outputs, and where it went wrong.

Proposal

Add a sibling trajectory_events table that captures one row per tool call / tool result / error / model turn boundary, scoped by (conversation_id, agent_id):

CREATE TABLE trajectory_events (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    conversation_id TEXT NOT NULL,
    agent_id TEXT NOT NULL,
    parent_agent_id TEXT,                -- null for the root agent
    seq INTEGER NOT NULL,                -- monotonically increasing per agent
    ts_ms INTEGER NOT NULL,              -- unix epoch milliseconds
    kind TEXT NOT NULL,                  -- 'tool_call' | 'tool_result' | 'error' | 'model_turn'
    payload TEXT NOT NULL,               -- JSON blob with kind-specific fields
    FOREIGN KEY (conversation_id) REFERENCES conversations(conversation_id)
);
CREATE INDEX idx_trajectory_lookup
    ON trajectory_events(conversation_id, agent_id, seq);

Scope

  • Diesel migration (up.sql / down.sql) under crates/forge_repo/src/database/migrations/
  • Update crates/forge_repo/src/database/schema.rs with the new table! block
  • forge_domain::TrajectoryEvent (kind enum + payload struct)
  • forge_app::TrajectoryRepo trait — record(event) and list(conversation_id, agent_id)
  • forge_repo::DieselTrajectoryRepo impl
  • forge_app::TrajectoryRecorder that writes events when wired into the orchestrator
  • Hook into Orchestrator::execute_tool_calls behind opt-in env var CODEGRAFF_TRACE=1 (off by default — local debugging only, not always-on cost)
  • Unit tests: round-trip recordlist on in-memory SQLite
  • Integration test: a tool call produces ordered tool_call + tool_result rows scoped to the right agent

Non-goals (for v1)

  • No JSONL mirror / file-based trace dir — SQLite is the single source of truth.
  • No log rotation or retention policy — purely additive table; user can DELETE FROM trajectory_events WHERE … if it grows.
  • No UI / CLI command to render a trace — sqlite3 codegraff.db 'SELECT …' is enough for v1.

Tradeoff

SQLite writes are ~10× slower than appending to JSONL. For local debugging the latency is irrelevant; for production we'd want it off (hence the env-var gate). The win is queryability — "show me every failed tool call from agent X in the last hour" becomes a one-liner.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions