Motivation
Today, every conversation is persisted to SQLite as a single opaque context JSON blob in the conversations table. When a subagent run goes wrong (a tool call fails, the model loops, a child agent silently swallows an error), there's no way to:
- See per-event timestamps — only conversation-level
created_at / updated_at.
- Query by tool name, agent, or error (
SELECT … WHERE tool_name = … is impossible against an opaque blob).
- Inspect subagent trajectories independently — child agents are nested inside the parent's context blob, not addressable on their own.
- See events that didn't make it into the final context (errors that were retried, partial deltas, dropped tool results).
Goal: when running a subagent with a real API key, get a queryable trajectory trace of what tool calls happened in what order, with what inputs/outputs, and where it went wrong.
Proposal
Add a sibling trajectory_events table that captures one row per tool call / tool result / error / model turn boundary, scoped by (conversation_id, agent_id):
CREATE TABLE trajectory_events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
conversation_id TEXT NOT NULL,
agent_id TEXT NOT NULL,
parent_agent_id TEXT, -- null for the root agent
seq INTEGER NOT NULL, -- monotonically increasing per agent
ts_ms INTEGER NOT NULL, -- unix epoch milliseconds
kind TEXT NOT NULL, -- 'tool_call' | 'tool_result' | 'error' | 'model_turn'
payload TEXT NOT NULL, -- JSON blob with kind-specific fields
FOREIGN KEY (conversation_id) REFERENCES conversations(conversation_id)
);
CREATE INDEX idx_trajectory_lookup
ON trajectory_events(conversation_id, agent_id, seq);
Scope
Non-goals (for v1)
- No JSONL mirror / file-based trace dir — SQLite is the single source of truth.
- No log rotation or retention policy — purely additive table; user can
DELETE FROM trajectory_events WHERE … if it grows.
- No UI / CLI command to render a trace —
sqlite3 codegraff.db 'SELECT …' is enough for v1.
Tradeoff
SQLite writes are ~10× slower than appending to JSONL. For local debugging the latency is irrelevant; for production we'd want it off (hence the env-var gate). The win is queryability — "show me every failed tool call from agent X in the last hour" becomes a one-liner.
Motivation
Today, every conversation is persisted to SQLite as a single opaque
contextJSON blob in theconversationstable. When a subagent run goes wrong (a tool call fails, the model loops, a child agent silently swallows an error), there's no way to:created_at/updated_at.SELECT … WHERE tool_name = …is impossible against an opaque blob).Goal: when running a subagent with a real API key, get a queryable trajectory trace of what tool calls happened in what order, with what inputs/outputs, and where it went wrong.
Proposal
Add a sibling
trajectory_eventstable that captures one row per tool call / tool result / error / model turn boundary, scoped by(conversation_id, agent_id):Scope
up.sql/down.sql) undercrates/forge_repo/src/database/migrations/crates/forge_repo/src/database/schema.rswith the newtable!blockforge_domain::TrajectoryEvent(kind enum + payload struct)forge_app::TrajectoryRepotrait —record(event)andlist(conversation_id, agent_id)forge_repo::DieselTrajectoryRepoimplforge_app::TrajectoryRecorderthat writes events when wired into the orchestratorOrchestrator::execute_tool_callsbehind opt-in env varCODEGRAFF_TRACE=1(off by default — local debugging only, not always-on cost)record→liston in-memory SQLitetool_call+tool_resultrows scoped to the right agentNon-goals (for v1)
DELETE FROM trajectory_events WHERE …if it grows.sqlite3 codegraff.db 'SELECT …'is enough for v1.Tradeoff
SQLite writes are ~10× slower than appending to JSONL. For local debugging the latency is irrelevant; for production we'd want it off (hence the env-var gate). The win is queryability — "show me every failed tool call from agent X in the last hour" becomes a one-liner.