Skip to content

DC-MEM-002: Session storage [MVP-1] #91

@radoxtech

Description

@radoxtech

Epic: #11 Memory and Project State Backbone [MVP-1]
Iteration: MVP-1
Package: @diricode/memory
Spec: docs/mvp/epic-memory.md → DC-MEM-002

Goal

Persist session and message history as first-class entities for replay, analysis, and pipeline continuity. Sessions are the primary unit of interaction tracking — every user conversation creates a session containing an ordered sequence of messages with token/cost telemetry.

Scope

  • sessions table:
    • id, created_at, updated_at, status (enum: created/active/completed/archived), metadata (JSON)
  • messages table:
    • id, session_id, role (user/assistant/system/tool), content, tokens, cost, agent_id, timestamp
    • FK: messages.session_id → sessions.id with CASCADE delete
    • Indexed lookups by session and time
  • Repository/API methods:
    • Session CRUD: create, read, update status/metadata, list with filters
    • Message CRUD: append + retrieval per session
    • Soft-close semantics (MVP preference — mark as completed, don't delete)
  • Pagination and ordering:
    • Chronological retrieval (ASC default for replay)
    • Cursor-based pagination for large sessions
    • Bounded page size

Acceptance Criteria

  • Sessions can be created, updated, listed, and fetched by ID
  • Messages can be appended and retrieved per session in deterministic chronological order
  • Deleting/closing sessions uses soft-close (status update, not row deletion)
  • Session/message queries are indexed and performant for typical MVP workloads
  • API returns typed, schema-validated entities (Zod validation at boundaries)
  • Message schema tracks tokens and cost per message for telemetry integration
  • Session status transitions are validated (no invalid state changes)

Schema Sketch

CREATE TABLE sessions (
  id TEXT PRIMARY KEY,
  status TEXT NOT NULL DEFAULT 'created' CHECK(status IN ('created','active','completed','archived')),
  metadata TEXT DEFAULT '{}',  -- JSON
  created_at TEXT NOT NULL DEFAULT (datetime('now')),
  updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE TABLE messages (
  id TEXT PRIMARY KEY,
  session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
  role TEXT NOT NULL CHECK(role IN ('user','assistant','system','tool')),
  content TEXT NOT NULL,
  tokens INTEGER DEFAULT 0,
  cost REAL DEFAULT 0.0,
  agent_id TEXT,
  timestamp TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE INDEX idx_messages_session_time ON messages(session_id, timestamp);
CREATE INDEX idx_sessions_status ON sessions(status);

Implementation Notes

Mandatory Web Research Before Implementation

Research the following before starting:

  1. AI agent session storage patterns — Study how OpenAI Agents SDK stores sessions in SQLite (AdvancedSQLiteSession with agent_sessions + agent_messages + message_structure tables). See OpenAI Agents Advanced SQLite docs
  2. Cursor-based pagination in SQLite — Study keyset pagination (WHERE timestamp > ? ORDER BY timestamp ASC LIMIT ?) vs offset pagination for message retrieval
  3. Session state machines — Study how frameworks like Semantic Kernel and Microsoft Agent Framework handle per-turn state persistence (load at turn start, persist at turn end)
  4. Deduplication patterns — Study the Hermes Agent bug (NousResearch/hermes-agent#860) where multiple write paths caused 3-4x message duplication in SQLite — implement single-writer pattern to avoid this

Reference Implementations

  • OpenAI Agents SDK: AdvancedSQLiteSession — stores sessions with branching support, per-turn token usage tracking (turn_usage table), and message structure metadata. Production-quality SQLite schema with proper FK constraints and indexes
  • Hermes Agent (NousResearch/hermes-agent): Demonstrates pitfalls of multiple write paths to same SQLite session DB — critical lesson on deduplication and single-writer queues
  • Microsoft Agent Framework: Instantiation-per-request model where state loads at turn start and persists at turn end. Turn timeout defaults. State keyed by conversation ID — validates our session-scoped approach
  • Yodoca (VitalyOborin/yodoca): Single-writer async queue pattern for SQLite writes avoids "database is locked" under concurrent access. Episodic nodes as immutable audit trail

Existing Code Context

  • SQLite client at packages/memory/src/db/client.ts with better-sqlite3 + WAL mode
  • Existing migrations in packages/memory/src/db/migrations/ — this adds a new migration
  • Message tokens/cost columns align with DC-MEM-005 (token usage tracking)
  • Follow existing repository class pattern in packages/memory/src/

Dependencies

  • Upstream: DC-MEM-001 (SQLite setup + migrations) — ✅ already implemented
  • Downstream: DC-PIPE-001 (turn lifecycle stores turn metadata in sessions)
  • Cross-epic: epic-observability (session data powers metrics)

References

  • docs/mvp/epic-memory.md (canonical spec)
  • spec-mvp-diricode.md (session lifecycle and memory ownership)
  • analiza-context-management.md (timeline and context persistence patterns)

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions