Skip to content

perf: SQLite-backed session loading for Copilot CLI (~15x faster startup)#93

Merged
InbarR merged 2 commits into
InbarR:mainfrom
meni-braun:feature/sqlite-session-loading
May 8, 2026
Merged

perf: SQLite-backed session loading for Copilot CLI (~15x faster startup)#93
InbarR merged 2 commits into
InbarR:mainfrom
meni-braun:feature/sqlite-session-loading

Conversation

@meni-braun
Copy link
Copy Markdown
Contributor

Use Copilot CLI's local SQLite DB (session-store.db) for session loading

Problem

tmax currently loads Copilot CLI sessions by scanning the filesystem:

  1. readdirSync on ~/.copilot/session-state/ (6,000+ directories on heavy-use machines)
  2. stat each dir's events.jsonl for mtime sorting
  3. Parse workspace.yaml per session for metadata (cwd, branch, repo, summary)
  4. Parse events.jsonl incrementally for activity data (messageCount, latestPrompt, etc.)

This was optimized in v1.7.1 (cap to 314 sessions, cached candidate list, incremental parsing), but the initial scan still takes ~300ms for the directory enumeration alone, plus ~3 seconds to parse 314 sessions. On machines with thousands of sessions, startup remains noticeably slow.

Opportunity

Copilot CLI maintains a local SQLite database at ~/.copilot/session-store.db that already contains most of the data tmax needs. This DB is updated by the CLI after each session and includes:

Table Rows (example) Content
sessions 6,931 id, cwd, repository, branch, summary, created_at, updated_at
turns 12,519 session_id, turn_index, user_message, assistant_response, timestamp
search_index 13,254 FTS5 full-text search over all session content
session_files 2,451 Files edited per session
session_refs 306 Commits, PRs linked to sessions
checkpoints 84 Session checkpoints with titles and overviews

Benchmarks

Tested on a machine with 6,931 sessions (224 MB database):

Operation Current (filesystem) SQLite Speedup
Session list (top 314, sorted) ~300ms (readdirSync + stat) 34ms ~9x
Session list + turn counts + latest prompt ~3s (parse yaml + jsonl) 196ms ~15x
Total eligible count Part of dir scan 10ms -
Search (all sessions) In-memory, loaded sessions only 97ms (FTS5, ALL 6,617 sessions) Searches everything
Batch turn stats (50 sessions) Already parsed 2ms -

Proposed hybrid architecture

SQLite for cold data (session list, metadata, search), filesystem watcher for hot data (live status of active sessions).

STARTUP (SQLite - fast)
├─ Query session-store.db for top N sessions by updated_at   (~34ms)
├─ Batch-query turn counts + latest prompts                  (~196ms)
├─ Get total eligible count                                  (~10ms)
└─ Done. Session list is populated.                          (~240ms total)

LIVE MONITORING (filesystem - unchanged)
├─ chokidar watches session-state/ for events.jsonl changes
├─ Only parse events.jsonl for ACTIVE sessions (status != idle)
└─ Real-time status updates: thinking, executingTool, etc.

SEARCH (SQLite FTS5 - better)
├─ Query search_index table using FTS5 MATCH
├─ Searches ALL sessions, not just loaded ones
└─ Returns results in ~100ms

Field mapping

tmax CopilotSessionSummary field SQLite source Available?
id sessions.id
cwd sessions.cwd
repository sessions.repository
branch sessions.branch
summary sessions.summary
lastActivityTime sessions.updated_at
messageCount COUNT(turns)
latestPrompt Last turns.user_message
latestPromptTime Last turns.timestamp
toolCallCount Not in SQLite ❌ (only matters for active sessions)
totalTokens Not in SQLite ❌ (only matters for active sessions)
status (live) Not in SQLite ❌ (keep events.jsonl watcher for this)
pendingToolCalls Not in SQLite ❌ (keep events.jsonl watcher for this)

The missing fields (toolCallCount, totalTokens, status, pendingToolCalls) are only meaningful for active sessions - exactly the ones the chokidar watcher already monitors. For the session list sidebar, all key display fields are available from SQLite.

SQLite schema

-- sessions table
CREATE TABLE sessions (
  id TEXT PRIMARY KEY,
  cwd TEXT,
  repository TEXT,
  branch TEXT,
  summary TEXT,
  created_at TEXT,   -- ISO 8601
  updated_at TEXT,   -- ISO 8601
  host_type TEXT
);
CREATE INDEX idx_sessions_repo ON sessions(repository);
CREATE INDEX idx_sessions_cwd ON sessions(cwd);

-- turns table
CREATE TABLE turns (
  id INTEGER PRIMARY KEY,
  session_id TEXT,
  turn_index INTEGER,
  user_message TEXT,
  assistant_response TEXT,
  timestamp TEXT
);
CREATE INDEX idx_turns_session ON turns(session_id);

-- FTS5 search index (already built by the CLI)
CREATE VIRTUAL TABLE search_index USING fts5(content, session_id, source_type, source_id);

Example queries

-- Replace readdirSync + stat + sort (34ms vs ~300ms)
SELECT id, cwd, repository, branch, summary, updated_at
FROM sessions
WHERE updated_at > datetime('now', '-30 days')
ORDER BY updated_at DESC
LIMIT 314;

-- Batch turn stats for loaded sessions (2ms for 50 sessions)
SELECT session_id, COUNT(*) as message_count, MAX(timestamp) as last_prompt_time
FROM turns
WHERE session_id IN (?, ?, ...)
GROUP BY session_id;

-- FTS5 search across ALL sessions (97ms, not just loaded ones)
SELECT DISTINCT s.id, s.summary, s.updated_at
FROM search_index si
JOIN sessions s ON s.id = si.session_id
WHERE search_index MATCH ?
ORDER BY s.updated_at DESC
LIMIT 50;

Considerations

  1. Copilot CLI only - Claude Code doesn't use this SQLite DB (different session format). The filesystem approach must remain for Claude Code sessions. This optimization is Copilot CLI-specific.

  2. SQLite locking - The CLI writes to this DB while tmax reads. SQLite handles concurrent readers well, but tmax should open with SQLITE_OPEN_READONLY and use WAL mode awareness. Since Electron already has better-sqlite3 available (or can use the built-in sql.js), this should be straightforward.

  3. DB availability - The session-store.db file may not exist on fresh installs or if the user has a very old CLI version. Fall back to the filesystem approach gracefully.

  4. Data freshness - The SQLite DB is updated at session boundaries (not in real-time during a session). For the session list this is fine - the live status comes from the events.jsonl watcher. For search, SQLite data may lag by one session but FTS5 coverage of all historical sessions far outweighs this.

  5. No additional dependency - The SQLite file is already maintained by the CLI. tmax just needs to read it.

How these were measured

SQLite benchmarks were measured by running timed Python/sqlite3 queries against the actual ~/.copilot/session-store.db on a machine with 6,931 sessions (224 MB database). Each query was timed with time.time() around the execute + fetchall.

Filesystem benchmarks come from two sources:

  • The tmax source code comment in copilot-session-monitor.ts line 44: "Uses sync stat - fast (~300ms) and only runs once (cached after)"
  • The parse phase timing is estimated from the yield-every-10 loop across 314 sessions, each requiring readFileSync on workspace.yaml + incremental readSync on events.jsonl + JSON parsing per line. On a cold start with 314 sessions to parse, this takes ~3 seconds based on observed startup behavior (the motivation for the v1.7.1 perf fix).

Impact

  • Startup: ~240ms (SQLite) vs ~3-4s (filesystem) for 314 sessions
  • Search: Covers ALL sessions (6,000+) vs only loaded ones (314)
  • Memory: Session metadata comes from SQLite query, no need to parse/cache workspace.yaml
  • "Load more": Instant SQL query vs re-statting directories
  • Scales: Performance independent of total session count (indexed queries)

…tup)

- Add copilot-session-db.ts: reads ~/.copilot/session-store.db (read-only, lazy-loaded)
- Session list loads from SQLite in ~200ms vs ~3-4s filesystem scan
- FTS5 full-text search across ALL sessions (not just loaded 314)
- AND/OR operator support in session and prompt search
- Prompt search (Ctrl+Shift+Y) searches all historical sessions via SQLite
- Graceful fallback to filesystem when DB unavailable (Claude Code, fresh installs)
- WSL: derives DB path from basePath, no cross-contamination with host DB
- Search skips lifecycle tab filter (shows all matching results)
- Load-more hidden when SQLite active (all sessions queryable instantly)
- Fixed archived tooltip to match actual auto-archive criteria
- Search placeholder shows AND/OR hints when SQLite active
- Debounce increased to 500ms for IPC-backed search
- Version bump to 1.7.3

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rtion in E2E test

Two fixes for workspaces-multi-select E2E test failures:

1. createWorkspace() now clears selectedTerminalIds when switching to
   the new workspace, matching setActiveWorkspace behavior (TASK-72).
   Without this, selection from workspace A leaked across because
   createWorkspace already sets activeWorkspaceId to the new ws,
   making the subsequent setActiveWorkspace(wsB) a no-op.

2. The 'showSelectedPanes is a no-op' test asserted viewMode !== 'grid',
   but the default viewMode IS 'grid'. Since showSelectedPanes is a
   no-op (< 2 selected), viewMode stays unchanged. Fixed the assertion
   to check viewMode is unchanged rather than not-grid.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@InbarR
Copy link
Copy Markdown
Owner

InbarR commented May 8, 2026

Thanks @meni-braun - this is a thoughtful proposal with real benchmarks. The hybrid (SQLite cold + chokidar hot) shape and the graceful filesystem fallback are exactly the right design.

Review notes before merging:

Native module compat. better-sqlite3 is a native module with prebuilt binaries that need to match Electron's Node ABI. tmax ships on Windows / macOS / Linux; before this lands I want to confirm the binaries cleanly install on all three (and via the npm-package postinstall path). A CI check that runs npm install on each platform would be ideal.

Concurrency with the CLI writer. The DB is opened read-only, but Copilot CLI writes to it during sessions. SQLite WAL mode + a busy timeout would smooth this; please confirm the open path uses SQLITE_OPEN_READONLY + PRAGMA busy_timeout = ... to avoid SQLITE_BUSY errors during heavy CLI activity.

Schema-drift defense. If a future Copilot CLI release adds/renames columns, the queries should fall back gracefully rather than throw. The code path that sets dbAvailable = false on first failure is the right idea; just make sure it covers query-time errors too, not only open-time.

WSL. The constructor passes a path under path.dirname(this.basePath) for WSL distros, but accessing the WSL filesystem from the host has different semantics (case sensitivity, lock files, the \\wsl.localhost\ prefix on Windows). Worth a smoke test against a WSL-installed Copilot CLI to confirm the SQLite open succeeds.

Conflicts with main. The PromptSearchDialog touches conflicting territory after a recent change (AND-syntax tokenization + match-count badge). Both the SQLite query path and the client-side AND filter want to handle the user's typed query - they need to compose: when SQLite-active, use the DB query for the haystack; when SQLite-inactive (Claude Code, fallback), use the client-side AND filter. The merge will need a hand-resolved hunk in that file.

I'd like to defer the merge until there's confidence on the cross-platform native install and the conflict resolution. Happy to walk through either with you - or you can rebase on current main and confirm the PromptSearchDialog merge yourself, and I'll re-review.

@InbarR InbarR merged commit 158849a into InbarR:main May 8, 2026
@InbarR
Copy link
Copy Markdown
Owner

InbarR commented May 8, 2026

Thanks @meni-braun - merged in 158849a. The SQLite path + filesystem fallback + native install all worked cleanly on this Windows machine. The PromptSearchDialog merge composed your SQLite query path with the AND-syntax client-side filter that landed in the meantime - SQLite-active queries use the DB directly, Claude Code entries continue to flow through the client-side AND filter, and the match-count badge stays.

Will pick up real-world CLI write-during-read concurrency feedback once it's in users' hands. Closing alongside the merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants