perf: async analytics/leaderboard + SQLite FTS5 + live partial results by apstenku123 · Pull Request #148 · vakovalskii/codbash

apstenku123 · 2026-04-09T00:10:49Z

Summary

Non-blocking loadSessions() cold path + async background warmer for parse/cost caches
Async /api/analytics/cost and /api/leaderboard jobs with live partial results (UI shows $0 → $5562 → $11344 → ... → final as sessions aggregate)
Persistent SQLite + FTS5 index (~/.codedash/cache/index.sqlite) for sessions, messages, daily stats, and aggregate result cache
Shared groupSessionsByConversation helper used by Timeline / All Sessions / Projects / Cloud Sync / Activity — collapses codex exec retries of the same prompt into one representative card with +N more badge
Fixes getActiveSessions O(N) lsof blocking (80s → 350ms), broken regex that missed codex-up / codex-up-exec, and lsof -a flag so pid filter actually applies
Fixes massive user_messages overcount: Claude type=user entries include tool_result blocks (28x inflation measured on a real session)
Detects scripted sub-agent runs (originator=codex_exec + 9 first-prompt regex patterns) and hides them from default counts; ?include_helpers=1 opts in

Why

On a corpus of 4873 sessions / 1.1 GB of JSONL the GUI hung:

loadSessions() cold: 109 seconds (re-parsed every Claude file sync, plus 112s of sync git rev-parse × 56 projects, plus sync parseClaudeSessionFile on a 199 MB file)
/api/active: 80 seconds (matched 34 processes named codex-up* and ran per-pid lsof + loadSessions inside the loop)
/api/analytics/cost + /api/leaderboard: blocking, client got ERR_TIMED_OUT
Spammed [ACTIVE] pid=... codex/waiting on every /api/active poll

Chrome tabs accumulated hung keep-alive connections and the server eventually looked dead.

After

metric	v6.15.10 cold	this branch
`loadSessions()`	109 s	72 ms (cacheOnly)
`/api/active`	80 s (34 procs)	350 ms cold, 0 ms warm
`searchFullText`	3–10 s (in-mem rebuild)	5 ms (FTS5 `MATCH`)
`/api/analytics/cost` first byte	~30 s blocking	~20 ms (partial)
Full analytics cold run	30+ s blocking	~15 s (with live UI)
Full analytics repeat visit	30+ s (recomputed)	instant (SQLite `aggregate_cache` hit)

On the test corpus leaderboard counts went from 2869 sessions / 867k prompts (inflated by retries + tool_results + scripted helpers) to a realistic 612 unique conversations / 8458 real user prompts / $53,637 real spend / 9-day streak.

Architectural notes

WAL-mode SQLite via _execAsync(sql) that spawns sqlite3 -cmd '.timeout 30000' <db> and streams SQL into stdin, so concurrent writers wait instead of erroring with database is locked. Reads don't block writers.
Fingerprint for aggregate_cache is quantized to 5-minute buckets so live codex processes appending to their rollout files don't invalidate the cache on every request.
Persistent cache dir at ~/.codedash/cache/ (was os.tmpdir() — macOS tmpdir cleanup wipes hours of parse work). Legacy tmpdir paths are migrated once on load.
Helper detection — two signals:
1. session_meta.payload.originator === 'codex_exec' (standard codex exec)
2. Regex match on first user prompt: ^You are in /, ^Read-only task\., ^Work (only )?in /, ^Pair-local .* lane, ^## X Y Agent, ^Read $OMX_, etc.

New endpoints

GET /api/warming — {running, done, total, phase} for background parse
GET /api/sqlite-status — {backfill, index: {sessions, messages, files, db_bytes}}
GET /api/analytics/cost now returns {status: 'running'|'done'|'error', progress, partialResult, result} (backwards-compat: done spreads result fields at the top level so existing UI code still works)
GET /api/leaderboard — same async job shape
GET /api/sessions?include_helpers=1 — opt in to scripted helper sessions (default hidden; X-Helper-Count header reports the filtered count)

Upgrade note

After installing users should do a hard-reload (Cmd+Shift+R) once — the poll loops in the split frontend modules are new code and old cached JS will show a stuck Loading… spinner.

Test plan

4873-session corpus: cold loadSessions 109 s → 72 ms
Live partial result visible in Analytics poll loop
Cache hit path: after first computation, subsequent /api/analytics/cost returns status: 'done' with the cached result
SQLite FTS5 search: analytics query returns 3 highlighted snippets in 58 ms on a 1.2M-message corpus
Helper filter: 4873 → 2869 sessions by default, ?include_helpers=1 restores full set
getActiveSessions correctly picks up codex-up processes with cwd /Volumes/external/sources/nanochat (not kernel addresses)
Server survives SIGTERM with periodic cache flush (no lost parse work)
Persistent aggregate_cache survives process restart — verified via sqlite3 ~/.codedash/cache/index.sqlite "SELECT kind, length(result_json) FROM aggregate_cache"

Stops the GUI from hanging on users with large session histories (4800+ codex rollouts / 1 GB+ JSONL). Tested on a 4873-session corpus where cold Analytics previously took 100+ seconds to respond. ## Core changes - **Non-blocking loadSessions** — sync path reads metadata only, uses the parse/cost disk caches, and queues uncached files for a background warmer. Cold `/api/sessions` now returns in ~300ms instead of blocking for 100+ seconds. - **Async background jobs** for `/api/analytics/cost` and `/api/leaderboard`. HTTP returns a `{status, progress, partialResult}` snapshot immediately; the client polls at 500ms and the job publishes a live partial aggregate on each chunk so users see real numbers climb ($0 → $5562 → $11344 → ... → final). - **Incremental cost aggregator** (`createCostAggregator`) — extracted from `getCostAnalytics` so the job can merge sessions one chunk at a time and finalize a snapshot per yield. - **SQLite + FTS5 index** (`src/sqlite-index.js`) at `~/.codedash/cache/index.sqlite` — persistent sessions/messages/ messages_fts (FTS5 porter+unicode61)/daily_stats/files_seen/ aggregate_cache tables. Full-text search runs via `MATCH` in ~5 ms on 1.2M messages. Uses async `spawn` with `-cmd .timeout 30000` so concurrent writers don't deadlock. - **Aggregate result cache** persisted in `aggregate_cache` with a 5-min quantized fingerprint (`count|max_ts_bucket|filters|helpers`). Repeat visits within the bucket are instant; active codex writes don't invalidate the cache on every request. ## Hot-path fixes found along the way - `getActiveSessions` ran `lsof` per matching process with a 2s timeout and called `loadSessions()` inside the loop → blocked the event loop for minutes when 30+ codex-up wrappers were running. Now: single `ps` call, tight regex (catches `codex`/`codex-up`/ `codex-up-exec` as binary names), batched `lsof -a -d cwd -Fpn -p` (the `-a` flag is critical; without it lsof ORs conditions and returns cwds for unrelated processes), pid→cwd cache, 3s result cache, no inner loadSessions. 80s → 350ms cold, 0ms cached. - `resolveGitRoot` called `git rev-parse` synchronously per unique project path — 56 projects × 2s = 112s of blocked event loop. Now queued to a background resolver with its own disk cache in `~/.codedash/cache/git-root-cache.json`. - `scanCodexSessions` was O(n²) via `.find()` on an array, and re-read every rollout file on each call. Now: `Map<sid, session>` + `cacheOnly` parse mode + background warmer drains uncached files. - `parseClaudeSessionFileAsync` — streaming read via `readline` for files >5 MB (user had a 199 MB session file). Yields to the event loop every 2000 lines via `await new Promise(r => setImmediate(r))` so HTTP requests aren't starved during parse. - Persistent cache paths moved from `os.tmpdir()` to `~/.codedash/cache/` so macOS tmpdir cleanup doesn't wipe a morning's worth of parse work. - Flush handlers on SIGINT/SIGTERM + periodic flush every 50 entries — killing the server mid-warm no longer loses hours of progress. ## Accuracy fixes - `parseClaudeSessionFile` counted every `type=user` entry as a user prompt, but Claude Code stores tool_results as `type=user` with `content: [{type:'tool_result', ...}]`. One measured session had 480 type=user entries but only 17 real user prompts — a 28x overcount. Now checks `content` for a real `text` block. - `isSystemMessage` extended to skip Codex runtime injections that were counted as user prompts: `<cwd>`, `<turn_aborted>`, `<ide_selection>`, `<command_output>`, `# CLAUDE.md`, `Warning: The maximum number of unified exec`, `AUTOSTEERING:`, `[Sub-agent results]`. - **Helper session detection** — codex rollouts with `session_meta.payload.originator === 'codex_exec'` (or scripted first-message patterns like `You are in /...`, `Read-only task.`, `Work in /...`, `Pair-local ... lane`, `## X Y Agent`, etc) are flagged `is_helper: true`. `/api/sessions` filters them by default; `?include_helpers=1` opts in. On the test corpus this removes 2166/4873 scripted sub-agent runs. - Leaderboard now counts **unique conversations** (via `group_key`), not retries. Real cost is summed over all rollouts (actual money spent); session count uses deduped groups. On the test corpus: 2869 → 612 unique conversations, 867k → 8458 real prompts, cost stays at the real $53,637. ## Shared grouping helper - `computeSessionGroupKey(s)` in data.js: `tool::project::firstMsg[0..200]` (or `helper::project` for helpers) — computed once per session on load, exposed as `s.group_key`. - `groupSessionsByConversation(sessions)` in frontend/app.js — shared by **Timeline**, **All Sessions**, **Projects view**, **Cloud Sync**, and **Activity/Heatmap**. One helper, one representative per group, `+N more` badge on cards. ## Logging - `CODEDASH_LOG=0` (default) silences stdout spam (previous `[ACTIVE] pid=... codex/waiting cpu=0%` lines were emitted on every /api/active poll). `ERROR`/`WARN`/`JOB` still go to stdout. - All logs (including verbose tags) always go to `~/.codedash/logs/server.log` with timestamps. Set `CODEDASH_LOG=1` to also mirror to stdout. ## Exports / new endpoints - `loadSessionsAsync(progressCb)` — async variant with incremental mtime-based change detection. - `getWarmingStatus()` + `/api/warming` — background parse progress. - `getSqliteBackfillStatus()` + `/api/sqlite-status` — FTS5 ingest progress + index size. - `createCostAggregator()`, `computeSessionCostForAnalytics(session, opencodeCache)`, `buildOpencodeCostCache(sessions)` — so the async jobs can stream-aggregate without re-wiring `getCostAnalytics`. ## Benchmark (user's machine, 4873 sessions / 1.1 GB JSONL) ``` before (v6.15.10) after loadSessions 109 s (cold) 72 ms (cacheOnly path) /api/active 80 s (34 procs) 350 ms cold, 0 ms warm search 3–10 s (rebuild) 5 ms (FTS5 MATCH) analytics 30 s (blocking) first partial in ~500 ms, full in ~15 s, instant on cache hit leaderboard 35 s (blocking) first partial in ~500 ms, full in ~15 s, instant on cache hit ``` Browser cache note: after upgrading, users should hard-reload (Cmd+Shift+R) once so the split frontend modules re-load — the poll loops are new code and old cached JS will show the initial "Loading..." spinner without progressing.

Copilot

Pull request overview

This PR refactors codedash’s analytics/leaderboard/search/session-loading paths to avoid long synchronous stalls by introducing background warming, async “job” endpoints with live partial results, and a persistent SQLite (FTS5) index/cache under ~/.codedash/cache/.

Changes:

Add a persistent SQLite + FTS5 index and aggregate-result cache for fast full-text search and repeat analytics/leaderboard visits.
Convert /api/analytics/cost and /api/leaderboard into async background jobs that stream progress + partial snapshots to the UI.
Add session de-duplication by conversation group (retry/resume collapsing) across multiple frontend views, plus default filtering of helper/sub-agent sessions.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
src/sqlite-index.js	New sqlite3 CLI wrapper, schema, FTS search, and persistent aggregate cache helpers.
src/server.js	Adds async analytics/leaderboard job endpoints, new status endpoints, helper filtering, and routes search via SQLite.
src/data.js	Implements cache-only session loading + background warmer, SQLite ingest/backfill, helper detection, grouping keys, and streaming analytics aggregation.
src/frontend/analytics.js	Adds polling loop and live “partial results” rendering for analytics.
src/frontend/leaderboard.js	Adds polling loop and progress UI for leaderboard.
src/frontend/app.js	Adds shared grouping helper + UI badges for collapsed conversation groups.
src/frontend/heatmap.js	Uses shared grouping helper to dedupe sessions before counting.
src/frontend/cloud.js	Uses shared grouping helper and shows “+N more” aggregate info.
src/frontend/styles.css	Adds styles for group badges and analytics/leaderboard progress UI.

Comments suppressed due to low confidence (1)

src/data.js:671

parseClaudeSessionFile() compares entry.timestamp directly to numeric firstTs/lastTs. Elsewhere (e.g. _computeSessionDailyBreakdown) timestamps are handled as either numbers or ISO strings. If entry.timestamp is a string here, the comparisons can produce incorrect first/last timestamps. Normalize entry.timestamp to an epoch-ms number before comparing/assigning.

      if (entry.timestamp) {
        if (entry.timestamp < firstTs) firstTs = entry.timestamp;
        if (entry.timestamp > lastTs) lastTs = entry.timestamp;
      }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/server.js

Copilot · 2026-04-09T00:18:21Z

src/server.js

+        let served = false;
+        try {
+          const sessions = loadSessions();
+          const filtered = includeHelpers ? sessions : sessions.filter(s => !s.is_helper);
+          const fingerprint = _analyticsFingerprint(filtered, '', '', includeHelpers ? 'h1' : 'h0');


Same concurrency issue as analytics: the leaderboard endpoint starts _runLeaderboardJob() only after an async cache lookup resolves. Concurrent requests for the same key can start multiple jobs and race to overwrite _jobs.leaderboard. Consider deduplicating in-flight work per key (store a promise/state immediately) so only one job runs per key at a time.

src/data.js

Copilot · 2026-04-09T00:18:21Z

src/data.js

+function _ensureSqliteBackfillRunning() {
+  if (_sqliteBackfillRunning) return;
+  let sqliteIndex;
+  try { sqliteIndex = require('./sqlite-index'); } catch { return; }
+
+  _sqliteBackfillRunning = true;
+  _sqliteBackfillStatus.running = true;
+  _sqliteBackfillStatus.startedAt = Date.now();
+  _sqliteBackfillStatus.phase = 'scanning';
+  _sqliteBackfillStatus.done = 0;


_ensureSqliteBackfillRunning() is described as a "one-shot" task, but there is no guard for the completed state. After the backfill finishes (_sqliteBackfillRunning resets to false), subsequent loadSessions() calls will start another full scan/backfill. Consider adding a persistent "completed" flag (or checking _sqliteBackfillStatus.phase === 'done') to avoid repeated full rescans.

src/data.js

Copilot · 2026-04-09T00:18:22Z

src/sqlite-index.js

+function _exec(sql, opts) {
+  opts = opts || {};
+  const args = ['-cmd', _CMD_BUSY, DB_FILE];
+  if (opts.json) args.push('-json');
+  args.push(sql);


_exec()/_execAsync() append -json after DB_FILE in the sqlite3 argv. The sqlite3 CLI stops parsing options once it sees the database filename, so -json in that position is treated as part of the SQL and can break all JSON queries (e.g. _execJson, loadAllFilesSeen, getIndexStatus, aggregate cache reads). Build args with options first (e.g. ['-cmd', _CMD_BUSY, '-json', DB_FILE]).

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…lskii#5, vakovalskii#10) - vakovalskii#10: Move `-json` flag BEFORE DB_FILE in both `_exec()` and `_execAsync()`. sqlite3 CLI stops option parsing at the database filename, so `-json` after it was silently ignored (fell back to default text output on strict-POSIX systems). - vakovalskii#3: Add synchronous placeholder job in the leaderboard endpoint before the async cache lookup, same pattern as the analytics endpoint fix from earlier commits. Prevents concurrent requests from each starting their own `_runLeaderboardJob()`. - vakovalskii#5: Guard `_ensureSqliteBackfillRunning()` with a completed-phase check (`_sqliteBackfillStatus.phase === 'done'`). Without this, every `loadSessions()` call after the initial backfill finished would trigger another full directory scan.

- select s (session) inside tab for correct iTerm2 focus - normalize tty: strip /dev/ prefix for reliable matching

- Add `src/embeddings.js` — optional vector search using @huggingface/transformers (pure JS ONNX, no Python/torch needed). Model: Xenova/all-MiniLM-L6-v2, 384-dim embeddings, ~23 MB ONNX. Falls back gracefully when npm package isn't installed. - `/api/search?q=X&mode=text|semantic|hybrid` — three search modes: - `text`: FTS5 keyword MATCH (existing behavior, default fallback) - `semantic`: pure cosine similarity against pre-computed session embeddings - `hybrid`: FTS5 for recall (top 200) → vector re-rank for precision, combined score = 0.3×fts_matches + 0.7×similarity - SQLite `session_embeddings` table stores pre-computed embeddings per session. Populated during SQLite backfill phase 2 (after FTS5 ingest). Batched 32 at a time with setImmediate yields. - `/api/embeddings/status` — reports model availability, dim, count. - Frontend: Text / Hybrid / Semantic toggle buttons next to search bar. Default: hybrid. Re-runs search on mode change. - Cherry-pick upstream v6.15.11: iTerm2 focus fix (select session in tab, normalize tty /dev/ prefix).

Rewrites embeddings.js to match the codex-git retrieval architecture (kb_hybrid_search.rs + kb_embedding_store.rs): 6-stage pipeline (per Memento paper, arXiv 2603.18743, Figure 8): Stage 1: FTS5 sparse recall (top-20) Stage 2: Dense embedding recall (top-20) Stage 3: Reciprocal Rank Fusion (k=60, Cormack et al. 2009) Stage 4: Utility reranking (per-entry success/failure rate) Stage 5: Threshold filter Stage 6: Top-k Weights from Memento paper results: BM25=0.3 (Recall@1=0.32), Embedding=0.7 (Recall@1=0.54) Utility: final = rrf * (0.7 + 0.3 * utility_rate) 3-level provider chain (matches codex-git priorities): 1. Local: MiniLM-L6-v2 (default, 384d, 23MB) or Qwen3-Embedding-0.6B (1024d, configurable via ~/.codedash/embedding-config.json) 2. API: OpenAI-compatible /embeddings endpoint (GitHub Models at models.github.ai, Copilot proxy, or any provider) 3. TF-IDF fallback: 256-dim bag-of-words hash (always available) New: utility tracker (SQLite search_utility table) — records click/ expand/ignore per session×query, feeds Stage 4 reranking. POST /api/search/utility endpoint for frontend to report outcomes. GET /api/embeddings/status now reports: model, dim, count, available models, config, and pipeline parameters. Tested: 36 results in 313ms for "megatron training" with full 6-stage pipeline on 500 pre-computed embeddings.

## copilot-client.js (NEW) - Auto-discovers GitHub Copilot OAuth tokens from: ~/.config/github-copilot/apps.json (preferred, VS Code refresh) ~/.copilot/auth/credential.json (Copilot CLI fallback) - Tries ALL available tokens (not just first) — handles stale tokens - Token exchange via GET api.github.com/copilot_internal/v2/token Returns dynamic endpoint (enterprise vs individual) - chatCompletion(messages, {model, max_tokens, reasoning_effort}) Default: gpt-4.1 (free for Pro). Also: gpt-5-mini with xhigh - summarizeSession(messages) — truncated first+last 10, GPT summary - Session token cached until expiry, auto-refresh 60s before ## Progressive message loading - GET /api/session/:id?offset=0&limit=50 — paginated - GET /api/session/:id/stream — SSE chunks of 50 messages - Frontend: loads first 50 immediately, "Load more" button - Message role filters: All | User | Assistant | Tools ## Summarize - POST /api/summarize/:id — calls Copilot gpt-4.1 - GET /api/copilot/status — auth state, model, api_base - Frontend: "Summarize" button in detail header - Summary rendered in styled box above messages ## Other - Disabled update nag banner (updateAvailable always false) - Merged upstream changes - Tests: copilot-client (7 tests), embeddings (8 tests) Both pass with real Copilot API calls Tested: gpt-4.1 "Hello!" 1.6s, gpt-5-mini xhigh "4" 687ms

…etry

vakovalskii · 2026-04-10T11:21:18Z

Account suspended. Also conflicts with current codebase and introduces unnecessary complexity (SQLite FTS5 dependency).

Merged: #156 (star sync), #155 (clipboard fallback), #159 (bind URL fix), #157 (session name vs first prompt), #160 (MCP badges toggle), #100 (Warp launch API) Closed: #128 (dup), #148 (banned), #161 (bad diff) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 9, 2026 00:10

Copilot started reviewing on behalf of apstenku123 April 9, 2026 00:11 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

apstenku123 and others added 13 commits April 9, 2026 12:42

Update src/data.js

6bda4fd

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/server.js

70a5bcc

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/server.js

3589482

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/data.js

62bed4c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/data.js

7b5ddd0

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/data.js

0984fd1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/data.js

a66cd0b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fix: cherry-pick upstream v6.15.11 — iTerm2 focus + tty normalize

8bcb147

- select s (session) inside tab for correct iTerm2 focus - normalize tty: strip /dev/ prefix for reliable matching

fix: copilot/status endpoint + pagination in session detail + token r…

459e237

…etry

vakovalskii closed this Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: async analytics/leaderboard + SQLite FTS5 + live partial results#148

perf: async analytics/leaderboard + SQLite FTS5 + live partial results#148
apstenku123 wants to merge 14 commits intovakovalskii:mainfrom
apstenku123:perf/async-jobs-sqlite-fts5

apstenku123 commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

vakovalskii commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

apstenku123 commented Apr 9, 2026

Summary

Why

After

Architectural notes

New endpoints

Upgrade note

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

vakovalskii commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants