Skip to content

Session indexer blocks event loop for seconds when scanning thousands of session files #237

@danshapiro

Description

@danshapiro

Problem

The coding_cli_refresh operation in server/coding-cli/session-indexer.ts processes session files synchronously, blocking the Node.js event loop for extended periods. This causes cascading performance degradation including terminal input lag, WebSocket disconnections, and UI freezing.

Evidence from Production Logs

{"event":"coding_cli_refresh","durationMs":4381.55}  // 4.4 second refresh
{"event":"perf_system","eventLoopMax":6895.44}       // 6.9s event loop block
{"event":"ws_backpressure_close"}                   // Connection killed
{"event":"terminal_input_lag","lagMs":2516}         // 2.5s input delay

Scale of the Problem

  • 8,206 session files being scanned (5,573 Claude + 2,633 Codex)
  • 6.6GB of session data in ~/.claude/projects and ~/.codex
  • Refresh times: 500ms - 4,400ms
  • Event loop blocking: up to 6,895ms (nearly 7 seconds)
  • Server uptime before restart: 20+ hours

Root Cause

The session indexer at server/coding-cli/session-indexer.ts:630-800 uses synchronous file I/O:

  1. Synchronous file reads - fs.readFileSync or similar blocks the event loop
  2. Synchronous JSON parsing - Large JSONL files parsed synchronously
  3. Insufficient yielding - yieldToEventLoop() every 200 files is not enough with 8,000+ files
  4. No parallelization - Single-threaded processing of all files

The current yieldToEventLoop() approach helps but doesn't prevent multi-second blocks because each individual file operation can still take substantial time.

Timeline of Degradation

Time Event Loop Max Issue
21:35:26 4,836ms Major block during refresh
23:47:59 6,895ms Worst block - heap jumped 220→608MB
21:58-22:04 300-700ms Sustained high latency

Proposed Solution

  1. Use async file operations - Replace sync reads with fs.promises.readFile
  2. Worker threads - Offload file parsing to worker threads to avoid blocking main thread
  3. Batched processing - Process files in smaller batches with proper async gaps
  4. Incremental indexing - Only re-index changed files using mtime comparison
  5. Lazy loading - Don't load full session content upfront, just metadata

Impact

  • High - This is the primary cause of Freshell becoming unusably slow after extended uptime
  • Affects all users with large numbers of Claude/Codex sessions
  • Gets progressively worse as more sessions accumulate

Related

  • Contributes to WebSocket backpressure issues
  • Related to memory leak in session state

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions