Skip to content

Compact inflated AI session transcripts#48

Merged
jsgrrchg merged 3 commits into
mainfrom
fix/compact-ai-session-transcripts
May 10, 2026
Merged

Compact inflated AI session transcripts#48
jsgrrchg merged 3 commits into
mainfrom
fix/compact-ai-session-transcripts

Conversation

@jsgrrchg
Copy link
Copy Markdown
Owner

@jsgrrchg jsgrrchg commented May 9, 2026

Summary

This PR fixes the root cause of inflated AI session transcripts in .neverwrite/sessions/*/transcript.jsonl.

NeverWrite already used session-meta.json and index.json as the logical source of truth, but the physical JSONL transcript could keep accumulating obsolete versions of updated tool/status messages. Long ACP sessions, especially Codex sessions with many tool/status updates, could therefore leave hundreds of MB of duplicated transcript rows even when the indexed conversation was much smaller.

Changes

  • Adds shared transcript compaction in crates/ai/src/persistence.rs for all AI runtimes, not just Codex.
  • Rewrites inflated transcript.jsonl files using only the messages currently referenced by index.json.
  • Rebuilds transcript offsets, lengths, and hashes after compaction.
  • Uses temporary files, backups, and compact-state.json so interrupted compactions can be rolled back safely on the next load/save.
  • Compacts on save when obsolete transcript bytes are significant, now it uses 6 times less space.
  • Repairs old inflated sessions on load/restore/page fetch before reading transcript pages.
  • Avoids compacting summary-only history listing, so opening Chat History metadata does not do unnecessary heavy work.

Impact

The user-visible chat history is preserved. This only removes obsolete physical JSONL rows that represented intermediate versions of messages already superseded by the transcript index.

This should reduce disk growth, I/O, memory pressure, and crash/freeze risk from long AI conversations while keeping restore/history behavior unchanged.

Validation

  • cargo test -p neverwrite-ai
  • cargo build -p neverwrite-native-backend
  • git diff --check

@jsgrrchg jsgrrchg changed the title [codex] Compact inflated AI session transcripts Compact inflated AI session transcripts May 9, 2026
@jsgrrchg jsgrrchg marked this pull request as ready for review May 9, 2026 06:49
@jsgrrchg jsgrrchg merged commit 3f4c9b7 into main May 10, 2026
3 checks passed
@jsgrrchg jsgrrchg deleted the fix/compact-ai-session-transcripts branch May 10, 2026 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix transcript history growth causing crashes and orphaned Codex processes

1 participant