strategy: guard against empty-session metadata stubs#1022
Conversation
Two defensive guards in CondenseSession to prevent writing metadata.json checkpoints with no transcript and no real session attribution behind them: 1. Post-redaction skip gate (skipIfPostRedactionEmpty). The existing pre-redaction gate at line 165 catches sessions with neither a transcript nor tracked files. It cannot catch the case where the transcript was non-empty pre-redaction but redaction silently dropped it (malformed JSONL from an external agent), AND the session's tracked files don't overlap with the committed file set. The new gate runs after redaction and after filterFilesTouched so it sees the post-filter view; if both transcript and FilesTouched are now empty, return Skipped instead of writing a stub. 2. Narrow the filterFilesTouched fallback (sessionHasEvidenceOfWork). The old fallback assigned every committed file (minus .entire/) to any session with empty FilesTouched. That was designed for mid-turn commits before SaveStep ran, but it also fired for sessions registered at SessionStart that never produced anything — e.g., a Codex companion session whose hooks ran with a null transcript_path and never reached SaveStep. The fallback now requires evidence the session is a real participant: either a non-empty transcript was extracted, or a prior SaveStep recorded a checkpoint (StepCount > 0). Also extracts redactOrDrop helper so CondenseSession stays under the maintidx threshold after the new skip gate is added. Tests cover the post-redaction skip in the no-overlap + redaction-fail case, plus four unit tests for the narrowed filterFilesTouched contract: applies fallback with StepCount evidence, applies fallback with transcript evidence, skips fallback when neither is present, and intersects normally when FilesTouched is already populated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: e733ba0e7dff
There was a problem hiding this comment.
Pull request overview
This PR strengthens ManualCommitStrategy.CondenseSession to avoid persisting “metadata-only” checkpoints when a session ends up with neither a usable transcript nor attributable file activity after redaction and file filtering, reducing false/empty session attribution.
Changes:
- Adds a post-redaction skip gate to prevent writing checkpoints when redaction drops the transcript and
FilesTouchedis empty after filtering. - Narrows the
filterFilesTouchedfallback so it only assigns committed files when the session has evidence of real participation (non-empty transcript orStepCount > 0). - Refactors redaction handling into a helper and adds targeted unit tests for the new behaviors.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
cmd/entire/cli/strategy/manual_commit_condensation.go |
Adds redactOrDrop, a post-redaction “empty session” skip gate, and tightens filterFilesTouched fallback via sessionHasEvidenceOfWork. |
cmd/entire/cli/strategy/condense_skip_test.go |
Adds regression coverage for the post-redaction skip case and unit tests validating the new filterFilesTouched fallback contract. |
dipree
left a comment
There was a problem hiding this comment.
Looks good. The gating order (pre-redaction gate → filterFilesTouched → redactOrDrop → post-redaction gate) is correct, and sessionHasEvidenceOfWork cleanly separates legitimate mid-turn commits from empty companion sessions. Tests cover both paths — the redaction-failure test is serial (t.Chdir) so the global redactSessionJSONLBytes override doesn't race with the parallel filterFilesTouched unit tests. Lint and full strategy suite pass locally.
https://entire.io/gh/entireio/cli/trails/a7fd3b96a21d
Two defensive guards in CondenseSession to prevent writing metadata.json checkpoints with no transcript and no real session attribution behind them:
Post-redaction skip gate (skipIfPostRedactionEmpty). The existing pre-redaction gate at line 165 catches sessions with neither a transcript nor tracked files. It cannot catch the case where the transcript was non-empty pre-redaction but redaction silently dropped it (malformed JSONL from an external agent), AND the session's tracked files don't overlap with the committed file set. The new gate runs after redaction and after filterFilesTouched so it sees the post-filter view; if both transcript and FilesTouched are now empty, return Skipped instead of writing a stub.
Narrow the filterFilesTouched fallback (sessionHasEvidenceOfWork). The old fallback assigned every committed file (minus .entire/) to any session with empty FilesTouched. That was designed for mid-turn commits before SaveStep ran, but it also fired for sessions registered at SessionStart that never produced anything — e.g., a Codex companion session whose hooks ran with a null transcript_path and never reached SaveStep. The fallback now requires evidence the session is a real participant: either a non-empty transcript was extracted, or a prior SaveStep recorded a checkpoint (StepCount > 0).
Also extracts redactOrDrop helper so CondenseSession stays under the maintidx threshold after the new skip gate is added.
Tests cover the post-redaction skip in the no-overlap + redaction-fail case, plus four unit tests for the narrowed filterFilesTouched contract: applies fallback with StepCount evidence, applies fallback with transcript evidence, skips fallback when neither is present, and intersects normally when FilesTouched is already populated.
Note
Medium Risk
Changes condensation skip/attribution behavior and could affect which sessions get persisted or credited for a commit, especially around mid-turn commits and redaction failures.
Overview
Prevents empty-session checkpoint stubs during manual-commit condensation.
CondenseSessionnow performs an additional post-redaction skip check, so if redaction drops the transcript andFilesTouchedfilters to empty, the session is skipped instead of persisting a metadata-only checkpoint.Tightens file attribution fallback.
filterFilesTouchedonly falls back to “all committed files (excluding.entire/)” when the session has evidence of work (non-empty transcript orStepCount > 0), avoiding false attribution to sessions that never produced output.Adds focused regression/unit tests covering the new post-redaction skip and the updated
filterFilesTouchedfallback/intersection behavior.Reviewed by Cursor Bugbot for commit 323d8fb. Configure here.