Skip to content

strategy: guard against empty-session metadata stubs#1022

Merged
Soph merged 1 commit intomainfrom
soph/prevent-empty-sessions
Apr 24, 2026
Merged

strategy: guard against empty-session metadata stubs#1022
Soph merged 1 commit intomainfrom
soph/prevent-empty-sessions

Conversation

@Soph
Copy link
Copy Markdown
Collaborator

@Soph Soph commented Apr 24, 2026

https://entire.io/gh/entireio/cli/trails/a7fd3b96a21d

Two defensive guards in CondenseSession to prevent writing metadata.json checkpoints with no transcript and no real session attribution behind them:

  1. Post-redaction skip gate (skipIfPostRedactionEmpty). The existing pre-redaction gate at line 165 catches sessions with neither a transcript nor tracked files. It cannot catch the case where the transcript was non-empty pre-redaction but redaction silently dropped it (malformed JSONL from an external agent), AND the session's tracked files don't overlap with the committed file set. The new gate runs after redaction and after filterFilesTouched so it sees the post-filter view; if both transcript and FilesTouched are now empty, return Skipped instead of writing a stub.

  2. Narrow the filterFilesTouched fallback (sessionHasEvidenceOfWork). The old fallback assigned every committed file (minus .entire/) to any session with empty FilesTouched. That was designed for mid-turn commits before SaveStep ran, but it also fired for sessions registered at SessionStart that never produced anything — e.g., a Codex companion session whose hooks ran with a null transcript_path and never reached SaveStep. The fallback now requires evidence the session is a real participant: either a non-empty transcript was extracted, or a prior SaveStep recorded a checkpoint (StepCount > 0).

Also extracts redactOrDrop helper so CondenseSession stays under the maintidx threshold after the new skip gate is added.

Tests cover the post-redaction skip in the no-overlap + redaction-fail case, plus four unit tests for the narrowed filterFilesTouched contract: applies fallback with StepCount evidence, applies fallback with transcript evidence, skips fallback when neither is present, and intersects normally when FilesTouched is already populated.


Note

Medium Risk
Changes condensation skip/attribution behavior and could affect which sessions get persisted or credited for a commit, especially around mid-turn commits and redaction failures.

Overview
Prevents empty-session checkpoint stubs during manual-commit condensation. CondenseSession now performs an additional post-redaction skip check, so if redaction drops the transcript and FilesTouched filters to empty, the session is skipped instead of persisting a metadata-only checkpoint.

Tightens file attribution fallback. filterFilesTouched only falls back to “all committed files (excluding .entire/)” when the session has evidence of work (non-empty transcript or StepCount > 0), avoiding false attribution to sessions that never produced output.

Adds focused regression/unit tests covering the new post-redaction skip and the updated filterFilesTouched fallback/intersection behavior.

Reviewed by Cursor Bugbot for commit 323d8fb. Configure here.

Two defensive guards in CondenseSession to prevent writing
metadata.json checkpoints with no transcript and no real session
attribution behind them:

1. Post-redaction skip gate (skipIfPostRedactionEmpty). The existing
   pre-redaction gate at line 165 catches sessions with neither a
   transcript nor tracked files. It cannot catch the case where the
   transcript was non-empty pre-redaction but redaction silently
   dropped it (malformed JSONL from an external agent), AND the
   session's tracked files don't overlap with the committed file set.
   The new gate runs after redaction and after filterFilesTouched so
   it sees the post-filter view; if both transcript and FilesTouched
   are now empty, return Skipped instead of writing a stub.

2. Narrow the filterFilesTouched fallback (sessionHasEvidenceOfWork).
   The old fallback assigned every committed file (minus .entire/) to
   any session with empty FilesTouched. That was designed for mid-turn
   commits before SaveStep ran, but it also fired for sessions
   registered at SessionStart that never produced anything — e.g., a
   Codex companion session whose hooks ran with a null transcript_path
   and never reached SaveStep. The fallback now requires evidence the
   session is a real participant: either a non-empty transcript was
   extracted, or a prior SaveStep recorded a checkpoint (StepCount > 0).

Also extracts redactOrDrop helper so CondenseSession stays under the
maintidx threshold after the new skip gate is added.

Tests cover the post-redaction skip in the no-overlap + redaction-fail
case, plus four unit tests for the narrowed filterFilesTouched
contract: applies fallback with StepCount evidence, applies fallback
with transcript evidence, skips fallback when neither is present, and
intersects normally when FilesTouched is already populated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: e733ba0e7dff
Copilot AI review requested due to automatic review settings April 24, 2026 12:34
@Soph Soph requested a review from a team as a code owner April 24, 2026 12:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR strengthens ManualCommitStrategy.CondenseSession to avoid persisting “metadata-only” checkpoints when a session ends up with neither a usable transcript nor attributable file activity after redaction and file filtering, reducing false/empty session attribution.

Changes:

  • Adds a post-redaction skip gate to prevent writing checkpoints when redaction drops the transcript and FilesTouched is empty after filtering.
  • Narrows the filterFilesTouched fallback so it only assigns committed files when the session has evidence of real participation (non-empty transcript or StepCount > 0).
  • Refactors redaction handling into a helper and adds targeted unit tests for the new behaviors.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
cmd/entire/cli/strategy/manual_commit_condensation.go Adds redactOrDrop, a post-redaction “empty session” skip gate, and tightens filterFilesTouched fallback via sessionHasEvidenceOfWork.
cmd/entire/cli/strategy/condense_skip_test.go Adds regression coverage for the post-redaction skip case and unit tests validating the new filterFilesTouched fallback contract.

Copy link
Copy Markdown
Contributor

@dipree dipree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The gating order (pre-redaction gate → filterFilesTouched → redactOrDrop → post-redaction gate) is correct, and sessionHasEvidenceOfWork cleanly separates legitimate mid-turn commits from empty companion sessions. Tests cover both paths — the redaction-failure test is serial (t.Chdir) so the global redactSessionJSONLBytes override doesn't race with the parallel filterFilesTouched unit tests. Lint and full strategy suite pass locally.

@Soph Soph merged commit 0487a56 into main Apr 24, 2026
14 checks passed
@Soph Soph deleted the soph/prevent-empty-sessions branch April 24, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants