fix: skip empty sessions and prevent phantom checkpoint paths#958
fix: skip empty sessions and prevent phantom checkpoint paths#958
Conversation
CondenseSession now attempts to resolve transcripts from the agent's native storage (via GetSessionDir/ResolveSessionFile) when the normal extraction paths find nothing. This covers agents that don't include transcript_path in hook payloads (e.g., Codex running as a subagent). For agents that require an export command (e.g., OpenCode), PrepareTranscript is called before reading the resolved file. If no transcript is found AND no files were touched, condensation is skipped entirely — no metadata-only stubs are written to the checkpoint branch. The session state remains intact so future commits can retry.
writeTranscript now returns a bool indicating whether files were actually written. writeSessionToSubdirectory only records Transcript and ContentHash paths in the checkpoint summary when the transcript was written, preventing phantom paths that point to non-existent files.
There was a problem hiding this comment.
Pull request overview
This PR improves manual-commit session condensation to avoid writing “metadata-only” checkpoint stubs when there’s nothing meaningful to persist (no transcript and no files), while still capturing transcripts in best-effort fashion for research-only and subagent scenarios (e.g., Codex via plugin).
Changes:
- Adds best-effort transcript resolution from the agent’s native session storage when normal extraction yields no transcript.
- Introduces a skip gate that returns
CondenseResult{Skipped: true}when both transcript and touched files are empty, preventing phantom/stub checkpoints and avoiding state mutation in the caller. - Fixes phantom transcript/content-hash paths in committed checkpoint summaries by only recording those paths when transcript files were actually written, with targeted unit tests.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/entire/cli/strategy/manual_commit_types.go | Extends CondenseResult with Skipped to support no-op condensations. |
| cmd/entire/cli/strategy/manual_commit_hooks.go | Treats CondenseResult.Skipped as a no-op (no state changes, no cleanup). |
| cmd/entire/cli/strategy/manual_commit_condensation.go | Implements extract refactor, transcript fallback resolution, and skip gate behavior. |
| cmd/entire/cli/strategy/condense_skip_test.go | Adds unit tests for skip behavior and transcript fallback (including TranscriptPreparer). |
| cmd/entire/cli/checkpoint/committed.go | Updates transcript writing to return “wrote files” boolean; prevents recording phantom paths. |
| cmd/entire/cli/checkpoint/committed_phantom_paths_test.go | Adds tests ensuring summary paths are only populated when transcript/hash files exist. |
Document why t.Parallel() is not used in condense_skip_test.go (t.Chdir/t.Setenv modify process-global state). Clarify that the condensed field on postCommitActionHandler intentionally returns false for both failures and skips.
extractOrCreateSessionData now accepts plumbing.Hash instead of *plumbing.Reference. The caller dereferences ref.Hash() only when hasShadowBranch is true, preventing a potential nil pointer dereference if the function's switch cases are reordered.
CondenseSessionByID and CondenseAndMarkFullyCondensed now check result.Skipped before updating state. CondenseSessionByID preserves session state intact (no StepCount reset, no shadow branch cleanup). CondenseAndMarkFullyCondensed marks the session as FullyCondensed since there is genuinely nothing to condense. Also adds a WARN log in extractOrCreateSessionData's default case to make the no-shadow-branch/no-transcript-path fallback visible in logs, preventing silent transcript loss from going undetected. Entire-Checkpoint: bc26a77f7fda
The prepare-commit-msg fast path (no-TTY agent commits) now skips ACTIVE sessions that have no transcript path, no tracked files, and no shadow branch data (StepCount == 0). These sessions would produce a Skipped result in CondenseSession, leaving the Entire-Checkpoint trailer pointing to nothing on the metadata branch. This prevents the commit-to-checkpoint invariant from being broken when the only active session is an empty subagent (e.g., Codex running inside Claude Code without producing a transcript).
…lution Replace inlined AsTranscriptPreparer type assertion and PrepareTranscript call with the existing prepareTranscriptIfNeeded helper from common.go. Identical behavior — both swallow errors and let callers handle missing files gracefully. Entire-Checkpoint: 7f527ed2d7e7
Return ([]byte, string) instead of []byte so the caller explicitly sets state.TranscriptPath. This makes the data flow visible at the call site instead of hiding a state mutation inside a function whose name and return type suggest read-only resolution. Entire-Checkpoint: c530dfff239d
…p gate The empty-session check in tryAgentCommitFastPath predicts what CondenseSession's skip gate would decide. Add a note so future editors know to update both locations together. Entire-Checkpoint: c7ff813d8c4c
The agent commit fast path now skips sessions with no transcript path, no files, and no steps. Update the test to use a session with a transcript path so it exercises the intended fast-path behavior rather than the new skip guard. Entire-Checkpoint: 21edf89c5352
The codex-plugin-cc companion does create Entire sessions: entire enable installs .codex/hooks.json and enables the codex_hooks feature flag, so Codex itself fires hooks (session-start, user-prompt-submit, stop) when running via the app-server. Update comments to say "Codex hooks may send transcript_path as null" instead of "Codex subagent hooks don't include transcript_path" — more precise about the actual payload. Entire-Checkpoint: fe3d47e6bbbe
The resolveTranscriptFromAgentStorage fallback looked up transcripts from the agent's native storage directory. This doesn't work for the primary use case (Codex companion plugin via app-server) because the app-server stores data in a SQLite database, not rollout files. Remove the fallback function, its call site in CondenseSession, and all related tests. The skip gate and defensive changes (phantom path fix, dangling trailer prevention) remain — they correctly handle sessions with no transcript by skipping condensation. Entire-Checkpoint: 36483ffb07fc
CondenseSessionByID previously returned nil on Skipped without marking the session, creating an infinite-retry loop for entire doctor. Now marks FullyCondensed=true, consistent with CondenseAndMarkFullyCondensed. Also fixes a stale comment referencing removed fallback resolution and clarifies the sync comment between the fast-path guard and skip gate. Entire-Checkpoint: e5244903031b
…SessionData Entire-Checkpoint: 12a9a7e2ffbe
|
Bugbot run |
There was a problem hiding this comment.
Pull request overview
This PR prevents “empty” agent sessions (no transcript + no file changes) from generating metadata-only committed checkpoints and from producing dangling commit-message trailers that point to non-existent checkpoint artifacts.
Changes:
- Add a skip gate to
ManualCommitStrategy.CondenseSessionand propagateCondenseResult.Skippedto PostCommit/doctor/eager-condense paths. - Prevent dangling
Entire-Checkpointtrailers by skipping clearly-empty ACTIVE sessions intryAgentCommitFastPath. - Fix phantom checkpoint paths by only populating transcript/content-hash paths when transcript files are actually written, and add targeted unit/integration tests.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/entire/cli/strategy/manual_commit_types.go | Adds CondenseResult.Skipped to represent no-op condensations. |
| cmd/entire/cli/strategy/manual_commit_hooks.go | Handles Skipped results in PostCommit; updates fast-path trailer gating to skip empty ACTIVE sessions. |
| cmd/entire/cli/strategy/manual_commit_condensation.go | Refactors extraction into extractOrCreateSessionData and adds the CondenseSession skip gate; marks skipped sessions fully condensed in doctor/eager paths. |
| cmd/entire/cli/strategy/condense_skip_test.go | Adds unit coverage for skip gating, doctor/eager behavior, and fast-path trailer skipping. |
| cmd/entire/cli/integration_test/mid_session_commit_test.go | Updates integration test to reflect trailer behavior only when a session has content. |
| cmd/entire/cli/checkpoint/committed_phantom_paths_test.go | Adds tests ensuring no phantom transcript/content-hash paths are recorded when no transcript is written. |
| cmd/entire/cli/checkpoint/committed.go | Changes transcript writing to return whether anything was written and only records transcript/content-hash paths when present. |
…n condensation filterFilesTouched's fallback assigns all committed files to sessions with empty FilesTouched (designed for mid-turn commits). This defeated the skip gate for genuinely empty sessions (no transcript, no shadow branch, no tracked files) because by the time the gate checked, the fallback had inflated FilesTouched. Move the skip gate before filterFilesTouched so it checks the session's own tracked files rather than the post-fallback set. Entire-Checkpoint: 036a550c2836
Entire-Checkpoint: c178329f6c0a
Entire-Checkpoint: 1edd77edfe92
Entire-Checkpoint: 029cd5b536c1
Entire-Checkpoint: 0bc529cd302d
|
Bugbot run |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 5a51675. Configure here.
Problem
When an agent session has no file changes and no transcript (e.g., Codex running inside Claude Code via the codex-plugin-cc), the condensation logic writes metadata-only checkpoint stubs. This creates noise in checkpoint history and records phantom file paths pointing to non-existent transcript files.
How Codex companion creates empty sessions
The codex-plugin-cc runs Codex via an app-server protocol inside Claude Code. It does not call Entire hooks directly — but
entire enableinstalls.codex/hooks.jsonand enables thecodex_hooksfeature flag in.codex/config.toml. Codex itself reads these hooks and firesSessionStart,UserPromptSubmit, andStopviaentire hooks codex *, creating a separate SessionState in.git/entire-sessions/alongside Claude Code's session.The Codex hook payloads have a nullable
transcript_pathfield that is often null, and the app-server does not write rollout files to~/.codex/sessions/(it uses a SQLite database instead). This means:SaveStepis called → no transcript on the shadow branchTranscriptPathis empty → no live transcript to readCondenseSessionproceeds anyway → metadata-only stub with phantom pathsValidated via logs for checkpoint
5b6978164aea: the Codex session firedSessionStartandTurnStarthooks withsession_ref: "", was condensed into 4 consecutive commits withtranscript_bytes: 0, and never fired aStophook.Prior behavior
CondenseSessionreturned an error when there was no shadow branch and noTranscriptPath— but for sessions that had a shadow branch (from another session's commits on the same branch), it silently proceeded with empty transcript datawriteSessionToSubdirectoryunconditionally recordedTranscriptandContentHashpaths in the checkpoint summary even whenwriteTranscriptwrote nothing, creating phantom paths to non-existent filestryAgentCommitFastPathaddedEntire-Checkpointtrailers for any ACTIVE session, even empty ones — creating dangling trailers pointing to nothing on the metadata branchCondenseSessionByID(used byentire doctor) would retry empty sessions indefinitely, never marking them as resolvedSolution
Skip condensation for sessions with no meaningful content, and prevent phantom artifacts at every layer.
Skip gate in
CondenseSessionAfter extraction and file filtering, if both
sessionData.TranscriptandsessionData.FilesTouchedare empty, returnCondenseResult{Skipped: true}instead of writing a metadata-only stub. All three callers handleSkipped:condenseAndUpdateState(PostCommit) — returns false, preserves shadow branchesCondenseSessionByID(doctor) — marksFullyCondensed=trueso doctor doesn't retryCondenseAndMarkFullyCondensed(eager condense) — marksFullyCondensed=trueExtraction refactor
Extracted
extractOrCreateSessionDatafromCondenseSessionto handle three cases cleanly: shadow branch extraction, live transcript extraction, and a new default case that returns empty session data (letting the skip gate handle it) instead of erroring.Phantom path fix in
writeSessionToSubdirectorywriteTranscriptnow returns(bool, error). The caller only recordsTranscriptandContentHashpaths inSessionFilePathswhen files were actually written.Dangling trailer prevention in
tryAgentCommitFastPathThe prepare-commit-msg fast path now skips ACTIVE sessions with no
TranscriptPath, noFilesTouched, andStepCount == 0. This prevents addingEntire-Checkpointtrailers that would point to nothing on the metadata branch.Test plan
TestCondenseSession_SkipsWhenNoTranscriptAndNoFiles— empty session →Skipped == trueTestCondenseSession_DoesNotSkipWhenFilesTouchedButNoTranscript— files but no transcript → condensed normallyTestCondenseSessionByID_SkippedPreservesState— doctor path marksFullyCondensedTestCondenseAndMarkFullyCondensed_SkippedMarksFullyCondensed— eager path marksFullyCondensedTestTryAgentCommitFastPath_SkipsEmptySession— no trailer for empty sessionTestTryAgentCommitFastPath_AcceptsSessionWithContent— trailer for session with contentTestTryAgentCommitFastPath_SkipsEmptyButAcceptsContentSession— multi-session: skips empty Codex, uses Claude CodeTestWriteCommitted_EmptyTranscript_NoPhantomPaths— no phantom paths for empty transcriptTestWriteCommitted_WithTranscript_PathsPopulated— paths set when transcript existsTestShadowStrategy_AgentCommit_GetsTrailerWhenSessionHasContent— integration test for fast path with contentmise run fmt && mise run lint && mise run test:ci)Note
Medium Risk
Touches core checkpoint condensation and commit-hook trailer logic; while behavior is guarded by tests, it can change when/which sessions get linked to commits and marked condensed.
Overview
Prevents empty/companion agent sessions (no transcript and no files touched) from being condensed into checkpoints, avoiding metadata-only stubs and dangling
Entire-Checkpointtrailers.CondenseSessionnow returns aCondenseResult.Skippedearly via a new skip gate (and a small extraction refactor), and all callers handle skips by not updating state/cleaning up branches or by marking sessionsFullyCondensedto stopentire doctorretry loops. The commit-message fast path (tryAgentCommitFastPath) now ignores ACTIVE sessions with no condensable content.Fixes phantom checkpoint paths by having
writeTranscriptreturn whether it actually wrote files, and only populatingSessionFilePaths.Transcript/ContentHashwhen present; adds targeted unit and integration regression coverage for these scenarios.Reviewed by Cursor Bugbot for commit 5a51675. Configure here.