Skip to content

fix: skip empty sessions and prevent phantom checkpoint paths#958

Merged
pfleidi merged 22 commits intomainfrom
feat/best-effort-transcript-capture
Apr 15, 2026
Merged

fix: skip empty sessions and prevent phantom checkpoint paths#958
pfleidi merged 22 commits intomainfrom
feat/best-effort-transcript-capture

Conversation

@pfleidi
Copy link
Copy Markdown
Contributor

@pfleidi pfleidi commented Apr 15, 2026

Problem

When an agent session has no file changes and no transcript (e.g., Codex running inside Claude Code via the codex-plugin-cc), the condensation logic writes metadata-only checkpoint stubs. This creates noise in checkpoint history and records phantom file paths pointing to non-existent transcript files.

How Codex companion creates empty sessions

The codex-plugin-cc runs Codex via an app-server protocol inside Claude Code. It does not call Entire hooks directly — but entire enable installs .codex/hooks.json and enables the codex_hooks feature flag in .codex/config.toml. Codex itself reads these hooks and fires SessionStart, UserPromptSubmit, and Stop via entire hooks codex *, creating a separate SessionState in .git/entire-sessions/ alongside Claude Code's session.

The Codex hook payloads have a nullable transcript_path field that is often null, and the app-server does not write rollout files to ~/.codex/sessions/ (it uses a SQLite database instead). This means:

  • No SaveStep is called → no transcript on the shadow branch
  • TranscriptPath is empty → no live transcript to read
  • No rollout file on disk → no way to resolve the transcript after the fact
  • CondenseSession proceeds anyway → metadata-only stub with phantom paths

Validated via logs for checkpoint 5b6978164aea: the Codex session fired SessionStart and TurnStart hooks with session_ref: "", was condensed into 4 consecutive commits with transcript_bytes: 0, and never fired a Stop hook.

Prior behavior

  1. CondenseSession returned an error when there was no shadow branch and no TranscriptPath — but for sessions that had a shadow branch (from another session's commits on the same branch), it silently proceeded with empty transcript data
  2. writeSessionToSubdirectory unconditionally recorded Transcript and ContentHash paths in the checkpoint summary even when writeTranscript wrote nothing, creating phantom paths to non-existent files
  3. tryAgentCommitFastPath added Entire-Checkpoint trailers for any ACTIVE session, even empty ones — creating dangling trailers pointing to nothing on the metadata branch
  4. CondenseSessionByID (used by entire doctor) would retry empty sessions indefinitely, never marking them as resolved

Solution

Skip condensation for sessions with no meaningful content, and prevent phantom artifacts at every layer.

Skip gate in CondenseSession

After extraction and file filtering, if both sessionData.Transcript and sessionData.FilesTouched are empty, return CondenseResult{Skipped: true} instead of writing a metadata-only stub. All three callers handle Skipped:

  • condenseAndUpdateState (PostCommit) — returns false, preserves shadow branches
  • CondenseSessionByID (doctor) — marks FullyCondensed=true so doctor doesn't retry
  • CondenseAndMarkFullyCondensed (eager condense) — marks FullyCondensed=true

Extraction refactor

Extracted extractOrCreateSessionData from CondenseSession to handle three cases cleanly: shadow branch extraction, live transcript extraction, and a new default case that returns empty session data (letting the skip gate handle it) instead of erroring.

Phantom path fix in writeSessionToSubdirectory

writeTranscript now returns (bool, error). The caller only records Transcript and ContentHash paths in SessionFilePaths when files were actually written.

Dangling trailer prevention in tryAgentCommitFastPath

The prepare-commit-msg fast path now skips ACTIVE sessions with no TranscriptPath, no FilesTouched, and StepCount == 0. This prevents adding Entire-Checkpoint trailers that would point to nothing on the metadata branch.

Test plan

  • TestCondenseSession_SkipsWhenNoTranscriptAndNoFiles — empty session → Skipped == true
  • TestCondenseSession_DoesNotSkipWhenFilesTouchedButNoTranscript — files but no transcript → condensed normally
  • TestCondenseSessionByID_SkippedPreservesState — doctor path marks FullyCondensed
  • TestCondenseAndMarkFullyCondensed_SkippedMarksFullyCondensed — eager path marks FullyCondensed
  • TestTryAgentCommitFastPath_SkipsEmptySession — no trailer for empty session
  • TestTryAgentCommitFastPath_AcceptsSessionWithContent — trailer for session with content
  • TestTryAgentCommitFastPath_SkipsEmptyButAcceptsContentSession — multi-session: skips empty Codex, uses Claude Code
  • TestWriteCommitted_EmptyTranscript_NoPhantomPaths — no phantom paths for empty transcript
  • TestWriteCommitted_WithTranscript_PathsPopulated — paths set when transcript exists
  • TestShadowStrategy_AgentCommit_GetsTrailerWhenSessionHasContent — integration test for fast path with content
  • Full CI suite passes (mise run fmt && mise run lint && mise run test:ci)

Note

Medium Risk
Touches core checkpoint condensation and commit-hook trailer logic; while behavior is guarded by tests, it can change when/which sessions get linked to commits and marked condensed.

Overview
Prevents empty/companion agent sessions (no transcript and no files touched) from being condensed into checkpoints, avoiding metadata-only stubs and dangling Entire-Checkpoint trailers.

CondenseSession now returns a CondenseResult.Skipped early via a new skip gate (and a small extraction refactor), and all callers handle skips by not updating state/cleaning up branches or by marking sessions FullyCondensed to stop entire doctor retry loops. The commit-message fast path (tryAgentCommitFastPath) now ignores ACTIVE sessions with no condensable content.

Fixes phantom checkpoint paths by having writeTranscript return whether it actually wrote files, and only populating SessionFilePaths.Transcript/ContentHash when present; adds targeted unit and integration regression coverage for these scenarios.

Reviewed by Cursor Bugbot for commit 5a51675. Configure here.

pfleidi added 2 commits April 14, 2026 17:37
CondenseSession now attempts to resolve transcripts from the agent's
native storage (via GetSessionDir/ResolveSessionFile) when the normal
extraction paths find nothing. This covers agents that don't include
transcript_path in hook payloads (e.g., Codex running as a subagent).
For agents that require an export command (e.g., OpenCode),
PrepareTranscript is called before reading the resolved file.

If no transcript is found AND no files were touched, condensation is
skipped entirely — no metadata-only stubs are written to the checkpoint
branch. The session state remains intact so future commits can retry.
writeTranscript now returns a bool indicating whether files were actually
written. writeSessionToSubdirectory only records Transcript and ContentHash
paths in the checkpoint summary when the transcript was written, preventing
phantom paths that point to non-existent files.
Copilot AI review requested due to automatic review settings April 15, 2026 00:45
Comment thread cmd/entire/cli/strategy/manual_commit_condensation.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves manual-commit session condensation to avoid writing “metadata-only” checkpoint stubs when there’s nothing meaningful to persist (no transcript and no files), while still capturing transcripts in best-effort fashion for research-only and subagent scenarios (e.g., Codex via plugin).

Changes:

  • Adds best-effort transcript resolution from the agent’s native session storage when normal extraction yields no transcript.
  • Introduces a skip gate that returns CondenseResult{Skipped: true} when both transcript and touched files are empty, preventing phantom/stub checkpoints and avoiding state mutation in the caller.
  • Fixes phantom transcript/content-hash paths in committed checkpoint summaries by only recording those paths when transcript files were actually written, with targeted unit tests.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
cmd/entire/cli/strategy/manual_commit_types.go Extends CondenseResult with Skipped to support no-op condensations.
cmd/entire/cli/strategy/manual_commit_hooks.go Treats CondenseResult.Skipped as a no-op (no state changes, no cleanup).
cmd/entire/cli/strategy/manual_commit_condensation.go Implements extract refactor, transcript fallback resolution, and skip gate behavior.
cmd/entire/cli/strategy/condense_skip_test.go Adds unit tests for skip behavior and transcript fallback (including TranscriptPreparer).
cmd/entire/cli/checkpoint/committed.go Updates transcript writing to return “wrote files” boolean; prevents recording phantom paths.
cmd/entire/cli/checkpoint/committed_phantom_paths_test.go Adds tests ensuring summary paths are only populated when transcript/hash files exist.

pfleidi added 12 commits April 15, 2026 10:13
Document why t.Parallel() is not used in condense_skip_test.go
(t.Chdir/t.Setenv modify process-global state). Clarify that
the condensed field on postCommitActionHandler intentionally
returns false for both failures and skips.
extractOrCreateSessionData now accepts plumbing.Hash instead of
*plumbing.Reference. The caller dereferences ref.Hash() only when
hasShadowBranch is true, preventing a potential nil pointer
dereference if the function's switch cases are reordered.
CondenseSessionByID and CondenseAndMarkFullyCondensed now check
result.Skipped before updating state. CondenseSessionByID preserves
session state intact (no StepCount reset, no shadow branch cleanup).
CondenseAndMarkFullyCondensed marks the session as FullyCondensed
since there is genuinely nothing to condense.

Also adds a WARN log in extractOrCreateSessionData's default case
to make the no-shadow-branch/no-transcript-path fallback visible
in logs, preventing silent transcript loss from going undetected.

Entire-Checkpoint: bc26a77f7fda
The prepare-commit-msg fast path (no-TTY agent commits) now skips
ACTIVE sessions that have no transcript path, no tracked files, and
no shadow branch data (StepCount == 0). These sessions would produce
a Skipped result in CondenseSession, leaving the Entire-Checkpoint
trailer pointing to nothing on the metadata branch.

This prevents the commit-to-checkpoint invariant from being broken
when the only active session is an empty subagent (e.g., Codex
running inside Claude Code without producing a transcript).
…lution

Replace inlined AsTranscriptPreparer type assertion and PrepareTranscript
call with the existing prepareTranscriptIfNeeded helper from common.go.
Identical behavior — both swallow errors and let callers handle missing
files gracefully.

Entire-Checkpoint: 7f527ed2d7e7
Return ([]byte, string) instead of []byte so the caller explicitly
sets state.TranscriptPath. This makes the data flow visible at the
call site instead of hiding a state mutation inside a function whose
name and return type suggest read-only resolution.

Entire-Checkpoint: c530dfff239d
…p gate

The empty-session check in tryAgentCommitFastPath predicts what
CondenseSession's skip gate would decide. Add a note so future
editors know to update both locations together.

Entire-Checkpoint: c7ff813d8c4c
The agent commit fast path now skips sessions with no transcript path,
no files, and no steps. Update the test to use a session with a
transcript path so it exercises the intended fast-path behavior rather
than the new skip guard.

Entire-Checkpoint: 21edf89c5352
The codex-plugin-cc companion does create Entire sessions: entire enable
installs .codex/hooks.json and enables the codex_hooks feature flag,
so Codex itself fires hooks (session-start, user-prompt-submit, stop)
when running via the app-server. Update comments to say "Codex hooks
may send transcript_path as null" instead of "Codex subagent hooks
don't include transcript_path" — more precise about the actual payload.

Entire-Checkpoint: fe3d47e6bbbe
The resolveTranscriptFromAgentStorage fallback looked up transcripts
from the agent's native storage directory. This doesn't work for
the primary use case (Codex companion plugin via app-server) because
the app-server stores data in a SQLite database, not rollout files.

Remove the fallback function, its call site in CondenseSession, and
all related tests. The skip gate and defensive changes (phantom path
fix, dangling trailer prevention) remain — they correctly handle
sessions with no transcript by skipping condensation.

Entire-Checkpoint: 36483ffb07fc
CondenseSessionByID previously returned nil on Skipped without marking
the session, creating an infinite-retry loop for entire doctor. Now
marks FullyCondensed=true, consistent with CondenseAndMarkFullyCondensed.

Also fixes a stale comment referencing removed fallback resolution and
clarifies the sync comment between the fast-path guard and skip gate.

Entire-Checkpoint: e5244903031b
@pfleidi pfleidi changed the title feat: best-effort transcript capture for research and subagent sessions fix: skip empty sessions and prevent phantom checkpoint paths Apr 15, 2026
@pfleidi
Copy link
Copy Markdown
Contributor Author

pfleidi commented Apr 15, 2026

Bugbot run

Comment thread cmd/entire/cli/checkpoint/committed_phantom_paths_test.go Outdated
Comment thread cmd/entire/cli/strategy/manual_commit_hooks.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prevents “empty” agent sessions (no transcript + no file changes) from generating metadata-only committed checkpoints and from producing dangling commit-message trailers that point to non-existent checkpoint artifacts.

Changes:

  • Add a skip gate to ManualCommitStrategy.CondenseSession and propagate CondenseResult.Skipped to PostCommit/doctor/eager-condense paths.
  • Prevent dangling Entire-Checkpoint trailers by skipping clearly-empty ACTIVE sessions in tryAgentCommitFastPath.
  • Fix phantom checkpoint paths by only populating transcript/content-hash paths when transcript files are actually written, and add targeted unit/integration tests.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
cmd/entire/cli/strategy/manual_commit_types.go Adds CondenseResult.Skipped to represent no-op condensations.
cmd/entire/cli/strategy/manual_commit_hooks.go Handles Skipped results in PostCommit; updates fast-path trailer gating to skip empty ACTIVE sessions.
cmd/entire/cli/strategy/manual_commit_condensation.go Refactors extraction into extractOrCreateSessionData and adds the CondenseSession skip gate; marks skipped sessions fully condensed in doctor/eager paths.
cmd/entire/cli/strategy/condense_skip_test.go Adds unit coverage for skip gating, doctor/eager behavior, and fast-path trailer skipping.
cmd/entire/cli/integration_test/mid_session_commit_test.go Updates integration test to reflect trailer behavior only when a session has content.
cmd/entire/cli/checkpoint/committed_phantom_paths_test.go Adds tests ensuring no phantom transcript/content-hash paths are recorded when no transcript is written.
cmd/entire/cli/checkpoint/committed.go Changes transcript writing to return whether anything was written and only records transcript/content-hash paths when present.

Comment thread cmd/entire/cli/strategy/manual_commit_condensation.go Outdated
Comment thread cmd/entire/cli/strategy/condense_skip_test.go Outdated
pfleidi added 5 commits April 15, 2026 15:38
…n condensation

filterFilesTouched's fallback assigns all committed files to sessions
with empty FilesTouched (designed for mid-turn commits). This defeated
the skip gate for genuinely empty sessions (no transcript, no shadow
branch, no tracked files) because by the time the gate checked, the
fallback had inflated FilesTouched.

Move the skip gate before filterFilesTouched so it checks the session's
own tracked files rather than the post-fallback set.

Entire-Checkpoint: 036a550c2836
@pfleidi
Copy link
Copy Markdown
Contributor Author

pfleidi commented Apr 15, 2026

Bugbot run

@pfleidi pfleidi marked this pull request as ready for review April 15, 2026 23:01
@pfleidi pfleidi requested a review from a team as a code owner April 15, 2026 23:01
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 5a51675. Configure here.

@pfleidi pfleidi merged commit 5bc8615 into main Apr 15, 2026
9 checks passed
@pfleidi pfleidi deleted the feat/best-effort-transcript-capture branch April 15, 2026 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants