Checkpoints V2: Support checkpoint_transcript_start for compact transcript.jsonl files#877
Draft
computermode wants to merge 39 commits intomainfrom
Draft
Checkpoints V2: Support checkpoint_transcript_start for compact transcript.jsonl files#877computermode wants to merge 39 commits intomainfrom
checkpoint_transcript_start for compact transcript.jsonl files#877computermode wants to merge 39 commits intomainfrom
Conversation
Entire-Checkpoint: 079c1c0e0eeb
Pre-session dirty files (CLI config files from `entire enable`, leftover changes from previous sessions) were incorrectly counted as human contributions, deflating agent percentage. Root cause: PA1 (first prompt attribution) captures worktree state at session start. This data was used to correct agent line counts (correct) but also added to human contributions (wrong). Fix: - Split prompt attributions into baseline (PA1) and session (PA2+) - PA1 data still subtracted from agent work (correct agent calc) - PA1 contributions excluded from relevantAccumulatedUser - PA1 removals excluded from totalUserRemoved - Include PendingPromptAttribution during condensation for agents that skip SaveStep (e.g., Codex mid-turn commits) - Add .entire/ filter to attribution calc (matches existing PA filter) - Fix wrapcheck lint errors in updateCombinedAttributionForCheckpoint Verified end-to-end: 100% agent with config files committed alongside. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: b0cb4216f6bc
…ibution Checkpoint package changes required by the attribution baseline fix: - PromptAttributionsJSON field on WriteCommittedOptions and CommittedMetadata - UpdateCheckpointSummary method on GitStore for multi-session aggregation - CombinedAttribution field on CheckpointSummary - Preserve existing CombinedAttribution during summary rewrites Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: b8963737336c
…arentCommitHash Fixes all 4 issues from Copilot and Cursor Bugbot review: 1. Precompute parentCommitHash on postCommitActionHandler struct using ParentHashes[0] (avoids extra object read, no silent error) 2. Remove duplicated 6-line parentCommitHash computation from HandleCondense and HandleCondenseIfFilesTouched 3. Thread parentTree through condenseOpts/attributionOpts and use it for non-agent file line counting — ensures diffLines uses parent→HEAD (consistent with parentCommitHash file scoping) instead of sessionBase→HEAD which over-counted intermediate commit changes 4. Add ParentTreeForNonAgentLines test proving the fix (TDD verified: HumanAdded=8 without fix → HumanAdded=3 with fix) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 12f5c4373467
Three fixes for multi-session attribution: 1. Cross-session file exclusion: Thread allAgentFiles (union of all sessions' FilesTouched) through the attribution pipeline. Files created by other agent sessions are no longer counted as human work. 2. Exclude .entire/ from commit session fallback: When the commit session has no FilesTouched and falls back to all committed files, filter out .entire/ metadata created by `entire enable`. 3. PA1 baseline uses base tree for new sessions: New sessions (StepCount == 0) always diff against the base commit tree, not the shared shadow branch which may contain other sessions' state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 209a37190167
Entire-Checkpoint: 573a97ec8d2c
Entire-Checkpoint: 3790cba265e6
Entire-Checkpoint: c9595c52ab4a
Entire-Checkpoint: 9f07aeebbf93
Entire-Checkpoint: f1c37c8efc47
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tering - Test AllAgentFiles cross-session exclusion in CalculateAttributionWithAccumulated - Test committedFilesExcludingMetadata filters .entire/ paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The combined_attribution field now diffs parent→HEAD once and classifies files as agent vs human based on the union of sessions with real checkpoints (SaveStep ran). Filters .entire/ and .claude/ config paths. Also adds ReadSessionMetadata for lightweight per-session metadata reads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mmit-inflation Fix attribution inflation from intermediate commits
don't show multiple spaces for codex single line start message rendering
Entire-Checkpoint: 36db97269a69
Entire-Checkpoint: 93066e1dac3c
Entire-Checkpoint: 4fdb72622b7f
Entire-Checkpoint: 51d95c3209d7
Entire-Checkpoint: e5883f33cb01
Initialize compact transcript offsets from existing checkpoint offsets during state normalization and add tests to preserve migration behavior. Made-with: Cursor Entire-Checkpoint: 4678bd55995f
Checkpoints V2: add migration option
…t-start-at-metadata
Entire-Checkpoint: 15712c7a9051
Contributor
There was a problem hiding this comment.
Pull request overview
Adds support for correctly tracking checkpoint_transcript_start for v2 compact transcript.jsonl artifacts by introducing a compact-transcript-specific offset and propagating it through condensation and migration.
Changes:
- Introduces
CompactTranscriptStart/CompactTranscriptLinesplumbing to track compact transcript offsets separately fromfull.jsonloffsets. - Updates v2
/maincommitted metadata writing to use the compact transcript start offset forcheckpoint_transcript_start. - Enhances
migrateto compute and persist compact transcript offsets when generatingtranscript.jsonl.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/entire/cli/strategy/manual_commit_types.go | Extends condensation result to report compact transcript line counts. |
| cmd/entire/cli/strategy/manual_commit_test.go | Adds coverage ensuring v2 /main writes checkpoint_transcript_start correctly for compact transcripts. |
| cmd/entire/cli/strategy/manual_commit_hooks.go | Updates session state to advance/reset compact transcript offsets across condensations/carry-forward. |
| cmd/entire/cli/strategy/manual_commit_condensation.go | Plumbs compact transcript start into write options and computes compact transcript line deltas. |
| cmd/entire/cli/session/state.go | Adds persisted compact_transcript_start to session state with legacy backfill behavior. |
| cmd/entire/cli/session/state_test.go | Adds tests for CompactTranscriptStart normalization/backfill and JSON round-trip. |
| cmd/entire/cli/migrate.go | Computes compact transcript offsets during migration and stores them in v2 write options. |
| cmd/entire/cli/checkpoint/v2_store_test.go | Tests that v2 /main metadata uses CompactTranscriptStart for checkpoint_transcript_start. |
| cmd/entire/cli/checkpoint/v2_committed.go | Switches v2 /main metadata field to use compact transcript start offset. |
| cmd/entire/cli/checkpoint/checkpoint.go | Adds CompactTranscriptStart to WriteCommittedOptions for v2 metadata writing. |
Entire-Checkpoint: 92926498c799
Contributor
Author
|
bugbot review |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit c12965c. Configure here.
Entire-Checkpoint: 3e7abfc2d4a5
Entire-Checkpoint: 1b4ddd35692a
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Ensures we have
checkpoint_transcript_startin the metadata.json for the compacttranscript.jsonlfiles.Also updates the
migratecommand to respect the start lines (previously, if I migrated, then added v1 checkpoints and migrated again, it wouldn't calculatecheckpoint_transcript_start).I tested this by creating multiple commits and checkpoints with a Claude session and inspecting the v2 refs to ensure that the
checkpoint_transcript_startlines were added to themetadata.jsonfile as expected.Testing with Codex next...
Note
Medium Risk
Adjusts how
checkpoint_transcript_startis computed and stored for v2/mainand during migration, which can affect transcript scoping and downstream consumers if offsets are wrong. Changes are localized and covered by new tests, but touch persistence/state and migration behavior.Overview
Ensures v2
/mainmetadata writescheckpoint_transcript_startin the compact transcript.jsonl line domain by addingCompactTranscriptStarttoWriteCommittedOptionsand using it inv2_committed.go.Updates manual-commit condensation and session state to track
compact_transcript_start, compute/fallback the compact offset when missing, and advance it after each condensation so subsequent checkpoints are correctly scoped.Improves
migrate --checkpoints v2to compute and persist compact transcript start offsets when generatingtranscript.jsonl, with new unit tests validating v2 metadata uses the compact offset (not full.jsonl offsets) across write, condense, and migration paths.Reviewed by Cursor Bugbot for commit c12965c. Configure here.