Skip to content

Per-checkpoint git-ref checkpoint store (git-refs backend)#1566

Merged
Soph merged 18 commits into
mainfrom
feat/checkpoint-git-refs
Jul 2, 2026
Merged

Per-checkpoint git-ref checkpoint store (git-refs backend)#1566
Soph merged 18 commits into
mainfrom
feat/checkpoint-git-refs

Conversation

@Soph

@Soph Soph commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

https://entire.io/gh/entireio/cli/trails/693

Implements the per-checkpoint git-ref checkpoint store (#1471): each checkpoint is stored under its own ref refs/entire/checkpoints/<shard>/<id> (pointing at a commit whose tree root is that checkpoint's contents) instead of the single append-only entire/checkpoints/v1 branch. Selectable via config, off by default — the git-branch store remains the default and is unchanged.

Targets main. #1576 (path.Join cleanup) is stacked on top of this branch.

What's in it

  1. Ref resolverCheckpointID.ShardFor shards on the last two characters for both formats (legacy 12-hex and ULID): one positional rule, no format detection, so the shard can't be computed inconsistently between callers, and the random suffix spreads evenly while IDs stay sortable. RefName builds the ref and rejects invalid/KindUnknown IDs (returns an error rather than a malformed ref); ParseRef is the inverse. This is the git-refs namespace only — the v1 branch tree keeps its independent first-two Path() layout.
  2. git-refs storegitRefsStore implementing the full api/checkpoint contract over per-checkpoint refs, sharing the treeWriter core (so dupl stays quiet); registered as a git-backed git-refs backend. Orphan-then-parented per-checkpoint history; stamps checkpoint_version = refs-v1. Plus the flock-protected JSONL push-discovery queue. Read paths distinguish a genuinely-absent checkpoint (→ not-found) from real IO/fetch errors, which now propagate instead of being masked as "not found."
  3. Config-aware pre-push — when the primary is git-refs, PrePush drains the queue and pushes the per-checkpoint refs fast-forward-only (never force). There is no server-side ref protection, so a force push could let a buggy/racing client clobber good remote history; instead a diverged ref is recovered by fetch + replay of the local-only commit onto the remote tip, then retried. It also honors the checkpoint policy (skips the push on a diverged/unsupported-format policy, leaving refs queued) exactly like the v1 path.
  4. On-demand ref fetch — a missing checkpoint ref is fetched on read (resume/explain/attribution/tokens), so checkpoints written on another machine resolve after clone.
  5. refs-v1 read/write policy + explain-on-clone fetches the specific ref by full ID.
  6. e2e/CI — backend-aware test harness, E2E_CHECKPOINT_STORE knob → ENTIRE_CHECKPOINTS_PRIMARY env override, the git-refs leg added to the PR-CI canary matrix, and an e2e-checkpoint-store workflow to run the full suite against either backend.

Rollout

Selected via the checkpoints.{primary, mirrors} taxonomy (or the ENTIRE_CHECKPOINTS_PRIMARY env override). primary: git-refs + mirror: git-branch is the parallel-rollout shape (refs authoritative; v1 still written locally). ID generation stays 12-hex for now — emitting ULIDs is a separate, deliberately deferred change; the store shards/reads both formats today.

Explicitly out of scope (follow-ups)

  • ULID emission (generation switch).
  • Backfill/migration of existing v1 checkpoints into per-checkpoint refs.
  • v1-mirror push for full downgrade safety (mirrors are write-only locally for now).
  • OPF for the git-refs push path.

Testing

mise run lint 0 issues (dupl quiet — treeWriter core is shared); full unit suite passes; mise run test:integration 379 pass; go vet -tags e2e ./e2e/... clean. Canary green in both modes: git-branch 59/59 + roger 4/4 (unchanged), and E2E_CHECKPOINT_STORE=git-refs 58/59 + roger 4/4 (the one skip is TestAlternates, a v1-branch-specific object-alternate rebase-sync test with no analog for the git-refs store's independent per-checkpoint refs).

Refs: #1471

🤖 Generated with Claude Code


Note

High Risk
Large change to checkpoint persistence, git push/sync, and remote fetch paths; wrong behavior could lose or fail to sync checkpoint metadata across machines, though git-branch remains the default.

Overview
Adds an optional git-refs checkpoint backend: each checkpoint lives on its own ref refs/entire/checkpoints/<shard>/<id> (commit tree = checkpoint contents) instead of only the shared entire/checkpoints/v1 branch. Default remains git-branch; selection is via checkpoints.primary or ENTIRE_CHECKPOINTS_PRIMARY / ENTIRE_CHECKPOINTS_MIRRORS.

The new gitRefsStore implements the same read/write contract as the branch store (shared tree helpers), stamps refs-v1, and enqueues touched refs in a flock-protected JSONL push queue. Pre-push drains that queue and batch force-pushes those refs when the primary is git-refs (no v1 OPF path yet). On-demand ref fetch (RefFetcher / FetchCheckpointRef) lets resume, explain, attribution, and tokens resolve checkpoints written on another machine after clone.

refs-v1 is now read/write-supported in checkpoint policy; tests that mocked “unsupported refs” move to refs-v2. E2E gains backend-aware helpers, E2E_CHECKPOINT_STORE, and a e2e-checkpoint-store workflow to run the suite against either backend.

Reviewed by Cursor Bugbot for commit 220c6b1. Configure here.

Copilot AI review requested due to automatic review settings June 29, 2026 18:59
@Soph Soph requested a review from a team as a code owner June 29, 2026 18:59

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 220c6b1. Configure here.

Comment thread cmd/entire/cli/strategy/manual_commit_push.go

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional per-checkpoint git-ref persistent checkpoint store (git-refs) where each checkpoint is stored under refs/entire/checkpoints/<shard>/<id> (commit tree root = checkpoint contents), alongside backend-aware push/fetch behavior and E2E coverage. This fits into the checkpoint persistence layer as an alternative to the existing append-only entire/checkpoints/v1 branch backend, selectable via settings/env overrides.

Changes:

  • Introduces git-refs as a new git-backed persistent checkpoint backend, with ref naming/parsing and a flock-protected JSONL push-discovery queue.
  • Updates CLI read paths (resume/explain/attribution/tokens) to support on-demand fetching of missing checkpoint refs.
  • Extends E2E harness to run against either backend and adds a workflow to run E2E by checkpoint store.

Reviewed changes

Copilot reviewed 55 out of 55 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
e2e/testutil/repo.go Switches repo setup + checkpoint artifact capture helpers to be backend-aware.
e2e/testutil/backend.go Adds backend selection + checkpoint state digest and ref/path helpers for git-branch vs git-refs.
e2e/testutil/assertions.go Updates checkpoint advance/existence assertions to work under both topologies.
e2e/testutil/artifacts.go Captures per-backend checkpoint trees/metadata for artifacts.
e2e/tests/main_test.go Adds E2E_CHECKPOINT_STOREENTIRE_CHECKPOINTS_PRIMARY suite-wide wiring.
e2e/tests/session_lifecycle_test.go Uses backend-aware checkpoint state for “no checkpoint advance” assertions.
e2e/tests/rewind_test.go Uses backend-aware checkpoint state for multi-checkpoint rewind tests.
e2e/tests/resume_remote_test.go Replaces v1-branch existence checks with backend-aware “checkpoints present” checks.
e2e/tests/explain_test.go Replaces v1-branch existence checks with backend-aware “checkpoints present” checks.
e2e/tests/edge_cases_test.go Uses backend-aware checkpoint state in edge-case tests.
e2e/tests/attach_test.go Uses backend-aware “checkpoints present” guard before reading checkpoint baseline.
e2e/tests/alternates_test.go Skips alternates-specific test under git-refs (branch-topology-specific).
e2e/README.md Documents E2E_CHECKPOINT_STORE.
cmd/entire/cli/tokens_profile.go Wires RefFetcher into checkpoint.Open for tokens profiling reads.
cmd/entire/cli/resume.go Wires RefFetcher into checkpoint.Open for resume reads.
cmd/entire/cli/explain.go Wires RefFetcher into checkpoint.Open for explain reads.
cmd/entire/cli/explain_export.go Adds git-refs-aware remote fallback fetch when exporting by full checkpoint ID.
cmd/entire/cli/attribution.go Wires RefFetcher into checkpoint.Open for attribution reads.
cmd/entire/cli/git_operations.go Adds FetchCheckpointRef (single-ref fetch) used by git-refs read paths.
cmd/entire/cli/settings/checkpoints.go Adds env overrides ENTIRE_CHECKPOINTS_PRIMARY / ENTIRE_CHECKPOINTS_MIRRORS.
cmd/entire/cli/settings/checkpoints_test.go Tests env override precedence + mirror parsing.
cmd/entire/cli/strategy/push_common.go Adds helpers for batching force-push of per-checkpoint refs.
cmd/entire/cli/strategy/manual_commit_push.go Adds git-refs primary pre-push path draining queue and batch force-pushing refs.
cmd/entire/cli/strategy/refs_push_test.go Tests force-push batching and stale-ref partitioning behavior.
cmd/entire/cli/strategy/checkpoint_policy_test.go Updates “unsupported refs” stand-in version to refs-v2.
cmd/entire/cli/checkpointpolicy/format.go Marks refs-v1 as read/write supported in policy format table.
cmd/entire/cli/checkpointpolicy/format_test.go Updates tests to expect refs-v1 read/write supported.
cmd/entire/cli/checkpointpolicy/*.go (tests) Adjusts tests to use refs-v2 as the unsupported version.
cmd/entire/cli/checkpoint/registry.go Registers git-refs backend type and adds RefFetcher to open env.
cmd/entire/cli/checkpoint/open.go Adds OpenOptions.RefFetcher + PrimaryIsRefs helper and plumbs env.
cmd/entire/cli/checkpoint/fetching_tree.go Defines RefFetchFunc used by git-refs backend.
cmd/entire/cli/checkpoint/aliases.go Exposes CheckpointVersionRefsV1 alias.
cmd/entire/cli/checkpoint/persistent.go Extracts shared tree-reading helpers usable by both git backends.
cmd/entire/cli/checkpoint/id/id.go Adds Kind() and ShardFor() (ULID last-2 shard vs legacy first-2).
cmd/entire/cli/checkpoint/id/id_test.go Adds ShardFor() coverage.
cmd/entire/cli/checkpoint/refs_naming.go Implements RefName + ParseRef for per-checkpoint refs.
cmd/entire/cli/checkpoint/refs_naming_test.go Tests ref naming/parsing and shard validation.
cmd/entire/cli/checkpoint/refs_store.go Implements gitRefsStore (persistent store over per-checkpoint refs) incl. on-demand ref fetch + list.
cmd/entire/cli/checkpoint/refs_store_test.go Unit tests for write/read variants, sharding, history, on-demand fetch, queue enqueue.
cmd/entire/cli/checkpoint/refs_store_seam_test.go Seam test for git-refs primary + git-branch mirror topology.
cmd/entire/cli/checkpoint/pushqueue.go Adds flock-protected JSONL push queue stored in git common dir.
cmd/entire/cli/checkpoint/pushqueue_test.go Tests queue enqueue/drain/remove semantics and malformed line skipping.
api/checkpoint/errors.go Adds CheckpointVersionRefsV1 constant (refs-v1).
.github/workflows/e2e-checkpoint-store.yml Adds workflow to run E2E against either checkpoint backend.

Comment thread e2e/testutil/backend.go
Comment thread cmd/entire/cli/checkpoint/refs_store.go
Comment thread cmd/entire/cli/checkpoint/refs_store.go
Comment thread cmd/entire/cli/checkpoint/pushqueue.go
Comment thread cmd/entire/cli/strategy/push_common.go
Base automatically changed from feat/checkpoint-treewriter to main June 30, 2026 09:28
huangyingting pushed a commit to repomesh/cli that referenced this pull request Jun 30, 2026
A workflow_dispatch workflow to run the e2e suite against a chosen checkpoint
backend (git-branch or git-refs) for a chosen agent. Split into its own PR so it
lands on the default branch and becomes dispatchable: workflow_dispatch only
shows the "Run workflow" button once the file is on the default branch.

Dispatch-only — merging triggers nothing automatically, so it cannot affect CI
or other workflows. Selecting git-refs is meaningful once the git-refs backend
lands (entireio#1566); until then dispatch it with ref set to that branch to exercise the
backend pre-merge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: d1f4f04600b5
@Soph Soph force-pushed the feat/checkpoint-git-refs branch from 6eadee1 to a135b5d Compare June 30, 2026 10:39
@Soph Soph force-pushed the feat/checkpoint-git-refs branch from e6934f4 to 0d500bc Compare June 30, 2026 17:54
Soph and others added 16 commits July 1, 2026 15:56
First slice of the per-checkpoint git-ref checkpoint store (#1471), on top of the
merged understanding layer. Pure naming/resolution; nothing constructs a ref
store yet.

- id: re-add CheckpointID.ShardFor (first-2 chars for legacy hex, last-2 for a
  ULID so the random suffix spreads evenly while the ID stays sortable) and the
  CheckpointID.Kind() method. These were deferred out of the understanding-layer
  PR (#1546) because sharding is a storage concern; they land here with their
  consumer.
- checkpoint: RefName builds refs/entire/checkpoints/<shard>/<id>; ParseRef
  inverts it, rejecting a mismatched shard or extra path segments so a malformed
  ref never resolves to the wrong bucket.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 53f11b080581
Second slice of #1471: a git-backed store that keeps one commit per
refs/entire/checkpoints/<shard>/<id> (tree root = checkpoint contents), sharing
the treeWriter subtree core with the git-branch store.

- refs_store.go: gitRefsStore implements PersistentStore over per-checkpoint refs
  (orphan-then-parented history, stamps refs-v1, no Vercel merge, List enumerates
  local refs, optional AuthorReader). Reads resolve a ref → commit tree and use
  shared read helpers; writes build the subtree via the embedded *treeWriter.
- Share the tree-read helpers: the git-branch reader's Read/ReadSession* now
  delegate to readSummaryFromCheckpointTree / readSession*FromTree free functions
  (no behavior change) so both backends read the same way, just navigating to the
  tree differently.
- registry: BackendTypeGitRefs ("git-refs") registered built-in with gitBacked:true
  + gitRefsBackendFactory; OpenEnv/OpenOptions gain RefFetcher; PrimaryIsRefs helper.
- pushqueue.go: flock-protected JSONL push-discovery queue in the git common dir
  (Enqueue/Drain/Remove); gitRefsStore.setRef enqueues best-effort. (Pre-push
  consumption lands in the next commit.)
- CheckpointVersionRefsV1 = "refs-v1".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: a4baf5553823
When the configured primary is git-refs, PrePush drains the push-discovery queue
and batch force-pushes those per-checkpoint refs (one git push, +ref:ref force
refspecs — independent histories, no fetch+rebase recovery) instead of pushing
the v1 branch. Default config (git-branch) is unchanged.

- partitionLocalRefs drops stale queue entries (refs deleted locally); surviving
  refs are removed from the queue only on a confirmed push, so a failed/transient
  push leaves them for the next pre-push and never blocks the user's git push.
- Shared post-push shadow cleanup extracted to cleanupPushedShadowBranches.

A configured git-branch mirror's v1 ref is not pushed here yet (downgrade-safety
mirror push is later), and OPF stays descoped for git-refs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 7ca5586da108
A git-refs primary previously read only checkpoint refs already present locally,
so checkpoints written on another machine (or not yet fetched after clone) read
as not-found. FetchCheckpointRef fetches one ref via the checkpoint remote into
the local ref of the same name; it's wired into the read commands that already
inject FetchBlobsByHash (resume, explain, attribution, tokens profile) via the
new OpenOptions.RefFetcher. gitRefsStore.resolveRefMaybeFetch invokes it once on
a local miss and retries; a failed fetch resolves to ErrCheckpointNotFound. The
git-branch backend ignores it; strategy-side Opens (condensation/rewind) read
just-written local refs and pass no fetcher.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 696096934b50
- checkpointpolicy: refs-v1 is now read- AND write-supported (refsV1Format added
  to both readFormats and writeFormats) so explain/resume/attach accept git-refs
  checkpoints and the git-refs store may write them. Tests that used "refs-v1" as
  a stand-in *unsupported* version now use "refs-v2" (the next, genuinely
  unsupported refs major); format_test asserts refs-v1 is supported.
- explain: on a fresh clone the prefix→ID remote fallback only fetched the v1
  metadata branch (empty under git-refs). When the primary is git-refs and the
  prefix is a full checkpoint ID (the Entire-Checkpoint trailer always is), fetch
  that one ref via FetchCheckpointRef and re-list.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 291fe479c0a8
- settings: LoadCheckpointsConfig honors ENTIRE_CHECKPOINTS_PRIMARY (+ comma-
  separated ENTIRE_CHECKPOINTS_MIRRORS), env-wins-over-file, matching the other
  ENTIRE_* overrides. Backend selection only.
- e2e: make the harness backend-aware (e2e/testutil/backend.go: checkpointStoreMode,
  CheckpointState advance digest, checkpointBlobSpec/checkpointRefName/checkpointShard).
  Advance detection, metadata reads, CheckpointIDs, push, and artifact capture route
  through it; the v1-branch-specific alternate-object sync test skips under git-refs.
  AssertCheckpointIDFormat keeps the production Validate (hex or ULID).
- e2e TestMain maps E2E_CHECKPOINT_STORE (git-branch default, or git-refs) to the
  ENTIRE_CHECKPOINTS_PRIMARY override so every spawned binary/hook uses it.
- CI: e2e-checkpoint-store workflow with a checkpoint_store input; e2e/README doc.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: d9646529ed6e
refs-v1 is now write-supported (refsV1Format in writeFormats), so the strategy
condensation/pre-push policy tests that used "refs-v1" to mean a write-unsupported
version must use "refs-v2" (the next, genuinely unsupported refs major). Same
flip already applied to the cli + checkpointpolicy test suites.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 3bf9d245dbca
ci.yml's test-canary job now runs as a matrix over checkpoint_store
[git-branch, git-refs], setting E2E_CHECKPOINT_STORE. The Vogon canary makes no
API calls, so this guards the git-refs store on every PR at no cost alongside the
default git-branch run. fail-fast: false so one backend's failure doesn't mask
the other.

(The e2e-checkpoint-store dispatch workflow already landed separately on main in
#1567; this branch leaves it untouched.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 917c38caf636
There is no server-side ref protection, so force-pushing per-checkpoint refs by
default risks a buggy or racing client silently clobbering good remote history.
Switch batchForcePushRefs -> batchPushRefs using a plain ref:ref refspec
(fast-forward-only): per-checkpoint refs normally advance by fast-forward so the
common case still succeeds, while a genuine non-fast-forward divergence (e.g. the
same checkpoint written differently elsewhere) is REJECTED rather than
overwritten. On rejection the pre-push logs and leaves the refs queued (not
overwritten); reconciling a diverged ref is deferred, and a future rewrite path
(e.g. OPF) can use --force-with-lease where it must replace a ref.

Tests: replace the force-overwrite assertion with AllowsFastForward (a descendant
update pushes without force) and RejectsNonFastForward (an orphan/divergent
update errors and leaves the remote ref unchanged).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: d3e6c662ead3
RefName previously returned refs/entire/checkpoints// for an empty/invalid
checkpoint ID, and callers built on the convention that the ID was already
validated. Make RefName return (plumbing.ReferenceName, error), erroring when the
ID is empty or an unrecognized format, so a bad ID can't silently become a
malformed ref that gets pushed/fetched/looked up. Store call sites
(refBase/setRef/resolveRefMaybeFetch/GetCheckpointAuthor) and the explain-on-clone
fetch propagate the error; tests use a mustRefName helper for known-valid IDs and
add a RefName_RejectsInvalidID case.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 2bd3564e62eb
…it queued

Build on the non-force push: when a per-checkpoint ref is rejected (it diverged
on the remote — the same checkpoint re-written elsewhere), instead of just
leaving it queued, fetch the remote ref and replay the local-only commits on top
via the existing fetchAndRebaseRefCommon, then retry. The retry stays non-force
(after the replay the local ref is a fast-forward over the remote), so the
remote's commit is preserved as an ancestor, never overwritten. The cherry-pick
is delta-based, so non-overlapping changes merge; a genuine overlap (both sides
rewrote the same file) surfaces as a rebase error and the ref is left for a later
pre-push — degrading to the previous safe behavior, never to a force overwrite.

pre-push keeps the batch push as the fast path (one round-trip when everything
fast-forwards) and falls back to this per-ref recovery only for rejected refs,
removing from the queue just those that land. Adds a test proving a diverged ref
ends up with both the remote-only and local-only changes after recovery.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 5ac4c261b57a
…icy)

Altitude follow-up to the non-force checkpoint push: tryPushRefCommon (the v1-era
per-ref pusher behind doPushRef) still force-pushed non-branch refs (+ref:ref),
which contradicts the new fast-forward-only policy for per-checkpoint refs and was
a latent footgun — a future caller routing a checkpoint ref through doPushRef
would silently overwrite a divergence. The non-branch path has no production
caller today (the v1 PrePush loop only pushes the v1 branch), so dropping the
force is safe and unifies the policy: every checkpoint ref, branch or
per-checkpoint, is fast-forward-only with doPushRef's fetch+rebase recovery on
divergence.

Also clarify (comment) that explain_export's RefName-error branch is defensive —
cid is already validated by NewCheckpointID, so it can't fire — rather than a
swallowed live error.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 89a2a991d773
The skip message said the git-refs store force-pushes per-checkpoint refs, but
git-refs switched to non-force fast-forward pushes with fetch+replay recovery.
The test is still git-branch-specific (it exercises the v1 branch's rebase-sync
path over an object alternate, which git-refs has no equivalent of); only the
reason text needed correcting.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 84e64cc670b3
The git-refs pre-push path was switched to fast-forward-only pushes with
fetch+replay recovery; the doc comment still said "batch force-pushes". Wording
only — no behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: f758baeca44c
…y on pre-push

Addresses the open review threads on the git-refs store.

Error handling (was: any ref-resolution error treated as "missing"):
- refBase: only a plumbing.ErrReferenceNotFound starts a new orphan checkpoint;
  a real lookup error (IO/corruption) is surfaced instead of silently
  overwriting the ref's history.
- resolveRefMaybeFetch: a failed on-demand fetch (offline, network, ctx
  cancellation) now returns the real error; only a genuinely absent ref (or a
  successful fetch that finds nothing) resolves to not-found.
- checkpointTree / Read: distinguish ErrCheckpointNotFound (→ nil summary) from
  real commit/tree/fetch errors, which now propagate instead of reading as
  "checkpoint doesn't exist" (which risked silent data loss).
- partitionLocalRefs: a transient/IO error looking up a ref keeps it as pushable
  (retried next pre-push) instead of dropping it from the queue as stale.

Pre-push policy (was: git-refs skipped the check the v1 path runs):
- prePushCheckpointRefs now calls syncCheckpointPolicyForPrePush first; a
  diverged or unsupported-format checkpoint policy skips the ref push (leaving
  refs queued), matching the v1 branch path. Policy governs checkpoint format
  compatibility, which is independent of the storage backend.

Updates the fetch-failure test to assert the corrected contract (error
propagates; genuine absence still reads as not-found).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 7c18df0a2b25
…or both formats)

ShardFor previously branched on the ID Kind — legacy hex on the first two chars,
ULID on the last two — which meant the ref shard depended on format detection and
could be computed inconsistently by an independent reimplementation (the e2e
helper sharded any 26-char string as a ULID, diverging from production's strict
check for a 26-char non-ULID).

Collapse it to a single positional rule: the last two characters, for both legacy
12-hex IDs and 26-char ULIDs. Trailing chars are uniformly random in both formats
(a ULID's leading chars are its timestamp), so distribution stays even, and there
is no Kind branch to get wrong. RefName still rejects KindUnknown IDs, so both
supported formats map to a proper ref while invalid ones are refused.

This only affects the git-refs ref namespace (no production data yet); the
entire/checkpoints/v1 branch tree keeps its independent first-two Path() layout.
The e2e checkpointShard helper is simplified to the same last-two rule so it
can't diverge from production. Updates the affected shard expectations in the id,
refs-naming, and refs-store tests.

Resolves the git-refs sharding review thread (e2e/testutil/backend.go).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: d17074281105
pjbgf
pjbgf previously approved these changes Jul 1, 2026
Two integration points shifted on main during the rebase:

- syncCheckpointPolicyForPrePush was refactored to a void sync (taking the repo)
  plus a separate checkpointPolicyAllowsGitHook(ctx, repo) decision. Adapt the
  git-refs pre-push path to the new two-call shape (open repo → sync → allow-gate)
  instead of the old bool-returning signature.

- main added several tests that use "refs-v1" as the canonical unsupported /
  future checkpoint_version sentinel. This branch makes refs-v1 a *supported*
  read/write format, so those sentinels must move to the next unsupported
  version, refs-v2: the CLI policy-print, explain-reject, agent-hook-skip, and
  the checkpointpolicy CanSatisfy / UnsupportedPolicyMessage tests. Also renamed
  the export test helper rewriteExportCheckpointVersionToRefsV1 →
  ...ToRefsV2 to match the version it already writes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 602319d6087b
@Soph Soph force-pushed the feat/checkpoint-git-refs branch from 02e2f08 to 7816aae Compare July 1, 2026 15:56
Enqueue only appends, so a long-lived session that re-enqueues the same
checkpoint ref across many writes without pushing would grow the queue file
unboundedly — Remove was the only point that rewrote it. Drain already
de-duplicates in memory; now it also rewrites the file to that de-duplicated set
when the on-disk queue held redundant lines (duplicate refs or malformed/blank
records), bounding the file to one line per distinct queued ref.

readLocked reports the raw non-empty line count so Drain compacts only when
rawLines > len(refs) (i.e. there was actually something redundant). The rewrite
is factored into rewriteLocked, shared with Remove, and stays atomic (temp +
rename). Drain still returns the refs and does not clear them — they survive
until a confirmed Remove.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 5608d53754d2
@Soph Soph merged commit 23dbc4c into main Jul 2, 2026
12 checks passed
@Soph Soph deleted the feat/checkpoint-git-refs branch July 2, 2026 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants