Skip to content

feat: bare-clone the data repo, eliminate the working tree (closes #85)#86

Open
themightychris wants to merge 9 commits into
mainfrom
feat/bare-data-repo
Open

feat: bare-clone the data repo, eliminate the working tree (closes #85)#86
themightychris wants to merge 9 commits into
mainfrom
feat/bare-data-repo

Conversation

@themightychris
Copy link
Copy Markdown
Member

Summary

The pod was carrying a working tree it never read. gitsheets reads/writes via hologit's tree-object interface, so the checked-out files were a redundant on-disk copy and an active CPU cost — hologit's getWorkspace() hashes the entire working tree on every workspace construction when one is set (node_modules/hologit/lib/Repo.js:86-97).

Switching to a bare clone removes both. The pod re-clones from origin on every restart (objects-only, ~50 MB), which is cheap, and the emptyDir volume sidesteps the PVC Multi-Attach errors we hit during failover.

What landed

  • Specspecs/behaviors/storage.md now declares the bare-clone invariant; three commitments enumerated (openRepo with no workTree, reconcile via plumbing, emptyDir-backed volume).
  • openPublicStore — opens with { gitDir: repoPath } (no workTree); fails loudly on non-bare paths with a remediation message pointing at the spec.
  • Reconcile rewriteapps/api/src/store/reconcile.ts swaps three working-tree-touching commands for plumbing equivalents:
    • git merge --ff-onlygit update-ref refs/heads/<branch> <new> <old> (CAS-protected)
    • git rebase → per-commit replay loop: git merge-tree --write-tree --merge-base=<C^> <newTip> <C> then git commit-tree, preserving author/message metadata and resetting committer to the API's pseudonymous identity (same as today's git rebase behavior)
    • git reset --hard (escape hatch) → git update-ref
  • Reconcile tests — all 7 cases (in-sync, fast-forwarded, pushed-ahead, rebased, conflict-escaped, fetch-failed, refspec regression) pass against bare clones. The conflict-escape's "no half-rebase left behind" assertion holds naturally with plumbing.
  • Entrypointgit clone --bare --branch $CFP_DATA_BRANCH; bare marker is objects/ at the path root; refuses to operate against a stray non-bare clone with a clear remediation message.
  • Kustomizedata volume swapped from PVC to emptyDir{}; pvc-data.yaml deleted; kustomization.yaml drops the reference.
  • Test helperscreateTestRepo / createFullDataRepo build bare gitdirs + use a transient working-tree clone to seed initial commits via push. New seedRawToml helper (in seed-fixtures.ts) centralizes the transient-clone-push pattern for ad-hoc fixture seeds — replaces identical-shape blocks in account-claim, auth, github-oauth, saml tests.
  • Importer + scrub-data — both updated for bare semantics (importer's ensureBranchCheckedOut uses update-ref + symbolic-ref; scrub-data copies .gitsheets/ configs via git ls-tree + git show instead of filesystem readdir).
  • Docsdocs/operations/deploy.md boot sequence + env table, docs/operations/runbook.md recovery procedure (drop the delete-PVC step), .claude/CLAUDE.md Local setup section.

All 241 API tests pass. type-check and lint clean.

Test plan

  • npm run -w packages/shared build && npm run type-check && npm run lint
  • npm run -w apps/api test — 241/241 pass
  • npm run -w packages/shared test — 53/53
  • npm run -w apps/web test — 30/30
  • kubectl kustomize deploy/kustomize/overlays/sandbox renders cleanly; data volume is emptyDir: {} and no codeforphilly-data PVC is emitted
  • Sandbox smoke (deferred to deploy) — apply manifests; pod boots; /api/health/ready reaches 200; git-pod-uploadpack.sh operator helper still fetches
  • Hot-reload webhook (deferred to deploy) — push to published; pod logs the short-circuit or rebuild line; site reflects the change
  • Boot-time measurement (deferred to deploy) — capture first-boot git clone --bare duration on the live remote; should be <30s for our current data size

Closes #85.

🤖 Generated with Claude Code

themightychris and others added 9 commits May 20, 2026 10:42
Eliminates the working-tree the pod is carrying for nothing. Bare
clone everywhere on the server side. The non-trivial bit is the
reconcile rebase rewrite — per-commit replay via git merge-tree
--write-tree + git commit-tree, preserving today's linear-history
"local on top of remote" semantics.

Closes #85.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per discussion: gitsheets struggles with mixed-mode (API advancing
HEAD while a co-located working tree drifts) — exactly the staleness
class we documented in #47. Going bare on prod alone keeps the
footgun pointed at dev machines, so make bare the only mode.

Local dev workflow: contributors clone --bare (matches prod), then
optionally clone-from-the-bare for a working-tree to browse/edit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The API always operates against a bare clone — no working tree on the
server side, locally or in production. gitsheets reads/writes via
hologit's tree-object interface; a checked-out working tree is
redundant data on disk and an active drift hazard (hologit hashes the
working tree on every getWorkspace call when one is present).

Section calls out the three plumbing-side commitments: openRepo with
no workTree, reconcile via update-ref/merge-tree/commit-tree, and
emptyDir-backed data volume.

Dev bootstrap snippet now uses git clone --bare; contributors browse
data through a second clone-from-the-bare working-tree clone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
repoPath is the bare gitdir now (no .git subdirectory). workTree is
omitted from the openRepo call so hologit takes the
createWorkspaceFromRef(HEAD) path instead of hashing a working tree
on every workspace construction.

A non-bare repoPath fails loudly at boot with a remediation message
pointing at the new spec section, so misconfiguration shows up at
startup rather than as runtime drift.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three working-tree-touching commands in the reconcile state machine
swapped for plumbing equivalents:

  git merge --ff-only <remote>   →  git update-ref refs/heads/<branch>
                                     <remote-commit> <old-commit>
  git rebase <remote>            →  per-commit replay via
                                     git merge-tree --write-tree +
                                     git commit-tree, then update-ref
  git reset --hard <remote>      →  git update-ref refs/heads/<branch>
                                     <remote-commit>

The replay path (replayLocalOntoRemote) preserves the linear-history
"local on top of remote" semantics of today's git rebase: original
author/message metadata carries forward; committer is rewritten to the
API's pseudonymous identity (matching rebase's preserve-author /
reset-committer behavior); a merge-tree conflict throws a typed
RebaseReplayConflictError which the diverged-branch handler catches and
routes to the escape hatch (preserve pre-replay HEAD on
conflicts/<UTC>, fast-forward local to remote).

CAS-protected update-refs throughout — racing in-process transacts
surface as errors rather than silent ref clobbers.

Tests: bare-clone the local rig instead of working-tree clone; replace
direct commitFile() with a commitToBare() helper that mints commits via
a transient working-tree clone (mirroring the existing advanceRemote
helper, now a thin wrapper alongside the new advanceLocal). The
single-branch-clone regression test generalizes to "bare-clone leaves
refs/remotes/origin/main unpopulated" — same underlying defense
(explicit refspec on fetch) verified.

All 7 reconcile state-machine tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
entrypoint.sh:
  - First-boot clones with `git clone --bare --branch $CFP_DATA_BRANCH`
  - Marker for "is a bare repo" is the presence of `objects/` at the path
    root (a non-bare clone has `.git/objects/` instead). Refuses to boot
    against a non-bare clone left over from earlier builds with a clear
    remediation message.
  - Git operations against the bare repo all use --git-dir explicitly;
    no working tree to `cd` into anymore.
  - Comments updated for the bare-repo invariant per
    specs/behaviors/storage.md.

deploy/kustomize/base/:
  - deployment.yaml volume `data` switches from PVC to emptyDir{}. Bare
    re-clone on every pod start is cheap (objects-only); persisting
    across restarts buys nothing because the bare clone is fully
    recoverable from the git remote.
  - pvc-data.yaml deleted; kustomization.yaml drops the reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test helpers + scripts updated to match the bare-repo invariant from
openPublicStore. createTestRepo / createFullDataRepo now build the
bare gitdir + use a transient working-tree clone to seed initial
commits via push, then discard the working tree. seedFixtures opens
the bare directly via openRepo({ gitDir }).

New seedRawToml helper (apps/api/tests/helpers/seed-fixtures.ts) wraps
the transient-clone-push dance for ad-hoc TOML seeds — replaces the
identical-shape git-add-commit blocks that lived in account-claim,
auth, github-oauth, and saml tests.

internal-reload + import-laddr test rigs: local clone is --bare;
imp-laddr's ensureBranchCheckedOut uses update-ref + symbolic-ref
instead of git checkout. internal-reload's post-reload sanity check
reads the file via `git show HEAD:<path>` instead of filesystem.

scrub-data: source repo opens bare; .gitsheets configs copied to the
target via `git ls-tree` + `git show` instead of filesystem readdir.

All 241 API tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- docs/operations/deploy.md — boot sequence describes git clone --bare,
  reconcile state machine now references update-ref / merge-tree /
  commit-tree plumbing. Env table note for CFP_DATA_REPO_PATH calls
  out emptyDir + re-clone-on-boot.
- docs/operations/runbook.md — recovery for "API won't boot" drops the
  delete-PVC step (no PVC for data). The "Drop into the pod" snippet
  inspects the bare repo via git --git-dir / `git show HEAD:.gitsheets`.
- .claude/CLAUDE.md — Local setup section: contributors clone --bare;
  optional second clone for a working-tree browser; Pod boot bullet
  reflects bare clone + emptyDir.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closeout: flip status to done, tick the validations that completed
locally, mark the four sandbox-side validations as deferred to deploy,
fill Notes (eight-commit shape + the hologit hashWorkTree surprise +
the test-scaffolding lift) and Follow-ups (deploy smoke, boot-time
measurement, onboarding polish, optional spec note about the
hashWorkTree cost we side-stepped).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

infra: eliminate working tree — switch data-repo clone to bare

1 participant