Skip to content

fix(push-script): fetch PR HEAD sha before git worktree add (closes #966)#972

Open
joelteply wants to merge 1 commit intomainfrom
fix/rebuild-stale-worktree-fetch
Open

fix(push-script): fetch PR HEAD sha before git worktree add (closes #966)#972
joelteply wants to merge 1 commit intomainfrom
fix/rebuild-stale-worktree-fetch

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

  • rebuild-stale-arm64 (and -amd64 whenever it actually fires) on PR builds was hitting fatal: invalid reference: <sha> at git worktree add time, despite STARTUP_SHA_FULL being correctly resolved from .pull_request.head.sha.
  • Root cause: actions/checkout@v4 shallow-clones refs/pull/<N>/merge, so the PR head commit is known as a remote ref but NOT as a local object. git worktree add --detach <DIR> <SHA> requires the commit to be locally available.
  • Fix: gate worktree add on git cat-file -e; if the sha isn't local, fetch it from origin (shallow first, full-history fallback). One small new block in scripts/push-current-arch.sh:209-220.

Empirical hit

PR #950 run 24927483127:

→ STARTUP_SHA_FULL resolved via GITHUB_EVENT_PATH .pull_request.head.sha: d98777e3d54a62f01c86d64d54f993d75d84a8e3
→ Creating frozen worktree at /tmp/continuum-build-d98777e3d54a (pinned at d98777e3d54a62f01c86d64d54f993d75d84a8e3)
fatal: invalid reference: d98777e3d54a62f01c86d64d54f993d75d84a8e3
##[error]Process completed with exit code 128.

rebuild-stale-amd64 was SKIPPED on the same run because amd64 already matched HEAD (a prior native push from bigmama-wsl had aligned the revision label) — so the bug was latent for amd64 and would have fired the next time CI actually had to rebuild it.

Fix

if ! git -C "$REPO_ROOT" cat-file -e "$STARTUP_SHA_FULL^{commit}" 2>/dev/null; then
  echo "→ SHA $STARTUP_SHA_FULL not present as a local object — fetching from origin"
  git -C "$REPO_ROOT" fetch --depth 1 origin "$STARTUP_SHA_FULL" 2>/dev/null \
    || git -C "$REPO_ROOT" fetch origin "$STARTUP_SHA_FULL" 2>/dev/null \
    || { echo "ERROR: cannot fetch sha $STARTUP_SHA_FULL from origin (not a real commit, or network/auth issue)" >&2; exit 1; }
fi

Dev-machine path is unaffected — cat-file -e always succeeds on the local HEAD that git rev-parse HEAD would resolve to. Only fires in CI when actions/checkout@v4's default fetch missed the sha.

Test plan

  • bash -n scripts/push-current-arch.sh — clean (no syntax error from the new block).
  • Empirical: next CI run that has to rebuild a stale image at PR HEAD should now succeed at git worktree add (validates the fetch path). Will confirm by watching rebuild-stale-amd64 / rebuild-stale-arm64 on the next push that triggers them.
  • Local sanity (cannot easily simulate CI's shallow checkout state on a dev box without contortions; the dev path stays in the cat-file-succeeds branch, exercising no fetch).

Closes

#966

#966)

Empirical hit on PR #950 run 24927483127: rebuild-stale-arm64 failed
immediately with "fatal: invalid reference: <sha>" after correctly
resolving STARTUP_SHA_FULL via .pull_request.head.sha.

Root cause: actions/checkout@v4 with default settings on a pull_request
event fetches refs/pull/<N>/merge as a shallow clone. The PR head sha
is a known remote ref but is NOT in the local object store, so
\`git worktree add --detach <DIR> <SHA>\` fails before it can build.

Fix: gate \`git worktree add\` on \`git cat-file -e\` and fetch the
missing commit if needed. Falls through full-history fetch when shallow
fetch is rejected. Dev-machine path unchanged — cat-file -e always
succeeds on local HEAD.

Why amd64 didn't trip the bug on the same run: rebuild-stale-amd64 was
SKIPPED because amd64 already matched HEAD (a prior native push aligned
the revision label). The bug was latent for amd64 — would have fired
on any push where amd64 actually drifted and CI had to rebuild.

Closes #966.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 25, 2026 18:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes CI failures in rebuild-stale-arm64/rebuild-stale-amd64 where git worktree add --detach <dir> <sha> can fail on PR builds because the PR head SHA may not exist as a local object in the checkout.

Changes:

  • Add a pre-git worktree add guard that checks whether STARTUP_SHA_FULL exists locally via git cat-file -e.
  • If missing, fetch the SHA from origin (shallow first, then full fetch fallback) before creating the detached worktree.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +220 to +222
git -C "$REPO_ROOT" fetch --depth 1 origin "$STARTUP_SHA_FULL" 2>/dev/null \
|| git -C "$REPO_ROOT" fetch origin "$STARTUP_SHA_FULL" 2>/dev/null \
|| { echo "ERROR: cannot fetch sha $STARTUP_SHA_FULL from origin (not a real commit, or network/auth issue)" >&2; exit 1; }
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The git fetch ... 2>/dev/null redirects will hide the underlying failure reason in CI logs (auth vs ref not found vs network). Consider keeping stderr for at least the final fetch attempt, or capturing and re-printing it when both fetch attempts fail, to make failures diagnosable.

Suggested change
git -C "$REPO_ROOT" fetch --depth 1 origin "$STARTUP_SHA_FULL" 2>/dev/null \
|| git -C "$REPO_ROOT" fetch origin "$STARTUP_SHA_FULL" 2>/dev/null \
|| { echo "ERROR: cannot fetch sha $STARTUP_SHA_FULL from origin (not a real commit, or network/auth issue)" >&2; exit 1; }
FETCH_ERR_LOG="$(mktemp)"
if ! git -C "$REPO_ROOT" fetch --depth 1 origin "$STARTUP_SHA_FULL" 2>"$FETCH_ERR_LOG" \
&& ! git -C "$REPO_ROOT" fetch origin "$STARTUP_SHA_FULL" 2>>"$FETCH_ERR_LOG"; then
cat "$FETCH_ERR_LOG" >&2
rm -f "$FETCH_ERR_LOG"
echo "ERROR: cannot fetch sha $STARTUP_SHA_FULL from origin (not a real commit, or network/auth issue)" >&2
exit 1
fi
rm -f "$FETCH_ERR_LOG"

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants