Skip to content

🤖 fix(ssh): harden bare base repo against receive-pack thin-pack failures#3356

Merged
ammario merged 1 commit into
mainfrom
fix/ssh-thin-pack-receive-pack
May 21, 2026
Merged

🤖 fix(ssh): harden bare base repo against receive-pack thin-pack failures#3356
ammario merged 1 commit into
mainfrom
fix/ssh-thin-pack-receive-pack

Conversation

@ammar-agent
Copy link
Copy Markdown
Collaborator

Summary

Hardens the SSH runtime's shared bare base repo (.mux-base.git) against the receive-pack unresolved deltas left after unpacking / unpack-objects abnormal exit failure that aborts workspace creation. The fix layers four independent defenses so that any one of them is enough to keep the slow-path push healthy on both new and legacy remote repos.

Background

A user hit this during SSH workspace init:

Pushing to remote...
remote: fatal: unresolved deltas left after unpacking
error: remote unpack failed: unpack-objects abnormal exit
 ! [remote rejected]     <branch> -> refs/mux-bundle/<branch> (unpacker error)
error: failed to push some refs to 'ssh://.../.mux-base.git'
Initialization failed: Failed to sync project: Failed to push to remote: ...

Two interacting root causes:

  1. Receive-pack's small-push fast path. When the incoming pack contains fewer than transfer.unpackLimit objects (default 100), receive-pack routes objects through unpack-objects instead of index-pack. unpack-objects cannot resolve thin-pack delta bases that live in other on-disk packs, so a small thin push to a populated bare repo can fail with unresolved deltas left after unpacking even when every base object is technically present.
  2. Orphan packs on disk. A git gc that gets SIGKILLed between writing pack-<sha>.pack and pack-<sha>.idx leaves the pack with no index. Git's negotiation surface ignores indexless packs, so the server doesn't advertise objects it actually has on disk → the client builds a thin pack assuming those bases are missing → receive-pack fails to resolve them.

The pre-fetch from origin (which would have populated the bases) had also failed, exposing the underlying weakness.

The existing retry classifier only matched generic transport errors (pack-objects died, Connection reset, …), so a thin-pack failure surfaced immediately as a fatal Failed to sync project.

Implementation

Four layered defenses in src/node/runtime/SSHRuntime.ts:

  1. Prevent the failure modeensureBaseRepo's batched setup script now also runs git config --local receive.unpackLimit 1. Receive-pack always routes incoming pushes through index-pack (which calls --fix-thin and resolves missing delta bases from local objects) instead of unpack-objects, regardless of object count. Idempotent — runs every sync, so legacy repos heal on the next workspace creation. Trade-off: more small packs, but the existing fragmented-pack maintenance (ensureHealthyBaseRepoForSync) already coalesces them on a threshold.
  2. Heal pre-existing on-disk damagerepairBaseRepoForSync gets a new step before gc: scan objects/pack/pack-*.pack for orphans missing .idx and run git index-pack to rebuild the index from the pack. The existing code comment already noted this exact SIGKILL failure mode but didn't act on it.
  3. Retry the failure — added unresolved deltas, unpacker error, unpack-objects abnormal exit, remote unpack failed to PROJECT_SYNC_RETRYABLE_ERRORS so the existing retry loop kicks in (which runs the maintenance from Better authentication UX #1 + ipc: use consistent error propogation #2 before re-attempting).
  4. Defang the retry push — when a retryable failure matches the unresolved-deltas pattern, the retry loop latches a forceNoThinNextAttempt flag and threads it into syncProjectSnapshotViaGitPush as { forceNoThin: true }. The next push then includes --no-thin on both the branch and tag pushes, sending a self-contained pack the receiver can unpack without any delta-base lookup. Unrelated retryable failures (connection reset, killed by signal, …) leave the flag off — no bandwidth penalty for problems --no-thin wouldn't fix.

Also surfaces Pre-fetch from origin skipped (fetch failed) cause via log.debug so we can diagnose recurrent prefetch misses without spamming the init logger on the happy path.

Validation

  • make static-check passes (typecheck + lint + format + docs links).
  • bun test src/node/runtime/ passes (all SSH runtime / sync contract / retry orchestration suites).
  • New retry tests cover: thin-pack failure → next attempt flips --no-thin; connection-reset → flag stays off.
  • New contract tests cover: --no-thin appears on both branch and tag pushes when forceNoThin: true; default sync path omits --no-thin.
  • Updated existing maintenance-command expectation tests for the new orphan-pack reindex step.

Risks

Touches the SSH workspace init slow path, which runs on every cold SSH workspace creation and every snapshot-drift sync.

  • receive.unpackLimit=1 is unconditionally applied to every shared bare repo via ensureBaseRepo. The trade-off is more small packs vs. fewer thin-pack-unpack failures; existing maintenance already collapses fragmented packs so the steady-state is unchanged.
  • The orphan-pack reindex runs git index-pack against any .pack lacking an .idx. A corrupt .pack will fail index-pack (best-effort, logged), then fall through to gc and the --no-thin retry push.
  • --no-thin is only applied on retry after a confirmed thin-pack failure, so the happy path is unchanged.
  • Retry classification is purely additive — existing retryable patterns and orchestration are untouched.

Generated with mux • Model: anthropic:claude-opus-4-7 • Thinking: high • Cost: $6.77

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammario ammario merged commit 53cc947 into main May 21, 2026
24 checks passed
@ammario ammario deleted the fix/ssh-thin-pack-receive-pack branch May 21, 2026 17:19
mux-bot Bot added a commit that referenced this pull request May 21, 2026
…RYABLE_ERRORS

#3356 introduced two adjacent constants in SSHRuntime.ts that listed the
same four receive-pack thin-pack failure patterns inline:
PROJECT_SYNC_RETRYABLE_ERRORS (used by the broad retry classifier) and
UNRESOLVED_DELTA_PUSH_PATTERNS (used by isUnresolvedDeltaPushFailure to
decide whether the next retry attempt should add --no-thin).

Define the four-string sub-class once on UNRESOLVED_DELTA_PUSH_PATTERNS
and spread it into PROJECT_SYNC_RETRYABLE_ERRORS so the lists can never
drift. .some(pattern => errorMsg.includes(pattern)) sees the same
patterns in the same order, so the retry classification is
byte-equivalent — pure DRY, no behavior change.
mux-bot Bot added a commit that referenced this pull request May 22, 2026
…RYABLE_ERRORS

#3356 introduced two adjacent constants in SSHRuntime.ts that listed the
same four receive-pack thin-pack failure patterns inline:
PROJECT_SYNC_RETRYABLE_ERRORS (used by the broad retry classifier) and
UNRESOLVED_DELTA_PUSH_PATTERNS (used by isUnresolvedDeltaPushFailure to
decide whether the next retry attempt should add --no-thin).

Define the four-string sub-class once on UNRESOLVED_DELTA_PUSH_PATTERNS
and spread it into PROJECT_SYNC_RETRYABLE_ERRORS so the lists can never
drift. .some(pattern => errorMsg.includes(pattern)) sees the same
patterns in the same order, so the retry classification is
byte-equivalent — pure DRY, no behavior change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants