fix(gitclone): refresh GitHub App token mid-subprocess via file-based credential helper#321
fix(gitclone): refresh GitHub App token mid-subprocess via file-based credential helper#321jrobotham-square wants to merge 7 commits into
Conversation
… credential helper GitHub App installation tokens are valid for one hour. The previous credential.helper embedded the token as a literal in a shell-function closure at exec time, so any git subprocess that ran longer than the TTL — most notably 'git lfs fetch' for large LFS repositories — would keep presenting the original token to GitHub long after it had expired and would loop on 'Bad credentials' until git-lfs's internal retry budget gave up (observed in production: jobs running 20+ hours before failing). Switch to a file-based credential helper: the token is written to a 0600 temp file and exposed as 'credential.helper=!cat <file>', which git re-reads on every credential query. A background goroutine re-fetches the token on a 30s ticker for the lifetime of the returned cleanup and rewrites the file atomically when the value changes, so long-running subprocesses transparently pick up rotated tokens on the next git-lfs retry. The TokenManager's own caching means the ticker is a cheap map lookup for most ticks; only the pre-expiry refresh hits the GitHub API. GitCommand now returns a cleanup function alongside the cmd that callers must defer to stop the goroutine and remove the credentials file. All in-tree callers (clone, fetch, ls-remote, lfs snapshot) are updated accordingly. Amp-Thread-ID: https://ampcode.com/threads/T-019e805f-48b7-7594-9d46-415a44d2e1c5 Co-authored-by: Amp <amp@ampcode.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7b56403ad6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Drop doc comments that restate the code, retain only the why-not-what explanations (helper file rotation contract, token-TTL rationale, atomic-write invariant, etc.). Remove internal repository name from the snapshot.go comment. Amp-Thread-ID: https://ampcode.com/threads/T-019e805f-48b7-7594-9d46-415a44d2e1c5 Co-authored-by: Amp <amp@ampcode.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: df1479d4fe
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Git appends the operation (`get`/`store`/`erase`) as a positional argument to `!`-prefixed credential helpers. The previous `!cat <credfile>` form therefore became `cat <credfile> get`, which would also cat a file named `get` from the worktree (or `store`/ `erase`). Lines in that file are parsed by git as credential entries and override the real token. Wrap the helper in a shell function that only outputs on `get` and absorbs the action argument. Add a regression test that drops hostile `get`/`store`/`erase` files into a worktree and asserts `git credential fill` returns the real token. Reported by Codex on PR #321. Amp-Thread-ID: https://ampcode.com/threads/T-019e805f-48b7-7594-9d46-415a44d2e1c5 Co-authored-by: Amp <amp@ampcode.com>
|
Good catch — confirmed locally that with the original Fixed in cc097e7 by wrapping the helper in a shell function that both gates on Added |
…ne on cleanup Two issues reported by Codex on PR #321. (P1) writeCredentialFile previously wrote the rotated token to `path + ".new"` and renamed it over the destination. Because path lives in /tmp on a shared host, a hostile local user could pre-create `<path>.new` as a symlink, and os.WriteFile would follow it, leaking the next-rotated GitHub App token to attacker-readable storage. Switch to os.CreateTemp (O_EXCL + random suffix) for the intermediate file so the write target cannot be pre-positioned by anyone else. (P2) cleanup previously cancelled the refresh context and immediately removed the credential file. A refresh tick that began before the cancel could finish its rename AFTER the cleanup-side os.Remove, leaving a fresh token sitting in /tmp once the caller believed it had been wiped. Track the refresh goroutine in a sync.WaitGroup so cleanup blocks until it has fully exited before removing the file. Adds two regression tests: - TestWriteCredentialFile_IgnoresHostileSiblingSymlink plants the `<path>.new` symlink the old code would have followed and asserts the rotated write lands at path, not at the symlink target. - TestCleanup_WaitsForInFlightRefresh shrinks the refresh interval to 1ms, drives the production goroutine into a blocking provider, and asserts cleanup does not return until the goroutine has unwound. credentialFileRefreshInterval becomes a var (with a nolint comment) purely so the second test can shrink it to ms granularity. Amp-Thread-ID: https://ampcode.com/threads/T-019e805f-48b7-7594-9d46-415a44d2e1c5 Co-authored-by: Amp <amp@ampcode.com>
|
Both valid. Fixed in dda656c. P1 (symlink attack on P2 (cleanup races refresh tick) —
|
| return nil, cleanup, errors.Wrap(err, "start token credential file") | ||
| } | ||
| cleanup = fileCleanup | ||
| // `!cmd` runs cmd via the shell on every credential query, so a |
There was a problem hiding this comment.
Can you get rid of these incredibly verbose comments please?
The deferred cleanup kept the credential helper file and its background refresh goroutine alive through snapshot.CreatePaths archive upload, which only touches the local cache. Scope the credential lifetime to the subprocess that actually needs it. Amp-Thread-ID: https://ampcode.com/threads/T-019e805f-48b7-7594-9d46-415a44d2e1c5 Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019e805f-48b7-7594-9d46-415a44d2e1c5 Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019e805f-48b7-7594-9d46-415a44d2e1c5 Co-authored-by: Amp <amp@ampcode.com>
|
closed in favour of #322 |
Problem
gitsubprocesses spawned by cachew (clone, fetch, ls-remote, andespecially
git lfs fetch) are configured with acredential.helperthat embeds the GitHub App installation token as a literal inside a
shell-function closure at exec time:
GitHub App installation tokens are valid for one hour. So any
subprocess that runs longer than that — which routinely happens for
git lfs fetchon large LFS-heavy repositories — keeps presentingits now-expired token to GitHub on every retry and loops on
Bad credentialsuntil git-lfs's own retry budget gives up, typicallymany hours later. The TokenManager refreshes its in-process cache
fine, but the running subprocess has no way to see the new value.
This was first surfaced as flapping shadow tests in staging but
verified to affect production too. Failure logs look like:
Self-recovery happens only because the next periodic invocation gets a
fresh token and (if the LFS delta has shrunk enough) finishes inside
the 1 h window. No human action is involved.
Fix
Switch to a file-based credential helper. The token is written to a
0600temporary file and exposed to git as:Git re-invokes the helper on every credential query, so it reads the
file fresh each time. A background goroutine bound to the returned
cleanupre-fetches the token on a 30 s ticker and rewrites the fileatomically (write-temp + rename) whenever the value changes. Most
ticks are a cheap cached lookup in
TokenManager; only the pre-expiryrefresh actually hits the GitHub API. The atomic rename means a
concurrent
cateither sees the old complete content or the newcomplete content — never a partial token.
GitCommandnow returns acleanup func()alongside the*exec.Cmdthat callers must defer to stop the goroutine and remove the
credentials file. All in-tree callers (
executeClone,fetchInternal,GetUpstreamRefs, and the LFS snapshot fetch) areupdated.
cleanupis always non-nil and safe to call multiple times,including when
GitCommanditself returned an error.Tests
TestGitCommandWithCredentialProvideris tightened toassert that the token is not embedded in the helper string and
that the on-disk file contains the full credential response with
0600perms.TestGitCommand_CleanupRemovesCredentialFileverifies the file isremoved by
cleanupand thatcleanupis idempotent.TestGitCommand_RefreshGoroutineUpdatesFiledrives one tick of therefresh loop and asserts the file picks up a rotated token, then
verifies a same-token tick is a no-op (no file churn).
TestWriteCredentialFile_Atomicruns concurrent reader/writergoroutines for 200 rewrites and asserts the reader never observes a
partial write.
TestShellSingleQuotecovers the path-quoting helper.Full
just testandjust lintpass locally.