fix(deps): fetch missing SHA-pinned commits into shallow bare clones#1259
Conversation
When a mono-repo package declares a transitive dependency pinned to a SHA that is not HEAD, the depth=1 bare clone lacks that commit object. Add a Tier-0 path in SharedCloneCache that reuses an existing bare for the same (host, owner, repo) and fetches the missing SHA into it before falling back to a fresh clone. Three-step strategy inside fetch_sha_into_bare(): 1. rev-parse -- already present? done. 2. git fetch --depth=1 <url> <sha> (full 40-char SHAs only). 3. git fetch <url> to broaden the shallow boundary. Also fix materialize_from_bare() to check out known_sha when provided instead of always checking out HEAD. Closes #1258 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR improves apm install reliability for mono-repos where transitive dependencies are pinned to a non-HEAD commit SHA, by reusing an existing shallow bare clone for the same repo and fetching the missing commit into it on demand. It also fixes materialize_from_bare() to check out known_sha when provided so consumers get the correct content when reusing a bare.
Changes:
- Add a
fetch_sha_into_bare()helper to hydrate an existing shallow bare with a missing SHA and update materialization to check outknown_shainstead of alwaysHEAD. - Extend
SharedCloneCachewith a repo-level reverse index and an optional Tier-0fetch_fnpath to reuse existing bares across different refs/SHA pins. - Wire the fetch path through
GitHubPackageDownloaderand add targeted unit tests plus a changelog entry.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/deps/bare_cache.py |
Adds on-demand SHA fetch logic for existing shallow bares; updates checkout target selection to respect known_sha. |
src/apm_cli/deps/shared_clone_cache.py |
Introduces repo-level bare reuse and a Tier-0 fetch hook before cloning a new bare. |
src/apm_cli/deps/github_downloader.py |
Plumbs a SHA-only fetch delegate into SharedCloneCache.get_or_clone() calls. |
tests/unit/deps/test_shared_clone_cache.py |
Adds unit coverage for SHA hydration, repo-level bare reuse, and checkout target behavior. |
CHANGELOG.md |
Documents the fix under Unreleased / Fixed. |
| # Step 3: broaden shallow -- fetch all refs without a SHA argument. | ||
| _log.debug("fetch_sha_into_bare: broadening shallow in %s to find %s", bare_path, sha[:12]) | ||
|
|
||
| def _fetch_action_broad(url: str, env: dict[str, str], target: Path) -> None: | ||
| subprocess.run( | ||
| [git_exe, "-C", str(bare_path), "fetch", url], | ||
| env=env, | ||
| check=True, | ||
| capture_output=True, | ||
| timeout=300, | ||
| ) |
There was a problem hiding this comment.
Fixed in a2b78f5. The broad fetch now uses --depth=50 (configurable via APM_BROAD_FETCH_DEPTH env var) instead of unbounded fetch. This brings in enough history to resolve most ancestor SHAs without downloading the entire repo.
| # Tier-0: try fetching the SHA into an existing bare for the | ||
| # same repo (different ref). This avoids a fresh network clone | ||
| # when a transitive dep pins a SHA that is missing only because | ||
| # the initial shallow bare did not include that commit. | ||
| if ref and fetch_fn: | ||
| existing_bare = self._find_repo_bare(host, owner, repo) | ||
| if existing_bare is not None: | ||
| try: | ||
| if fetch_fn(existing_bare, ref): | ||
| entry.path = existing_bare | ||
| return existing_bare |
There was a problem hiding this comment.
The API is safe-by-default: fetch_fn is only supplied by GitHubPackageDownloader._shared_bare_fetch_fn which is wired only for 40-char hex SHA refs (gated by is_commit_sha()). The get_or_clone parameter is intentionally named fetch_fn (not sha_fetch_fn) to keep the API general, but the only caller guards it. Added test test_fetch_fn_none_skips_tier0 to verify Tier-0 is bypassed when fetch_fn=None.
| # Verify "origin" is NOT used as a shorthand | ||
| assert "origin" not in call_args or call_args.index("origin") < call_args.index( | ||
| explicit_url | ||
| ) |
There was a problem hiding this comment.
Fixed in a2b78f5. The test now loops over all mock_run.call_args_list and asserts "origin" never appears in any fetch argv.
| # Tier-0: try fetching the SHA into an existing bare for the | ||
| # same repo (different ref). This avoids a fresh network clone | ||
| # when a transitive dep pins a SHA that is missing only because | ||
| # the initial shallow bare did not include that commit. | ||
| if ref and fetch_fn: | ||
| existing_bare = self._find_repo_bare(host, owner, repo) | ||
| if existing_bare is not None: | ||
| try: | ||
| if fetch_fn(existing_bare, ref): | ||
| entry.path = existing_bare | ||
| return existing_bare |
There was a problem hiding this comment.
Fixed in a2b78f5. Added _bare_fetch_locks: dict[Path, threading.Lock] to SharedCloneCache. Before invoking fetch_fn, the code acquires a per-bare-path lock (created under self._lock), serialising concurrent fetches into the same bare while still allowing parallel fetches to different bares. Locks are cleared in cleanup().
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 2 | 1 | Tier-0 fetch path is well-structured; one unguarded concurrent-fetch race on shared bare paths warrants a per-bare lock before this pattern is relied on at scale. |
| CLI Logging Expert | 0 | 3 | 2 | Debug logging is coherent and well-structured; two issues worth fixing: developer guidance embedded in a log message, and exc_info on a debug fallthrough that dumps tracebacks in verbose mode. |
| DevX UX Expert | 0 | 1 | 2 | Hard git exit-128 failure on SHA-pinned transitive deps becomes a working install; two minor rough edges: silent broad fetch can appear as a hang, short SHAs get no user hint. |
| Supply Chain Security Expert | 0 | 2 | 2 | SHA-pin integrity is sound (git content-addressing + post-fetch rev-parse verify); two recommended concerns: broad-fetch DoS surface and network I/O under lock. |
| OSS Growth Hacker | 0 | 2 | 1 | Fix unblocks mono-repo adopters (real churn stopper); CHANGELOG copy is too impl-focused to earn social share -- one sentence rewrite recommended. |
| Auth Expert | 0 | 1 | 1 | Auth routing is correct end-to-end; one defense-in-depth gap: FETCH_HEAD is not scrubbed after the new fetch actions, leaving the authenticated URL on disk. |
| Doc Writer | 0 | 1 | 2 | CHANGELOG entry is accurate and style-consistent; one recommended doc update in guides/dependencies.md to cover the on-demand SHA fetch behavior for mono-repo transitive deps. |
| Test Coverage Expert | 0 | 3 | 1 | 13 new unit tests cover all three changed modules well; install-pipeline floor tier requires integration-with-fixtures but the sha-https e2e variant cannot be certified without GITHUB_APM_PAT. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 5 follow-ups
- [Auth Expert] Truncate FETCH_HEAD after
_fetch_action_shaand_fetch_action_broadby calling_scrub_bare_remote_url(or(bare_path / 'FETCH_HEAD').write_text('')) after each successful subprocess.run -- Contradicts the established scrub contract documented at bare_cache.py line 186. Token-embedded URL persists in the shared bare for its lifetime. Strongest non-blocking finding in the panel. - [Python Architect] Add a per-bare-path RLock (or release
entry.lockbeforefetch_fnand re-acquire for atomic path assignment) to serialise concurrent Tier-0 fetches into the same bare -- Two parallel installs of SHA-pinned deps from the same repo can race on concurrent git fetch writes to the pack-file store, risking a corrupt shallow boundary. Also resolves the 300-second lock-contention window flagged by Supply Chain. - [Supply Chain Security Expert] Add
--depth=50(orAPM_BROAD_FETCH_DEPTHenv var) to_fetch_action_broad; if SHA still not found after bounded fetch, fail closed and fall through to fresh clone -- Unbounded broad fetch on a large or adversarially-large repo can exhaust disk and bandwidth. One-line fix with high defensive value. - [Test Coverage Expert] Add unit test asserting
GitHubPackageDownloaderpasses non-Nonefetch_fntoget_or_clonefor 40-char SHA refs andfetch_fn=Nonefor branch refs -- The_shared_bare_fetch_fnclosure wiring is the critical new behavior path; zero unit-level assertions exist for it. Missing test on a governed-by-policy surface. - [OSS Growth Hacker] Rewrite CHANGELOG entry to user-behavior framing: lead with the crash being fixed and the '1 clone + 1 fetch instead of 2 clones' performance win -- Current copy ('shallow bare clones') is maintainer vocabulary. Adopters who hit this crash will not recognize their pain. The performance side-effect is an underused positioning signal.
Architecture
classDiagram
direction LR
class SharedCloneCache {
<<ThreadSafeCache>>
-_lock: Lock
-_entries: dict
-_repo_bares: dict
-_temp_dirs: list
+get_or_clone(host, owner, repo, ref, clone_fn, fetch_fn) Path
+_find_repo_bare(host, owner, repo) Path
+_get_or_create_entry(key) _CacheEntry
+cleanup()
}
class _CacheEntry {
<<ValueObject>>
+lock: Lock
+path: Path
+error: Exception
}
class GitHubPackageDownloader {
<<Facade>>
+_bare_clone_with_fallback(...)
+_fetch_sha_into_bare(bare_path, sha, dep_ref) bool
+_materialize_from_bare(...)
+_execute_transport_plan(...)
}
class fetch_sha_into_bare {
<<Strategy>>
+fetch_sha_into_bare(execute_transport_plan, repo_url_base, bare_path, sha, dep_ref) bool
}
class materialize_from_bare {
<<Pure>>
+materialize_from_bare(bare_path, consumer_dir, ref, env, known_sha) str
}
class DependencyReference {
<<ValueObject>>
+repo_url: str
+ref: str
}
SharedCloneCache *-- _CacheEntry : per-key entries
GitHubPackageDownloader ..> SharedCloneCache : get_or_clone()
GitHubPackageDownloader ..> fetch_sha_into_bare : delegates via _fetch_sha_into_bare
GitHubPackageDownloader ..> materialize_from_bare : _materialize_from_bare
fetch_sha_into_bare ..> DependencyReference : reads
class SharedCloneCache:::touched
class GitHubPackageDownloader:::touched
class fetch_sha_into_bare:::touched
class materialize_from_bare:::touched
classDef touched fill:#fff3b0,stroke:#d47600
flowchart TD
A(["apm install: SHA-pinned transitive dep"]) --> B["github_downloader.py\nis_commit_sha = re.match SHA pattern"]
B -->|true| C["build _shared_bare_fetch_fn closure"]
C --> D["shared_clone_cache.py:get_or_clone()\nfetch_fn=_shared_bare_fetch_fn"]
D --> E{"entry.path already set?"}
E -->|yes| DONE(["return cached bare path"])
E -->|no| F{"ref and fetch_fn provided?"}
F -->|no| CLONE
F -->|yes| G["_find_repo_bare()\n[LOCK] lookup _repo_bares"]
G -->|None found| CLONE
G -->|bare exists| H["bare_cache.py:fetch_sha_into_bare()\nTier-0 path"]
H --> I["git rev-parse --verify sha"]
I -->|present| J["entry.path = existing_bare\nreturn"]
I -->|"missing, len==40"| K["git fetch --depth=1 url sha"]
K -->|success| J
K -->|fail| L["git fetch url -- broaden shallow"]
L -->|success| J
L -->|fail| CLONE
CLONE["tempfile.mkdtemp\nclone_fn(bare_target)"]
CLONE --> M["[LOCK] _repo_bares register"]
M --> N["entry.path = clone_path"]
N --> O["materialize_from_bare\ncheckout_target = known_sha or HEAD"]
J --> O
O --> DONE
style H fill:#fff3b0,stroke:#d47600
style G fill:#fff3b0,stroke:#d47600
style M fill:#fff3b0,stroke:#d47600
style O fill:#fff3b0,stroke:#d47600
sequenceDiagram
participant I as apm install
participant GD as GitHubPackageDownloader
participant SCC as SharedCloneCache
participant BC as fetch_sha_into_bare
participant Git as git subprocess
I->>GD: download dep (owner/repo@SHA_PIN)
GD->>SCC: get_or_clone(host,owner,repo,sha, clone_fn, fetch_fn)
SCC->>SCC: _find_repo_bare(host,owner,repo)
Note over SCC: Returns existing bare from prior ref clone
SCC->>GD: fetch_fn(existing_bare, sha)
GD->>BC: fetch_sha_into_bare(execute_transport_plan, url, bare, sha)
BC->>Git: rev-parse --verify sha
Git-->>BC: returncode != 0 (missing)
BC->>Git: git fetch --depth=1 url sha
Git-->>BC: success
BC->>Git: rev-parse --verify sha
Git-->>BC: returncode == 0
BC-->>GD: True
GD-->>SCC: True
SCC-->>GD: existing_bare path
GD->>Git: materialize_from_bare (git clone --local)
Git-->>GD: consumer working tree at known_sha
GD-->>I: resolved dep
Recommendation
Ship this PR. The core fix is correct, the SHA-pin integrity story is sound, and leaving mono-repo adopters on an exit-128 crash is worse than any of the recommended gaps. However, three issues should be resolved in a fast-follow (or in this PR if the author can turn them around quickly): the FETCH_HEAD scrub gap (broken security invariant), the per-bare-path fetch lock (concurrent corruption risk), and the broad-fetch depth guard (DoS surface). The test-coverage gaps and CHANGELOG rewrite are clean follow-ups that can land in the next patch cycle without holding this fix.
Full per-persona findings
Python Architect
-
[recommended] Concurrent Tier-0 fetches into the same bare path are unguarded at
src/apm_cli/deps/shared_clone_cache.py:97
Two parallel threads with different SHA-pinned transitive deps on the same repo will both pass_find_repo_bare()-> samebare_path, then callfetch_fn(same_bare, sha1)andfetch_fn(same_bare, sha2)concurrently. Git's concurrent writes to SHALLOW, pack-files, and refs/FETCH_HEAD are not serialised and can produce a corrupt or inconsistent shallow boundary.
Suggested: Add a per-bare-path RLock in SharedCloneCache (e.g.self._bare_fetch_locks: dict[Path, threading.Lock]). Acquire it aroundfetch_fn(existing_bare, ref)inget_or_clone. -
[recommended]
_logassigned as local variable insidefetch_sha_into_bareon every call atsrc/apm_cli/deps/bare_cache.py:346
_log = logging.getLogger(__name__)is inside the function body, not at module level. Inconsistent withshared_clone_cache.pypattern; minor overhead at scale.
Suggested: Hoist_log = logging.getLogger(__name__)to module level. -
[nit]
_shared_bare_fetch_fnparameter namedshabut receivesreffromget_or_cloneatsrc/apm_cli/deps/github_downloader.py:1111
fetch_fnsignature isCallable[[Path, str], bool]andget_or_clonecalls it asfetch_fn(existing_bare, ref). The mismatch between outerrefand innershais a minor readability trap.
Suggested: Rename the closure parameter fromshatoref, or add an inline comment.
CLI Logging Expert
-
[recommended] Caller guidance embedded inside log message text at
src/apm_cli/deps/bare_cache.py:446
The final exhausted-attempts message appends'; caller should fall back to a fresh clone'-- this is a code comment masquerading as a log message. Log messages describe what happened, not what the caller should do next.
Suggested:_log.debug('fetch_sha_into_bare: all fetch attempts exhausted for %s in %s', sha[:12], bare_path) -
[recommended]
exc_info=Trueon a debug-level fallthrough dumps full tracebacks in verbose mode atsrc/apm_cli/deps/shared_clone_cache.py:119
In DEBUG/verbose mode, every SHA-cache miss that falls through to a fresh clone emits a full traceback. High noise for an expected fallback path.
Suggested: Dropexc_info=Trueand promote to info:_log.info('Bare fetch miss for %s/%s/%s ref=%s, falling back to fresh clone', host, owner, repo, ref) -
[recommended] No user-facing signal when a network fetch fires during install at
src/apm_cli/deps/bare_cache.py:374
Steps 2 and 3 may each trigger a network fetch with a 300-second timeout. All messages are at DEBUG -- invisible in default mode. A slow broad fetch is indistinguishable from a hang.
Suggested: Add_log.info('Hydrating missing commit %s into shared bare for %s', sha[:12], repo_url_base)before step 2. -
[nit]
CalledProcessErrorbranches swallow subprocess stderr silently atsrc/apm_cli/deps/bare_cache.py:399
CalledProcessError.stderrcontains the git error output invaluable for debugging auth or protocol failures in CI.
Suggested: Catch asexcept subprocess.CalledProcessError as excand appendexc.stderr.decode(errors='replace').strip()to the debug message. -
[nit] Function-name prefix on every message is redundant with
%(funcName)sin log formatters atsrc/apm_cli/deps/bare_cache.py:369
All 11 debug lines are prefixed with'fetch_sha_into_bare: '. Minor inconsistency with the rest of the deps module.
DevX UX Expert
-
[recommended] Broad fetch (step 3) is fully silent and can take 30-60s on large repos -- indistinguishable from a hang at
src/apm_cli/deps/bare_cache.py
Step 3 runsgit fetch <url>withcapture_output=True. All output is at DEBUG level. A user whose install triggers this path sees zero terminal activity for potentially a minute.
Suggested: Emit a single info-level line before_fetch_action_broadexecutes (e.g., via_rich_infohelper), suppressed on the fast step-1 hit path. -
[nit] Short SHAs silently fall through to broad fetch with no user hint that the pin is suboptimal at
src/apm_cli/deps/bare_cache.py
When a dep is pinned to a 7-char SHA, step 2 is skipped and step 3 runs a full broad fetch. No feedback that switching to full 40-char SHA would make future installs faster.
Suggested: Log a debug/info message whenlen(sha) != 40noting that full SHAs enable targeted fetches. -
[nit] When Tier-0 fetch succeeds,
_repo_baresis not updated for the reused bare atsrc/apm_cli/deps/shared_clone_cache.py
After a successful Tier-0 fetch, the reverse index is stale (no user-visible breakage, but asymmetric with the fresh-clone path).
Suggested: Afterentry.path = existing_bare, append(ref, existing_bare)to_repo_bares.
Supply Chain Security Expert
-
[recommended] Broad fetch (step 3) has no depth/size guard -- full repo history download possible at
src/apm_cli/deps/bare_cache.py
_fetch_action_broadrunsgit fetch <url>with no--depthand no refspec. A slow or adversarially-large remote can exhaust disk and bandwidth within the 300-second timeout.
Suggested: Add--depth=50(orAPM_BROAD_FETCH_DEPTHenv var). If SHA still not found, fail closed and fall through to fresh clone. -
[recommended] Network I/O (
fetch_fn) is invoked whileentry.lockis held, risking 300-second lock contention atsrc/apm_cli/deps/shared_clone_cache.py
Any other thread requesting the same(host, owner, repo, ref)key blocks for the entire fetch duration. Under concurrent installs this creates a starvation condition.
Suggested: Releaseentry.lockbefore callingfetch_fn, or use a separate per-fetch lock. Re-acquire for atomicentry.pathassignment after success. -
[nit]
is_commit_sharegex accepts 7-39 char short SHAs;rev-parseverification of short SHAs is theoretically ambiguous atsrc/apm_cli/deps/github_downloader.py
Short SHAs can collide in large repos (git disambiguation is best-effort). If the lockfile spec mandates full 40-char SHAs, tightenis_commit_shafor the fetch optimization path. -
[nit] Tier-0 success path does not register the new
(ref, bare_path)in_repo_baresatsrc/apm_cli/deps/shared_clone_cache.py
Subtly asymmetric between Tier-0 and fresh-clone paths; no security impact.
OSS Growth Hacker
-
[recommended] CHANGELOG entry speaks to implementation, not user pain or outcome at
CHANGELOG.md:40
The phrase 'shallow bare clones' is maintainer vocabulary. A mono-repo adopter who hit this crash recognizes their pain only if they already know the internals.
Suggested: Rewrite to lead with the user symptom:'Fixed: apm install hard-failure for mono-repo packages whose transitive dependencies reference a sibling commit other than HEAD. Now resolves with a single in-place fetch instead of crashing -- and two SHA-pinned deps from the same repo cost one clone + one fetch, not two clones. (#1258)' -
[recommended] Performance win (1 clone + 1 fetch vs 2 clones) is buried and unnamed
The clone-reuse side-effect reinforces APM's 'reproducible everywhere, fast' story vs. pip/cargo. Not mentioned in CHANGELOG at all.
Suggested: Add sub-bullet:'As a side-effect, multiple SHA-pinned references to the same repository now share a single bare clone -- reducing redundant network fetches during install.' -
[nit] No docs/quickstart mention of mono-repo transitive dep support with SHA pinning
Users who hit Shallow clone breaks SHA-pinned intra-repo dependencies in mono-repos #1258 found it through pain, not documentation. A short 'Mono-repo transitive dependencies' example in the quickstart would surface this fix proactively.
Suggested: File a follow-up doc task for the quickstart.
Auth Expert
-
[recommended] FETCH_HEAD not truncated after
_fetch_action_sha/_fetch_action_broadatsrc/apm_cli/deps/bare_cache.py:380
git fetch <authenticated-url> <sha>writes the token-embedded URL into FETCH_HEAD. The established scrub contract (_scrub_bare_remote_url) explicitly truncates FETCH_HEAD for this reason. The new fetch closures skip this step, leaving the live token on disk for the lifetime of the shared bare.
Suggested: Call_scrub_bare_remote_url(bare_path, git_exe, env)after each successful fetch action, or at minimum(bare_path / 'FETCH_HEAD').write_text(''). -
[nit] Cross-ref bare reuse auth context is correct but deserves an inline comment at
src/apm_cli/deps/shared_clone_cache.py:113
When_find_repo_barereturns a bare cloned by a differentdep_ref, the_shared_bare_fetch_fnclosure uses the currentdep_reffor auth. Correct (same repo = same auth scope), but a future reader may not see why.
Doc Writer
-
[nit] CHANGELOG entry uses git-internals term 'shallow bare clones' at
CHANGELOG.md
Every other CHANGELOG entry is written from a user-behavior perspective. 'Cached bare clones' or just 'cached clones' conveys the same information without requiring git internals knowledge. -
[recommended]
guides/dependencies.mdparallel-download section should note on-demand SHA fetch for mono-repo transitive deps atdocs/src/content/docs/guides/dependencies.md:789
Line 789 describes bare-clone sharing but says nothing about what happens when a transitive dep pins a SHA that isn't the HEAD of the initial clone. Before this fix that scenario hard-failed. A single sentence addition closes the gap.
Suggested:'When a transitive dependency pins a commit SHA that differs from the ref used for the initial clone, APM fetches that specific commit into the existing bare clone on demand (git fetch --depth=1 <url> <sha>) rather than re-cloning.' -
[nit] CHANGELOG entry issue reference (Shallow clone breaks SHA-pinned intra-repo dependencies in mono-repos #1258) is confirmed correct -- no action needed.
Test Coverage Expert
-
[recommended] Install-pipeline floor tier not certifiable: sha-https integration variant requires live credential to run
test_install_subdir_dedup_e2e.py::test_two_subdirs_same_repo_parallel[sha-https]exercises the fullGithubDownloader -> SharedCloneCache (Tier-0) -> fetch_sha_into_bare()path. Skips withoutGITHUB_APM_PATso certification cannot be issued in this review.
Suggested: Confirm CI runs the parametrized sha-https variant with the credential present. No new test needed if CI is green.
Proof (unknown):tests/integration/test_install_subdir_dedup_e2e.py::test_two_subdirs_same_repo_parallel[sha-https]-- proves: apm install with a SHA-pinned subdir dep reuses an existing bare clone instead of fetching from scratch [devx,governed-by-policy] -
[recommended]
_shared_bare_fetch_fnclosure wiring fromGithubDownloadertofetch_sha_into_barehas no unit-level integration test
Theis_commit_shaguard that conditionally passesfetch_fn=Nonefor non-SHA refs has zero unit-level assertions. Grep confirmed: no hits for_shared_bare_fetch_fnor_fetch_sha_into_barein tests outside bare_cache imports.
Suggested: Add a test intests/unit/deps/test_github_downloader_validation.pyasserting thatget_or_clone()is called with non-Nonefetch_fnfor a 40-char SHA ref andfetch_fn=Nonefor a branch ref.
Proof (missing):tests/unit/deps/test_github_downloader_validation.py::test_sha_ref_wires_fetch_fn_into_get_or_clone-- proves: apm install of a SHA-pinned dep passes the fetch-into-bare closure to SharedCloneCache; a branch ref passes fetch_fn=None [devx,governed-by-policy] -
[recommended]
checkout_targetbug-fix regression trapped at unit tier only; integration-tier gap
TestMaterializeCheckoutTarget(2 tests) traps the regression at unit tier. Floor tier formaterialize_from_baresurface is integration-with-fixtures. No test inTestMaterializeFromBare(real subprocess) asserts the materialized HEAD equalsknown_sha. Greppedtests/integration/for 'known_sha' and 'checkout_target': zero hits.
Suggested: ExtendTestMaterializeFromBarewith a real-bare test assertinggit -C consumer_dir rev-parse HEAD == known_sha.
Proof (missing):tests/unit/deps/test_shared_clone_cache.py::test_materialize_from_bare_known_sha_checked_out-- proves: apm install of a SHA-pinned dep materializes the correct commit, not the branch HEAD [devx,governed-by-policy] -
[nit] No concurrent-access test specifically for the new
_find_repo_bare()reverse-index lookup
_repo_baresis populated inside a different critical section than the main lock, so a race between_find_repo_bareand population for a just-completed first clone is theoretically possible. Existing concurrency tests do not cover this interleaving.
Suggested: Optional follow-up: add a thread-interleaving test.
Proof (missing):tests/unit/deps/test_shared_clone_cache.py::test_find_repo_bare_concurrent_population_race-- proves: two threads installing SHA-pinned deps from the same repo do not race on the _repo_bares index [devx]
This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.
Generated by PR Review Panel for issue #1259 · ● 3.5M · ◷
- Scrub FETCH_HEAD after each fetch in fetch_sha_into_bare to prevent token-embedded URL persisting on disk (security finding) - Add per-bare-path lock (_bare_fetch_locks) in SharedCloneCache to serialise concurrent Tier-0 fetches into the same bare (correctness) - Cap broad fetch with --depth=50 (APM_BROAD_FETCH_DEPTH override) to prevent unbounded history download on adversarial repos (security) - Hoist _log to module level in bare_cache.py (was re-created per call) - Include exc.stderr in CalledProcessError debug messages; drop exc_info=True on expected fallback paths to avoid spurious tracebacks in verbose mode - Drop caller-guidance suffix from exhausted-attempts debug message - Add info-level log before broad fetch (visible without -v) - Update _repo_bares after Tier-0 success so the index stays current - Change Tier-0 exception handler to info level; drop exc_info=True - Rename _shared_bare_fetch_fn param sha->ref_or_sha for clarity - Rewrite CHANGELOG entry for #1258 with concrete symptom wording - Add docs sentence about on-demand SHA hydration in Parallel Downloads - Strengthen test_fetch_action_uses_explicit_url_not_origin: assert no 'origin' shorthand in any fetch argv - Add test_fetch_fn_none_skips_tier0: verifies Tier-0 is bypassed when fetch_fn=None even with populated repo_bares - Add test_materialize_known_sha_checks_out_correct_commit: real-git integration test verifying known_sha checks out the right commit Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…1267) (#1268) * fix(deps): use --git-dir for bare repos + pin fetched SHAs as refs (#1267) Two bugs discovered during E2E validation of #1259: 1. git -C <bare> fails on git 2.53.0 (safe.bareRepository=explicit default). Switch all 8 bare-repo subprocess calls to --git-dir. 2. fetch_sha_into_bare inserts SHAs into the object store without a ref, so git clone --local --shared (which uses upload-pack) silently omits them. Pin each fetched SHA as refs/heads/apm-pin-<sha12>. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(deps): address review feedback on bare-cache fixes (#1268) - Guard _pin_sha_as_head_ref log behind returncode check (C1/P1) - Use target param instead of bare_path in fetch closures (C2/C3) - Reword docstring to use sha-prefix (C4) - Add env= omission comments for local-only git plumbing (P2/P5) - Add hex validation guard for SHA in pin ref creation (P3) - Reframe CHANGELOG to lead with user outcome (P4/P6) - Add integration test for fetch_sha_into_bare with real git (P7) - Add pin_argv assertions to shallow/broad fetch unit tests (P8) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(tests): address review feedback on integration test + CHANGELOG Test file: - Replace em dash with ASCII -- in comment (encoding contract) - Use Path.as_uri() instead of f-string for cross-platform file:// URL - Add type hints to _make_execute, inner execute_transport_plan, and counting_execute (Callable[..., None] / DependencyReference / Any) - Reword test docstring from <sha12> to <sha-prefix> to match the production docstring contract CHANGELOG: - Remove duplicate entry from already-released [0.13.0] section; the fix lands under [Unreleased] which is the only valid target per changelog.instructions.md - Adopt the user-outcome-first wording in the [Unreleased] entry Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Sergio Sisternes <sergio.sisternes@epam.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Daniel Meppiel <copilot-rework@github.com>
fix(deps): fetch missing SHA-pinned commits into shallow bare clones
TL;DR
When a mono-repo package declares a transitive dependency pinned to a SHA that is not HEAD,
apm installfails because the depth=1 bare clone lacks that commit. This PR adds a Tier-0 fetch path inSharedCloneCachethat reuses an existing bare for the same repository and fetches the missing SHA into it on demand — turning a hard failure into a single lightweightgit fetch --depth=1 <url> <sha>.Note
Closes #1258. No breaking changes — the new path is additive and only activates when
rev-parse --verifyfails for a SHA ref.Problem (WHY)
apm installcreates depth=1 bare clones (--depth=1). Only HEAD exists in the object store; any SHA-pinned transitive dependency referencing a commit other than HEAD fails withrev-parse --verify <sha>^{commit}exit code 128.materialize_from_bare()always checks outHEADregardless of theknown_shaparameter, so even if the correct commit were fetched, the wrong files would be served when reusing a bare whose HEAD points elsewhere.(host, owner, repo, ref)cache key) instead of reusing a single bare — wasteful in both disk and network I/O.Why these matter: the depth=1 strategy is correct for the common case (direct deps resolve to HEAD via the lockfile), but SHA-pinned transitive deps are a valid and documented use case. GitHub's
uploadpack.allowReachableSHA1InWantenablesgit fetch --depth=1 <url> <sha>— APM just never attempted it.Approach (WHAT)
fetch_sha_into_bare()— a 3-step strategy (rev-parse check → shallow SHA fetch → broaden shallow) that hydrates an existing bare with a missing commit._repo_bares) toSharedCloneCacheso it can find an existing bare for the same(host, owner, repo)regardless of ref.fetch_fnparameter throughget_or_clone()→ tried as Tier-0 before falling through to a fresh clone.materialize_from_bare()to check outknown_shawhen provided instead of hardcodedHEAD.Implementation (HOW)
src/apm_cli/deps/bare_cache.py— Newfetch_sha_into_bare()free function (160 lines). Uses explicit URLs from the transport plan, nevergit fetch origin(remote URL isredacted://after_scrub_bare_remote_url). Short SHAs (len < 40) skip step 2 (Git protocol requires full SHA forfetch --depth=1). Does NOT callgit update-ref HEAD— consumer handles resolution viaknown_sha. Also fixesmaterialize_from_bare()checkout target:checkout_target = known_sha or \"HEAD\"(permalink).src/apm_cli/deps/shared_clone_cache.py— New_repo_baresdict mapping(host, owner, repo)→list[(ref, Path)]. New_find_repo_bare()method.get_or_clone()accepts optionalfetch_fnand runs Tier-0 before the fresh-clone path. Reverse index populated after successful clone, cleared oncleanup().src/apm_cli/deps/github_downloader.py— New_fetch_sha_into_bare()delegate method (thin wrapper). Builds_shared_bare_fetch_fnclosure and passes it toget_or_clone()only whenis_commit_shais truthy — non-SHA refs (branches, tags) bypass Tier-0 entirely.tests/unit/deps/test_shared_clone_cache.py— 13 new tests across 3 test classes (see Scenario Evidence below).CHANGELOG.md— Entry under## [Unreleased]/### Fixed.Diagrams
Legend: the Tier-0 fetch path inserted into
SharedCloneCache.get_or_clone()— shows how a second SHA-pinned dep for the same repo reuses the existing bare instead of cloning fresh.flowchart TD A[\"get_or_clone(host, owner, repo, sha2, clone_fn, fetch_fn)\"] --> B{Cache hit?} B -->|Yes| C[Return cached path] B -->|No| D{fetch_fn provided?} D -->|No| G[\"Fresh clone (existing path)\"] D -->|Yes| E[\"_find_repo_bare(host, owner, repo)\"] E -->|None| G E -->|bare_path| F[\"fetch_fn(bare_path, sha2)\"] F -->|True| H[Cache bare_path, return] F -->|False| G G --> I[Clone into new temp dir] I --> J[Register in _repo_bares] J --> K[Return clone_path] classDef new stroke-dasharray: 5 5; class D,E,F,H new;Trade-offs
clone.depth. This is orthogonal to the bug and adds configuration surface without clear demand. Deferred to a separate issue if needed.fetch --depth=1 <url> <sha>. Short SHAs fall through to step 3 (broad fetch), which is slower but correct.fetch_fnexceptions are swallowed. If the fetch raises, the cache falls through to a fresh clone rather than propagating. This keeps the happy path (no SHA pins) completely unaffected and avoids breaking installs due to transient fetch failures.update-ref HEADafter fetch. The bare's HEAD remains unchanged;materialize_from_bareusesknown_shadirectly. This avoids race conditions when multiple threads share the same bare.Benefits
apm installsucceeds for mono-repo packages with SHA-pinned transitive dependencies that reference commits other than HEAD — previously a hard failure.fetch_fnis only passed whenis_commit_shais truthy; non-SHA deps follow the existing clone path unchanged.materialize_from_barenow checks out the correct commit whenknown_shais provided, fixing a latent bug where reusing a bare with a different HEAD would serve wrong files.Validation
Lint verification (ruff check + format)
Full pytest output (38 tests)
Scenario Evidence
apm installon a mono-repo with SHA-pinned transitive deps succeeds when the pinned SHA is not HEADtests/unit/deps/test_shared_clone_cache.py::TestFetchShaIntoBare::test_shallow_fetch_full_sha_succeeds(regression-trap for #1258)tests/unit/deps/test_shared_clone_cache.py::TestFetchShaIntoBare::test_sha_already_present_returns_true_without_fetchtests/unit/deps/test_shared_clone_cache.py::TestFetchShaIntoBare::test_short_sha_skips_shallow_fetch_goes_to_broadtests/unit/deps/test_shared_clone_cache.py::TestSharedCloneCacheRepoReuse::test_fetch_fn_failure_falls_through_to_fresh_clonetests/unit/deps/test_shared_clone_cache.py::TestSharedCloneCacheRepoReuse::test_fetch_fn_exception_falls_through_to_fresh_clonetests/unit/deps/test_shared_clone_cache.py::TestSharedCloneCacheRepoReuse::test_fetch_fn_reuses_existing_bare_for_different_shamaterialize_from_barechecks outknown_shawhen provided, not HEADtests/unit/deps/test_shared_clone_cache.py::TestMaterializeCheckoutTarget::test_checkout_uses_known_sha_when_providedorigin(which is redacted)tests/unit/deps/test_shared_clone_cache.py::TestFetchShaIntoBare::test_fetch_action_uses_explicit_url_not_originHow to test
uv run --extra dev pytest tests/unit/deps/test_shared_clone_cache.py -x -v— all 38 tests pass (25 existing + 13 new).uv run --extra dev ruff check src/ tests/ && uv run --extra dev ruff format --check src/ tests/— lint clean, no diagnostics.fetch_sha_into_bare()— verify that step 2 is guarded bylen(sha) == 40and that all fetch actions use the URL from the transport plan.materialize_from_bare()usescheckout_target = known_sha or \"HEAD\"at bare_cache.py L551.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com