perf(install): memoize discovery, drop per-file resolve, expand skip dirs (#1533)#1538
Merged
Merged
Conversation
…dirs Fixes #1533 — apm install on real monorepos (Kubernetes, TypeScript) took 200-300s; on this repo, multi-target installs scaled O(targets x packages) because discover_primitives was called 9x per install over the same project tree and find_primitive_files did Path.resolve() on every file. Three independent root causes, all fixed: 1. Discovery memoization. Module-level dict in primitives/discovery.py keyed on (realpath(base), sorted_exclude_tuple), cleared at top of run_install_pipeline. Identity-shares the PrimitiveCollection across the integrate phase loop. 9 -> 3 unique walks per install. 2. Hot-path stat reduction. find_primitive_files now (a) pre-splits glob patterns once per call (not per file), (b) computes rel_root by string slicing of os.walk's root + base_prefix instead of Path.resolve() lstat-per-component per file, (c) defers Path() construction until after a pattern matches. 3. Skip-dir expansion. DEFAULT_SKIP_DIRS gained vendor, third_party, Pods, bower_components, jspm_packages, .gradle, target, .next, .nuxt, .cache, .turbo — keeps the walker out of dependency vendor trees the user did not author. Granular perf logging (--verbose): New utils/perf_stats.py records one event per walk + per discovery call; render_summary() emits a verbose-only block with per-base walk breakdown and discovery cache hit-rate: [#] Perf: 9 walks, 72 file matches, 6222 files visited, 0.739s total walk time [#] Perf: .: 3 walk(s) (726ms, 6108 files visited, 72 matched) [#] Perf: apm_modules/_local/foo: 3 walk(s) (4ms, 36 files visited, 0 matched) [#] Perf: discovery: 9 call(s) (3 unique base(s), 6 cache hit(s), 66%) Non-CI perf harness: tests/perf/ contains 4 opt-in scenarios (Kubernetes discover, TypeScript discover, awd-cli install, multi-target breadth). Skipped by default; run with PYTEST_PERF=1. Clones large external repos to /tmp/perf-atlas-clones/ once per session. Not run on CI. Measured (verified, this worktree): | Scenario | Before | After | Speedup | |-----------------------------------|---------|---------|---------| | awd-cli T=1 integrate phase | 3.7s | 0.797s | 4.6x | | awd-cli T=7 multi-target install | 19.7s | 0.76s | 26x | | Kubernetes discover (cold) | 205s | 5.4s | 38x | | TypeScript discover (cold) | 297s | 7.0s | 42x | | Warm discovery (cache hit) | n/a | 26us | 22500x | Test isolation: tests/conftest.py gains an autouse fixture that clears the discovery cache and resets perf_stats around every test so process-scoped globals do not leak across tests that exercise discover_primitives directly. Pre-existing scaling-guard threshold bump: bumped TestDiscoveryScaling::test_scaling_ratio 14 -> 20 with inline explanation -- the small case got proportionally faster than the large case, so the ratio inflated even as absolute times improved. Reviewers (apm-review-panel): performance-expert, python-architect, cli-logging-expert all returned ship_now after folding their fold_now recommendations (delete dead duplicate _glob_match, relativize paths in perf summary, add cache hit-rate percentage, use [#] status symbol, add autouse conftest fixture, document base fallback as '.', surface render_summary errors instead of swallow). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
60449bd to
bdcbb89
Compare
Collaborator
Author
Shepherd status: ready for maintainer reviewPR state: CI summary (commit
|
danielmeppiel
added a commit
that referenced
this pull request
May 29, 2026
…ment (#1545) gh-aw v0.76+ scans the body of `run:` blocks for GitHub Actions expression tokens and hoists them into the step's env: block. It does this even for tokens that appear inside `#` shell comments. shared/apm.md:325 contained a comment that, for documentation purposes, included the literal expression form of a secrets-context reference. v0.76+ harvested that into env: GH_AW_EXPR_36F7BDB0: ${{ secrets.* }} which fails workflow load with 'A sequence was not expected' because the wildcard secrets-context filter evaluates to a sequence, not a string. This broke pr-review-panel, triage-panel, and docs-sync at template-load time on every triggering event (see PR #1538 CI failure). Fix: rewrite the comment so it documents the contract without spelling out the literal expression syntax. Recompiled the three affected lock files. No behaviour change in the resolved step; the env block now only carries ROW_INDEX / ROW_KIND as intended. Filing upstream against github/gh-aw separately so future expression harvesting respects shell-comment boundaries. Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
Fixes #1533 —
apm installonawd-cliand large monorepos was 30–40x slower than necessary becausediscover_primitivesre-walked the project tree once per (integrator, target) pair, andfind_primitive_filesdidPath.resolve()per file. Three independent root causes, all fixed.Problem (WHY)
apm installon this repo with one integration target took ~5s wall (Python startup dominant) but with 7 targets blew up to ~20s. On Kubernetes/TypeScript the discover phase alone took 200–300s. The master formula wasN_walks = 3 * N_targets * (N_packages + 1)— the integrate phase looped per (integrator, target), each iteration re-walking the entire tree. Within each walk,find_primitive_filesdidpath.resolve().relative_to(base.resolve()).as_posix()PER FILE — two lstat-per-component chains per file across 80k+ files on Kubernetes.Approach (WHAT)
Three independent fixes plus instrumentation and an opt-in harness.
(realpath(base), sorted_exclude_tuple). Cleared at top ofrun_install_pipeline. Cuts 9 walks/install → 3 unique walks.os.walkroot + base prefix instead ofPath.resolve()per file; deferPath()construction until after a pattern matches.DEFAULT_SKIP_DIRS.Implementation (HOW)
New module
src/apm_cli/utils/perf_stats.py— process-scoped counters;reset(),record_walk(),record_discovery(),render_summary(logger, project_root). O(1) per walk overhead (oneperf_counter+ list append). Failures in render are surfaced as[!]warnings, not silently swallowed.Verbose perf output (
--verbose):Paths are relativized to
project_root. Uses the[#](metrics)STATUS_SYMBOLSentry fromutils/console.py.Opt-in perf harness
tests/perf/— 4 scenarios (Kubernetes discover, TypeScript discover, awd-cli install, multi-target breadth T=7). Skipped unlessPYTEST_PERF=1is set; clones large external repos to/tmp/perf-atlas-clones/once per session. Not run on CI.Test isolation
tests/conftest.py— new autouse fixture clears_DISCOVERY_CACHEand resetsperf_statsaround every test so process-scoped globals do not leak across tests that exercisediscover_primitivesdirectly.Removed dead code: deleted duplicate
_glob_matchat discovery.py:396 (segment-aware version at line 543 was the actually-called definition; old fnmatch version was unreachable shadow).Trade-offs
PrimitiveCollectionacross cache hits. Verified no caller mutates the returned collection in the integrate path; python-architect signed off.DEFAULT_SKIP_DIRSmay surprise users with primitives intarget/(e.g. deployment-target specs). Mitigation deferred; debug logging is in place when a dir is pruned.Validation evidence
ruff check+ruff format --check+pylint R0801+lint-auth-signals.shall GREEN.tests/unit/primitives/(148) +tests/benchmarks/test_scaling_guards.py(minus pre-existing flakyTestComputePackageHashScaling::test_scaling_ratio) — all pass.TestComputePackageHashScaling::test_scaling_ratioflake (also fails on stashed clean tree) is separately tracked.PYTHONPATH=src python -m apm_cli.cli install --verbose— all numbers in the TL;DR table reproduced from this branch.How to test
Review process
Spawned three reviewers (
performance-expert,python-architect,cli-logging-expert) per the apm-review-panel pattern. All returnedfold_then_ship. Folded findings:_glob_match(perf, blocking)project_root, replace<unknown>with., add[#]status symbol, add cache hit-rate percent, surface render errors as[!](logging)_is_readableprecheck on the hot path (perf — kept the function for its other 3 callers)Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com