Skip to content

feat(topology): recover codebase statistics per sub-package off the single scc walk#121

Merged
maudlin merged 1 commit into
mainfrom
78-perpackage-stats
Jun 24, 2026
Merged

feat(topology): recover codebase statistics per sub-package off the single scc walk#121
maudlin merged 1 commit into
mainfrom
78-perpackage-stats

Conversation

@maudlin

@maudlin maudlin commented Jun 24, 2026

Copy link
Copy Markdown
Owner

What

On an undeclared fan-out, codebase-stats ran once over the whole tree, so each sub-package's totals were mushed into one record and unlabelled. This recovers it per assessment root: slice the first-party keep-set to each root's subtree (scc_keep_for_root) and re-aggregate the one cached scc --by-file walk — reuse, never re-walk, so no extra scc cost. Records are namespaced via SLUG_NS (api/codebase-stats); the console table is labelled per package (📦 api/).

This is #78 Phase 2b increment 2, first slice — the scc-measured arm. The engine-routed complexity and duplication arms still run whole-tree; recovering them needs the engine (eslint / lizard / jscpd) re-routed per sub-package, which is a meatier, separable change — tracked as a follow-up (see issue comment).

Why this is safe

scc_keep_for_root "." returns the full keep-set path unchanged, and the loop runs once with SLUG_NS empty → a single package / declared workspace is byte-identical to before (the acceptance gate). Verified end-to-end: every parsed record on a single-package fixture is identical between main and this branch (modulo the absolute CHECKUP_OUT_DIR path embedded in one message string).

Verification

  • test/scc-inventory.test.sh — added scc_keep_for_root unit cases (slice correctness, . identity, ./-prefix normalisation, nested-root filename flattening, breakdown round-trip): 39 passed.
  • Full CI suite green locally (12 suites).
  • End-to-end smoke on a Go/JS… (node) undeclared fan-out: undeclared-fan-outapi/codebase-stats + web/codebase-stats, labelled.
  • End-to-end byte-identical diff on a single-package fixture.

Refs #78

🤖 Generated with Claude Code

…ingle scc walk (#78)

On an undeclared fan-out, `codebase-stats` ran once over the whole tree, so each
sub-package's totals were mushed into one record and unlabelled. Recover it per
assessment root by slicing the first-party keep-set to each root's subtree
(`scc_keep_for_root`) and re-aggregating the ONE cached `scc --by-file` walk —
reuse, never re-walk, so no extra scc cost. Records are namespaced via `SLUG_NS`
(`api/codebase-stats`) and the console table is labelled per package.

`scc_keep_for_root "."` returns the full keep-set path unchanged, and the loop
runs once with `SLUG_NS` empty, so a single package / declared workspace is
byte-identical to before (verified end-to-end: all parsed records unchanged).

This is #78 Phase 2b increment 2, first slice. The engine-routed `complexity`
and `duplication` arms still run whole-tree — recovering them needs the engine
(eslint / lizard / jscpd) re-routed per sub-package, tracked as a follow-up.

Refs #78

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_012oHR4g8pH7Ui242SRycFzw
@maudlin maudlin merged commit 0e56115 into main Jun 24, 2026
6 checks passed
@maudlin maudlin deleted the 78-perpackage-stats branch June 24, 2026 21:51
maudlin added a commit that referenced this pull request Jun 25, 2026
…ine routing (#78) (#123)

The last whole-tree measurement arm. On an undeclared fan-out, complexity ran
once over the whole tree, routed by whole-tree engine selection. Recover it per
assessment root with the FULL engine ladder re-routed per child
(route_complexity_child, a scoped mirror of the whole-tree routing): ESLint on
the JS/TS slice using the child's OWN flat config + local bin, lizard on the
non-JS slice (inventory sliced to the subtree), scc fallback (keep-set sliced).
Findings are namespaced (backend/complexity) and labelled per package.

The single git-hotspots CSV is accumulated across packages — truncated once
before the loop, appended per arm, always in TARGET-relative (namespaced) paths
(scc/lizard/eslint findings re-prefixed; the standalone-lizard CSV's file column
namespaced via awk) — so the churn × complexity join stays whole-tree. If nothing
is measured the (empty) CSV is dropped, matching the pre-#78 absent state.

Single package / declared workspace → one iteration at "." reusing the whole-tree
routing, no cd, SLUG_NS empty, $PWD == $TARGET → byte-identical to before, CSV
included (verified end-to-end on the lizard, scc and merged arms + full
parsed-set diff). A fan-out routes each child to its own engine and accumulates a
namespaced CSV.

With stats (#121), duplication (#122) and now complexity, every measurement arm
recovers per sub-package — #78 is complete.

Closes #78


Claude-Session: https://claude.ai/code/session_012oHR4g8pH7Ui242SRycFzw

Co-authored-by: Mark Ridley <210189+maudlin@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant