feat(topology): recover codebase statistics per sub-package off the single scc walk by maudlin · Pull Request #121 · maudlin/checkup

maudlin · 2026-06-24T21:43:06Z

What

On an undeclared fan-out, codebase-stats ran once over the whole tree, so each sub-package's totals were mushed into one record and unlabelled. This recovers it per assessment root: slice the first-party keep-set to each root's subtree (scc_keep_for_root) and re-aggregate the one cached scc --by-file walk — reuse, never re-walk, so no extra scc cost. Records are namespaced via SLUG_NS (api/codebase-stats); the console table is labelled per package (📦 api/).

This is #78 Phase 2b increment 2, first slice — the scc-measured arm. The engine-routed complexity and duplication arms still run whole-tree; recovering them needs the engine (eslint / lizard / jscpd) re-routed per sub-package, which is a meatier, separable change — tracked as a follow-up (see issue comment).

Why this is safe

scc_keep_for_root "." returns the full keep-set path unchanged, and the loop runs once with SLUG_NS empty → a single package / declared workspace is byte-identical to before (the acceptance gate). Verified end-to-end: every parsed record on a single-package fixture is identical between main and this branch (modulo the absolute CHECKUP_OUT_DIR path embedded in one message string).

Verification

test/scc-inventory.test.sh — added scc_keep_for_root unit cases (slice correctness, . identity, ./-prefix normalisation, nested-root filename flattening, breakdown round-trip): 39 passed.
Full CI suite green locally (12 suites).
End-to-end smoke on a Go/JS… (node) undeclared fan-out: undeclared-fan-out → api/codebase-stats + web/codebase-stats, labelled.
End-to-end byte-identical diff on a single-package fixture.

Refs #78

🤖 Generated with Claude Code

…ingle scc walk (#78) On an undeclared fan-out, `codebase-stats` ran once over the whole tree, so each sub-package's totals were mushed into one record and unlabelled. Recover it per assessment root by slicing the first-party keep-set to each root's subtree (`scc_keep_for_root`) and re-aggregating the ONE cached `scc --by-file` walk — reuse, never re-walk, so no extra scc cost. Records are namespaced via `SLUG_NS` (`api/codebase-stats`) and the console table is labelled per package. `scc_keep_for_root "."` returns the full keep-set path unchanged, and the loop runs once with `SLUG_NS` empty, so a single package / declared workspace is byte-identical to before (verified end-to-end: all parsed records unchanged). This is #78 Phase 2b increment 2, first slice. The engine-routed `complexity` and `duplication` arms still run whole-tree — recovering them needs the engine (eslint / lizard / jscpd) re-routed per sub-package, tracked as a follow-up. Refs #78 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012oHR4g8pH7Ui242SRycFzw

…ine routing (#78) (#123) The last whole-tree measurement arm. On an undeclared fan-out, complexity ran once over the whole tree, routed by whole-tree engine selection. Recover it per assessment root with the FULL engine ladder re-routed per child (route_complexity_child, a scoped mirror of the whole-tree routing): ESLint on the JS/TS slice using the child's OWN flat config + local bin, lizard on the non-JS slice (inventory sliced to the subtree), scc fallback (keep-set sliced). Findings are namespaced (backend/complexity) and labelled per package. The single git-hotspots CSV is accumulated across packages — truncated once before the loop, appended per arm, always in TARGET-relative (namespaced) paths (scc/lizard/eslint findings re-prefixed; the standalone-lizard CSV's file column namespaced via awk) — so the churn × complexity join stays whole-tree. If nothing is measured the (empty) CSV is dropped, matching the pre-#78 absent state. Single package / declared workspace → one iteration at "." reusing the whole-tree routing, no cd, SLUG_NS empty, $PWD == $TARGET → byte-identical to before, CSV included (verified end-to-end on the lizard, scc and merged arms + full parsed-set diff). A fan-out routes each child to its own engine and accumulates a namespaced CSV. With stats (#121), duplication (#122) and now complexity, every measurement arm recovers per sub-package — #78 is complete. Closes #78 Claude-Session: https://claude.ai/code/session_012oHR4g8pH7Ui242SRycFzw Co-authored-by: Mark Ridley <210189+maudlin@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

maudlin mentioned this pull request Jun 24, 2026

Scan root/scope is a hypothesis: detect topology, judge it, recover (assess where first-party code actually lives) #78

Closed

maudlin merged commit 0e56115 into main Jun 24, 2026
6 checks passed

maudlin deleted the 78-perpackage-stats branch June 24, 2026 21:51

maudlin mentioned this pull request Jun 25, 2026

feat(topology): recover complexity per sub-package with per-child engine routing #123

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(topology): recover codebase statistics per sub-package off the single scc walk#121

feat(topology): recover codebase statistics per sub-package off the single scc walk#121
maudlin merged 1 commit into
mainfrom
78-perpackage-stats

maudlin commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

maudlin commented Jun 24, 2026

What

Why this is safe

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant