chore(review): codebase review findings + ESLint/typedoc guardrails#75
Conversation
Consolidates a multi-agent review of the 2026-04-24 codebase into `docs/plans/2026-04-24-codebase-review-findings.md` and lands Track A (guardrails) live so the patterns the project relied on as convention are now enforced by CI. Track A — lands in this PR - ESLint: architectural bans (default exports, enums, cross-layer peer-dep imports in core) via `no-restricted-syntax` + `no-restricted-imports`. - ESLint: LOC / complexity caps (`max-lines`, `max-lines-per-function`, `complexity`, `max-depth`, `max-params`, `max-nested-callbacks`). Thresholds chosen so current code passes clean at error-level. - ESLint: quality defaults for agentic contributions (`no-console`, `eqeqeq`, `no-duplicate-imports`, `switch-exhaustiveness-check`, `no-explicit-any`, etc.). - Collapse three `import type` + `import` pairs into single inline type imports (Agent.ts x2, ExcaliburAnimationBridge.ts x1) — required by the new `no-duplicate-imports` rule. - Typedoc: add missing adapter entry points (mistreevous / js-son / tfjs), add docs build as a parallel CI job, add `npm run docs` to the `verify` script so local gate mirrors CI. Output continues to live under `docs/api/` (already gitignored; sits alongside how-to/plans/specs). Lint baseline after this PR: 0 errors, 11 warnings (the warnings are the ratchet targets tracked under Track C of the plan). Tracks B–E (stale docs, complexity ratchet, src micro-findings, tooling gaps) are left as a punch list — each a separate topic branch per CLAUDE.md one-PR-one-branch rule.
Follow-up to the review-findings PR. Three asks from the PR review: (a) tighten max-lines from 1000 to 350; (b) actually improve Agent.ts instead of overriding the cap; (c) remove stale-prone metrics from docs. ESLint - max-lines: 350 (was 1000) at error level. - New rules: require-array-sort-compare (determinism win — `[].sort()` without a comparator uses locale-dependent UTF-16 ordering), prefer-nullish-coalescing, prefer-optional-chain, consistent-type-exports, prefer-readonly (warn), prefer-template, no-useless-concat, no-useless-rename, radix, default-case-last, no-lonely-if. - A handful of `&& X.y` patterns in src/ auto-rewritten to `?.y` by prefer-optional-chain — same behaviour, less noise. - Per-file override: Agent.ts capped at 500 (down from the previous 700) until tick() itself is split. The cap is tighter than a global lift would be and surfaces the legacy outlier in CI. Agent.ts split (-110 effective LOC, 562 → 484) - Extracted Agent.restore() → src/agent/internal/RestoreCoordinator.ts (`runRestore`). - Extracted Agent.die() → src/agent/internal/DeathCoordinator.ts (`runDeath`). - Extracted Agent.snapshot() → src/agent/internal/SnapshotAssembler.ts (`assembleSnapshot`). - Added a small `applyLifecycleSnapshot()` helper on Agent so the coordinator never reaches into protected fields. - Moved the `isInMemoryMemoryAdapter` typeguard from Agent.ts to its natural home next to the class it guards (src/memory/InMemoryMemoryAdapter.ts). Express skill consolidation (D1) - Three near-identical 24-line skills (ExpressMeowSkill, ExpressSadSkill, ExpressSleepySkill) collapsed onto a single createExpressionSkill() factory in src/skills/defaults/ExpressionSkill.ts. The three exports become one-line declarations sharing the same execution path. No caller-visible change. Test sort comparators - Four test files had `.sort()` without a comparator. Each now passes an explicit code-point comparator — locks in the same determinism discipline the library code follows. Stale-prone metrics scrubbed from prose - "~300 tests" (CLAUDE.md), "10 default skills" (README, CLAUDE.md, docs/specs/vision.md), "~80 KB unminified ESM" (README) replaced with descriptive phrasing or pointers to the source of truth (the `size-limit` field in package.json, CI logs). The test-count claim was already off by ~50%; these go stale every PR. Verify still green: 0 lint errors / 11 ratchet warnings, all tests passing, build under size-budget, typedoc 0 errors.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b1c9971d8d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const ts = agent.getTimeScale(); | ||
| const elapsedSec = (elapsedMs / 1000) * ts; |
There was a problem hiding this comment.
Recompute catch-up dt from current timeScale each chunk
runRestore now snapshots timeScale once (const ts = agent.getTimeScale()) and reuses it for every catch-up step. During runCatchUp, each step calls agent.tick(...), and reactive handlers/modules can legally call setTimeScale() between chunks; with a cached ts, later chunks advance the wrong virtual time (over/under-shooting elapsed offline time). Before this change, the divider read this.timeScale per chunk, so setTimeScale() applied on the next tick as documented. Please read the current scale inside the step callback instead of caching it once.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — fixed in 3473b9a. The pre-refactor code read this.timeScale live both at restore entry (for the total budget) and inside the per-chunk callback (for the realDt divisor), so a reactive handler that called setTimeScale() between chunks applied on the next chunk per the documented "takes effect on the NEXT tick" semantics. My refactor cached it once and used it for both, which broke the per-chunk live-read.
Now getTimeScale() is re-read inside the step callback. The total elapsedSec budget is still computed once at restore entry — that's the correct snapshot of "how much virtual time to catch up to" and is bounded so catch-up terminates regardless of intra-chunk scale tweaks.
Generated by Claude Code
Codex review on PR #75 caught a behaviour regression in the restore() extraction: my refactor cached `agent.getTimeScale()` once and reused it for every catch-up chunk's `realDt = chunk / ts` divisor. The original code in `Agent.ts` read `this.timeScale` live per chunk, so a reactive handler that called `setTimeScale()` between chunks applied on the next chunk — matching the documented "takes effect on the NEXT tick" semantics. Cached-once meant later chunks would over- or under-shoot offline elapsed time when timeScale changed mid-catch-up. Fix: re-read `agent.getTimeScale()` inside the step callback. The total virtual budget (`elapsedSec`) is still computed once at restore entry — that's the correct snapshot of "how much virtual time to catch up", and is bounded so the catch-up still terminates regardless of intra-chunk scale tweaks.
Summary
Multi-agent review of the 2026-04-24 codebase consolidated into an actionable plan at
docs/plans/2026-04-24-codebase-review-findings.md, with Track A (guardrails + the meaningful refactors that fall out of those guardrails) landed live. The codebase is exceptionally disciplined — determinism is fully enforced, no default exports or enums insrc/, noany, all skills returnResult<T, E>. The findings are mostly stale docs, convention-only guardrails, and ratchet opportunities.What ships in this PR
ESLint — architecture, size, and quality
Previously enforced by convention only, now enforced by CI:
Architecture
ExportDefaultDeclarationbanned (with carve-out for config files)TSEnumDeclarationbanned (STYLE_GUIDE.md:as constunions)no-restricted-imports: core (src/**, excluding adapter/port/integration folders) cannot importexcalibur,@tensorflow/*,js-son-agent,mistreevous,@anthropic-ai/sdk,openai, orsim-ecsComplexity & size (agent navigability)
max-linesmax-lines-per-functioncomplexitymax-depthmax-paramsmax-nested-callbacksQuality
no-console(allowwarn/error),no-debugger,no-var,prefer-const,eqeqeq,no-throw-literal,no-duplicate-imports,no-unneeded-ternary,object-shorthand,@typescript-eslint/no-explicit-any,@typescript-eslint/switch-exhaustiveness-check,prefer-template,no-useless-concat,no-useless-rename,radix,default-case-last,no-lonely-if— allerror@typescript-eslint/require-array-sort-compare—[].sort()without a comparator uses locale-dependent UTF-16 ordering, which silently breaks byte-equal trace replay. Locked in.@typescript-eslint/prefer-nullish-coalescing,prefer-optional-chain,consistent-type-exports— error@typescript-eslint/prefer-readonly— warn@typescript-eslint/no-non-null-assertion— warn (idiomatic undernoUncheckedIndexedAccess: true)Agent.ts split (–110 effective LOC: 562 → 484)
Reviewer flagged that overriding
max-linesfor Agent.ts is a smell. Three self-contained methods extracted into focused helpers undersrc/agent/internal/:Agent.restore()→RestoreCoordinator.ts(runRestore)Agent.die()→DeathCoordinator.ts(runDeath)Agent.snapshot()→SnapshotAssembler.ts(assembleSnapshot)Plus a small
applyLifecycleSnapshot()method onAgentso the coordinator never reaches into protected fields, and the duplicatedisInMemoryMemoryAdaptertypeguard moved next to the class it guards (src/memory/InMemoryMemoryAdapter.ts).The remaining hotspot is
tick()itself (~150 effective LOC of stage orchestration). Splitting it cleanly is its own focused PR (Track C2 of the plan). Until then, Agent.ts has a per-filemax-lines: 500override (down from the previous 700) so the cap is tighter than a global lift would be and the legacy outlier stays visible in CI.Express skill consolidation (D1)
Three near-identical 24-line skills (
ExpressMeowSkill,ExpressSadSkill,ExpressSleepySkill) collapsed onto a singlecreateExpressionSkill(id, label, expression, fxHint)factory insrc/skills/defaults/ExpressionSkill.ts. The three exports become one-line declarations sharing the same execution path. No caller-visible change.Typedoc wired into CI
mistreevous,js-son,tfjs) — they were inpackage.json#exportsbut nottypedoc.json.docsjob in.github/workflows/ci.yml, parallel with format/lint/typecheck/actionlint. Test job depends on it.npm run docsadded to theverifyscript so local gate matches CI.docs/api/(already gitignored; sits alongsidehow-to/,plans/,specs/). Uploaded as artifact (api-docs, 14-day retention).Test sort comparators
Four test files had
.sort()without a comparator — now passes explicit code-point comparators. Locks in the same determinism discipline the library code follows.Stale-prone metrics scrubbed from docs
Counts like "~300 tests" (CLAUDE.md — already 50% off), "10 default skills" (README, CLAUDE.md,
docs/specs/vision.md), and "~80 KB unminified ESM" (README) replaced with descriptive phrasing or pointers to the source of truth (size-limitconfig, CI logs). These go stale every PR.Verify gate after this PR
format:check— cleanlint— 0 errors, 11 ratchet warnings (Track C targets, all visible)typecheck— cleantest— all passingbuild— every entry under itssize-limitbudgetdocs— typedoc emits 5 entry points, 0 errorsPunch list for follow-up PRs (in the plan, not in this PR)
Each = its own topic branch from
developper CLAUDE.md.docs/plans/…polish-and-harden.mditems already merged but marked "Not started",MEMORY.mdreference,.changeset/config.json:baseBranch: "main"vs CLAUDE.md "PRs target develop", stalegray-matterexternal invite.config.ts.tick()into stage helpers (drops the Agent.ts override),createAgentcomplexity 59 (extract per-subsystem composition),MockLlmProvider.completeSynccomplexity 25.result.tshelpers missing JSDoc,IntentionCandidate.discriminantsemantic docs,SkillRegistry.invoke()JSDoc clarify throws-on-unregistered.*to real ranges.Full detail with file:line refs in
docs/plans/2026-04-24-codebase-review-findings.md.Test plan
npm run verifygreen locally (format/lint/typecheck/test/build/docs)eslintexits 0 errors / 11 ratchet warnings (was 9 errors before refactor; the extractions resolved 5 of those)docsjobhttps://claude.ai/code/session_01HMuHymBXZ8JqVEnpdcmnnU