Skip to content

chore(review): codebase review findings + ESLint/typedoc guardrails#75

Merged
Luis85 merged 3 commits into
developfrom
claude/codebase-review-Ba2iv
Apr 24, 2026
Merged

chore(review): codebase review findings + ESLint/typedoc guardrails#75
Luis85 merged 3 commits into
developfrom
claude/codebase-review-Ba2iv

Conversation

@Luis85
Copy link
Copy Markdown
Owner

@Luis85 Luis85 commented Apr 24, 2026

Summary

Multi-agent review of the 2026-04-24 codebase consolidated into an actionable plan at docs/plans/2026-04-24-codebase-review-findings.md, with Track A (guardrails + the meaningful refactors that fall out of those guardrails) landed live. The codebase is exceptionally disciplined — determinism is fully enforced, no default exports or enums in src/, no any, all skills return Result<T, E>. The findings are mostly stale docs, convention-only guardrails, and ratchet opportunities.

What ships in this PR

ESLint — architecture, size, and quality

Previously enforced by convention only, now enforced by CI:

Architecture

  • ExportDefaultDeclaration banned (with carve-out for config files)
  • TSEnumDeclaration banned (STYLE_GUIDE.md: as const unions)
  • no-restricted-imports: core (src/**, excluding adapter/port/integration folders) cannot import excalibur, @tensorflow/*, js-son-agent, mistreevous, @anthropic-ai/sdk, openai, or sim-ecs

Complexity & size (agent navigability)

Rule Level Threshold
max-lines error 350 (skip blank/comments)
max-lines-per-function warn 150
complexity warn 15
max-depth warn 4
max-params warn 5
max-nested-callbacks warn 3

Quality

  • no-console (allow warn/error), no-debugger, no-var, prefer-const, eqeqeq, no-throw-literal, no-duplicate-imports, no-unneeded-ternary, object-shorthand, @typescript-eslint/no-explicit-any, @typescript-eslint/switch-exhaustiveness-check, prefer-template, no-useless-concat, no-useless-rename, radix, default-case-last, no-lonely-if — all error
  • Determinism win: @typescript-eslint/require-array-sort-compare[].sort() without a comparator uses locale-dependent UTF-16 ordering, which silently breaks byte-equal trace replay. Locked in.
  • @typescript-eslint/prefer-nullish-coalescing, prefer-optional-chain, consistent-type-exports — error
  • @typescript-eslint/prefer-readonly — warn
  • @typescript-eslint/no-non-null-assertion — warn (idiomatic under noUncheckedIndexedAccess: true)

Agent.ts split (–110 effective LOC: 562 → 484)

Reviewer flagged that overriding max-lines for Agent.ts is a smell. Three self-contained methods extracted into focused helpers under src/agent/internal/:

  • Agent.restore()RestoreCoordinator.ts (runRestore)
  • Agent.die()DeathCoordinator.ts (runDeath)
  • Agent.snapshot()SnapshotAssembler.ts (assembleSnapshot)

Plus a small applyLifecycleSnapshot() method on Agent so the coordinator never reaches into protected fields, and the duplicated isInMemoryMemoryAdapter typeguard moved next to the class it guards (src/memory/InMemoryMemoryAdapter.ts).

The remaining hotspot is tick() itself (~150 effective LOC of stage orchestration). Splitting it cleanly is its own focused PR (Track C2 of the plan). Until then, Agent.ts has a per-file max-lines: 500 override (down from the previous 700) so the cap is tighter than a global lift would be and the legacy outlier stays visible in CI.

Express skill consolidation (D1)

Three near-identical 24-line skills (ExpressMeowSkill, ExpressSadSkill, ExpressSleepySkill) collapsed onto a single createExpressionSkill(id, label, expression, fxHint) factory in src/skills/defaults/ExpressionSkill.ts. The three exports become one-line declarations sharing the same execution path. No caller-visible change.

Typedoc wired into CI

  • Added missing adapter entry points (mistreevous, js-son, tfjs) — they were in package.json#exports but not typedoc.json.
  • New docs job in .github/workflows/ci.yml, parallel with format/lint/typecheck/actionlint. Test job depends on it.
  • npm run docs added to the verify script so local gate matches CI.
  • Output continues to go to docs/api/ (already gitignored; sits alongside how-to/, plans/, specs/). Uploaded as artifact (api-docs, 14-day retention).

Test sort comparators

Four test files had .sort() without a comparator — now passes explicit code-point comparators. Locks in the same determinism discipline the library code follows.

Stale-prone metrics scrubbed from docs

Counts like "~300 tests" (CLAUDE.md — already 50% off), "10 default skills" (README, CLAUDE.md, docs/specs/vision.md), and "~80 KB unminified ESM" (README) replaced with descriptive phrasing or pointers to the source of truth (size-limit config, CI logs). These go stale every PR.

Verify gate after this PR

  • format:check — clean
  • lint0 errors, 11 ratchet warnings (Track C targets, all visible)
  • typecheck — clean
  • test — all passing
  • build — every entry under its size-limit budget
  • docs — typedoc emits 5 entry points, 0 errors

Punch list for follow-up PRs (in the plan, not in this PR)

Each = its own topic branch from develop per CLAUDE.md.

  • Track B: README pre-release version mismatch, docs/plans/…polish-and-harden.md items already merged but marked "Not started", MEMORY.md reference, .changeset/config.json:baseBranch: "main" vs CLAUDE.md "PRs target develop", stale gray-matter external in vite.config.ts.
  • Track C: Split tick() into stage helpers (drops the Agent.ts override), createAgent complexity 59 (extract per-subsystem composition), MockLlmProvider.completeSync complexity 25.
  • Track D: result.ts helpers missing JSDoc, IntentionCandidate.discriminant semantic docs, SkillRegistry.invoke() JSDoc clarify throws-on-unregistered.
  • Track E: Vitest coverage thresholds; tighten peer deps from * to real ranges.

Full detail with file:line refs in docs/plans/2026-04-24-codebase-review-findings.md.

Test plan

  • npm run verify green locally (format/lint/typecheck/test/build/docs)
  • eslint exits 0 errors / 11 ratchet warnings (was 9 errors before refactor; the extractions resolved 5 of those)
  • Typedoc generates all 5 entry points with 0 errors
  • Size budget unchanged
  • All test suites still green
  • CI green on the new docs job

https://claude.ai/code/session_01HMuHymBXZ8JqVEnpdcmnnU

claude added 2 commits April 24, 2026 23:12
Consolidates a multi-agent review of the 2026-04-24 codebase into
`docs/plans/2026-04-24-codebase-review-findings.md` and lands Track A
(guardrails) live so the patterns the project relied on as convention
are now enforced by CI.

Track A — lands in this PR
- ESLint: architectural bans (default exports, enums, cross-layer
  peer-dep imports in core) via `no-restricted-syntax` +
  `no-restricted-imports`.
- ESLint: LOC / complexity caps (`max-lines`, `max-lines-per-function`,
  `complexity`, `max-depth`, `max-params`, `max-nested-callbacks`).
  Thresholds chosen so current code passes clean at error-level.
- ESLint: quality defaults for agentic contributions (`no-console`,
  `eqeqeq`, `no-duplicate-imports`, `switch-exhaustiveness-check`,
  `no-explicit-any`, etc.).
- Collapse three `import type` + `import` pairs into single inline
  type imports (Agent.ts x2, ExcaliburAnimationBridge.ts x1) — required
  by the new `no-duplicate-imports` rule.
- Typedoc: add missing adapter entry points (mistreevous / js-son /
  tfjs), add docs build as a parallel CI job, add `npm run docs` to
  the `verify` script so local gate mirrors CI. Output continues to
  live under `docs/api/` (already gitignored; sits alongside
  how-to/plans/specs).

Lint baseline after this PR: 0 errors, 11 warnings (the warnings are
the ratchet targets tracked under Track C of the plan).

Tracks B–E (stale docs, complexity ratchet, src micro-findings,
tooling gaps) are left as a punch list — each a separate topic branch
per CLAUDE.md one-PR-one-branch rule.
Follow-up to the review-findings PR. Three asks from the PR review:
(a) tighten max-lines from 1000 to 350; (b) actually improve Agent.ts
instead of overriding the cap; (c) remove stale-prone metrics from
docs.

ESLint
- max-lines: 350 (was 1000) at error level.
- New rules: require-array-sort-compare (determinism win — `[].sort()`
  without a comparator uses locale-dependent UTF-16 ordering),
  prefer-nullish-coalescing, prefer-optional-chain,
  consistent-type-exports, prefer-readonly (warn), prefer-template,
  no-useless-concat, no-useless-rename, radix, default-case-last,
  no-lonely-if.
- A handful of `&& X.y` patterns in src/ auto-rewritten to `?.y` by
  prefer-optional-chain — same behaviour, less noise.
- Per-file override: Agent.ts capped at 500 (down from the previous
  700) until tick() itself is split. The cap is tighter than a global
  lift would be and surfaces the legacy outlier in CI.

Agent.ts split (-110 effective LOC, 562 → 484)
- Extracted Agent.restore() → src/agent/internal/RestoreCoordinator.ts
  (`runRestore`).
- Extracted Agent.die() → src/agent/internal/DeathCoordinator.ts
  (`runDeath`).
- Extracted Agent.snapshot() → src/agent/internal/SnapshotAssembler.ts
  (`assembleSnapshot`).
- Added a small `applyLifecycleSnapshot()` helper on Agent so the
  coordinator never reaches into protected fields.
- Moved the `isInMemoryMemoryAdapter` typeguard from Agent.ts to its
  natural home next to the class it guards
  (src/memory/InMemoryMemoryAdapter.ts).

Express skill consolidation (D1)
- Three near-identical 24-line skills (ExpressMeowSkill,
  ExpressSadSkill, ExpressSleepySkill) collapsed onto a single
  createExpressionSkill() factory in
  src/skills/defaults/ExpressionSkill.ts. The three exports become
  one-line declarations sharing the same execution path. No
  caller-visible change.

Test sort comparators
- Four test files had `.sort()` without a comparator. Each now passes
  an explicit code-point comparator — locks in the same determinism
  discipline the library code follows.

Stale-prone metrics scrubbed from prose
- "~300 tests" (CLAUDE.md), "10 default skills" (README, CLAUDE.md,
  docs/specs/vision.md), "~80 KB unminified ESM" (README) replaced
  with descriptive phrasing or pointers to the source of truth (the
  `size-limit` field in package.json, CI logs). The test-count claim
  was already off by ~50%; these go stale every PR.

Verify still green: 0 lint errors / 11 ratchet warnings, all tests
passing, build under size-budget, typedoc 0 errors.
@Luis85 Luis85 marked this pull request as ready for review April 24, 2026 23:46
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b1c9971d8d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +100 to +101
const ts = agent.getTimeScale();
const elapsedSec = (elapsedMs / 1000) * ts;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Recompute catch-up dt from current timeScale each chunk

runRestore now snapshots timeScale once (const ts = agent.getTimeScale()) and reuses it for every catch-up step. During runCatchUp, each step calls agent.tick(...), and reactive handlers/modules can legally call setTimeScale() between chunks; with a cached ts, later chunks advance the wrong virtual time (over/under-shooting elapsed offline time). Before this change, the divider read this.timeScale per chunk, so setTimeScale() applied on the next tick as documented. Please read the current scale inside the step callback instead of caching it once.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 3473b9a. The pre-refactor code read this.timeScale live both at restore entry (for the total budget) and inside the per-chunk callback (for the realDt divisor), so a reactive handler that called setTimeScale() between chunks applied on the next chunk per the documented "takes effect on the NEXT tick" semantics. My refactor cached it once and used it for both, which broke the per-chunk live-read.

Now getTimeScale() is re-read inside the step callback. The total elapsedSec budget is still computed once at restore entry — that's the correct snapshot of "how much virtual time to catch up to" and is bounded so catch-up terminates regardless of intra-chunk scale tweaks.


Generated by Claude Code

Codex review on PR #75 caught a behaviour regression in the restore()
extraction: my refactor cached `agent.getTimeScale()` once and reused
it for every catch-up chunk's `realDt = chunk / ts` divisor. The
original code in `Agent.ts` read `this.timeScale` live per chunk, so a
reactive handler that called `setTimeScale()` between chunks applied
on the next chunk — matching the documented "takes effect on the NEXT
tick" semantics.

Cached-once meant later chunks would over- or under-shoot offline
elapsed time when timeScale changed mid-catch-up.

Fix: re-read `agent.getTimeScale()` inside the step callback. The
total virtual budget (`elapsedSec`) is still computed once at restore
entry — that's the correct snapshot of "how much virtual time to
catch up", and is bounded so the catch-up still terminates regardless
of intra-chunk scale tweaks.
@Luis85 Luis85 merged commit a633eb6 into develop Apr 24, 2026
9 checks passed
@Luis85 Luis85 deleted the claude/codebase-review-Ba2iv branch April 24, 2026 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants