feat(cli): add convention audit command#1306
Conversation
jackwener
left a comment
There was a problem hiding this comment.
Round 1 review (treat as request-changes — gh blocks --request-changes on owner PRs).
Scanner architecture is solid (pure function → reusable for #213 gate, --site/site/name filter, --strict opt-in, skipUserDiscovery for clean output, 3 unit tests on happy path). But there are blockers in the adapter changes shipped alongside the new command.
🔴 Blockers
1. PR introduces 3 silent-column-drop regressions via "adapter fixes"
These are out of scope for "add convention audit command" and they push exactly the wrong direction the rule is meant to prevent. Verified locally by running this branch's own convention-audit:
clis/youtube/feed.js
- Renames source
video_id(snake) →videoId(camel) AND dropsvideo_idfrom columns. - Net effect: rows emit
videoIdwhich is NOT in columns.node dist/src/main.js convention-audit --site youtubeflags this on this very branch:youtube/feed row emits key(s) not present in columns: videoId emitted_keys: [title, channel, views, duration, published, videoId] columns: [rank, title, channel, views, duration, published, url] - Plus the rename moves a key from snake_case (project convention) to camelCase. Right direction is the opposite: keep snake_case, fix columns to include
video_id.
clis/xiaoe/catalog.js
- Drops
urlfrom columns, but source still emitsurlin bothresult.push({...url...})(lines 99 & 143) AND pipelinemap: { url: '\${{ item.url }}' }(line 158). - Audit on this branch:
xiaoe/catalog row emits key(s) not present in columns: url
clis/xiaohongshu/feed.js
- Drops
idfrom columns, but pipelinemap: { id: '\${{ item.id }}' }(line 24) still emits it. - Audit doesn't flag this because the silent-column-drop rule misses pipeline
map:blocks (see #3 below) — but runtime still dropsidfrom table view, exactly the pattern this PR is supposed to guard against.
Recommendation: revert these three adapter changes from this PR. If they need actual fixes, do them in a separate PR and use the audit as evidence. The default direction is add the missing key to columns, not remove from columns.
2. clis/youtube/feed.test.js deleted (−131 lines) with no replacement
The test verified continuation pagination (chains appendContinuationItemsAction to reach the limit). It would currently fail because it asserts video_id but the source now emits videoId — so it was deleted instead of updated. Restore + update field names, or explain why this coverage is no longer needed.
🟡 Important
3. silent-column-drop rule misses pipeline map: blocks
xiaohongshu/feed in this very PR proves the gap: id is dropped from columns but emitted by pipeline map, and the audit doesn't see it. With ~half of the registry using pipeline-style adapters, this is a meaningful coverage hole. Consider extracting pipeline[].map keys from the registered command (or via cli-manifest.json — the manifest already serializes pipeline) and diffing against columns the same way.
4. silent-empty-fallback fires outside catch blocks
Regex \breturn\s+\[\s*\]\s*;? runs on every line, so it flags idiomatic guards like if (!store) return []; (xiaoe/catalog L19, xiaoe/detail L18). The spec was specifically "empty array fallback in catch blocks". Either scope to catch blocks (parse catch (...) {...} or use a small AST), or document the false-positive baseline so #217 does not make these CI-blocking.
🟢 Nits (non-blocking)
auditTypedErrorPatternsscans every line including comments andevaluate:JS string literals.// note: returns []would firesilent-empty-fallback. Consider stripping//line comments at the top of the scanner.argv[0] === 'convention-audit'skip in main.ts works but is brittle; a smallSKIP_DISCOVERY_COMMANDSset is more honest.renderOutput(report, { fmt })for json/yaml does not passtitle/elapsed/sourcelike other built-ins. Minor consistency thing, defer.
✅ Good
- Pure-function scanner —
runConventionAudit(opts)→ConventionAuditReport. #213 gate can drop in cleanly. --site/site/name/--strictshape is right.- skipUserDiscovery for clean reports is a nice touch (audits should not depend on whatever user adapters happen to be installed).
- Categories shape (rule + count + violations) is agent-friendly.
Whole-registry numbers (from this branch)
commands: 660, sites: 108, violations: 562
silent-column-drop: 100
camelCase-in-columns: 189
missing-access-metadata: 0 ← good, #205 paid off
silent-clamp: 109
silent-empty-fallback: 71
silent-sentinel: 84
write-without-delete-pair: 9
Useful baseline for #213/#217. But before #213 freezes the baseline, please get the 3 adapter regressions out of #1306 — otherwise the silent-column-drop baseline starts off contaminated.
Action items for round 2
- Revert the youtube/feed, xiaoe/catalog, xiaohongshu/feed source/columns/test changes from this PR.
- Decide pipeline
map:scanning: add now, or open follow-up issue. - Decide silent-empty-fallback scope: narrow to catch blocks, or accept and let #217 mark some as known-false-positives.
I will re-verify after round 2.
jackwener
left a comment
There was a problem hiding this comment.
Round 1 retraction — blockers 1 & 2 are invalid.
I diff'd git diff main..pr-1306 against current origin/main (which has #1305 merged at f482a6b), but #1306 is based on 0d37f48 (#1304 tip, before #1305). So the youtube/xiaohongshu/xiaoe "regressions" I flagged are actually #1305's content NOT YET on this branch — the diff direction made them look like deletions/reverts, but #1306 itself touches none of those adapter files. Same goes for feed.test.js: it exists in main (added by #1305, 131 lines) and shows up as deleted in main..pr-1306 only because the branch hasn't been rebased.
So:
- ❌
Blocker 1 (3 adapter regressions)— false alarm. After rebase onto current main these disappear. - ❌
Blocker 2 (feed.test.js deletion)— false alarm. After rebase the file is preserved. - ✅ #3 silent-column-drop rule misses pipeline
map:blocks — still valid. - ✅ #4 silent-empty-fallback fires outside catch blocks — still valid.
- ✅ Nits #5-#7 — still valid as nits.
My fault for diffing against current main instead of the branch's merge-base. Going forward I'll use git diff $(git merge-base origin/main HEAD)..HEAD for a true PR-author diff.
7959d8f to
c01652b
Compare
jackwener
left a comment
There was a problem hiding this comment.
Round 2 LGTM ✅ (gh blocks --approve on owner PRs, comment-only)
Rebased onto current main (f482a6b, post-#1305). True PR diff (vs merge-base) is audit-only — no adapter touches:
docs/.vitepress/config.mts | 2 +
docs/conventions/convention-audit.md | 39 +
src/cli.ts | 23 +
src/convention-audit.test.ts | 173 +
src/convention-audit.ts | 567 +
src/main.ts | 19 +-/-
Both Round 1 valid concerns addressed:
#3 Pipeline map: scanning — added /\bmap\s*:\s*{/g trigger to extractPotentialRowObjects, reuses the same balanced-block reader. Verified end-to-end:
- Whole-registry silent-column-drop: 100 → 104 (+4 from pipeline-style adapters previously invisible)
- New unit test
scans pipeline map blocks for silent column dropscovers it.
#4 silent-empty-fallback catch-only — added findCatchBlockRanges (matches \bcatch\s*(?:\([^)]*\))?\s*{ + balanced findBalancedBlockEnd) and isInsideAnyRange check. Verified:
- silent-empty-fallback: 71 → 5 (-66 idiomatic guards eliminated)
- Spot-checked the 5 remaining (douyin/user-videos:31, jike/post:50, jike/topic:38, jike/user:37, weread/search:97) — all genuinely inside
catch { ... }blocks. - New unit test
only reports empty array fallbacks inside catch blockscovers both directions.
Whole-registry baseline (this branch, ready for #213/#217 to lock):
commands: 660, sites: 108, files_scanned: 647, violations: 500
silent-column-drop: 104
camelCase-in-columns: 189
missing-access-metadata: 0
silent-clamp: 109
silent-empty-fallback: 5
silent-sentinel: 84
write-without-delete-pair: 9
Tests: 120/120 pass on cli.test.ts + convention-audit.test.ts. tsc --noEmit clean. npm run build green (660 manifest entries).
Minor latent (non-blocking, defer):
auditTypedErrorPatternsoffset tracking usesoffset += line.length + 1butsource.split(/\r?\n/)strips both\r\n— off-by-one on CRLF files. Repo is LF-only, no impact today; flag for future portability.\bmap\s*:\s*{trigger also matches plain JS object literals like{ map: { ... } }outside pipelines — currently safe because the next layer requires column overlap to flag, but worth narrowing ifcli({ ... })ever grows othermap:fields.auditTypedErrorPatternsstill scans line content including//comments — only matters if someone ships// TODO return []style; defer.
These are tail-end nits, not merge-blockers. Up to @Jakevin (WAWQAQ) to call merge — this is a new agent-facing built-in command, would want explicit go-ahead before #213/#217 freeze its baseline.
Summary
opencli convention-auditwith text/yaml/json output plus--site,site/name, and--strictVerification
npm run typechecknpx vitest run src/convention-audit.test.ts src/cli.test.ts --project unitnpm run docs:buildnpm run buildnode dist/src/main.js convention-audit --site pixiv -f jsonnode dist/src/main.js convention-audit --site pixiv --strictexits 1 with the expected pixiv/illusts violationNote
npm testhad 281 files / 2396 tests pass / 2 skipped, but exited 1 because local port 19825 was already occupied andsrc/daemon.test.tshit EADDRINUSE. This is local environment noise, not caused by this PR.