Skip to content

feat(cli): add convention audit command#1306

Merged
jackwener merged 1 commit intomainfrom
feat/convention-audit
May 4, 2026
Merged

feat(cli): add convention audit command#1306
jackwener merged 1 commit intomainfrom
feat/convention-audit

Conversation

@jackwener
Copy link
Copy Markdown
Owner

Summary

  • add opencli convention-audit with text/yaml/json output plus --site, site/name, and --strict
  • scan manifest + source for silent-column-drop, camelCase columns, missing access metadata, silent clamp/empty/sentinel patterns, and write-without-delete-pair
  • skip user/plugin discovery for convention-audit so local stale adapters do not pollute official reports
  • document the audit command under conventions

Verification

  • npm run typecheck
  • npx vitest run src/convention-audit.test.ts src/cli.test.ts --project unit
  • npm run docs:build
  • npm run build
  • node dist/src/main.js convention-audit --site pixiv -f json
  • node dist/src/main.js convention-audit --site pixiv --strict exits 1 with the expected pixiv/illusts violation

Note

  • npm test had 281 files / 2396 tests pass / 2 skipped, but exited 1 because local port 19825 was already occupied and src/daemon.test.ts hit EADDRINUSE. This is local environment noise, not caused by this PR.

Copy link
Copy Markdown
Owner Author

@jackwener jackwener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 1 review (treat as request-changes — gh blocks --request-changes on owner PRs).

Scanner architecture is solid (pure function → reusable for #213 gate, --site/site/name filter, --strict opt-in, skipUserDiscovery for clean output, 3 unit tests on happy path). But there are blockers in the adapter changes shipped alongside the new command.

🔴 Blockers

1. PR introduces 3 silent-column-drop regressions via "adapter fixes"

These are out of scope for "add convention audit command" and they push exactly the wrong direction the rule is meant to prevent. Verified locally by running this branch's own convention-audit:

clis/youtube/feed.js

  • Renames source video_id (snake) → videoId (camel) AND drops video_id from columns.
  • Net effect: rows emit videoId which is NOT in columns. node dist/src/main.js convention-audit --site youtube flags this on this very branch:
    youtube/feed row emits key(s) not present in columns: videoId
    emitted_keys: [title, channel, views, duration, published, videoId]
    columns:      [rank, title, channel, views, duration, published, url]
    
  • Plus the rename moves a key from snake_case (project convention) to camelCase. Right direction is the opposite: keep snake_case, fix columns to include video_id.

clis/xiaoe/catalog.js

  • Drops url from columns, but source still emits url in both result.push({...url...}) (lines 99 & 143) AND pipeline map: { url: '\${{ item.url }}' } (line 158).
  • Audit on this branch:
    xiaoe/catalog row emits key(s) not present in columns: url
    

clis/xiaohongshu/feed.js

  • Drops id from columns, but pipeline map: { id: '\${{ item.id }}' } (line 24) still emits it.
  • Audit doesn't flag this because the silent-column-drop rule misses pipeline map: blocks (see #3 below) — but runtime still drops id from table view, exactly the pattern this PR is supposed to guard against.

Recommendation: revert these three adapter changes from this PR. If they need actual fixes, do them in a separate PR and use the audit as evidence. The default direction is add the missing key to columns, not remove from columns.

2. clis/youtube/feed.test.js deleted (−131 lines) with no replacement

The test verified continuation pagination (chains appendContinuationItemsAction to reach the limit). It would currently fail because it asserts video_id but the source now emits videoId — so it was deleted instead of updated. Restore + update field names, or explain why this coverage is no longer needed.

🟡 Important

3. silent-column-drop rule misses pipeline map: blocks

xiaohongshu/feed in this very PR proves the gap: id is dropped from columns but emitted by pipeline map, and the audit doesn't see it. With ~half of the registry using pipeline-style adapters, this is a meaningful coverage hole. Consider extracting pipeline[].map keys from the registered command (or via cli-manifest.json — the manifest already serializes pipeline) and diffing against columns the same way.

4. silent-empty-fallback fires outside catch blocks

Regex \breturn\s+\[\s*\]\s*;? runs on every line, so it flags idiomatic guards like if (!store) return []; (xiaoe/catalog L19, xiaoe/detail L18). The spec was specifically "empty array fallback in catch blocks". Either scope to catch blocks (parse catch (...) {...} or use a small AST), or document the false-positive baseline so #217 does not make these CI-blocking.

🟢 Nits (non-blocking)

  1. auditTypedErrorPatterns scans every line including comments and evaluate: JS string literals. // note: returns [] would fire silent-empty-fallback. Consider stripping // line comments at the top of the scanner.
  2. argv[0] === 'convention-audit' skip in main.ts works but is brittle; a small SKIP_DISCOVERY_COMMANDS set is more honest.
  3. renderOutput(report, { fmt }) for json/yaml does not pass title/elapsed/source like other built-ins. Minor consistency thing, defer.

✅ Good

  • Pure-function scanner — runConventionAudit(opts)ConventionAuditReport. #213 gate can drop in cleanly.
  • --site / site/name / --strict shape is right.
  • skipUserDiscovery for clean reports is a nice touch (audits should not depend on whatever user adapters happen to be installed).
  • Categories shape (rule + count + violations) is agent-friendly.

Whole-registry numbers (from this branch)

commands: 660, sites: 108, violations: 562
  silent-column-drop:        100
  camelCase-in-columns:      189
  missing-access-metadata:     0   ← good, #205 paid off
  silent-clamp:              109
  silent-empty-fallback:      71
  silent-sentinel:            84
  write-without-delete-pair:   9

Useful baseline for #213/#217. But before #213 freezes the baseline, please get the 3 adapter regressions out of #1306 — otherwise the silent-column-drop baseline starts off contaminated.

Action items for round 2

  1. Revert the youtube/feed, xiaoe/catalog, xiaohongshu/feed source/columns/test changes from this PR.
  2. Decide pipeline map: scanning: add now, or open follow-up issue.
  3. Decide silent-empty-fallback scope: narrow to catch blocks, or accept and let #217 mark some as known-false-positives.

I will re-verify after round 2.

Copy link
Copy Markdown
Owner Author

@jackwener jackwener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 1 retraction — blockers 1 & 2 are invalid.

I diff'd git diff main..pr-1306 against current origin/main (which has #1305 merged at f482a6b), but #1306 is based on 0d37f48 (#1304 tip, before #1305). So the youtube/xiaohongshu/xiaoe "regressions" I flagged are actually #1305's content NOT YET on this branch — the diff direction made them look like deletions/reverts, but #1306 itself touches none of those adapter files. Same goes for feed.test.js: it exists in main (added by #1305, 131 lines) and shows up as deleted in main..pr-1306 only because the branch hasn't been rebased.

So:

  • Blocker 1 (3 adapter regressions) — false alarm. After rebase onto current main these disappear.
  • Blocker 2 (feed.test.js deletion) — false alarm. After rebase the file is preserved.
  • #3 silent-column-drop rule misses pipeline map: blocks — still valid.
  • #4 silent-empty-fallback fires outside catch blocks — still valid.
  • ✅ Nits #5-#7 — still valid as nits.

My fault for diffing against current main instead of the branch's merge-base. Going forward I'll use git diff $(git merge-base origin/main HEAD)..HEAD for a true PR-author diff.

After rebase + #3/#4 fixes, I'll do round 2 fresh.

@jackwener jackwener force-pushed the feat/convention-audit branch from 7959d8f to c01652b Compare May 4, 2026 14:35
Copy link
Copy Markdown
Owner Author

@jackwener jackwener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 2 LGTM ✅ (gh blocks --approve on owner PRs, comment-only)

Rebased onto current main (f482a6b, post-#1305). True PR diff (vs merge-base) is audit-only — no adapter touches:

docs/.vitepress/config.mts           |   2 +
docs/conventions/convention-audit.md |  39 +
src/cli.ts                           |  23 +
src/convention-audit.test.ts         | 173 +
src/convention-audit.ts              | 567 +
src/main.ts                          |  19 +-/-

Both Round 1 valid concerns addressed:

#3 Pipeline map: scanning — added /\bmap\s*:\s*{/g trigger to extractPotentialRowObjects, reuses the same balanced-block reader. Verified end-to-end:

  • Whole-registry silent-column-drop: 100 → 104 (+4 from pipeline-style adapters previously invisible)
  • New unit test scans pipeline map blocks for silent column drops covers it.

#4 silent-empty-fallback catch-only — added findCatchBlockRanges (matches \bcatch\s*(?:\([^)]*\))?\s*{ + balanced findBalancedBlockEnd) and isInsideAnyRange check. Verified:

  • silent-empty-fallback: 71 → 5 (-66 idiomatic guards eliminated)
  • Spot-checked the 5 remaining (douyin/user-videos:31, jike/post:50, jike/topic:38, jike/user:37, weread/search:97) — all genuinely inside catch { ... } blocks.
  • New unit test only reports empty array fallbacks inside catch blocks covers both directions.

Whole-registry baseline (this branch, ready for #213/#217 to lock):

commands: 660, sites: 108, files_scanned: 647, violations: 500
  silent-column-drop:        104
  camelCase-in-columns:      189
  missing-access-metadata:     0
  silent-clamp:              109
  silent-empty-fallback:       5
  silent-sentinel:            84
  write-without-delete-pair:   9

Tests: 120/120 pass on cli.test.ts + convention-audit.test.ts. tsc --noEmit clean. npm run build green (660 manifest entries).

Minor latent (non-blocking, defer):

  • auditTypedErrorPatterns offset tracking uses offset += line.length + 1 but source.split(/\r?\n/) strips both \r\n — off-by-one on CRLF files. Repo is LF-only, no impact today; flag for future portability.
  • \bmap\s*:\s*{ trigger also matches plain JS object literals like { map: { ... } } outside pipelines — currently safe because the next layer requires column overlap to flag, but worth narrowing if cli({ ... }) ever grows other map: fields.
  • auditTypedErrorPatterns still scans line content including // comments — only matters if someone ships // TODO return [] style; defer.

These are tail-end nits, not merge-blockers. Up to @Jakevin (WAWQAQ) to call merge — this is a new agent-facing built-in command, would want explicit go-ahead before #213/#217 freeze its baseline.

@jackwener jackwener merged commit 73dc129 into main May 4, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant