feat(spec+define): lintSpecQuality diagnostic helper + DEFINE warning surface + prd-taskmaster comparison#25
Conversation
…fficiency heuristic
…narrow NAMED_CONTROL_RE R1 review closure for M-SPEC1 prd-taskmaster borrow: - Add CODEX_BRIEFING.md, CODEX_RESPONSE.md, COMPARISON.md (companion docs missing from prior commits) - Narrow NAMED_CONTROL_RE to the contract-pinned allow-list (drop generic uppercase acronym arm that over-suppressed warnings on bullets like 'REST API should be fast') - Add QH2 healthy-arm test: 2 bullets + 15+ words does NOT warn - Add AND-arm test: generic uppercase acronym does NOT suppress vague-language - Pin allow-list verbatim in spec-contract.md (OAuth, OIDC, SAML, JWT, MFA, TOTP, RBAC, ABAC, ACL, TLS, mTLS, HMAC, AES, RSA, ECDSA, HKDF, PBKDF2, bcrypt, argon2, scrypt, SOC2, HIPAA, GDPR, PCI, CSP, CORS, CSRF) - Update SYNTHESIS.md closure block with R1 commit + file list - README index update deferred to a separate cleanup commit on main (parallel-session safety)
…taskmaster-borrow
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a new diagnostic-only quality linter for SPEC artifacts, borrowing mechanics from the prd-taskmaster skill. It adds a vague-language detector and a goals-sufficiency check, implemented as a non-blocking helper in src/artifacts/spec.ts and surfaced during the DEFINE phase. Feedback focuses on improving the efficiency and robustness of the regex-based suppression logic, specifically regarding case-sensitivity and pre-compilation of patterns.
| // or a named-control identifier from the contract-pinned allow-list. | ||
| // The intent is the source comparison's "explicit metric or named control" rule. | ||
| const METRIC_RE = /\d+\s*(?:ms|s|m|h|%|MB|GB|KB|kb|req|rps|qps|hz|min|sec|seconds?|minutes?|hours?|days?|users?|requests?|rows?|bytes?|tokens?)\b/i | ||
| const NAMED_CONTROL_RE = /\b(?:OAuth|OIDC|SAML|JWT|MFA|TOTP|RBAC|ABAC|ACL|TLS|mTLS|HMAC|AES|RSA|ECDSA|HKDF|PBKDF2|bcrypt|argon2|scrypt|SOC2|HIPAA|GDPR|PCI|CSP|CORS|CSRF)\b/ |
There was a problem hiding this comment.
The NAMED_CONTROL_RE is currently case-sensitive, which differs from the METRIC_RE above and the vague terms check itself. This may lead to false positive warnings if a user specifies a named control using lowercase (e.g., "jwt" or "mtls"). Adding the i flag would make the suppression logic more robust and consistent.
| const NAMED_CONTROL_RE = /\b(?:OAuth|OIDC|SAML|JWT|MFA|TOTP|RBAC|ABAC|ACL|TLS|mTLS|HMAC|AES|RSA|ECDSA|HKDF|PBKDF2|bcrypt|argon2|scrypt|SOC2|HIPAA|GDPR|PCI|CSP|CORS|CSRF)\b/ | |
| const NAMED_CONTROL_RE = /\b(?:OAuth|OIDC|SAML|JWT|MFA|TOTP|RBAC|ABAC|ACL|TLS|mTLS|HMAC|AES|RSA|ECDSA|HKDF|PBKDF2|bcrypt|argon2|scrypt|SOC2|HIPAA|GDPR|PCI|CSP|CORS|CSRF)\b/i |
| function findVagueTerms(bullet: string): readonly string[] { | ||
| // Match the prd-taskmaster regex: optional `should be|must be|need(s) to be` lead-in, | ||
| // then any of the 15 terms as a whole word, case-insensitive. | ||
| const pattern = new RegExp( | ||
| String.raw`\b(?:should\s+be\s+|must\s+be\s+|needs?\s+to\s+be\s+)?(` + | ||
| VAGUE_TERMS.join('|') + | ||
| String.raw`)\b`, | ||
| 'gi', | ||
| ) | ||
| if (METRIC_RE.test(bullet) || NAMED_CONTROL_RE.test(bullet)) return [] | ||
| const found: string[] = [] | ||
| let m: RegExpExecArray | null | ||
| while ((m = pattern.exec(bullet)) !== null) { | ||
| found.push(m[1]!.toLowerCase()) | ||
| } | ||
| return found | ||
| } |
There was a problem hiding this comment.
The vague terms regex is being re-compiled on every call to findVagueTerms. Since this function is executed for every bullet in every section of the SPEC, it is more efficient to pre-compile the pattern as a constant. Additionally, using matchAll provides a cleaner way to iterate over global matches compared to a while loop with exec.
const VAGUE_PATTERN = new RegExp(\n String.raw`\\b(?:should\\s+be\\s+|must\\s+be\\s+|needs?\\s+to\\s+be\\s+)?(` +\n VAGUE_TERMS.join('|') +\n String.raw`)\\b`,\n 'gi',\n)\n\nfunction findVagueTerms(bullet: string): readonly string[] {\n // Match the prd-taskmaster regex: optional `should be|must be|need(s) to be` lead-in,\n // then any of the 15 terms as a whole word, case-insensitive.\n if (METRIC_RE.test(bullet) || NAMED_CONTROL_RE.test(bullet)) return []\n return Array.from(bullet.matchAll(VAGUE_PATTERN), (m) => m[1]!.toLowerCase())\n}There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4b37f6c6d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // or a named-control identifier from the contract-pinned allow-list. | ||
| // The intent is the source comparison's "explicit metric or named control" rule. | ||
| const METRIC_RE = /\d+\s*(?:ms|s|m|h|%|MB|GB|KB|kb|req|rps|qps|hz|min|sec|seconds?|minutes?|hours?|days?|users?|requests?|rows?|bytes?|tokens?)\b/i | ||
| const NAMED_CONTROL_RE = /\b(?:OAuth|OIDC|SAML|JWT|MFA|TOTP|RBAC|ABAC|ACL|TLS|mTLS|HMAC|AES|RSA|ECDSA|HKDF|PBKDF2|bcrypt|argon2|scrypt|SOC2|HIPAA|GDPR|PCI|CSP|CORS|CSRF)\b/ |
There was a problem hiding this comment.
Match named-control exemptions case-insensitively
NAMED_CONTROL_RE is case-sensitive, so QH1 suppression fails for common lowercase/mixed-case spellings of the same controls (for example, oauth, oidc, jwt). In practice this causes false spec_vague_language warnings for bullets like “login must be secure with oauth,” even though the rule is supposed to suppress warnings when a named control is present. This will add noisy DEFINE diagnostics and reduce trust in the linter output; normalize casing or use a case-insensitive match for the allow-list check.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds a diagnostic-only SPEC quality linter (lintSpecQuality) and surfaces its warnings in DEFINE completion output, with the heuristics contract-pinned in documentation and accompanied by prd-taskmaster comparison writeups.
Changes:
- Implement
lintSpecQualityto emit warning issues for vague-language terms (QH1) and underspecified Goals (QH2), without impactingparseSpec/SpecLoadError. - Surface lint warnings in DEFINE’s completion
userMessageafter writingSPEC.mdand emittinggate_required. - Document the heuristics in
docs/references/spec-contract.mdand add comparison/codex debate artifacts underdocs/comparison/08-prd-taskmaster/.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/artifacts-spec.test.ts | Adds unit coverage for QH1/QH2 behaviors, output shape, and immutability of returned issue arrays. |
| src/phases/define.ts | Calls lintSpecQuality on successful DEFINE and appends diagnostic warnings to the CLI completion message. |
| src/artifacts/spec.ts | Introduces the SpecLintIssue/SpecLintCode types and lintSpecQuality implementation. |
| docs/references/spec-contract.md | Pins the vague-term vocabulary, suppression rules, and Goals sufficiency heuristic as diagnostic-only contract text. |
| docs/comparison/08-prd-taskmaster/SYNTHESIS.md | Adds synthesis/closure record for the prd-taskmaster comparison and M-SPEC1 borrow implementation. |
| docs/comparison/08-prd-taskmaster/COMPARISON.md | Adds the full comparison narrative and borrow decision record. |
| docs/comparison/08-prd-taskmaster/CODEX_BRIEFING.md | Adds the structured debate input sent to Codex for review. |
| docs/comparison/08-prd-taskmaster/CODEX_RESPONSE.md | Adds Codex’s verdict and recommended modifications for the borrow set. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Quality heuristics: diagnostic-only checks surfaced AFTER GATE_DEFINE_PASSED is | ||
| // written. These do NOT participate in parseSpec / SpecLoadError / approval flow | ||
| // (rule 20: zero new gate authority). Pinned in docs/references/spec-contract.md | ||
| // under "Quality heuristics (diagnostic-only)". |
| // or a named-control identifier from the contract-pinned allow-list. | ||
| // The intent is the source comparison's "explicit metric or named control" rule. | ||
| const METRIC_RE = /\d+\s*(?:ms|s|m|h|%|MB|GB|KB|kb|req|rps|qps|hz|min|sec|seconds?|minutes?|hours?|days?|users?|requests?|rows?|bytes?|tokens?)\b/i | ||
| const NAMED_CONTROL_RE = /\b(?:OAuth|OIDC|SAML|JWT|MFA|TOTP|RBAC|ABAC|ACL|TLS|mTLS|HMAC|AES|RSA|ECDSA|HKDF|PBKDF2|bcrypt|argon2|scrypt|SOC2|HIPAA|GDPR|PCI|CSP|CORS|CSRF)\b/ |
| The matching pattern is: | ||
|
|
||
| ```regex | ||
| \b(?:should\s+be\s+|must\s+be\s+|needs?\s+to\s+be\s+)?(fast|quick|slow|good|bad|poor|user-friendly|easy|simple|secure|safe|scalable|flexible|performant|efficient)\b | ||
| ``` | ||
|
|
||
| The optional lead-in is `should be`, `must be`, or `need(s) to be`. Warnings are suppressed for any bullet containing an explicit metric (digits plus a unit-like token) or a named-control reference from this allow-list: | ||
|
|
|
|
||
| **Decision (pre-debate): YES — code-oz exceeds, with two narrow prompt-adjacent borrows.** | ||
|
|
||
| `prd-taskmaster` is a single-purpose Claude Code skill that turns a feature idea into a Product Requirements Document, then hands the PRD to TaskMaster (an external task generator) for breakdown and execution. It is ~3.7k LOC concentrated in one Python `script.py` plus a `SKILL.md` workflow. Its authority surface ends at *PRD draft + 13-check validation + USER-TEST checkpoint insertion*. code-oz, by contrast, is a multi-phase runtime with file-based gates, sha256-bound artifacts, cross-family review at REVIEW, run-level budgets, worktree isolation, and a debate-scheduler. |
| --- | ||
| name: codex-briefing-prd-taskmaster | ||
| companion-docs: COMPARISON.md (analysis), CODEX_RESPONSE.md (verdicts), SYNTHESIS.md (locks) | ||
| target: structured Codex debate input for the prd-taskmaster comparison | ||
| status: dispatched-pending | ||
| date: 2026-05-10 | ||
| codex-model: gpt-5.5 | ||
| codex-effort: xhigh | ||
| codex-sandbox: read-only | ||
| --- |
|
|
||
| 1. Update `docs/references/spec-contract.md` with the pinned 15-term vocabulary list and the Goals-sufficiency heuristic. Doc-only commit. | ||
| 2. Add `SpecLintIssue` and `lintSpecQuality` to `src/artifacts/spec.ts`. Helper is pure: `(spec: ParsedSpec) → readonly SpecLintIssue[]`. | ||
| 3. Wire the linter into the DEFINE completion message so users see warnings *after* `GATE_DEFINE_PASSED.json` is written. Approval semantics untouched. |
| - `a1fc14ff47a213104d1cc480b150905cde807c94` test(spec): add lintSpecQuality coverage (M-SPEC1) | ||
| - `50575bfe8035a738508de9c280d1c062b9eefc2e` docs(comparison): close M-SPEC1 implementation in prd-taskmaster synthesis | ||
| - `ea5593bd0a867d75fba1e32e6eda948b2fa80152` fix(spec): address Opus review F1+F2+F3 (regex clarity, AND-arm test, header comment) | ||
| - `PENDING_R1_FIX_SHA` fix(spec): close R1 findings — closure freshness, companion docs, QH2 healthy-arm test, narrow NAMED_CONTROL_RE |
Summary
Adds the M-SPEC1 prd-taskmaster borrow: a
lintSpecQualitydiagnostic helper for SPEC artifacts plus DEFINE-phase surfacing, with the underlying vague-language vocabulary and Goals sufficiency heuristic pinned in the SPEC contract.lintSpecQuality(src/artifacts/spec.ts): pure diagnostic helper, returns warnings only; never throws, never blocks gate writes, never entersSpecLoadErrorCode.docs/references/spec-contract.mdpins the 15-term vague-language vocabulary, the AND-arm exemption rules (explicit metric OR named-control allow-list), and the Goals sufficiency heuristic (QH2).NAMED_CONTROL_REis the contract-pinned allow-list verbatim (OAuth, OIDC, SAML, JWT, MFA, TOTP, RBAC, ABAC, ACL, TLS, mTLS, HMAC, AES, RSA, ECDSA, HKDF, PBKDF2, bcrypt, argon2, scrypt, SOC2, HIPAA, GDPR, PCI, CSP, CORS, CSRF); the prior generic-uppercase arm was dropped after the R1 round flagged it as over-permissive on bullets like "REST API should be fast".docs/comparison/08-prd-taskmaster/{COMPARISON,CODEX_BRIEFING,CODEX_RESPONSE,SYNTHESIS}.mdcapture the borrow analysis + Codex accept-with-modifications verdict.Authority footprint (Rule 20)
Zero new authority.
lintSpecQualityis a diagnostic outsideSpecLoadError; it does not write gate files, does not emit new events, does not enter approval-blocking paths, and is not config-driven. Vague-vocabulary is contract-pinned (not a config knob) so the deterministic validation surface cannot drift across runs.Review history
NAMED_CONTROL_REnarrowed to allow-list,SYNTHESIS.mdclosure block updated.Merge with main
Merged origin/main (11 PRs landed today). No conflicts.
lintSpecQualitysurface, spec-contract additions, and 08-prd-taskmaster comparison docs all sit alongside the new content from those PRs without touching the same files.Test plan
bun test— 3189 pass, 2 skip (xai live), 0 failbun run typecheck— clean