Skip to content

fix(audit): align schema pattern detection#50

Merged
rubenmarcus merged 2 commits into
multivmlabs:mainfrom
KimHyeongRae0:fix/audit-schema-detection
May 14, 2026
Merged

fix(audit): align schema pattern detection#50
rubenmarcus merged 2 commits into
multivmlabs:mainfrom
KimHyeongRae0:fix/audit-schema-detection

Conversation

@KimHyeongRae0
Copy link
Copy Markdown
Contributor

@KimHyeongRae0 KimHyeongRae0 commented May 12, 2026

Fixes audit/schema FAQ and HowTo detection drift. Validation: npm run lint, npm run test -- --run, npm run build.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 12, 2026

@KimHyeongRae0 is attempting to deploy a commit to the Cytonic Team on Vercel.

A member of the Team first needs to authorize it.

@KimHyeongRae0 KimHyeongRae0 force-pushed the fix/audit-schema-detection branch from fb7823c to 0663375 Compare May 12, 2026 16:48
@KimHyeongRae0 KimHyeongRae0 changed the title [codex] fix audit schema detection fix(audit): align schema pattern detection May 12, 2026
@rubenmarcus
Copy link
Copy Markdown
Member

Strong fix @KimHyeongRae0 — the audit was using regex shortcuts to detect FAQ/HowTo while the schema generator used proper detection logic, so the audit could report "FAQPage schema eligible" while the schema generator would silently produce nothing. Extracting detectFaqPatterns and detectHowToSteps into schema-patterns.ts and having both call sites consume the same source is exactly the right shape.

Three new tests covering the three cases (FAQ without answer, single-step "HowTo", real FAQ) make the regression boundary clear.

Approving. Will merge once the open queue clears.

@KimHyeongRae0 KimHyeongRae0 marked this pull request as ready for review May 14, 2026 07:35
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@KimHyeongRae0
Copy link
Copy Markdown
Contributor Author

Thanks. Marked this ready for review.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 14, 2026

Greptile Summary

This PR consolidates duplicate FAQ and HowTo pattern-detection logic into a new shared schema-patterns.ts module, then wires both schema.ts and audit.ts to use it. The audit checks now accurately reflect what schema markup will actually be generated: FAQ headings must begin with a recognised English question word, and HowTo content must contain at least two numbered step headings.

  • schema-patterns.ts (new): Exports detectFaqPatterns and detectHowToSteps, extracted verbatim from the private helpers that previously lived only in schema.ts.
  • audit.ts: Replaces two ad-hoc regexes with calls to the shared functions; auditCitability also switches from testing all pages concatenated to per-page iteration, eliminating a subtle false-positive risk.
  • audit.test.ts: Four new tests exercise every branch — non-question-word heading, single step, valid FAQ, and two-step HowTo.

Confidence Score: 5/5

Safe to merge — the change is a clean extraction of identical logic into a shared module with no observable behavioral regression on valid inputs.

All four branches introduced by the new detection logic are covered by the new tests. The audit-side behavioral tightening (question-word requirement, per-page FAQ scan, ≥2-step HowTo guard) is intentional and matches what schema generation already enforced, so no downstream drift is introduced.

No files require special attention.

Important Files Changed

Filename Overview
src/core/schema-patterns.ts New shared module extracting detectFaqPatterns and detectHowToSteps from schema.ts; logic is identical to the originals, now exported for both schema generation and audit use.
src/core/audit.ts Replaces two ad-hoc inline regexes with calls to the shared detectFaqPatterns/detectHowToSteps; per-page iteration replaces the allContent concatenation approach in auditCitability, which is strictly more correct.
src/core/schema.ts Private detectFaqPatterns and detectHowToSteps helpers removed; functionality unchanged — now delegated to the shared schema-patterns module.
src/core/audit.test.ts Adds four new schema-presence tests: non-question-word heading (false), single HowTo step (false), valid FAQ (true), and two-step HowTo (true) — covering all four branches introduced by the new detection logic.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[audit.ts] -->|import| SP[schema-patterns.ts]
    B[schema.ts] -->|import| SP
    SP --> F{detectFaqPatterns}
    F --> F1[Heading starts with question word?]
    F1 -->|yes| F2[Collect answer lines]
    F2 --> F3[Return items]
    F1 -->|no| F4[Skip]
    SP --> H{detectHowToSteps}
    H --> H1[Heading matches Step N or N.?]
    H1 -->|yes| H2[Collect body lines]
    H2 --> H3{steps >= 2?}
    H3 -->|yes| H4[Return steps]
    H3 -->|no| H5[Return empty]
Loading

Reviews (3): Last reviewed commit: "test(audit): add positive HowTo test cas..." | Re-trigger Greptile

Comment thread src/core/audit.test.ts
Greptile flagged that the schema-presence test block covered three
failure paths (non-question heading, single-step heading, FAQ that
should pass) but no positive HowTo case. Without it, a regression in
the `>= 2 steps` guard or the step regex would silently go undetected.

Added a test asserting that content with two ## Step N: headings
passes the 'FAQPage or HowTo schema' check.

12 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rubenmarcus
Copy link
Copy Markdown
Member

Added the missing positive HowTo test case (0775299) — asserts that content with two ## Step N: headings passes the 'FAQPage or HowTo schema' check. Resolves the last open thread. 12 audit tests pass.

@greptileai re-review please.

@rubenmarcus rubenmarcus merged commit 3720f22 into multivmlabs:main May 14, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants