You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Added — five new checks (full agent-optimization coverage)
html-rendering (weight 9%): detects whether the static HTML response actually contains content, since most AI crawlers (GPTBot, ClaudeBot, CCBot, …) do not execute JavaScript. Heuristics: text length, word count, text-to-markup ratio, empty SPA mount points (#root, #__next, #__nuxt, #app, #svelte, #gatsby), semantic landmarks (<main>, <article>, <header>, <footer>, <nav>), single <h1>, <noscript> fallback, and <img alt> coverage.
sitemap (weight 4%): locates the sitemap via robots.txtSitemap: directive or /sitemap.xml, validates XML shape, parses <urlset> and <sitemapindex>, samples child sitemaps from indexes, scores <lastmod> coverage and freshness (>365d → stale), enforces 50k-URL / 50MB limits.
tls-https (weight 5%): site is served over HTTPS, HTTP redirects to HTTPS, HSTS max-age >= 6 months (1 year for preload), includeSubDomains, preload directive eligibility per https://hstspreload.org.
well-known-ai (weight 3%): emerging AI-specific discovery files — /.well-known/ai.txt (Spawning), /.well-known/genai.txt, /ai-plugin.json (legacy ChatGPT plugin), /agents.json (Wildcard / OpenAgents), /.well-known/nlweb.json (Microsoft NLWeb). Each present file scores; coverage is bonus rather than baseline.
Improved — existing checks
meta-tags: now validates Open Graph completeness (og:title, og:description, og:url, og:type, og:image, og:site_name) and Twitter Card completeness (twitter:card, twitter:title, twitter:description, twitter:image). Reuses shared HTML utilities for tag matching.
agent-json: validates the url field is absolute and matches the audited origin, and that every skills[] entry has both id and description.
llms-txt / agent-json / mcp / openapi: validate Content-Type of the fetched resource (text/plain / text/markdown for llms.txt; application/json for the JSON manifests). Penalty: −5 per mismatch.
New shared module src/checks/html-utils.ts with regex-based primitives for HTML inspection (getMetaContent, findLinkTags, findMetaTagsByPrefix, extractVisibleText, countExecutableScripts, getTagAttribute, …). Eliminates duplicated regex code across meta-tags, seo-basics, html-rendering, and structured-data.
New shared utility checkContentType in src/checks/utils.ts for consistent Content-Type validation.
Scoring
Weights redistributed across 14 checks, total still sums to 100. New highest-weight signals are llms-txt and robots-txt (11% each) followed by html-rendering / structured-data / http-headers (9%).
Tests
198 tests total (77 new). New suites: html-rendering (14), sitemap (12), seo-basics (19), tls-https (11), well-known-ai (8). Plus expanded meta-tags / agent-json / mcp / openapi / llms-txt suites for the new validations.
Breaking
Score deltas vs v2.x are expected on the same site because (a) weights were redistributed across 14 checks instead of 9, and (b) Content-Type validation on /llms.txt and the .well-known JSON manifests now applies a −5 penalty per mismatch. Sites previously scoring 100 may drop a few points until the new signals are addressed. Use --baseline to track regressions explicitly.