Skip to content

Releases: lucioduran/ax-audit

v3.6.0 — AI licensing, cloaking detection, crawl efficiency & full documentation

09 Jun 16:12

Choose a tag to compare

This release consolidates everything since v3.1.0 (v3.2.0 → v3.6.0): four new checks, Content Signals support, parallel batch auditing, fetch retries, a Markdown reporter, and a complete documentation set. ax-audit goes from 15 to 18 checks, and from 229 to 301 tests.

All four new checks are informational in 3.x — they run and report full findings but carry weight 0, so your existing scores and baselines are unchanged. They gain weight in v4.0.


✨ New checks

content-negotiation — Markdown for Agents (v3.2 surface, shipped 3.1; hardened since)

Probes the homepage with Accept: text/markdown — the pattern served by Cloudflare and Vercel and requested by Claude Code, Cursor, and OpenCode (~80% fewer tokens than HTML). Validates the negotiated Content-Type, that the body is real Markdown (not relabeled HTML), Vary: Accept for cache correctness, and reports the size reduction vs HTML. Partial credit for a <link rel="alternate" type="text/markdown"> fallback.

rsl — Really Simple Licensing (v3.3)

Validates RSL 1.0, the machine-readable content-licensing standard endorsed by 1,500+ publishers (Reddit, Yahoo, Medium, O'Reilly). Discovery via all three spec mechanisms — robots.txt License: directive, Link: rel="license" header, and <link rel="license" type="application/rsl+xml">. Document validation: namespace, <content url> requirement, <license> presence, permits/prohibits vocabulary (usage/user/geo), and payment types. Flags pre-1.0 draft tokens with migration hints.

agent-access — Cloaking detection (v3.4)

Probes the homepage with realistic user-agents for the 8 core AI crawlers and compares status and visible-text volume against the baseline. Catches the failure mode invisible to operators: robots.txt allows GPTBot while your WAF returns it a 403 (Cloudflare's "Block AI Crawlers" toggle produces exactly this). Blocks consistent with an explicit robots.txt Disallow are treated as intentional. Includes a verified-bots caveat for WAFs using Web Bot Auth.

crawl-efficiency (v3.5)

Measures the cost of crawling your pages: compression (rewards Brotli, accepts gzip/deflate/zstd), conditional GET (verifies an ETag/Last-Modified validator and that the server answers If-None-Match/If-Modified-Since with a real 304), and response size.


🔧 Improvements

Content Signals Policy in robots-txt (v3.2)

The robots-txt check now parses Content-Signal: directives (contentsignals.org, CC0) — the search / ai-input / ai-train preferences Cloudflare serves by default on 3.8M+ managed domains. Declared signals are reported per User-agent group; malformed segments, unknown names, and out-of-group placement produce warnings. Informational — no score impact.

Infrastructure (v3.6)

  • Fetch retries with exponential backoff for transient failures (network errors, timeouts, 408/425/429/5xx). --retries <n> (default 2). Previously a single transient timeout scored a check 0.
  • Parallel batch auditing via --concurrency <n> and the new BatchOptions type, with order-preserving output. Default remains sequential.
  • Markdown reporter--output markdown for CI logs and PR comments (single + batch). New exports: renderMarkdown, renderBatchMarkdown.
  • Added Google's official signed AI-agent user-agent Google-Agent (agent.bot.goog) to the known-crawler list.
  • CLI now validates --retries, --concurrency, and --output.

📚 Documentation

A complete documentation set under docs/, shipped in the npm package and mirrored at lucioduran.com/projects/ax-audit/docs:

  • getting-started — first audit, reading the report, impact-ordered remediation, baselines
  • concepts — the AX standards landscape (discovery, interaction, governance/licensing, transport)
  • checks — exact per-finding scoring for all 18 checks
  • cli / api / ci / architecture / faq — full reference, with an API-stability policy
  • New CONTRIBUTING.md and SECURITY.md

Every new finding has a matching remediation guide at /projects/ax-audit/guides.


🐛 Fixes

  • Scorer division by zero: running only weight-0 checks (e.g. --checks rsl) returned NaN; now falls back to a plain average.

⚙️ Compatibility

  • No breaking changes. New checks are informational (weight 0); scores and baselines are unchanged from 3.1.x. Retries can raise scores on flaky endpoints that previously timed out, but the scoring model itself is unchanged.

📦 Install

npx ax-audit@3.6.0 https://your-site.com

Full changelog: v3.1.0...v3.6.0

v3.1.0 — Markdown for Agents: content-negotiation check (15 checks)

06 Jun 14:07

Choose a tag to compare

Added — content-negotiation check (informational)

  • content-negotiation (weight 0 in 3.x): probes the homepage with Accept: text/markdown to detect Markdown for Agents support — the content-negotiation pattern implemented by Cloudflare and Vercel and requested by Claude Code, Cursor, and OpenCode. Markdown cuts token usage by ~80% vs HTML for the same content.
    • Validates the negotiated Content-Type (text/markdown).
    • Detects relabeled HTML documents masquerading as Markdown (−25).
    • Validates Vary: Accept so shared caches and CDNs keep the HTML and Markdown representations apart (−15 when missing; accepts Vary: *).
    • Reports the size reduction vs the HTML representation (informational).
    • Partial credit (40) when negotiation is unsupported but a <link rel="alternate" type="text/markdown"> fallback is advertised.
    • Distinguishes HTTP 406 from plain "ignores Accept" in the failure detail.

Added — per-request fetch headers

  • CheckContext.fetch now accepts an optional { headers } argument (new exported type: FetchOptions). Custom headers merge case-insensitively over the defaults, so a custom Accept replaces the default instead of being sent alongside it.
  • The in-memory request cache now keys on URL + normalized (lowercased, sorted) headers — mirroring Vary semantics on the wire, so the HTML and Markdown probes of the same URL never collide.

Fixed

  • Scorer division by zero: calculateOverallScore returned NaN when every selected check had weight 0 (e.g. --checks content-negotiation). It now falls back to a plain average, and returns 0 for empty input.

Scoring

  • The new check is informational in 3.x: it runs and reports findings but does not affect the overall score, so existing scores and baselines are unchanged. It will gain weight in v4.0, consistent with treating score-affecting changes as breaking (see v3.0.0).

Tests

  • 229 tests total (31 new): content-negotiation (19), fetcher integration against a real local HTTP server (9), and scorer coverage for weight-0 checks (3).

Try it:

npx ax-audit@3.1.0 https://your-site.com --checks content-negotiation

v3.0.0 — Full agent-optimization coverage (14 checks)

30 Apr 16:58

Choose a tag to compare

Added — five new checks (full agent-optimization coverage)

  • html-rendering (weight 9%): detects whether the static HTML response actually contains content, since most AI crawlers (GPTBot, ClaudeBot, CCBot, …) do not execute JavaScript. Heuristics: text length, word count, text-to-markup ratio, empty SPA mount points (#root, #__next, #__nuxt, #app, #svelte, #gatsby), semantic landmarks (<main>, <article>, <header>, <footer>, <nav>), single <h1>, <noscript> fallback, and <img alt> coverage.
  • sitemap (weight 4%): locates the sitemap via robots.txt Sitemap: directive or /sitemap.xml, validates XML shape, parses <urlset> and <sitemapindex>, samples child sitemaps from indexes, scores <lastmod> coverage and freshness (>365d → stale), enforces 50k-URL / 50MB limits.
  • seo-basics (weight 7%): <title> length 20–70, <meta name="description"> length 70–160, <link rel="canonical"> (absolute, single), <html lang> (BCP 47), <meta charset="utf-8">, <meta name="viewport">, hreflang completeness with x-default. Title/description duplication detection.
  • tls-https (weight 5%): site is served over HTTPS, HTTP redirects to HTTPS, HSTS max-age >= 6 months (1 year for preload), includeSubDomains, preload directive eligibility per https://hstspreload.org.
  • well-known-ai (weight 3%): emerging AI-specific discovery files — /.well-known/ai.txt (Spawning), /.well-known/genai.txt, /ai-plugin.json (legacy ChatGPT plugin), /agents.json (Wildcard / OpenAgents), /.well-known/nlweb.json (Microsoft NLWeb). Each present file scores; coverage is bonus rather than baseline.

Improved — existing checks

  • meta-tags: now validates Open Graph completeness (og:title, og:description, og:url, og:type, og:image, og:site_name) and Twitter Card completeness (twitter:card, twitter:title, twitter:description, twitter:image). Reuses shared HTML utilities for tag matching.
  • agent-json: validates the url field is absolute and matches the audited origin, and that every skills[] entry has both id and description.
  • llms-txt / agent-json / mcp / openapi: validate Content-Type of the fetched resource (text/plain / text/markdown for llms.txt; application/json for the JSON manifests). Penalty: −5 per mismatch.
  • robots-txt: CORE_AI_CRAWLERS extended (now 8 entries: GPTBot, ClaudeBot, ChatGPT-User, Claude-SearchBot, Google-Extended, PerplexityBot, OAI-SearchBot, CCBot). ALL_AI_CRAWLERS extended with MistralAI-User, KagiBot, GeminiBot, Goose, AwarioBot family, Bingbot, ImagesiftBot, omgili, Webzio-Extended, and others (47 known crawlers total).

Refactored

  • New shared module src/checks/html-utils.ts with regex-based primitives for HTML inspection (getMetaContent, findLinkTags, findMetaTagsByPrefix, extractVisibleText, countExecutableScripts, getTagAttribute, …). Eliminates duplicated regex code across meta-tags, seo-basics, html-rendering, and structured-data.
  • New shared utility checkContentType in src/checks/utils.ts for consistent Content-Type validation.

Scoring

  • Weights redistributed across 14 checks, total still sums to 100. New highest-weight signals are llms-txt and robots-txt (11% each) followed by html-rendering / structured-data / http-headers (9%).

Tests

  • 198 tests total (77 new). New suites: html-rendering (14), sitemap (12), seo-basics (19), tls-https (11), well-known-ai (8). Plus expanded meta-tags / agent-json / mcp / openapi / llms-txt suites for the new validations.

Breaking

  • Score deltas vs v2.x are expected on the same site because (a) weights were redistributed across 14 checks instead of 9, and (b) Content-Type validation on /llms.txt and the .well-known JSON manifests now applies a −5 penalty per mismatch. Sites previously scoring 100 may drop a few points until the new signals are addressed. Use --baseline to track regressions explicitly.

v2.4.0 — Baseline Comparison

16 Apr 13:09

Choose a tag to compare

Baseline Comparison Mode

Track AX score changes over time by saving baselines and comparing against them in subsequent runs.

New CLI flags

  • --save-baseline <path> — save audit results as a baseline JSON file
  • --baseline <path> — compare against a previous baseline, show per-check score deltas (▲/▼)
  • --fail-on-regression <points> — exit with code 1 if any check regresses by more than N points

Works with all output formats (terminal, JSON, HTML).

CI/CD usage

# Save baseline on main branch
- run: npx ax-audit https://your-site.com --save-baseline .ax-baseline.json

# Gate PRs on regressions
- run: npx ax-audit https://your-site.com --baseline .ax-baseline.json --fail-on-regression 5

Programmatic API

New exports: saveBaseline(), loadBaseline(), diffBaseline(), toBaselineData() with full TypeScript types (BaselineData, BaselineDiff, CheckDiff).

Other

  • 15 new tests (total: 121)
  • Fixed test runner glob that was silently skipping root-level test files

v2.2.1

04 Mar 21:12

Choose a tag to compare

  • Improve type safety, remove duplication, fix nullish coalescing
  • Suggest ax-init when score < 100
  • Add remediation guide links to all findings
  • Add green electric logo to README
  • Fix: format robots-txt.ts to pass Prettier check

v2.1.0

04 Mar 21:12

Choose a tag to compare

  • Add remediation hints to all audit findings

v2.0.0

04 Mar 21:12

Choose a tag to compare

  • Add HTML reporter with score gauge and dark mode
  • Update version to 2.0.0

v1.14.0

04 Mar 21:12

Choose a tag to compare

  • Minor internal improvements and version bump

v1.13.0

04 Mar 21:12

Choose a tag to compare

  • Add batch URLs support for auditing multiple sites in one run

v1.12.0

04 Mar 21:12

Choose a tag to compare

  • Fix Link header parsing with proper RFC 5988 parser