Releases · lucioduran/ax-audit

09 Jun 16:12

v3.6.0

e1984dc

v3.6.0 — AI licensing, cloaking detection, crawl efficiency & full documentation Latest

Latest

This release consolidates everything since v3.1.0 (v3.2.0 → v3.6.0): four new checks, Content Signals support, parallel batch auditing, fetch retries, a Markdown reporter, and a complete documentation set. ax-audit goes from 15 to 18 checks, and from 229 to 301 tests.

All four new checks are informational in 3.x — they run and report full findings but carry weight 0, so your existing scores and baselines are unchanged. They gain weight in v4.0.

✨ New checks

`content-negotiation` — Markdown for Agents (v3.2 surface, shipped 3.1; hardened since)

Probes the homepage with Accept: text/markdown — the pattern served by Cloudflare and Vercel and requested by Claude Code, Cursor, and OpenCode (~80% fewer tokens than HTML). Validates the negotiated Content-Type, that the body is real Markdown (not relabeled HTML), Vary: Accept for cache correctness, and reports the size reduction vs HTML. Partial credit for a <link rel="alternate" type="text/markdown"> fallback.

`rsl` — Really Simple Licensing (v3.3)

Validates RSL 1.0, the machine-readable content-licensing standard endorsed by 1,500+ publishers (Reddit, Yahoo, Medium, O'Reilly). Discovery via all three spec mechanisms — robots.txt License: directive, Link: rel="license" header, and <link rel="license" type="application/rsl+xml">. Document validation: namespace, <content url> requirement, <license> presence, permits/prohibits vocabulary (usage/user/geo), and payment types. Flags pre-1.0 draft tokens with migration hints.

`agent-access` — Cloaking detection (v3.4)

Probes the homepage with realistic user-agents for the 8 core AI crawlers and compares status and visible-text volume against the baseline. Catches the failure mode invisible to operators: robots.txt allows GPTBot while your WAF returns it a 403 (Cloudflare's "Block AI Crawlers" toggle produces exactly this). Blocks consistent with an explicit robots.txt Disallow are treated as intentional. Includes a verified-bots caveat for WAFs using Web Bot Auth.

`crawl-efficiency` (v3.5)

Measures the cost of crawling your pages: compression (rewards Brotli, accepts gzip/deflate/zstd), conditional GET (verifies an ETag/Last-Modified validator and that the server answers If-None-Match/If-Modified-Since with a real 304), and response size.

🔧 Improvements

Content Signals Policy in `robots-txt` (v3.2)

The robots-txt check now parses Content-Signal: directives (contentsignals.org, CC0) — the search / ai-input / ai-train preferences Cloudflare serves by default on 3.8M+ managed domains. Declared signals are reported per User-agent group; malformed segments, unknown names, and out-of-group placement produce warnings. Informational — no score impact.

Infrastructure (v3.6)

Fetch retries with exponential backoff for transient failures (network errors, timeouts, 408/425/429/5xx). --retries <n> (default 2). Previously a single transient timeout scored a check 0.
Parallel batch auditing via --concurrency <n> and the new BatchOptions type, with order-preserving output. Default remains sequential.
Markdown reporter — --output markdown for CI logs and PR comments (single + batch). New exports: renderMarkdown, renderBatchMarkdown.
Added Google's official signed AI-agent user-agent Google-Agent (agent.bot.goog) to the known-crawler list.
CLI now validates --retries, --concurrency, and --output.

📚 Documentation

A complete documentation set under docs/, shipped in the npm package and mirrored at lucioduran.com/projects/ax-audit/docs:

getting-started — first audit, reading the report, impact-ordered remediation, baselines
concepts — the AX standards landscape (discovery, interaction, governance/licensing, transport)
checks — exact per-finding scoring for all 18 checks
cli / api / ci / architecture / faq — full reference, with an API-stability policy
New CONTRIBUTING.md and SECURITY.md

Every new finding has a matching remediation guide at /projects/ax-audit/guides.

🐛 Fixes

Scorer division by zero: running only weight-0 checks (e.g. --checks rsl) returned NaN; now falls back to a plain average.

⚙️ Compatibility

No breaking changes. New checks are informational (weight 0); scores and baselines are unchanged from 3.1.x. Retries can raise scores on flaky endpoints that previously timed out, but the scoring model itself is unchanged.

📦 Install

npx ax-audit@3.6.0 https://your-site.com

Full changelog: v3.1.0...v3.6.0

Assets 2

06 Jun 14:07

lucioduran

v3.1.0

067a307

v3.1.0 — Markdown for Agents: content-negotiation check (15 checks)

Added — content-negotiation check (informational)

content-negotiation (weight 0 in 3.x): probes the homepage with Accept: text/markdown to detect Markdown for Agents support — the content-negotiation pattern implemented by Cloudflare and Vercel and requested by Claude Code, Cursor, and OpenCode. Markdown cuts token usage by ~80% vs HTML for the same content.
- Validates the negotiated Content-Type (text/markdown).
- Detects relabeled HTML documents masquerading as Markdown (−25).
- Validates Vary: Accept so shared caches and CDNs keep the HTML and Markdown representations apart (−15 when missing; accepts Vary: *).
- Reports the size reduction vs the HTML representation (informational).
- Partial credit (40) when negotiation is unsupported but a <link rel="alternate" type="text/markdown"> fallback is advertised.
- Distinguishes HTTP 406 from plain "ignores Accept" in the failure detail.

Added — per-request fetch headers

CheckContext.fetch now accepts an optional { headers } argument (new exported type: FetchOptions). Custom headers merge case-insensitively over the defaults, so a custom Accept replaces the default instead of being sent alongside it.
The in-memory request cache now keys on URL + normalized (lowercased, sorted) headers — mirroring Vary semantics on the wire, so the HTML and Markdown probes of the same URL never collide.

Fixed

Scorer division by zero: calculateOverallScore returned NaN when every selected check had weight 0 (e.g. --checks content-negotiation). It now falls back to a plain average, and returns 0 for empty input.

Scoring

The new check is informational in 3.x: it runs and reports findings but does not affect the overall score, so existing scores and baselines are unchanged. It will gain weight in v4.0, consistent with treating score-affecting changes as breaking (see v3.0.0).

Tests

229 tests total (31 new): content-negotiation (19), fetcher integration against a real local HTTP server (9), and scorer coverage for weight-0 checks (3).

Try it:

npx ax-audit@3.1.0 https://your-site.com --checks content-negotiation

Assets 2

30 Apr 16:58

lucioduran

v3.0.0

1c96db4

v3.0.0 — Full agent-optimization coverage (14 checks)

Added — five new checks (full agent-optimization coverage)

html-rendering (weight 9%): detects whether the static HTML response actually contains content, since most AI crawlers (GPTBot, ClaudeBot, CCBot, …) do not execute JavaScript. Heuristics: text length, word count, text-to-markup ratio, empty SPA mount points (#root, #__next, #__nuxt, #app, #svelte, #gatsby), semantic landmarks (<main>, <article>, <header>, <footer>, <nav>), single <h1>, <noscript> fallback, and <img alt> coverage.
sitemap (weight 4%): locates the sitemap via robots.txt Sitemap: directive or /sitemap.xml, validates XML shape, parses <urlset> and <sitemapindex>, samples child sitemaps from indexes, scores <lastmod> coverage and freshness (>365d → stale), enforces 50k-URL / 50MB limits.
seo-basics (weight 7%): <title> length 20–70, <meta name="description"> length 70–160, <link rel="canonical"> (absolute, single), <html lang> (BCP 47), <meta charset="utf-8">, <meta name="viewport">, hreflang completeness with x-default. Title/description duplication detection.
tls-https (weight 5%): site is served over HTTPS, HTTP redirects to HTTPS, HSTS max-age >= 6 months (1 year for preload), includeSubDomains, preload directive eligibility per https://hstspreload.org.
well-known-ai (weight 3%): emerging AI-specific discovery files — /.well-known/ai.txt (Spawning), /.well-known/genai.txt, /ai-plugin.json (legacy ChatGPT plugin), /agents.json (Wildcard / OpenAgents), /.well-known/nlweb.json (Microsoft NLWeb). Each present file scores; coverage is bonus rather than baseline.

Improved — existing checks

meta-tags: now validates Open Graph completeness (og:title, og:description, og:url, og:type, og:image, og:site_name) and Twitter Card completeness (twitter:card, twitter:title, twitter:description, twitter:image). Reuses shared HTML utilities for tag matching.
agent-json: validates the url field is absolute and matches the audited origin, and that every skills[] entry has both id and description.
llms-txt / agent-json / mcp / openapi: validate Content-Type of the fetched resource (text/plain / text/markdown for llms.txt; application/json for the JSON manifests). Penalty: −5 per mismatch.
robots-txt: CORE_AI_CRAWLERS extended (now 8 entries: GPTBot, ClaudeBot, ChatGPT-User, Claude-SearchBot, Google-Extended, PerplexityBot, OAI-SearchBot, CCBot). ALL_AI_CRAWLERS extended with MistralAI-User, KagiBot, GeminiBot, Goose, AwarioBot family, Bingbot, ImagesiftBot, omgili, Webzio-Extended, and others (47 known crawlers total).

Refactored

New shared module src/checks/html-utils.ts with regex-based primitives for HTML inspection (getMetaContent, findLinkTags, findMetaTagsByPrefix, extractVisibleText, countExecutableScripts, getTagAttribute, …). Eliminates duplicated regex code across meta-tags, seo-basics, html-rendering, and structured-data.
New shared utility checkContentType in src/checks/utils.ts for consistent Content-Type validation.

Scoring

Weights redistributed across 14 checks, total still sums to 100. New highest-weight signals are llms-txt and robots-txt (11% each) followed by html-rendering / structured-data / http-headers (9%).

Tests

198 tests total (77 new). New suites: html-rendering (14), sitemap (12), seo-basics (19), tls-https (11), well-known-ai (8). Plus expanded meta-tags / agent-json / mcp / openapi / llms-txt suites for the new validations.

Breaking

Score deltas vs v2.x are expected on the same site because (a) weights were redistributed across 14 checks instead of 9, and (b) Content-Type validation on /llms.txt and the .well-known JSON manifests now applies a −5 penalty per mismatch. Sites previously scoring 100 may drop a few points until the new signals are addressed. Use --baseline to track regressions explicitly.

Assets 2

16 Apr 13:09

lucioduran

v2.4.0

6eaf3c5

v2.4.0 — Baseline Comparison

Baseline Comparison Mode

Track AX score changes over time by saving baselines and comparing against them in subsequent runs.

New CLI flags

--save-baseline <path> — save audit results as a baseline JSON file
--baseline <path> — compare against a previous baseline, show per-check score deltas (▲/▼)
--fail-on-regression <points> — exit with code 1 if any check regresses by more than N points

Works with all output formats (terminal, JSON, HTML).

CI/CD usage

# Save baseline on main branch
- run: npx ax-audit https://your-site.com --save-baseline .ax-baseline.json

# Gate PRs on regressions
- run: npx ax-audit https://your-site.com --baseline .ax-baseline.json --fail-on-regression 5

Programmatic API

New exports: saveBaseline(), loadBaseline(), diffBaseline(), toBaselineData() with full TypeScript types (BaselineData, BaselineDiff, CheckDiff).

Other

15 new tests (total: 121)
Fixed test runner glob that was silently skipping root-level test files

Assets 2

04 Mar 21:12

lucioduran

v2.2.1

9f77137

v2.2.1

Improve type safety, remove duplication, fix nullish coalescing
Suggest ax-init when score < 100
Add remediation guide links to all findings
Add green electric logo to README
Fix: format robots-txt.ts to pass Prettier check

Assets 2

04 Mar 21:12

lucioduran

v2.1.0

bca3ea1

v2.1.0

Add remediation hints to all audit findings

Assets 2

04 Mar 21:12

lucioduran

v2.0.0

ce78557

v2.0.0

Add HTML reporter with score gauge and dark mode
Update version to 2.0.0

Assets 2

04 Mar 21:12

lucioduran

v1.14.0

4e97526

v1.14.0

Minor internal improvements and version bump

Assets 2

04 Mar 21:12

lucioduran

v1.13.0

9a4352e

v1.13.0

Add batch URLs support for auditing multiple sites in one run

Assets 2

04 Mar 21:12

lucioduran

v1.12.0

76eeb3f

v1.12.0

Fix Link header parsing with proper RFC 5988 parser

Assets 2

Releases: lucioduran/ax-audit

v3.6.0 — AI licensing, cloaking detection, crawl efficiency & full documentation

✨ New checks

content-negotiation — Markdown for Agents (v3.2 surface, shipped 3.1; hardened since)

rsl — Really Simple Licensing (v3.3)

agent-access — Cloaking detection (v3.4)

crawl-efficiency (v3.5)

🔧 Improvements

Content Signals Policy in robots-txt (v3.2)

Infrastructure (v3.6)

📚 Documentation

🐛 Fixes

⚙️ Compatibility

📦 Install

Uh oh!

v3.1.0 — Markdown for Agents: content-negotiation check (15 checks)

Added — content-negotiation check (informational)

Added — per-request fetch headers

Fixed

Scoring

Tests

Uh oh!

v3.0.0 — Full agent-optimization coverage (14 checks)

Added — five new checks (full agent-optimization coverage)

Improved — existing checks

Refactored

Scoring

Tests

Breaking

Uh oh!

v2.4.0 — Baseline Comparison

Baseline Comparison Mode

New CLI flags

CI/CD usage

Programmatic API

Other

Uh oh!

v2.2.1

Uh oh!

v2.1.0

Uh oh!

v2.0.0

Uh oh!

v1.14.0

Uh oh!

v1.13.0

Uh oh!

v1.12.0

Uh oh!

`content-negotiation` — Markdown for Agents (v3.2 surface, shipped 3.1; hardened since)

`rsl` — Really Simple Licensing (v3.3)

`agent-access` — Cloaking detection (v3.4)

`crawl-efficiency` (v3.5)

Content Signals Policy in `robots-txt` (v3.2)