Skip to content

itaides/agent-readiness

Repository files navigation

agent-readiness

MIT License Bun 1.1+ 14 checks Zero telemetry

agent-readiness CLI — scanning robots.txt, sitemap, MCP card, OAuth metadata and more from localhost

Check whether your site is ready for AI agents. From localhost.

Same checks isitagentready.com runs — on staging, in CI, before you ship.

The public agent-readiness tools only see what's on the open internet. That rules out http://localhost:3000, your staging deploy behind a VPN, your CI preview, and every PR before it merges — exactly when fixing things is cheap. agent-readiness is a small CLI you point at any URL you can reach.

$ agent-readiness https://example.com
agent-readiness  https://example.com/
scanned          2026-05-20T07:31:09Z

[discoverability] 3/3 passing
  PASS  robots.txt — Valid robots.txt with 11 User-agent group(s)
  PASS  sitemap.xml — Valid sitemap (well-known)
  PASS  Link header advertising api-catalog — Link header advertises api-catalog

[content] 1/1 passing
  PASS  Markdown content negotiation — Server returns text/markdown for Accept: text/markdown

[bot-access] 2/3 passing
  PASS  Content-Signal directive in robots.txt — Recognized signals: ai-train, search
  PASS  AI bot rules in robots.txt — Rules present for: GPTBot, ClaudeBot, ...
  FAIL  Web Bot Auth public key directory — 200 but content-type "text/html" — looks like SPA fallthrough

Install

The repo is private for now. Clone and link:

git clone git@github.com:itaides/agent-readiness.git
cd agent-readiness
bun install
bun link
agent-readiness https://example.com

When the repo flips public, the one-liner becomes:

bun install -g github:itaides/agent-readiness

Scanning a local dev server (HTTPS with self-signed cert, RFC1918, IPv6 loopback — all handled):

agent-readiness http://localhost:3000

What it checks

14 checks across 4 categories. Full pass criteria, endpoints, and spec references live in docs/checks.md.

Discoverability (3)

  • robots.txt (RFC 9309)
  • sitemap.xml (with robots.txt fallback discovery)
  • Link header advertising rel="api-catalog" (RFC 8288)

Content (2)

  • Markdown content negotiation (Accept: text/markdown)
  • llms.txt (llmstxt.org — opt-in via --include-llms-txt)

Bot access control (3)

  • Content-Signal: directive in robots.txt (ai-train, ai-input, search)
  • AI bot rules in robots.txt — looks for User-agent lines targeting GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and 10 more
  • Web Bot Auth key directory at /.well-known/http-message-signatures-directory

Capabilities (6)

  • Agent Skills index at /.well-known/agent-skills/index.json
  • API Catalog at /.well-known/api-catalog (RFC 9727)
  • OAuth Authorization Server Metadata (RFC 8414)
  • OAuth Protected Resource Metadata (RFC 9728)
  • MCP Server Card at /.well-known/mcp/server-card.json
  • WebMCP (heuristic — fetch-only model can't fully detect a runtime JS API; the check states the limitation in its evidence)

Coming, after spec research

A2A Agent Card, x402, UCP, ACP, MPP — tracked in issue #1. The rule: no check ships without explicit pass criteria written into docs/checks.md first.


CLI

agent-readiness <url> [options]

  --preset all|content|api      Check group preset (default: all)
  --only id1,id2,...            Run only these check IDs (overrides --preset)
  --skip id1,id2,...            Skip these check IDs (applied after preset)
  --list-checks                 List every check and preset, then exit
  --include-llms-txt            Run the opt-in llms.txt check
  --format human|json|html|md   Output format (default: human)
  --timeout <ms>                Per-request timeout (default: 5000)
  --concurrency <n>             Parallel requests (default: 4)
  --insecure                    Skip TLS verification (auto-on for loopback/RFC1918)
  --user-agent <ua>             Spoof UA (e.g. "GPTBot/1.0")
  --header "K: V"               Inject a request header (repeatable)

JSON output is pipeable and diffable:

agent-readiness https://example.com --format json | jq '.summary'

HTML output is a single self-contained file — inline CSS, no external assets, dark-mode-aware — ready to open in a browser, attach to a PR comment, or drop into a dashboard:

agent-readiness https://example.com --format html > report.html
open report.html

Markdown output works for PR bodies and Slack threads:

agent-readiness https://example.com --format md

What makes this different

Localhost works

Every check runs from HTTP-level signals — no headless browser, no JS execution. (One exception: WebMCP, where the spec genuinely requires JS; that check documents the limitation in its evidence.) Point it at any URL you can reach: http://localhost:3000, https://staging.internal.corp/, http://[::1]:8080/.

Strict response shape validation

SPA dev servers return 200 OK + text/html for any path, including /.well-known/* paths that should be JSON. A naive scanner ("did we get 200?") reports these as passing. Every JSON check here validates the content-type and parses the body and checks required fields before claiming a pass. The same logic catches partial implementations in production — a real well-known endpoint serving JSON that's missing required fields fails for the right reason.

Fast feedback

A full scan finishes in under a second against a typical site. Single-file ~36 KB JS bundle. Bun runtime. Two runtime dependencies (p-limit, robots-parser), both inlined.

CI integration (in progress)

JSON output is shipped today. --baseline for regression-only failure mode is tracked in the issue queue. The shape of that flag: first scan never breaks CI, but the next push that introduces a regression does.


Privacy

The tool makes HTTP requests only to the target URL you provide. No analytics, no telemetry, no third-party services. See PRIVACY.md.

The code in this repository is the code that runs on your machine. The build artifact (dist/cli.js) is a single ~36 KB bundle.


Security

Reporting a vulnerability: see SECURITY.md. Findings about target sites you scan are between you and those site operators. Tool bugs (a false pass in shape validation, for example) — please report privately first.


Develop

git clone git@github.com:itaides/agent-readiness.git
cd agent-readiness
bun install
bun test            # 6 tests, ~60 ms
bun run build       # dist/cli.js — single ~36 KB bundle
bunx tsc --noEmit -p .

Project layout:

src/
  cli.ts             # arg parsing + main()
  scanner.ts         # parallel check orchestrator
  http.ts            # fetch wrapper: timeout, redirects, body cap, insecure mode
  target.ts          # URL normalize + locality detection (loopback / RFC1918 / ULA)
  report.ts          # human + JSON formatters
  registry.ts        # check list + preset definitions
  types.ts           # Check, CheckResult, Target, HttpClient, Report
  checks/
    _shared.ts       # JSON well-known fetcher + shape validation helper
    *.ts             # one file per check (14 of them)
test/
  checks/*.test.ts   # per-check unit tests
  e2e.test.ts        # all checks against fixture-server
testdata/
  fixture-server.ts  # all-pass + SPA-fallthrough modes
docs/
  checks.md          # spec — pass criteria for every check

Adding a new check:

  1. Add src/checks/<id>.ts with a default export of type Check.
  2. Register it in src/registry.ts.
  3. Document it in docs/checks.md with endpoint + pass criteria + spec URL + asOf date.
  4. Add fixture data to testdata/fixture-server.ts for the all-pass case.
  5. bun test should still pass.

License

MIT. Inspired by isitagentready.com and the Cloudflare agent-readiness blog post. Independent open-source implementation — not affiliated with Cloudflare.

About

Check whether a website is ready for AI agents — works locally too. CLI alternative to isitagentready.com that can scan localhost and integrate into CI.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors