diff --git a/.agents/skills/agent-readiness-audit/SKILL.md b/.agents/skills/agent-readiness-audit/SKILL.md new file mode 100644 index 000000000000..afb2e56a3add --- /dev/null +++ b/.agents/skills/agent-readiness-audit/SKILL.md @@ -0,0 +1,228 @@ +--- +name: agent-readiness-audit +description: > + Audit a documentation site for agent-friendliness: discovery, markdown + delivery, crawlability, semantic structure, machine-readable surfaces, + and content legibility. Use when asked to assess docs.docker.com or any + docs site for AI/agent readiness, produce a scored report, compare with + external scanners, or generate a remediation list. Triggers on: + "audit docs for agent readiness", "how agent-friendly is docs.docker.com", + "score our docs for AI agents", "review llms.txt / markdown / crawlability", + "create an agent-readiness remediation plan". +argument-hint: "" +--- + +# Agent Readiness Audit + +Audit the live site, not the source tree alone. Prefer the same fetch path +an external agent would use in the wild: direct HTTP requests, sitemap +sampling, and page-level inspection. + +Do not reduce the result to a homepage-only scan or a binary checklist. + +## 1. Set scope + +Use `$ARGUMENTS` as the base URL when provided. Otherwise infer the base +URL from context and state the assumption. + +Decide whether the host being audited is: + +- a docs-only host +- an app/tool host +- a mixed host + +This matters for optional checks such as MCP, plugin manifests, or other +tool discovery files. Do not penalize a docs-only host for missing +tooling manifests that belong on a separate service. + +For `docs.docker.com`, treat the public docs host as docs-only. Docker's +MCP server is published separately, so missing MCP files on the docs host +should be reported as `N/A`, not as a failure. + +## 2. Gather sitewide signals + +Always check these resources first: + +- `/llms.txt` +- `/llms-full.txt` +- `/robots.txt` +- `/sitemap.xml` + +Only check host-level tool manifests when the host is an app/tool host, +mixed host, or explicitly advertises them: + +- `/.well-known/ai-plugin.json` +- `/.well-known/agent.json` +- `/.well-known/agents.json` + +Use the bundled script for a baseline: + +```bash +bash .agents/skills/agent-readiness-audit/scripts/baseline-probes.sh \ + "$ARGUMENTS" +``` + +The script produces baseline evidence only. You still need to interpret +what matters for a docs property and score it with the rubric. + +For docs-only hosts, you may skip tool-manifest probes to reduce noise: + +```bash +CHECK_TOOL_MANIFESTS=0 \ + bash .agents/skills/agent-readiness-audit/scripts/baseline-probes.sh \ + "$ARGUMENTS" +``` + +## 3. Sample representative pages + +Use the sitemap when available. Do not rely on the homepage alone. + +If `llms.txt` exists, sample some URLs from it as well. This helps catch +stale or misleading discovery surfaces that a sitemap-only sample would miss. + +Sample at least 12 pages when the site is large enough, and cover multiple +page types: + +- homepage or docs landing page +- section landing pages +- task guides +- product manuals +- reference or API pages +- tutorial or learning pages + +If the sitemap is missing or unusable, discover pages through internal +links and note the lower confidence. + +If the site has distinct delivery patterns, sample each one. For example: + +- normal content pages +- generated reference pages +- versioned docs +- localized docs + +## 4. Run fetch-path checks on each sample + +For each sampled page, verify: + +- HTML fetch status, content type, and final URL +- `Accept: text/markdown` behavior +- direct markdown route behavior such as `.md` or another stable path +- page-level markdown alternate links and whether they actually resolve +- whether page actions such as "Open Markdown" agree with the working route +- whether the HTML title or H1 matches the markdown H1 closely enough for + retrieval parity +- whether main content is present in the initial HTML +- redirect chain length and canonical URL consistency +- obvious chrome/noise in the markdown response + +Do not assume a `.md` mirror exists just because another site uses one. +Verify the actual markdown path the site exposes. + +Treat these as separate signals: + +- negotiated markdown works +- a stable direct markdown URL works +- the page advertises the correct markdown URL + +If the page advertises dead markdown alternates but a working markdown route +exists, do not fail markdown delivery outright. Score it as a discoverability +and consistency problem instead. + +For API or generated reference pages, also verify whether a machine-readable +asset such as OpenAPI YAML is directly linked and fetchable. + +## 5. Judge structure and legibility + +Measure structural signals: + +- exactly one `h1` +- sane heading hierarchy +- `main` and `article` presence where appropriate +- canonical tags +- JSON-LD or breadcrumb structured data +- stable anchors and deep-linkable headings + +Also make a qualitative judgment about agent legibility: + +- markdown strips site chrome cleanly +- headings are specific and task-oriented +- code blocks stay intelligible without client-side JS +- the page is not dominated by banners, injected chat, or nav noise + +Measure code block labeling explicitly when code samples are common. A page +type with many untagged fenced blocks should lose points even if the prose is +otherwise clean. + +For page types that intentionally render interactive UIs with JavaScript, +judge them separately from normal docs pages. If the HTML shell is thin, +check whether the page still provides: + +- a fetchable markdown summary +- a directly linked machine-readable asset +- a usable non-JS fallback + +## 6. Score with the rubric + +Use [references/rubric.md](references/rubric.md). + +Rules: + +- score only what you verified +- mark non-applicable checks as `N/A` +- normalize the final score against applicable points only +- do not let optional manifest checks dominate the grade + +Apply the foundational caps from the rubric. A site with broken discovery +or broken markdown delivery should not earn a high grade because it has +clean metadata. + +Do not average away a weak page type. If one major page type, such as API +reference, is materially worse than the rest of the corpus, call it out as +the weakest segment and reflect it in the category notes. + +## 7. Compare with external scanners when useful + +If external scanner results are available, compare them to your live +findings. Treat them as secondary evidence. + +If a scanner and the live fetch disagree: + +- trust the live fetch +- report the mismatch explicitly +- explain whether the scanner is testing a different assumption + +## 8. Produce a remediation list + +Turn findings into a short backlog: + +- `P0`: fetchability or discovery blockers +- `P1`: recurring structural or parity issues +- `P2`: polish, optional manifests, or low-impact enhancements + +For each remediation, include: + +- the failing signal +- why it matters to agents +- a concrete fix +- whether it is sitewide or page-type-specific + +## 9. Report in a stable format + +Use [references/report-template.md](references/report-template.md). + +Always include: + +- overall score and grade +- confidence level +- sampled URLs or sample strategy +- category scores +- highest-priority findings +- remediation backlog + +## Notes + +- Favor docs-delivery checks over marketing-site heuristics. +- Do not fail a docs host for lacking MCP or plugin manifests unless the + host itself is meant to expose tools. +- Treat raw byte size as supporting evidence, not as a primary scoring input. +- Prefer short evidence excerpts and commands over long copied page text. diff --git a/.agents/skills/agent-readiness-audit/references/report-template.md b/.agents/skills/agent-readiness-audit/references/report-template.md new file mode 100644 index 000000000000..e1592eb5054b --- /dev/null +++ b/.agents/skills/agent-readiness-audit/references/report-template.md @@ -0,0 +1,63 @@ +# Agent Readiness Report Template + +Use this structure for final audit output. + +```markdown +## Agent Readiness Audit + +**Site:** +**Date:** +**Overall score:** /100 +**Grade:** +**Confidence:** + +### Summary + +<2-4 sentence verdict focused on what an external agent can actually +discover, fetch, and interpret on this site.> + +### Category Scores + +| Category | Score | Notes | +| --- | ---: | --- | +| Discovery and policy | / | | +| Retrieval and markdown delivery | / | | +| Structure and semantics | / | | +| Crawlability and delivery behavior | / | | +| Machine-readable surfaces | / | | +| Content legibility | / | | + +### Sample + +- Sample strategy: +- Sampled pages: +- Page types covered: +- Weakest page type: + +### Findings + +- `P0`: +- `P1`: +- `P2`: + +### Remediation + +- `P0`: , because +- `P1`: , because +- `P2`: , because + +### Evidence + +- Sitewide checks: +- Fetch-path checks: +- Structural checks:

+- Code block checks: +- Scanner comparison: +``` + +## Notes + +- Keep the summary short and outcome-oriented. +- Findings should refer to concrete URLs or page types. +- If a criterion is `N/A`, say why instead of leaving it blank. diff --git a/.agents/skills/agent-readiness-audit/references/rubric.md b/.agents/skills/agent-readiness-audit/references/rubric.md new file mode 100644 index 000000000000..51f089ed6135 --- /dev/null +++ b/.agents/skills/agent-readiness-audit/references/rubric.md @@ -0,0 +1,129 @@ +# Agent Readiness Rubric + +Score the site on a 100-point scale before normalization. If a criterion is +not applicable, remove its points from the denominator instead of treating +it as failed. + +## Grade bands + +- `A`: 90-100 +- `B`: 80-89 +- `C`: 65-79 +- `D`: 50-64 +- `F`: below 50 + +## Confidence levels + +- `High`: sitemap available and at least 12 sampled pages across at least + four page types +- `Medium`: six to 11 sampled pages, or weaker coverage of page types +- `Low`: fewer than six sampled pages, or homepage-biased sampling + +## Foundational caps + +Apply these after computing the raw score: + +- No `sitemap.xml` and no `llms.txt`: maximum grade `C` +- Markdown delivery fails on most sampled pages and no usable alternate + markdown path exists: maximum grade `D` +- Main content is missing from initial HTML on more than 25% of sampled + pages: maximum grade `D` +- `robots.txt` blocks broad crawl access to the docs site and the block is + not clearly intentional: maximum grade `F` + +Optional manifest gaps alone must not drop a docs-only host below `B`. + +## Categories + +### 1. Discovery and policy - 15 points + +- `5` `llms.txt` exists, is fetchable, and is useful for agent discovery +- `4` `sitemap.xml` exists and includes the main docs corpus +- `4` `robots.txt` is accessible and does not unintentionally block major + crawl agents or search agents +- `2` curated bulk-discovery aid exists, such as `llms-full.txt` or an + equivalent machine-readable catalog + +When `llms.txt` exists, sample some URLs from it. Stale or misleading +discovery links should reduce this category even if the file itself exists. + +### 2. Retrieval and markdown delivery - 25 points + +- `8` `Accept: text/markdown` works on sampled pages or an equivalent + negotiated markdown response exists +- `5` a stable direct markdown route works on sampled pages +- `5` page-level markdown hints, alternates, or UI actions point to a + working markdown URL +- `4` markdown responses strip navigation chrome and preserve headings, + links, and code blocks cleanly +- `3` HTML and markdown stay in parity across the sampled set + +### 3. Structure and semantics - 20 points + +- `6` sampled pages have one `h1` and a mostly consistent heading hierarchy +- `5` `main` or `article` marks the primary content and the content is + present in the initial HTML +- `4` canonical tags and stable final URLs are correct +- `3` structured data such as breadcrumbs or article metadata exists where + appropriate +- `2` headings expose stable anchors or deep-link targets, and the HTML title + or H1 stays reasonably aligned with the markdown H1 + +### 4. Crawlability and delivery behavior - 15 points + +- `5` crawl directives are sane for a public docs property +- `4` the site does not depend on client-side rendering to expose core + content +- `3` cache and freshness signals are reasonable for bots, such as + `ETag`, `Last-Modified`, or useful cache headers +- `3` redirect chains are short and predictable + +### 5. Machine-readable surfaces - 10 points + +- `4` API or reference sections expose OpenAPI, schema, or downloadable + machine-readable assets where relevant +- `3` pages with interactive JavaScript reference UIs still provide a usable + non-JS fallback such as markdown, YAML, or another directly linked asset +- `3` tool manifests such as MCP, plugin, or agent descriptors exist only + when the audited host is actually meant to expose tools + +### 6. Content legibility - 15 points + +- `5` markdown is clean and low-noise rather than a dump of site chrome +- `4` headings and section intros are specific enough for retrieval and + chunking +- `3` fenced code blocks are mostly language-tagged and remain copyable and + interpretable +- `3` repeated banners, chat chrome, consent overlays, or other boilerplate + do not overwhelm the main content + +## Scoring guidance + +Use the full category only when the signal is consistently good across the +sample. Partial credit is expected. + +Examples: + +- A sitewide `llms.txt` that exists but is stale or too shallow may earn + partial credit rather than full credit. +- If markdown works only on some page types, score that criterion based on + observed coverage instead of failing or passing it outright. +- If a working markdown route exists but the page advertises a dead + alternate URL, deduct in markdown discoverability rather than in raw + markdown availability. +- If `llms.txt` exists but points to stale, broken, or inconsistent paths, + deduct in discovery rather than in core fetchability. +- If tool manifests are irrelevant to the host, mark them `N/A`. +- If a major page type is weaker than the rest of the site, note that + explicitly instead of letting stronger page types hide it in the average. + +## Reporting guidance + +For every category, include one line that explains the score: + +- what was tested +- what passed +- what limited the score + +Use evidence from live fetches. Do not score from assumptions about the +framework or source repository. diff --git a/.agents/skills/agent-readiness-audit/scripts/baseline-probes.sh b/.agents/skills/agent-readiness-audit/scripts/baseline-probes.sh new file mode 100755 index 000000000000..50875aee068d --- /dev/null +++ b/.agents/skills/agent-readiness-audit/scripts/baseline-probes.sh @@ -0,0 +1,287 @@ +#!/usr/bin/env bash + +set -euo pipefail + +if [[ $# -lt 1 ]]; then + echo "usage: $0 [sample-url ...]" >&2 + exit 1 +fi + +if ! command -v curl >/dev/null 2>&1; then + echo "curl is required" >&2 + exit 1 +fi + +if ! command -v rg >/dev/null 2>&1; then + echo "rg is required" >&2 + exit 1 +fi + +BASE_URL="${1%/}" +shift || true +SAMPLE_SIZE="${SAMPLE_SIZE:-12}" +LLMS_SAMPLE_SIZE="${LLMS_SAMPLE_SIZE:-2}" +CHECK_TOOL_MANIFESTS="${CHECK_TOOL_MANIFESTS:-1}" +TMPDIR="$(mktemp -d)" +trap 'rm -rf "$TMPDIR"' EXIT + +count_matches() { + local pattern="$1" + local file="$2" + rg -o "$pattern" "$file" 2>/dev/null | wc -l | tr -d ' ' || true +} + +header_value() { + local header_file="$1" + local name="$2" + awk -F': ' -v target="$name" ' + tolower($1) == tolower(target) { value = $2 } + END { + gsub(/\r/, "", value) + print value + } + ' "$header_file" +} + +normalize_text() { + printf '%s' "$1" \ + | tr '[:upper:]' '[:lower:]' \ + | sed -E 's/[[:space:]]+/ /g; s/^[[:space:]]+//; s/[[:space:]]+$//; s/ \| docker docs$//' +} + +code_fence_stats() { + local file="$1" + awk ' + BEGIN { in_block = 0; total = 0; tagged = 0 } + /^```/ { + line = $0 + sub(/^```[[:space:]]*/, "", line) + if (!in_block) { + total++ + if (line != "") { + tagged++ + } + in_block = 1 + } else { + in_block = 0 + } + } + END { + printf "%d\t%d\n", total, tagged + } + ' "$file" +} + +resource_probe() { + local url="$1" + local label="$2" + local body="$TMPDIR/resource-body" + local headers="$TMPDIR/resource-headers" + local status + local content_type + local bytes + + status="$(curl -sS -L -o "$body" -D "$headers" -w '%{http_code}' "$url" || true)" + content_type="$(header_value "$headers" "content-type")" + bytes="$(wc -c < "$body" | tr -d ' ')" + + printf '%s\t%s\t%s\t%s\t%s\n' "$label" "$url" "$status" "$content_type" "$bytes" +} + +page_probe() { + local url="$1" + local html="$TMPDIR/page-html" + local html_headers="$TMPDIR/page-html-headers" + local md="$TMPDIR/page-md" + local md_headers="$TMPDIR/page-md-headers" + local direct_md="$TMPDIR/page-direct-md" + local direct_md_headers="$TMPDIR/page-direct-md-headers" + local alt_md="$TMPDIR/page-alt-md" + local alt_md_headers="$TMPDIR/page-alt-md-headers" + local status + local content_type + local final_url + local h1_count + local main_count + local article_count + local canonical_count + local jsonld_count + local md_alt + local md_alt_url + local direct_md_url + local html_title + local html_h1 + local md_h1 + local md_status + local md_content_type + local md_bytes + local direct_md_status + local direct_md_content_type + local md_alt_status="na" + local md_alt_content_type="na" + local title_md_h1_match="no" + local html_h1_md_h1_match="no" + local code_blocks_total + local code_blocks_tagged + + status="$( + curl -sS -L -o "$html" -D "$html_headers" \ + -w '%{http_code}\t%{url_effective}' "$url" || true + )" + content_type="$(header_value "$html_headers" "content-type")" + final_url="${status#*$'\t'}" + status="${status%%$'\t'*}" + + h1_count="$(count_matches ']' "$html")" + main_count="$(count_matches ']' "$html")" + article_count="$(count_matches ']' "$html")" + canonical_count="$(count_matches 'rel=canonical' "$html")" + jsonld_count="$(count_matches 'application/ld\+json' "$html")" + md_alt="$( + rg -o 'type=text/markdown href=[^ >]+|href=[^ >]+[^>]*type=text/markdown' \ + "$html" -m 1 2>/dev/null | sed -E 's/.*href=([^ >]+).*/\1/' || true + )" + + md_status="$(curl -sS -L -H 'Accept: text/markdown' -o "$md" -D "$md_headers" -w '%{http_code}' "$url" || true)" + md_content_type="$(header_value "$md_headers" "content-type")" + md_bytes="$(wc -c < "$md" | tr -d ' ')" + direct_md_url="$(printf '%s' "$final_url" | sed 's#/$##').md" + direct_md_status="$(curl -sS -L -o "$direct_md" -D "$direct_md_headers" -w '%{http_code}' "$direct_md_url" || true)" + direct_md_content_type="$(header_value "$direct_md_headers" "content-type")" + html_title="$(rg -o '[^<]+' "$html" -m 1 2>/dev/null | sed 's/<title>//' || true)" + html_h1="$(rg -o '<h1[^>]*>[^<]+' "$html" -m 1 2>/dev/null | sed -E 's/<h1[^>]*>//' || true)" + md_h1="$(awk '/^# / { sub(/^# /, ""); print; exit }' "$md" || true)" + + if [[ -n "$html_title" && -n "$md_h1" ]]; then + if [[ "$(normalize_text "$html_title")" == "$(normalize_text "$md_h1")" ]]; then + title_md_h1_match="yes" + fi + fi + + if [[ -n "$html_h1" && -n "$md_h1" ]]; then + if [[ "$(normalize_text "$html_h1")" == "$(normalize_text "$md_h1")" ]]; then + html_h1_md_h1_match="yes" + fi + fi + + IFS=$'\t' read -r code_blocks_total code_blocks_tagged < <(code_fence_stats "$md") + + if [[ -n "$md_alt" ]]; then + if [[ "$md_alt" =~ ^https?:// ]]; then + md_alt_url="$md_alt" + elif [[ "$md_alt" == /* ]]; then + md_alt_url="${BASE_URL}${md_alt}" + else + md_alt_url="${BASE_URL}/${md_alt}" + fi + md_alt_status="$(curl -sS -L -o "$alt_md" -D "$alt_md_headers" -w '%{http_code}' "$md_alt_url" || true)" + md_alt_content_type="$(header_value "$alt_md_headers" "content-type")" + else + md_alt_url="na" + fi + + printf '%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n' \ + "$url" \ + "$status" \ + "$content_type" \ + "$final_url" \ + "$h1_count" \ + "$main_count" \ + "$article_count" \ + "$canonical_count" \ + "$jsonld_count" \ + "$md_status" \ + "$md_content_type" \ + "$md_bytes" \ + "$direct_md_url" \ + "$direct_md_status" \ + "$direct_md_content_type" \ + "$md_alt_url" \ + "$md_alt_status" \ + "$md_alt_content_type" \ + "$title_md_h1_match" \ + "$html_h1_md_h1_match" \ + "$code_blocks_total" \ + "$code_blocks_tagged" +} + +llms_urls() { + local llms="$TMPDIR/llms-sample.txt" + local llms_status + + llms_status="$(curl -sS -L -o "$llms" -w '%{http_code}' "$BASE_URL/llms.txt" || true)" + if [[ "$llms_status" == "200" ]]; then + rg -o '\(https?://[^)]+\)' "$llms" 2>/dev/null \ + | tr -d '()' \ + | rg "^${BASE_URL//./\\.}" \ + | rg -v '/404\.html$|/search/?$|\.xml$|\.txt$' \ + | awk -v limit="$LLMS_SAMPLE_SIZE" '!seen[$0]++ && NR <= limit { print }' + fi +} + +sitemap_urls() { + local sitemap="$TMPDIR/sitemap.xml" + local sitemap_status + + sitemap_status="$(curl -sS -L -o "$sitemap" -w '%{http_code}' "$BASE_URL/sitemap.xml" || true)" + if [[ "$sitemap_status" == "200" ]]; then + rg -o '<loc>[^<]+' "$sitemap" \ + | sed 's/<loc>//' \ + | rg "^${BASE_URL//./\\.}" \ + | rg -v '/404\.html$|/search/?$|\.xml$|\.txt$' \ + | awk '!seen[$0]++ { print }' + fi +} + +sample_urls() { + if [[ $# -gt 0 ]]; then + printf '%s\n' "$@" + return + fi + + local sample_file="$TMPDIR/sampled-urls.txt" + + { + llms_urls + sitemap_urls + } | awk -v limit="$SAMPLE_SIZE" ' + !seen[$0]++ { + print + count++ + if (count >= limit) { + exit + } + } + ' > "$sample_file" + + if [[ ! -s "$sample_file" ]]; then + printf '%s/\n' "$BASE_URL" + else + cat "$sample_file" + fi +} + +printf 'META\tbase-url\t%s\n' "$BASE_URL" +if [[ $# -gt 0 ]]; then + printf 'META\tsample-source\texplicit\n' +else + printf 'META\tsample-source\tllms-and-sitemap-or-homepage\n' +fi +printf '\nSITEWIDE\n' +printf 'label\turl\tstatus\tcontent-type\tbytes\n' +resource_probe "$BASE_URL/llms.txt" "llms.txt" +resource_probe "$BASE_URL/llms-full.txt" "llms-full.txt" +resource_probe "$BASE_URL/robots.txt" "robots.txt" +resource_probe "$BASE_URL/sitemap.xml" "sitemap.xml" +if [[ "$CHECK_TOOL_MANIFESTS" == "1" ]]; then + resource_probe "$BASE_URL/.well-known/ai-plugin.json" "ai-plugin.json" + resource_probe "$BASE_URL/.well-known/agent.json" "agent.json" + resource_probe "$BASE_URL/.well-known/agents.json" "agents.json" +fi + +printf '\nPAGES\n' +printf 'url\tstatus\tcontent-type\tfinal-url\th1\tmain\tarticle\tcanonical\tjsonld\tmd-negotiate-status\tmd-negotiate-content-type\tmd-bytes\tmd-direct-url\tmd-direct-status\tmd-direct-content-type\tmd-alt-url\tmd-alt-status\tmd-alt-content-type\ttitle-md-h1-match\th1-md-h1-match\tcode-blocks-total\tcode-blocks-tagged\n' +while IFS= read -r page_url; do + [[ -z "$page_url" ]] && continue + page_probe "$page_url" +done < <(sample_urls "$@") diff --git a/hugo.yaml b/hugo.yaml index 271d714bbf0f..61c2580ec76d 100644 --- a/hugo.yaml +++ b/hugo.yaml @@ -102,6 +102,12 @@ outputFormats: mediaType: "text/plain" notAlternative: true permalinkable: false + llmsfull: + baseName: llms-full + isPlainText: true + mediaType: "text/plain" + notAlternative: true + permalinkable: false # Enable custom output formats # (only generate the custom output files once) @@ -112,6 +118,7 @@ outputs: - metadata - robots - llms + - llmsfull page: - html - markdown diff --git a/layouts/_partials/footer.html b/layouts/_partials/footer.html index bdefcb71bcba..13ab4f250987 100644 --- a/layouts/_partials/footer.html +++ b/layouts/_partials/footer.html @@ -49,6 +49,11 @@ class="inline-flex truncate whitespace-normal" >llms.txt</a > + <a + href="{{ "llms-full.txt" | relURL }}" + class="inline-flex truncate whitespace-normal" + >llms-full.txt</a + > </div> <div class="flex items-center justify-end"> <button @@ -85,7 +90,6 @@ class="border-t border-gray-200 bg-gray-100 px-4 py-4 text-sm text-gray-400 md:border-none dark:border-gray-700 dark:bg-gray-900 dark:text-gray-600" > <span - >Copyright © 2013-{{ time.Now.Year }} Docker Inc. All rights - reserved.</span + >Copyright © 2013-{{ time.Now.Year }} Docker Inc. All rights reserved.</span > </div> diff --git a/layouts/_partials/head.html b/layouts/_partials/head.html index e57796eeaebf..113151d7a6c8 100644 --- a/layouts/_partials/head.html +++ b/layouts/_partials/head.html @@ -2,10 +2,22 @@ <meta name="viewport" content="width=device-width, initial-scale=1" /> {{ partial "meta.html" . }} {{- range .AlternativeOutputFormats -}} - <link rel="{{ .Rel }}" type="{{ .MediaType.Type }}" href="{{ .Permalink }}" /> + {{- if eq .Name "markdown" -}} + <link + rel="{{ .Rel }}" + type="{{ .MediaType.Type }}" + href="{{ partial "utils/markdown-url.html" $ }}" + /> + {{- else -}} + <link + rel="{{ .Rel }}" + type="{{ .MediaType.Type }}" + href="{{ .Permalink }}" + /> + {{- end -}} {{ end -}} {{ partialCached "utils/css.html" "-" "-" }} -<link href="/pagefind/pagefind-component-ui.css" rel="stylesheet"> +<link href="/pagefind/pagefind-component-ui.css" rel="stylesheet" /> {{- if hugo.IsProduction -}} <script src="https://cdn.cookielaw.org/scripttemplates/otSDKStub.js" diff --git a/layouts/_partials/meta.html b/layouts/_partials/meta.html index e5233f447692..5497bd220bf6 100644 --- a/layouts/_partials/meta.html +++ b/layouts/_partials/meta.html @@ -3,7 +3,7 @@ <title>{{ site.Title }} {{ else }} - {{ printf "%s | %s" .LinkTitle site.Title }} + {{ printf "%s | %s" .Title site.Title }} {{ end }} {{ if or (eq .Params.sitemap false) (not hugo.IsProduction) }} @@ -20,11 +20,11 @@ - + - + - + {{ $schema | jsonify | safeJS }} - {{- /* Add BreadcrumbList schema */ -}} {{- $breadcrumbs := slice -}} {{- $position := 1 -}} @@ -94,8 +94,8 @@ "@type" "ListItem" "position" $position "item" (dict - "@id" .Permalink - "name" (truncate 110 .LinkTitle) + "@id" .Permalink + "name" (truncate 110 .LinkTitle) ) -}} {{- $breadcrumbs = $breadcrumbs | append $item -}} @@ -108,8 +108,8 @@ "@type" "ListItem" "position" $position "item" (dict - "@id" .Permalink - "name" (truncate 110 .LinkTitle) + "@id" .Permalink + "name" (truncate 110 .Title) ) -}} {{- $breadcrumbs = $breadcrumbs | append $currentItem -}} diff --git a/layouts/_partials/utils/markdown-url.html b/layouts/_partials/utils/markdown-url.html new file mode 100644 index 000000000000..c1c7fadd1b3f --- /dev/null +++ b/layouts/_partials/utils/markdown-url.html @@ -0,0 +1,2 @@ +{{- $path := strings.TrimSuffix "/" .RelPermalink -}} +{{- printf "%s.md" $path | absURL -}} diff --git a/layouts/api.html b/layouts/api.html index f3b815332857..6ba73632bbf6 100644 --- a/layouts/api.html +++ b/layouts/api.html @@ -1,76 +1,174 @@ - + - - {{ $specURL := urls.Parse (printf "/%s%s.yaml" .File.Dir .File.ContentBaseName) }} - {{ .Title }} - - - - - - - - - - + + + + + + + + + + + - - - - - - - {{ if or (strings.HasPrefix .RelPermalink "/reference/api/hub/") (strings.HasPrefix .RelPermalink "/reference/api/registry/") }} - - {{ else }} - - {{ end }} - - + + {{ partial "schema.html" . }} + + + + +
+
+
+

{{ .Title }}

+

{{ .Description }}

+ + +
+
+
+ {{ if or (strings.HasPrefix .RelPermalink "/reference/api/hub/") (strings.HasPrefix .RelPermalink "/reference/api/registry/") }} + + {{ else }} + + {{ end }} +
+
+ + diff --git a/layouts/home.llms.txt b/layouts/home.llms.txt index 6648734b6e4c..2652c6241818 100644 --- a/layouts/home.llms.txt +++ b/layouts/home.llms.txt @@ -5,6 +5,7 @@ # Docker Documentation > MCP endpoint for structured agent access: https://mcp-docs.docker.com/mcp +> Bulk text corpus for offline indexing: https://docs.docker.com/llms-full.txt {{- range $grouped }} ## {{ humanize .Key }} diff --git a/layouts/home.llmsfull.txt b/layouts/home.llmsfull.txt new file mode 100644 index 000000000000..eaa17a21b423 --- /dev/null +++ b/layouts/home.llmsfull.txt @@ -0,0 +1,16 @@ +{{- $pages := where site.RegularPages "Params.sitemap" "!=" false -}} +{{- $sorted := sort $pages "RelPermalink" -}} + +# Docker Documentation full text + +> Source index: https://docs.docker.com/llms.txt +> This file contains page metadata and stable markdown URLs for bulk ingestion. +{{- range $sorted }} + +## {{ .Title }} +URL: {{ .Permalink }} +Markdown: {{ partial "utils/markdown-url.html" . }} +{{- with .Description }} +Description: {{ chomp (replace . "\n" " ") }} +{{- end }} +{{- end }}