Background
Comparison of cli.internetcomputer.org against skills.internetcomputer.org (which recently received SEO improvements, tracked in dfinity/developer-docs#104) revealed several gaps. This issue tracks everything needed to close them.
Versioned deployment: what this means for SEO
The docs site uses a versioned folder structure on the IC asset canister (/0.1/, /0.2/, /main/). Root-level files (index.html, matomo.js, versions.json) are regenerated by CI on every main push via the publish-root-files job, which already reads LATEST_VERSION from versions.json.
This shapes the implementation strategy:
- Root-level changes (dynamically generated by
publish-root-files) — robots.txt, root sitemap, OG image file. These never require rebuilding old version folders.
- Build-time changes (in
astro.config.mjs) — meta tags, JSON-LD, RSS link. These apply to all future version builds automatically. The current 0.2/ folder needs a one-time rebuild (push to docs/v0.2 branch) to pick them up, after which no old-version rebuilds are ever needed again.
- Old versions should be blocked from indexing via
robots.txt (/0.1/, /main/), so their missing in-HTML improvements are SEO-irrelevant anyway.
Implement now
1. robots.txt (missing entirely)
No /robots.txt exists at the root. Add dynamic generation to the publish-root-files CI job (.github/workflows/docs.yml) alongside the existing index.html generation:
User-agent: *
Allow: /<latest-version>/
Disallow: /main/
Disallow: /0.1/ # (list all older versions except latest)
Sitemap: https://cli.internetcomputer.org/<latest-version>/sitemap-index.xml
# LLM crawlers
User-agent: GPTBot
Allow: /<latest-version>/
User-agent: ClaudeBot
Allow: /<latest-version>/
LATEST_VERSION is already computed in the CI step — reuse it. Disallow lines for old versions should be generated from versions.json. /main/ is always disallowed (development branch, not authoritative).
2. <meta name="robots" content="index, follow, max-image-preview:large">
Starlight doesn't add this. max-image-preview:large tells Google it can show large image previews in search results — a genuine improvement. index, follow is the browser default but makes intent explicit.
Add globally via Starlight head config in docs-site/astro.config.mjs:
{ tag: 'meta', attrs: { name: 'robots', content: 'index, follow, max-image-preview:large' } },
3. <meta name="author" content="DFINITY Foundation">
Standard HTML meta tag. Add globally via Starlight head config. Semantically correct for a DFINITY-owned docs site.
4. JSON-LD structured data
Starlight injects no structured data. Add WebSite + Organization schemas to the site (via Starlight head config or a custom Head component). Organization with DFINITY Foundation as publisher covers the publisher signal without needing a non-standard <meta name="publisher"> tag.
Example for the home page:
{
"@context": "https://schema.org",
"@type": "WebSite",
"name": "ICP CLI",
"url": "https://cli.internetcomputer.org",
"description": "Command-line tool for developing and deploying applications on the Internet Computer Protocol (ICP)",
"inLanguage": "en-US",
"publisher": {
"@type": "Organization",
"name": "DFINITY Foundation",
"url": "https://dfinity.org"
}
}
5. og:image / twitter:image
Starlight sets twitter:card: summary_large_image but without an actual image — this is actively misleading and renders as a blank preview when shared on LinkedIn, Slack, and Twitter/X.
- Create
docs-site/public/og-image.svg (a simple branded SVG — match the style of skills.internetcomputer.org/og-image.svg)
- Copy
og-image.svg to the root folder in publish-root-files (so it's accessible at /og-image.svg, version-independent)
- Reference with an absolute URL in Starlight
head config:
{ tag: 'meta', attrs: { property: 'og:image', content: 'https://cli.internetcomputer.org/og-image.svg' } },
{ tag: 'meta', attrs: { name: 'twitter:image', content: 'https://cli.internetcomputer.org/og-image.svg' } },
- Note: Twitter/X does not support SVG for OG images. Before production launch, convert to PNG (statically or via a build-time step using
@resvg/resvg-js or similar).
6. og:locale and og:type fixes
Starlight currently outputs og:locale: en (should be en_US) and og:type: article on the home page (should be website for index/landing pages — article is correct for content pages).
These are Starlight defaults. Override the home page (src/content/docs/index.mdx or equivalent) with custom frontmatter or a custom Head component to set og:type: website on the landing page only.
For og:locale, override globally via Starlight head config if Starlight exposes this, or via a custom component.
7. llms-full.txt
A bulk concatenation of all pages as defined by the llmstxt.org spec — distinct from the existing llms.txt:
llms.txt + individual .md endpoints → interactive AI agents (selective fetching)
llms-full.txt → scrapers, RAG pipelines, and fine-tuning datasets (bulk ingestion)
Add to docs-site/plugins/astro-agent-docs.mjs alongside the existing llms.txt generation. Also expose at root via publish-root-files (same pattern as llms.txt).
8. RSS feed
Useful for developers tracking documentation changes. Add feed.xml generation to plugins/astro-agent-docs.mjs (or a dedicated plugin). Reference from Starlight head config:
{ tag: 'link', attrs: { rel: 'alternate', type: 'application/rss+xml', href: '/feed.xml', title: 'ICP CLI documentation updates' } },
Copy to root in publish-root-files (same pattern as llms.txt).
9. Sitemap lastmod timestamps
Starlight generates sitemaps without lastmod. Add accurate timestamps using git commit history at build time — not the build date, which would mark every page as changed on every build and actively harm crawl budget.
Explicitly configure @astrojs/sitemap in astro.config.mjs with a serialize callback that runs git log -1 --format=%cI -- <source-file> per page.
10. Root-level sitemap
Currently there is no /sitemap.xml at the root domain — each version has its own at /<version>/sitemap-index.xml. Add a root-level sitemap.xml to publish-root-files that references only the latest version's sitemap:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://cli.internetcomputer.org/<latest-version>/sitemap-index.xml</loc>
</sitemap>
</sitemapindex>
Reference this from robots.txt (Sitemap: https://cli.internetcomputer.org/sitemap.xml).
One-time rebuild of current version (0.2)
After the build-time changes above land on main, push to the docs/v0.2 branch to trigger a rebuild of the 0.2/ folder. This is the only old-version rebuild ever needed — all future versions pick up the improvements automatically.
Skipped (not actual improvements)
- Font preconnect — fonts are served via
@fontsource npm packages (local), no external CDN to preconnect to
- Standalone
<meta name="publisher"> — not a standard HTML meta tag; publisher signal is covered by JSON-LD Organization
index, follow only — browser default; only meaningful as part of max-image-preview:large (covered in item 2)
- Making the root
/ indexable — the root meta name="robots" content="noindex" on the redirect page is intentional and correct; the redirect target (/0.2/) is what should be indexed, not the redirect shell itself
Background
Comparison of cli.internetcomputer.org against skills.internetcomputer.org (which recently received SEO improvements, tracked in dfinity/developer-docs#104) revealed several gaps. This issue tracks everything needed to close them.
Versioned deployment: what this means for SEO
The docs site uses a versioned folder structure on the IC asset canister (
/0.1/,/0.2/,/main/). Root-level files (index.html,matomo.js,versions.json) are regenerated by CI on everymainpush via thepublish-root-filesjob, which already readsLATEST_VERSIONfromversions.json.This shapes the implementation strategy:
publish-root-files) —robots.txt, root sitemap, OG image file. These never require rebuilding old version folders.astro.config.mjs) — meta tags, JSON-LD, RSS link. These apply to all future version builds automatically. The current0.2/folder needs a one-time rebuild (push todocs/v0.2branch) to pick them up, after which no old-version rebuilds are ever needed again.robots.txt(/0.1/,/main/), so their missing in-HTML improvements are SEO-irrelevant anyway.Implement now
1.
robots.txt(missing entirely)No
/robots.txtexists at the root. Add dynamic generation to thepublish-root-filesCI job (.github/workflows/docs.yml) alongside the existingindex.htmlgeneration:LATEST_VERSIONis already computed in the CI step — reuse it. Disallow lines for old versions should be generated fromversions.json./main/is always disallowed (development branch, not authoritative).2.
<meta name="robots" content="index, follow, max-image-preview:large">Starlight doesn't add this.
max-image-preview:largetells Google it can show large image previews in search results — a genuine improvement.index, followis the browser default but makes intent explicit.Add globally via Starlight
headconfig indocs-site/astro.config.mjs:3.
<meta name="author" content="DFINITY Foundation">Standard HTML meta tag. Add globally via Starlight
headconfig. Semantically correct for a DFINITY-owned docs site.4. JSON-LD structured data
Starlight injects no structured data. Add
WebSite+Organizationschemas to the site (via Starlightheadconfig or a custom Head component).Organizationwith DFINITY Foundation as publisher covers the publisher signal without needing a non-standard<meta name="publisher">tag.Example for the home page:
{ "@context": "https://schema.org", "@type": "WebSite", "name": "ICP CLI", "url": "https://cli.internetcomputer.org", "description": "Command-line tool for developing and deploying applications on the Internet Computer Protocol (ICP)", "inLanguage": "en-US", "publisher": { "@type": "Organization", "name": "DFINITY Foundation", "url": "https://dfinity.org" } }5.
og:image/twitter:imageStarlight sets
twitter:card: summary_large_imagebut without an actual image — this is actively misleading and renders as a blank preview when shared on LinkedIn, Slack, and Twitter/X.docs-site/public/og-image.svg(a simple branded SVG — match the style ofskills.internetcomputer.org/og-image.svg)og-image.svgto the root folder inpublish-root-files(so it's accessible at/og-image.svg, version-independent)headconfig:@resvg/resvg-jsor similar).6.
og:localeandog:typefixesStarlight currently outputs
og:locale: en(should been_US) andog:type: articleon the home page (should bewebsitefor index/landing pages —articleis correct for content pages).These are Starlight defaults. Override the home page (
src/content/docs/index.mdxor equivalent) with custom frontmatter or a customHeadcomponent to setog:type: websiteon the landing page only.For
og:locale, override globally via Starlightheadconfig if Starlight exposes this, or via a custom component.7.
llms-full.txtA bulk concatenation of all pages as defined by the llmstxt.org spec — distinct from the existing
llms.txt:llms.txt+ individual.mdendpoints → interactive AI agents (selective fetching)llms-full.txt→ scrapers, RAG pipelines, and fine-tuning datasets (bulk ingestion)Add to
docs-site/plugins/astro-agent-docs.mjsalongside the existingllms.txtgeneration. Also expose at root viapublish-root-files(same pattern asllms.txt).8. RSS feed
Useful for developers tracking documentation changes. Add
feed.xmlgeneration toplugins/astro-agent-docs.mjs(or a dedicated plugin). Reference from Starlightheadconfig:Copy to root in
publish-root-files(same pattern asllms.txt).9. Sitemap
lastmodtimestampsStarlight generates sitemaps without
lastmod. Add accurate timestamps using git commit history at build time — not the build date, which would mark every page as changed on every build and actively harm crawl budget.Explicitly configure
@astrojs/sitemapinastro.config.mjswith aserializecallback that runsgit log -1 --format=%cI -- <source-file>per page.10. Root-level sitemap
Currently there is no
/sitemap.xmlat the root domain — each version has its own at/<version>/sitemap-index.xml. Add a root-levelsitemap.xmltopublish-root-filesthat references only the latest version's sitemap:Reference this from
robots.txt(Sitemap: https://cli.internetcomputer.org/sitemap.xml).One-time rebuild of current version (0.2)
After the build-time changes above land on
main, push to thedocs/v0.2branch to trigger a rebuild of the0.2/folder. This is the only old-version rebuild ever needed — all future versions pick up the improvements automatically.Skipped (not actual improvements)
@fontsourcenpm packages (local), no external CDN to preconnect to<meta name="publisher">— not a standard HTML meta tag; publisher signal is covered by JSON-LDOrganizationindex, followonly — browser default; only meaningful as part ofmax-image-preview:large(covered in item 2)/indexable — the rootmeta name="robots" content="noindex"on the redirect page is intentional and correct; the redirect target (/0.2/) is what should be indexed, not the redirect shell itself