Skip to content

feat(seo): GEO foundations — llms.txt, raw .md, AI bot allowlist, JSON-LD fixes#118

Merged
ftvision merged 1 commit into
masterfrom
feat/seo-geo
May 14, 2026
Merged

feat(seo): GEO foundations — llms.txt, raw .md, AI bot allowlist, JSON-LD fixes#118
ftvision merged 1 commit into
masterfrom
feat/seo-geo

Conversation

@ftvision
Copy link
Copy Markdown
Owner

Summary

Make essays first-class citizens for AI answer engines (Claude, ChatGPT, Perplexity, Gemini) and IDE agents (Cursor, Continue).

  • JSON-LD bug fixes: BlogPosting.dateModified and BlogPosting.image were silently dropped (essayPostingSchema never piped them through, and EssayMeta had no updated field). Adds updated?: string to EssayMeta/PeriodicMeta and both frontmatter validators, then wires both fields into the schema. Now sitemap lastmod and JSON-LD dateModified agree.
  • Explicit AI bot allow-list in robots.ts: 15 named crawlers across OpenAI, Anthropic, Perplexity, Google, Apple, Common Crawl, Meta, Cohere, DuckDuckGo. A blanket * covers them in theory, but several default to "no matching rule = stay out". Comment in the file documents how to flip the training bots to disallow while keeping search/answer bots allowed.
  • Favicon: app/icon.svg (file convention — Next emits the link tags automatically). Black square, serif "A".
  • llms.txt: app/llms.txt/route.ts serves a build-time-generated llms.txt listing every published essay (en + zh), periodic, and series. Each essay links to its raw.md URL.
  • Raw markdown route: /essays/<slug>/raw.md (and /zh/essays/<slug>/raw.md) serves clean markdown — title, author, dates, canonical URL header, then the raw MDX body. LLM crawlers preferentially ingest markdown over rendered HTML (cheaper tokens, fewer parsing errors). Announced from the HTML via <link rel=\"alternate\" type=\"text/markdown\">.

Test plan

  • pnpm --filter @blog/blog exec tsc --noEmit passes
  • Rich Results Test on https://www.feitong.phd/essays/decision-game: 2 valid items (BlogPosting + BreadcrumbList)
  • After merge + deploy, re-run Rich Results Test and confirm dateModified and image now appear in BlogPosting
  • After merge + deploy: curl https://www.feitong.phd/llms.txt returns essay listing with correct URLs
  • After merge + deploy: curl -I https://www.feitong.phd/essays/decision-game/raw.md returns 200 + markdown body
  • After merge + deploy: curl https://www.feitong.phd/robots.txt shows explicit AI bot entries

Follow-ups (out of scope)

  • Add a summary frontmatter field + render as a quick-answer block above the fold (3-sentence self-contained answer in the first ~150 tokens)
  • FAQPage schema for Q&A-style essays (offer-negotiation, reverse-interview-zh, meeting-how-to-zh)
  • Set up the monthly prompt-tracking baseline from plan/docs/seo/geo-measure.md

🤖 Generated with Claude Code

…N-LD fixes

Make essays first-class citizens for AI answer engines (Claude, ChatGPT,
Perplexity, Gemini) and IDE agents.

- BlogPosting JSON-LD now emits `dateModified` and `image`. Previously both
  were silently dropped: `dateModified` because `EssayMeta` had no `updated`
  field, `image` because `essayPostingSchema` didn't pass it through. Adds
  `updated?: string` to EssayMeta/PeriodicMeta and both frontmatter
  validators so essays can declare a last-modified date.
- robots.ts names 15 AI/answer-engine crawlers explicitly (GPTBot,
  OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-User, Claude-SearchBot,
  PerplexityBot, Perplexity-User, Google-Extended, Applebot-Extended,
  CCBot, Bytespider, Meta-ExternalAgent, cohere-ai, DuckAssistBot). A
  blanket `*` allow covers them in theory, but some default to "no
  matching rule = stay out".
- app/icon.svg adds a favicon (file-convention; Next emits the link tags).
- app/llms.txt/route.ts serves a build-time-generated llms.txt listing all
  published essays (en + zh), periodics, and series. Each essay links to
  its raw .md URL.
- app/(en|zh)/essays/[slug]/raw.md/route.ts serves clean markdown per
  essay at /essays/<slug>/raw.md. LLM crawlers preferentially ingest
  markdown over rendered HTML (cheaper tokens, fewer parsing errors).
  Both essay pages now announce this via `<link rel="alternate"
  type="text/markdown">`.

Verified with Google's Rich Results Test: BlogPosting + BreadcrumbList
detected and valid on essay pages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ftvision ftvision merged commit 44c5957 into master May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant