feat(seo): GEO foundations — llms.txt, raw .md, AI bot allowlist, JSON-LD fixes#118
Merged
Conversation
…N-LD fixes Make essays first-class citizens for AI answer engines (Claude, ChatGPT, Perplexity, Gemini) and IDE agents. - BlogPosting JSON-LD now emits `dateModified` and `image`. Previously both were silently dropped: `dateModified` because `EssayMeta` had no `updated` field, `image` because `essayPostingSchema` didn't pass it through. Adds `updated?: string` to EssayMeta/PeriodicMeta and both frontmatter validators so essays can declare a last-modified date. - robots.ts names 15 AI/answer-engine crawlers explicitly (GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-User, Claude-SearchBot, PerplexityBot, Perplexity-User, Google-Extended, Applebot-Extended, CCBot, Bytespider, Meta-ExternalAgent, cohere-ai, DuckAssistBot). A blanket `*` allow covers them in theory, but some default to "no matching rule = stay out". - app/icon.svg adds a favicon (file-convention; Next emits the link tags). - app/llms.txt/route.ts serves a build-time-generated llms.txt listing all published essays (en + zh), periodics, and series. Each essay links to its raw .md URL. - app/(en|zh)/essays/[slug]/raw.md/route.ts serves clean markdown per essay at /essays/<slug>/raw.md. LLM crawlers preferentially ingest markdown over rendered HTML (cheaper tokens, fewer parsing errors). Both essay pages now announce this via `<link rel="alternate" type="text/markdown">`. Verified with Google's Rich Results Test: BlogPosting + BreadcrumbList detected and valid on essay pages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Make essays first-class citizens for AI answer engines (Claude, ChatGPT, Perplexity, Gemini) and IDE agents (Cursor, Continue).
BlogPosting.dateModifiedandBlogPosting.imagewere silently dropped (essayPostingSchemanever piped them through, andEssayMetahad noupdatedfield). Addsupdated?: stringtoEssayMeta/PeriodicMetaand both frontmatter validators, then wires both fields into the schema. Now sitemaplastmodand JSON-LDdateModifiedagree.robots.ts: 15 named crawlers across OpenAI, Anthropic, Perplexity, Google, Apple, Common Crawl, Meta, Cohere, DuckDuckGo. A blanket*covers them in theory, but several default to "no matching rule = stay out". Comment in the file documents how to flip the training bots to disallow while keeping search/answer bots allowed.app/icon.svg(file convention — Next emits the link tags automatically). Black square, serif "A".llms.txt:app/llms.txt/route.tsserves a build-time-generatedllms.txtlisting every published essay (en + zh), periodic, and series. Each essay links to itsraw.mdURL./essays/<slug>/raw.md(and/zh/essays/<slug>/raw.md) serves clean markdown — title, author, dates, canonical URL header, then the raw MDX body. LLM crawlers preferentially ingest markdown over rendered HTML (cheaper tokens, fewer parsing errors). Announced from the HTML via<link rel=\"alternate\" type=\"text/markdown\">.Test plan
pnpm --filter @blog/blog exec tsc --noEmitpasseshttps://www.feitong.phd/essays/decision-game: 2 valid items (BlogPosting+BreadcrumbList)dateModifiedandimagenow appear inBlogPostingcurl https://www.feitong.phd/llms.txtreturns essay listing with correct URLscurl -I https://www.feitong.phd/essays/decision-game/raw.mdreturns 200 + markdown bodycurl https://www.feitong.phd/robots.txtshows explicit AI bot entriesFollow-ups (out of scope)
summaryfrontmatter field + render as a quick-answer block above the fold (3-sentence self-contained answer in the first ~150 tokens)FAQPageschema for Q&A-style essays (offer-negotiation,reverse-interview-zh,meeting-how-to-zh)plan/docs/seo/geo-measure.md🤖 Generated with Claude Code