Skip to content

Add llms.txt to texera.apache.org #4932

@aglinxinyuan

Description

@aglinxinyuan

Feature Summary

llms.txt is an emerging convention for exposing a site's documentation in a machine-readable index that LLM agents can consume — conceptually similar to a sitemap for crawlers, but aimed at agents.

Texera's documentation is hosted on texera.apache.org. Without an llms.txt, agents have to scan source code or the rendered site to answer questions about Texera. Publishing an llms.txt would give them a structured directory of the site's content, enabling things like:

  • Answering "does Texera support X?" by reading the docs index instead of scanning source code.
  • Serving as the foundation for a Texera agent knowledge base (similar to what Spark has done).
  • A low-cost, structured entry point that we can grow as more content lands on apache.org.

Example from ng-zorro: https://ng.ant.design/llms.txt

Proposed Solution or Design

  1. Add llms.txt to apache/incubator-texera-site so it is served at texera.apache.org/llms.txt.
  2. Add an AGENTS.md to that repo noting that llms.txt should be kept in sync when site content changes (updates are expected to be infrequent).
  3. Gradually expand coverage as more content moves onto apache.org.

References:

Impact / Priority

(P3) Low – nice to have, low maintenance cost.

Affected Area

Other (apache/incubator-texera-site)

Metadata

Metadata

Assignees

Labels

No labels
No labels
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions