-
Notifications
You must be signed in to change notification settings - Fork 0
metadata files
You will learn about the public metadata files the daemon serves: robots.txt, sitemaps, and llms.txt.
GET /robots.txt
Returns the crawler policy file. It points crawlers to the sitemap and discourages access to live query endpoints.
Disallowed paths:
/api/v1/search/api/v1/query/api/v1/compose/api/v1/client-ip- per-feed search endpoints
The robots.txt file is an advisory crawler hint, not a security control. It does not list admin paths or private local paths.
GET /sitemap.xml
Returns a Sitemaps.org sitemap index. This index links to individual sitemap shard files.
Each shard uses absolute URLs and the Sitemaps.org XML namespace. Shards stay below 45,000 URLs to remain well under the 50,000-URL sitemap protocol limit.
GET /sitemap-*.xml
Individual sitemap shard files. Each shard covers one category of pages:
- feed detail pages — one URL per public feed
- country detail pages — one URL per country in the public index
- ASN detail pages — one URL per ASN in the public index
- maintainer detail pages — one URL per public maintainer
- index pages — homepage, countries, ASNs, maintainers, methodology
Sitemaps do not include admin routes, API routes, raw file downloads, or private runtime details.
GET /llms.txt
Returns a concise Markdown file for AI agents and automated tools. It links to public pages, methodology pages, public API indexes, and the feed catalog.
The file follows the emerging llms.txt convention for curated AI-readable site context. It does not expose admin routes, authenticated operations, local filesystem paths, or private runtime details.
Example content structure:
# update-ipsets
Public cybercrime IP feed observatory.
## Pages
- / — homepage with IP lookup and feed explorer
- /countries — country index
- /asns — ASN index
- /maintainers — maintainer index
- /methodology — methodology index
## API
- /api/v1/sets — feed catalog
- /api/v1/search — IP lookup
- /api/v1/countries — country index
- /api/v1/asns — ASN index- Daemon Command Reference
- Environment Variables
- Configuration Reload
- Listener Topologies
- Admin Authentication
- Feed Families
- Source Feeds
- Processor Reference
- Static Feeds
- Merge Feeds
- Artifact Parents
- History Derivatives
- Provider Databases
- Use Roles
- Critical Infrastructure Reference Feeds
- Legal Fields
- Feed Visibility & Lifecycle
- YAML Field Reference
- Pipeline Overview
- Download Lifecycle
- Processing Lifecycle
- Feed Status Reference
- Health Classes
- What Triggers Reprocessing
- Accessing the Admin
- Runtime Status
- Feed Inventory
- Artifact Inventory
- Live Queues
- Background Work
- Schedule State
- Operator Actions
- Enable & Disable