Skip to content

RFC: AdCP Registry API for cached adagents.json lookup #201

@bokelley

Description

@bokelley

AdCP Registry API Specification

Overview

A centralized registry service that:

  1. Crawls and caches adagents.json files from publisher domains
  2. Provides fast lookups without hitting publisher servers
  3. Tracks verification status across the ecosystem
  4. Discovers new publishers via registry submissions

Use Cases

Sales Agent (us)

  • Fast property discovery: Query registry instead of crawling each publisher
  • Bulk verification: Check authorization for 100s of publishers instantly
  • Publisher discovery: Find publishers accepting new sales agents
  • Monitoring: Track when publishers update their adagents.json

Publisher

  • Submit to registry: Register your domain for discovery
  • See who's authorized: List all sales agents you've authorized
  • Track usage: See which agents are querying your properties

Ecosystem

  • Directory: Searchable list of all publishers and sales agents
  • Analytics: Ecosystem health, adoption metrics
  • Compliance: Verify AdCP spec compliance across ecosystem

API Endpoints

1. Fetch Cached adagents.json

Purpose: Get cached adagents.json without hitting publisher server.

GET /v1/adagents/{publisher_domain}

Response:

{
  "domain": "nytimes.com",
  "adagents_url": "https://nytimes.com/.well-known/adagents.json",
  "data": { /* full adagents.json */ },
  "cached_at": "2025-11-16T20:00:00Z",
  "ttl_seconds": 3600,
  "status": "valid",
  "spec_version": "2.0.0"
}

Cache Headers:

Cache-Control: public, max-age=3600
ETag: "abc123"
Last-Modified: Sat, 16 Nov 2025 20:00:00 GMT

Status Codes:

  • 200 - Cached data available
  • 202 - Accepted, crawl in progress (use Retry-After header)
  • 404 - Publisher not found in registry
  • 410 - Publisher removed adagents.json (Gone)

2. Verify Agent Authorization

Purpose: Check if a sales agent is authorized by a publisher.

POST /v1/verify
Content-Type: application/json

{
  "publisher_domain": "nytimes.com",
  "agent_url": "https://sales-agent.example.com",
  "property_identifiers": [
    {"type": "domain", "value": "nytimes.com"}
  ]
}

Response:

{
  "authorized": true,
  "properties": [
    {
      "property_id": "site:nytimes.com",
      "name": "The New York Times",
      "tags": ["news", "premium"],
      "authorized_for_agent": true
    }
  ],
  "verified_at": "2025-11-16T20:00:00Z",
  "cache_status": "hit"
}

3. Batch Verification

Purpose: Check authorization for multiple publishers at once.

POST /v1/verify/batch
Content-Type: application/json

{
  "agent_url": "https://sales-agent.example.com",
  "publisher_domains": ["nytimes.com", "washingtonpost.com", "espn.com"]
}

Response:

{
  "agent_url": "https://sales-agent.example.com",
  "results": {
    "nytimes.com": {
      "authorized": true,
      "property_count": 12,
      "tag_count": 5
    },
    "washingtonpost.com": {
      "authorized": false,
      "reason": "agent_not_listed"
    },
    "espn.com": {
      "authorized": true,
      "property_count": 8,
      "tag_count": 3
    }
  },
  "verified_at": "2025-11-16T20:00:00Z"
}

4. Search Publishers

Purpose: Discover publishers accepting new sales agents.

GET /v1/publishers/search?accepting_agents=true&property_type=website&tags=news

Response:

{
  "total": 145,
  "publishers": [
    {
      "domain": "nytimes.com",
      "name": "The New York Times",
      "property_types": ["website"],
      "tags": ["news", "premium"],
      "authorized_agents_count": 23,
      "accepting_new_agents": true,
      "last_updated": "2025-11-16T20:00:00Z"
    }
  ],
  "next_page": "/v1/publishers/search?page=2&..."
}

5. Submit Publisher

Purpose: Register a publisher domain for crawling.

POST /v1/publishers
Content-Type: application/json

{
  "domain": "example-publisher.com",
  "contact_email": "adtech@example.com",
  "notify_on_crawl": true
}

Response:

{
  "domain": "example-publisher.com",
  "status": "pending",
  "crawl_scheduled": "2025-11-16T20:05:00Z",
  "submission_id": "sub_abc123"
}

6. Registry Stats

Purpose: Ecosystem health metrics.

GET /v1/stats

Response:

{
  "publishers": {
    "total": 1234,
    "with_valid_adagents": 890,
    "accepting_agents": 567
  },
  "sales_agents": {
    "total": 89,
    "active_last_30d": 67
  },
  "properties": {
    "total": 12543,
    "by_type": {
      "website": 8901,
      "mobile_app": 2345,
      "ctv_app": 1297
    }
  },
  "last_updated": "2025-11-16T20:00:00Z"
}

Caching Strategy

Registry-Side

  • Crawl frequency: Every 24 hours (configurable per publisher)
  • On-demand crawl: When cache miss or ?force_refresh=true
  • Rate limiting: Max 10 req/sec per publisher domain
  • Storage: Redis for hot cache, S3 for cold storage

Client-Side (adcp library)

from adcp import fetch_adagents

# Option 1: Use registry as fallback
data = await fetch_adagents(
    "nytimes.com",
    use_registry=True,
    registry_url="https://registry.adcontextprotocol.org",
    fallback_to_direct=True  # Try publisher if registry fails
)

# Option 2: Registry-only (faster, but requires registry)
data = await fetch_adagents(
    "nytimes.com",
    registry_only=True,
    registry_url="https://registry.adcontextprotocol.org"
)

Cache Headers

Cache-Control: public, max-age=3600, stale-while-revalidate=86400
Vary: Accept-Encoding

Authentication

Public Endpoints (no auth)

  • GET /v1/adagents/{domain} - Read-only registry access
  • GET /v1/stats - Public metrics
  • GET /v1/publishers/search - Public directory

Authenticated Endpoints (API key)

  • POST /v1/verify/batch - Batch operations
  • POST /v1/publishers - Submit publisher
  • DELETE /v1/publishers/{domain} - Remove publisher (owner only)

Auth Header:

Authorization: Bearer registry_sk_abc123...

Rate Limiting

Public API

  • 100 requests/minute per IP
  • 1000 requests/hour per IP

Authenticated API

  • 1000 requests/minute per API key
  • 50,000 requests/hour per API key

Rate Limit Headers:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 456
X-RateLimit-Reset: 1700164800

Error Responses

Standard Error Format

{
  "error": {
    "code": "publisher_not_found",
    "message": "Publisher domain not found in registry",
    "suggestion": "Submit this publisher via POST /v1/publishers",
    "docs": "https://docs.adcontextprotocol.org/registry#errors"
  }
}

Error Codes

  • publisher_not_found - Domain not in registry
  • invalid_adagents - Publisher's adagents.json is malformed
  • crawl_failed - Could not fetch from publisher
  • rate_limit_exceeded - Too many requests
  • unauthorized - Invalid API key
  • publisher_unavailable - Domain returns 404/410

Monitoring & Observability

Health Check

GET /health

Response:

{
  "status": "healthy",
  "version": "1.0.0",
  "cache": {
    "hit_rate": 0.94,
    "total_entries": 1234
  },
  "crawler": {
    "pending": 12,
    "failed_last_hour": 3
  }
}

Metrics (Prometheus)

# Cache performance
registry_cache_hits_total
registry_cache_misses_total
registry_cache_entries

# Crawler stats
registry_crawls_total{status="success|failed"}
registry_crawl_duration_seconds

# API usage
registry_requests_total{endpoint="/v1/verify",status="200"}

Implementation Priorities

Phase 1: Core Registry (MVP)

  • ✅ Crawl and cache adagents.json files
  • GET /v1/adagents/{domain} endpoint
  • ✅ Basic verification endpoint
  • ✅ 24-hour cache TTL
  • ✅ Health check

Phase 2: Discovery

  • ✅ Search publishers endpoint
  • ✅ Publisher submission
  • ✅ Stats endpoint
  • ✅ Public directory UI

Phase 3: Optimization

  • ✅ Batch verification
  • ✅ Stale-while-revalidate caching
  • ✅ CDN integration
  • ✅ Analytics dashboard

Phase 4: Advanced

  • ✅ Webhook notifications (publisher updated adagents.json)
  • ✅ Change history/audit log
  • ✅ Compliance checking
  • ✅ Property recommendation engine

Library Integration

Current (direct fetch)

from adcp import fetch_adagents

# Every call hits publisher server
data = await fetch_adagents("nytimes.com")  # 500ms

With Registry

from adcp import fetch_adagents

# Fast registry lookup, fallback to direct
data = await fetch_adagents(
    "nytimes.com",
    use_registry="https://registry.adcontextprotocol.org"  # 50ms
)

Batch Verification (new function)

from adcp import verify_batch_authorization

results = await verify_batch_authorization(
    agent_url="https://sales-agent.example.com",
    publisher_domains=["nytimes.com", "espn.com", ...],  # 100 domains
    registry_url="https://registry.adcontextprotocol.org"
)
# Single request, 200ms total vs 50 seconds direct

Open Questions

  1. Registry Authority: Who operates the official registry?

    • AdCP project directly?
    • Third-party service with governance?
    • Multiple registries with replication?
  2. Publisher Control: How do publishers claim/verify their domain?

    • DNS TXT record verification?
    • Email verification to domain owner?
    • Self-serve portal?
  3. Costs: Who pays for registry infrastructure?

    • Free public service (non-profit model)?
    • Paid tiers for high-volume users?
    • Publisher sponsorships?
  4. Data Privacy: What gets cached?

    • Full adagents.json (public data anyway)?
    • Aggregated stats only?
    • Opt-out mechanism for publishers?
  5. Staleness: How to handle stale cache?

    • Push webhooks when publisher updates?
    • ETags/If-Modified-Since?
    • Force refresh API?

Benefits

For Sales Agents

  • 100x faster property discovery (50ms vs 5s per publisher)
  • 📊 Bulk operations - verify 100s of publishers in one request
  • 🔍 Discovery - find new publisher partnerships
  • 📈 Monitoring - track when publishers update authorization

For Publishers

  • 📢 Visibility - be discovered by sales agents
  • 🔐 Control - see who's accessing your properties
  • Compliance - validate your adagents.json format
  • 📊 Analytics - understand ecosystem adoption

For Ecosystem

  • 🌐 Public directory of all AdCP participants
  • 📈 Health metrics - track protocol adoption
  • 🚀 Faster adoption - reduce technical barriers
  • 🔧 Better tooling - enable new developer tools

Security Considerations

  1. DDoS Protection: Rate limiting + CDN
  2. Cache Poisoning: Verify signatures on adagents.json
  3. Publisher Verification: DNS TXT records for domain ownership
  4. Data Integrity: Checksums and ETags
  5. Privacy: No PII, only public adagents.json data

Example: Sales Agent Startup Flow

Without Registry (current):

# Sales agent starts up
publishers = ["nytimes.com", "espn.com", ...]  # 100 publishers
for domain in publishers:
    data = await fetch_adagents(domain)  # 500ms each
    verify_authorization(data, our_url)
# Total: 50 seconds

With Registry:

# Sales agent starts up
publishers = ["nytimes.com", "espn.com", ...]  # 100 publishers
results = await verify_batch_authorization(
    our_url,
    publishers,
    registry_url="https://registry.adcontextprotocol.org"
)
# Total: 200ms (250x faster!)

Next Steps

  1. Spec Review: Share with AdCP community for feedback
  2. Prototype: Build MVP with Phase 1 features
  3. Library Updates: Add registry support to adcp Python library
  4. Beta Testing: Test with 5-10 publishers and sales agents
  5. Production: Launch public registry service
  6. Governance: Establish registry governance model

Related Docs

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions