Skip to content

Add Rate Limiting and Quota Management to Adapter Schema #60

@mickdarling

Description

@mickdarling

Summary

Define rate limiting and quota management as part of the adapter schema, allowing systems to monitor API usage, enforce limits, and provide intelligent notifications to users.

Motivation

When adapters wrap external APIs:

  • APIs have rate limits that must be respected
  • Users may want to set cost/usage budgets
  • Systems need to prevent runaway API calls (especially in agentic loops)
  • Graceful degradation is better than hard failures

Rate and quota information should be:

  1. Extracted during adapter generation (from API specs/docs)
  2. Configurable by users
  3. Enforced programmatically by the adapter
  4. Surfaced via introspection

Proposed Schema Addition

# Adapter front matter
---
name: example-api
type: adapter
version: "1.0.0"
rate_limits:
  # API-defined limits (from interrogation)
  api_limits:
    - scope: global
      limit: 5000
      window: hour
    - scope: endpoint
      endpoint: search
      limit: 30
      window: minute
  
  # User-configurable quotas
  quotas:
    enabled: true
    limits:
      - metric: calls_per_hour
        warn: 4000      # notification threshold
        pause: 4800     # pause and request confirmation
        hard_stop: 5000 # absolute stop
      - metric: calls_per_day
        warn: 10000
        pause: 15000
      - metric: cost_usd_per_day
        warn: 5.00
        pause: 10.00
        hard_stop: 50.00
      - metric: tokens_per_hour  # for LLM-based APIs
        warn: 100000
        pause: 150000
    
    notifications:
      - trigger: warn
        action: log  # log | notify | callback
        message: "Approaching rate limit for {api_name}: {metric} at {current}/{limit}"
      - trigger: pause
        action: notify
        require_confirmation: true
        message: "Rate limit pause for {api_name}. Continue? {current}/{limit}"
      - trigger: hard_stop
        action: block
        message: "Hard stop reached for {api_name}: {metric}"
    
    tracking:
      persist: true  # persist usage across sessions
      reset_schedule: "0 0 * * *"  # cron for resetting daily counters
---

Introspection

Quota status should be queryable:

// introspect operation
{
  operation: "get_quota_status",
  response: {
    api: "github-api",
    quotas: [
      { metric: "calls_per_hour", current: 3500, limit: 5000, status: "ok" },
      { metric: "cost_usd_per_day", current: 4.50, limit: 5.00, status: "warn" }
    ],
    next_reset: "2026-01-26T14:00:00Z"
  }
}

Behavior

  1. Automatic extraction: Adapter generator extracts rate limits from API specs when available
  2. User override: Users can set stricter quotas than API limits
  3. Graceful handling: warn → pause → hard_stop progression
  4. Intelligent defaults: System can suggest quotas based on typical usage patterns
  5. Cross-adapter aggregation: For APIs with shared rate limits across endpoints

Cost Estimation

For paid APIs, adapters can include cost metadata:

cost:
  model: per_call  # per_call | per_token | per_byte | tiered
  pricing:
    - endpoint: "*"
      cost_per_call: 0.001
    - endpoint: "premium_search"
      cost_per_call: 0.01
  currency: USD

Related

Tasks

  • Define rate_limits schema
  • Define quotas schema
  • Define cost estimation schema
  • Add introspection operations for quota status
  • Document notification/callback mechanisms
  • Define cross-adapter aggregation rules

Metadata

Metadata

Assignees

No one assigned

    Labels

    adapterAdapter development relatedenhancementNew feature or requestinfrastructureRepository setup and configurationphase-3Adapter: Adapter specifications and interfacesspecCore specification content

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions