Skip to content

[FEATURE]: Support multiple API keys with failover for MCP providers #25964

@sharyuke

Description

@sharyuke
  • I have verified this feature I'm about to request hasn't been suggested before.

Describe the enhancement you want to request:

Problem Statement

Currently, when configuring multiple API keys for the same MCP provider (e.g., multiple OpenAI keys, various OpenRouter keys from different vendors), users must manually edit the configuration file to switch keys when one key hits rate limits or fails. This is disruptive during active coding sessions.

Many developers:

  • Use multiple API keys from the same provider for higher rate limits
  • Use keys from different proxy/reseller services for cost optimization
  • Experience rate limiting (429 errors) during intensive coding sessions
  • Want automatic failover without manual intervention

Proposed Solution

Allow configuring multiple API keys as an array in the MCP provider configuration, with automatic failover on errors:

Configuration Example:

{
  "mcp": {
    "openai": {
      "type": "openai",
      "apiKey": ["key-1", "key-2", "key-3"],
      "baseUrl": "https://api.openai.com/v1"
    },
    "openrouter": {
      "type": "openrouter",
      "apiKey": ["$OR_KEY_1", "$OR_KEY_2"],
      "baseUrl": "https://openrouter.ai/api/v1"
    }
  }
}

Expected Behavior

  1. Sequential Failover: Keys should be tried in order; if key-1 fails with 429/5xx, automatically try key-2
  2. Transparent Logging: Log which key is currently active for debugging purposes
  3. Exhaustion Handling: If all keys fail, report a clear error message listing all failed keys
  4. Health Check: Optionally validate keys on startup and mark unhealthy keys as disabled

Proposed Implementation

Config Schema Change

// Current (single key)
apiKey: string

// Proposed (multiple keys)
apiKey: string | string[]

Error Handling Strategy

  • 429 (Rate Limited): Immediate failover to next key
  • 5xx (Server Error): Retry once with next key after brief backoff
  • 401/403 (Auth Error): Skip to next key immediately
  • Network Error: Retry with next key

User Experience

  • No additional user interaction required during failover
  • User can query current active key via status command
  • Keys are rotated fairly to distribute load (round-robin optional for v2)

Benefits

  1. Improved Reliability: Automatic failover reduces disruption from rate limiting
  2. Better Resource Utilization: Users can fully utilize all their API keys
  3. Seamless Experience: No manual intervention required during key exhaustion
  4. Cost Optimization: Easier to use keys from multiple providers/resellers

Alternatives Considered

  1. Manual key switching command: Requires user action, not seamless
  2. External load balancer: Adds infrastructure complexity, not user-friendly
  3. Per-request random selection: Doesn't handle rate limits well

Priority

Medium - This is an enhancement that improves reliability but doesn't block core functionality.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions