Problem
When an API key is expired, invalid, or misconfigured, the agent only discovers this minutes into a workflow run — after container startup, healthchecks, and initial reasoning turns. The error surfaces as an opaque upstream 401/403 deep in the agent's execution, making it hard to diagnose.
Proposal
Add startup key validation to the api-proxy sidecar (containers/api-proxy/server.js). After all proxy servers are listening (ports 10000–10004), fire lightweight probe requests against each configured provider's API to verify credentials are accepted. Log clear, actionable messages for each result.
Validation endpoints
| Provider |
Probe |
Auth header |
Valid response |
Invalid response |
| OpenAI |
GET /v1/models |
Authorization: Bearer {key} |
200 |
401 |
| Anthropic |
POST /v1/messages (minimal body) |
x-api-key + anthropic-version |
400 (missing fields = key valid) |
401/403 |
Copilot (COPILOT_GITHUB_TOKEN, non-classic) |
GET /models |
Authorization: Bearer {token} |
200 |
401 |
Copilot (COPILOT_API_KEY / classic PAT) |
Skip validation |
— |
— |
Log "validation not supported for this auth mode" |
| Gemini |
GET /v1beta/models |
x-goog-api-key |
200 |
400/403 |
Design constraints
-
Only validate default API targets. Custom targets (--openai-api-target, --gemini-api-target, etc.) and non-empty base paths may not support the probe endpoints. Skip validation and log "skipped — custom API target" for these.
-
Use proxyAgent for all requests. Validation requests must route through Squid, same as normal traffic. This also validates that the proxy chain is working.
-
Respect startup sequencing. Wrap each server.listen() in a Promise, await Promise.all(...), then fire validation. This ensures Docker healthcheck (port 10000) passes before validation begins.
-
10-second timeout per provider. If a probe takes longer, log a warning and move on — the network may not be ready yet.
-
Never exit the process. Log errors clearly but don't crash. The key might become valid later, or the agent might not use that provider.
-
Shared helper function. A single validateKey(provider, target, path, headers, expectedStatus) function avoids duplicating URL construction, proxy routing, and timeout logic across providers.
-
Anthropic probe is inconclusive. Since there's no lightweight GET endpoint, the POST probe should include proper headers (anthropic-version, Content-Type) and classify 400 as "key accepted, request malformed" vs 401/403 as "key rejected." Log the nuance.
-
Copilot auth mode matters. COPILOT_GITHUB_TOKEN (non-classic) can be validated via /models. Classic ghp_* PATs and COPILOT_API_KEY (BYOK) cannot — log that validation is skipped with the reason.
Log format
[INFO] key_validation_success { provider: "openai", message: "OpenAI API key validated successfully", duration_ms: 342 }
[ERROR] key_validation_failed { provider: "anthropic", status: 401, message: "Anthropic API key is invalid or expired. Requests to this provider will fail." }
[WARN] key_validation_skipped { provider: "copilot", message: "Validation skipped — COPILOT_API_KEY auth mode does not support probe endpoint" }
[WARN] key_validation_skipped { provider: "openai", message: "Validation skipped — custom API target (my-llm-router.internal)" }
[WARN] key_validation_timeout { provider: "gemini", message: "Key validation timed out after 10s — network may not be ready" }
Testing
- Unit tests: mock
https.request to return various status codes; verify correct log events
- Integration: can be tested manually with
--enable-api-proxy and deliberately expired keys
Files to modify
containers/api-proxy/server.js — add validateKey() helper and validateApiKeys() orchestrator
containers/api-proxy/server.test.js — unit tests for the validation logic
- Export
validateKey for testability
Out of scope
- Periodic re-validation (only at startup)
- Key rotation / refresh
- Blocking the agent until validation completes (fire-and-forget after servers listen)
Problem
When an API key is expired, invalid, or misconfigured, the agent only discovers this minutes into a workflow run — after container startup, healthchecks, and initial reasoning turns. The error surfaces as an opaque upstream 401/403 deep in the agent's execution, making it hard to diagnose.
Proposal
Add startup key validation to the api-proxy sidecar (
containers/api-proxy/server.js). After all proxy servers are listening (ports 10000–10004), fire lightweight probe requests against each configured provider's API to verify credentials are accepted. Log clear, actionable messages for each result.Validation endpoints
GET /v1/modelsAuthorization: Bearer {key}POST /v1/messages(minimal body)x-api-key+anthropic-versionCOPILOT_GITHUB_TOKEN, non-classic)GET /modelsAuthorization: Bearer {token}COPILOT_API_KEY/ classic PAT)GET /v1beta/modelsx-goog-api-keyDesign constraints
Only validate default API targets. Custom targets (
--openai-api-target,--gemini-api-target, etc.) and non-empty base paths may not support the probe endpoints. Skip validation and log "skipped — custom API target" for these.Use
proxyAgentfor all requests. Validation requests must route through Squid, same as normal traffic. This also validates that the proxy chain is working.Respect startup sequencing. Wrap each
server.listen()in a Promise,await Promise.all(...), then fire validation. This ensures Docker healthcheck (port 10000) passes before validation begins.10-second timeout per provider. If a probe takes longer, log a warning and move on — the network may not be ready yet.
Never exit the process. Log errors clearly but don't crash. The key might become valid later, or the agent might not use that provider.
Shared helper function. A single
validateKey(provider, target, path, headers, expectedStatus)function avoids duplicating URL construction, proxy routing, and timeout logic across providers.Anthropic probe is inconclusive. Since there's no lightweight GET endpoint, the POST probe should include proper headers (
anthropic-version,Content-Type) and classify 400 as "key accepted, request malformed" vs 401/403 as "key rejected." Log the nuance.Copilot auth mode matters.
COPILOT_GITHUB_TOKEN(non-classic) can be validated via/models. Classicghp_*PATs andCOPILOT_API_KEY(BYOK) cannot — log that validation is skipped with the reason.Log format
Testing
https.requestto return various status codes; verify correct log events--enable-api-proxyand deliberately expired keysFiles to modify
containers/api-proxy/server.js— addvalidateKey()helper andvalidateApiKeys()orchestratorcontainers/api-proxy/server.test.js— unit tests for the validation logicvalidateKeyfor testabilityOut of scope