Skip to content

fix(api-proxy): suppress model fallback for Copilot, add excludeEngines#4015

Merged
lpcox merged 2 commits into
mainfrom
fix/copilot-model-fallback-suppression
May 29, 2026
Merged

fix(api-proxy): suppress model fallback for Copilot, add excludeEngines#4015
lpcox merged 2 commits into
mainfrom
fix/copilot-model-fallback-suppression

Conversation

@lpcox
Copy link
Copy Markdown
Collaborator

@lpcox lpcox commented May 29, 2026

Summary

Addresses #3987modelFallback was silently rewriting retired/restricted Copilot model names to middle-power fallback models, obscuring the real upstream error with generic auth-failure diagnostics.

Changes

1. Suppress middle-power fallback for standard Copilot

When no BYOK env vars are set, the Copilot CLI is authoritative for its own model catalogue. Retired or restricted models should fail fast with a clear upstream error instead of being silently rewritten to a random middle-tier model.

Previously: standard Copilot → fallback enabled → model rewritten → confusing error or wrong model
Now: standard Copilot → fallback suppressed → request goes through with original model → clear 400 model not supported error

2. Suppress for BYOK Copilot targeting GitHub catalog

When BYOK hints are present but the target is still a GitHub Copilot catalog host (api.githubcopilot.com, etc.), the same logic applies — catalog is authoritative.

3. Add modelFallback.excludeEngines config option

New config field: array of engine/provider names for which middle-power fallback is suppressed. This gives gh-aw authors explicit control to disable fallback per-engine without disabling it globally.

{
  "apiProxy": {
    "modelFallback": {
      "enabled": true,
      "strategy": "middle_power",
      "excludeEngines": ["copilot", "anthropic"]
    }
  }
}

4. Improved error messaging

  • New model_unavailable diagnostic log: Emitted when Copilot returns 400 model not supported after retries are exhausted. Provides actionable guidance instead of generic auth-error message.
  • Suppressed misleading auth-error log: The generic upstream_auth_error warning ("check that the API key is valid") is no longer emitted for 400 responses that are actually model-not-supported errors.

Files Changed

  • containers/api-proxy/providers/copilot.js — Suppress fallback for standard Copilot + BYOK catalog targets
  • containers/api-proxy/server.js — Parse excludeEngines, apply per-provider policy
  • containers/api-proxy/upstream-response.js — Add model_unavailable diagnostic, suppress auth-error for model errors
  • src/awf-config-schema.json + docs/awf-config.schema.json — Add excludeEngines field to schema
  • docs/awf-config-spec.md — Document new field and suppression conditions
  • containers/api-proxy/server.network.test.js — Tests for suppression policies
  • containers/api-proxy/server.models.test.js — Updated model transform tests to use non-copilot provider

Testing

  • All 2193 AWF tests pass
  • All 777 api-proxy tests pass (excluding 1 pre-existing failure in server.custom-auth-header.test.js on main)
  • New tests added for excludeEngines config and standard-Copilot suppression

Addresses #3987 — modelFallback was silently rewriting retired/restricted
Copilot model names to middle-power fallback models, obscuring the real
upstream error.

Changes:
- Suppress middle-power fallback for standard Copilot (non-BYOK): the
  Copilot CLI is authoritative for its own model catalogue, so
  retired/restricted models should fail fast with a clear error
- Suppress for BYOK Copilot pointing at GitHub catalog targets (same
  rationale — catalog is authoritative)
- Add modelFallback.excludeEngines config option: array of engine names
  for which fallback is suppressed, allowing per-engine control
- Improve error messaging: emit a model_unavailable diagnostic log when
  Copilot returns 400 'model not supported' after retries are exhausted,
  instead of only the misleading 'check API key' auth error
- Suppress generic upstream_auth_error log for 400s that are actually
  model-not-supported errors (avoids conflating auth issues with model
  availability)

Closes #3987

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 29, 2026 13:53
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

Documentation Preview

Documentation build failed for this PR. View logs.

Built from commit e65b1c7

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 96.57% 96.62% 📈 +0.05%
Statements 96.45% 96.49% 📈 +0.04%
Functions 98.24% 98.24% ➡️ +0.00%
Branches 90.77% 90.81% 📈 +0.04%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/config-writer.ts 89.3% → 90.9% (+1.65%) 89.3% → 90.9% (+1.65%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions github-actions Bot mentioned this pull request May 29, 2026
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines the api-proxy “model fallback” behavior to avoid silently rewriting Copilot model names (which can obscure upstream “model not supported” errors), and introduces a per-engine suppression mechanism via modelFallback.excludeEngines.

Changes:

  • Suppress middle-power model fallback for Copilot in “standard” mode and for Copilot requests that still target GitHub Copilot catalog hosts.
  • Add apiProxy.modelFallback.excludeEngines to config schema/spec and apply suppression policy per provider.
  • Improve diagnostics by emitting a model_unavailable log for Copilot “model not supported” errors and suppressing misleading auth warnings in that case; update tests accordingly.
Show a summary per file
File Description
src/awf-config-schema.json Adds modelFallback.excludeEngines to the source config schema.
docs/awf-config.schema.json Mirrors the schema change in the published docs schema.
docs/awf-config-spec.md Documents excludeEngines and Copilot-specific suppression conditions.
containers/api-proxy/server.js Parses excludeEngines and applies per-provider fallback suppression policy.
containers/api-proxy/providers/copilot.js Suppresses fallback for standard Copilot and GitHub-catalog targets.
containers/api-proxy/upstream-response.js Adds model_unavailable diagnostic and suppresses misleading auth warnings for model errors.
containers/api-proxy/server.network.test.js Adds/updates reflect endpoint tests for suppression policies.
containers/api-proxy/server.models.test.js Updates model transform tests to use a non-Copilot provider where fallback remains active.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

containers/api-proxy/upstream-response.js:76

  • The 400-path still logs upstream_auth_error even when responseBody is unavailable (e.g. when the 400 response isn't buffered). That can reintroduce the misleading "check that the API key is valid" warning for model-not-supported 400s. Consider skipping the 400 auth warning unless the buffered body is present and confirmed to be a non-model error.
  function logUpstreamAuthError(statusCode, { requestId, provider, targetHost, req, responseBody }) {
    if (statusCode === 401 || statusCode === 403) {
      logRequest('warn', 'upstream_auth_error', {
        request_id: requestId, provider, status: statusCode,
        upstream_host: targetHost, path: sanitizeForLog(req.url),
        message: `Upstream returned ${statusCode} — check that the API key is valid and correctly formatted`,
      });
    } else if (statusCode === 400) {
      // Suppress generic auth-error message when the 400 is a model-not-supported
      // error — that case is handled by the model_unavailable diagnostic.
      if (responseBody && parseModelNotSupportedFromBody(responseBody)) return;
      logRequest('warn', 'upstream_auth_error', {
        request_id: requestId, provider, status: statusCode,
        upstream_host: targetHost, path: sanitizeForLog(req.url),
        message: `Upstream returned ${statusCode} — check that the API key is valid and correctly formatted`,
      });
    }
  • Files reviewed: 8/8 changed files
  • Comments generated: 3

Comment thread containers/api-proxy/server.js Outdated
Comment on lines +104 to +107
const excludeEngines = Array.isArray(parsed.excludeEngines)
? parsed.excludeEngines.filter(e => typeof e === 'string').map(e => e.toLowerCase())
: [];
return { enabled, strategy, excludeEngines };
Comment thread docs/awf-config-spec.md
Comment on lines +816 to 823
- The provider is in the `excludeEngines` list
- Copilot engine in standard mode (no BYOK env vars): the Copilot CLI is
authoritative for its own model catalogue, so retired/restricted model names
should fail fast with a clear upstream error rather than being silently
rewritten to a middle-power fallback
- Copilot is configured for a BYOK non-`githubcopilot` target (for example Azure
OpenAI deployment endpoints), where deployment names are provider-local and
must not be rewritten to catalog model IDs
Comment on lines +446 to +454
it('should suppress fallback for standard Copilot (no BYOK hints)', () => {
// Default test environment has no BYOK env vars set — standard Copilot
const result = reflectEndpoints();
expect(result.model_fallback_effective.copilot).toEqual({
enabled: false,
strategy: 'middle_power',
suppressed: true,
suppression_reason: 'copilot_standard_authoritative',
});
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@lpcox
Copy link
Copy Markdown
Collaborator Author

lpcox commented May 29, 2026

@copilot address review feedback

Copy link
Copy Markdown
Contributor

Copilot AI commented May 29, 2026

@copilot address review feedback

Addressed in commit e584303.

  • Normalized excludeEngines parsing to trim/lowercase/dedupe and always return excludeEngines: [] when unset/invalid.
  • Updated config spec docs to explicitly cover BYOK + GitHub Copilot catalog target fallback suppression.
  • Hardened the standard-Copilot fallback test to explicitly clear/restore BYOK hint env vars.
  • Updated affected reflect expectations and added normalization coverage in tests.

Validated with: npm test -- --runInBand server.network.test.js server.models.test.js (70/70 passing).

Copilot finished work on behalf of lpcox May 29, 2026 14:28
@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test: Claude Engine

  • ✅ GitHub API: 2 recent PRs found
  • ✅ GitHub check: playwright_check=PASS
  • ✅ File verify: smoke-test-claude-26643063082.txt exists

Result: PASS

💥 [THE END] — Illustrated by Smoke Claude

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Smoke Test: API Proxy OpenTelemetry Tracing

Scenario Result Notes
1. Module Loading ✅ Pass otel.js loads successfully; exports: startRequestSpan, setTokenAttributes, endSpan, endSpanError, shutdown, isEnabled + internal helpers
2. Test Suite ✅ Pass 33/33 tests passed (otel.test.js) across ProxyAwareOtlpExporter, FileSpanExporter, and shutdown suites
3. Env Var Forwarding ✅ Pass api-proxy-service.ts forwards OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, GITHUB_AW_OTEL_TRACE_ID, GITHUB_AW_OTEL_PARENT_SPAN_ID, and OTEL_SERVICE_NAME
4. Token Tracker Integration ✅ Pass onUsage callback exists in token-tracker-http.js (line 237) as the OTEL hook point
5. OTEL Diagnostics ✅ Pass No OTLP endpoint configured in this run; graceful degradation to file fallback (FileSpanExporter) confirmed

All 5 scenarios passed. OTEL integration is functional.

📡 OTel tracing validated by Smoke OTel Tracing

@github-actions
Copy link
Copy Markdown
Contributor

🔬 Smoke Test Results

Test Result
GitHub MCP connectivity ✅ PR data fetched successfully
GitHub.com HTTP connectivity ✅ Reachable
File write/read (/tmp/gh-aw/agent/smoke-test-copilot-26643063174.txt) ✅ Content verified

PR: fix(api-proxy): suppress model fallback for Copilot, add excludeEngines
Author: @lpcox | Assignees: none

Overall: PASS

📰 BREAKING: Report filed by Smoke Copilot

@github-actions
Copy link
Copy Markdown
Contributor

🔥 Smoke Test: Copilot BYOK (Offline) Mode

Test Result
GitHub MCP (list PRs) ✅ PR #4012 "feat(api-proxy): Anthropic WIF schema fields and OIDC validation probe" returned
GitHub.com connectivity ⚠️ Pre-step data not injected (template vars unexpanded)
File write/read ⚠️ Pre-step data not injected (template vars unexpanded)
BYOK inference (this response) ✅ Responding via api-proxy → api.githubcopilot.com

Running in BYOK offline mode (COPILOT_OFFLINE=true) via api-proxy → api.githubcopilot.com.

Overall: PARTIAL — BYOK inference and MCP confirmed ✅; pre-step smoke data vars were not injected.

Author: @lpcox | No assignees.

🔑 BYOK report filed by Smoke Copilot BYOK

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

Gemini Engine Smoke Test Results

  • GitHub MCP Testing: ❌ (mcpscripts missing)
  • GitHub.com Connectivity: ❌ (curl error 35/400 via Squid)
  • File Writing Testing: ✅
  • Bash Tool Testing: ✅

Overall Status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

@github-actions
Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx All passed ✅ PASS
Node.js execa All passed ✅ PASS
Node.js p-limit All passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #4015 · sonnet46 1.1M ·

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test Results — FAIL ❌

Check Result
Redis PING ❌ Timeout/no response
PostgreSQL pg_isready ❌ No response on port 5432
PostgreSQL SELECT 1 ❌ Timeout/no response

host.docker.internal is not reachable from this environment. Service containers may not be running or are not accessible from the agent sandbox.

🔌 Service connectivity validated by Smoke Services

@github-actions
Copy link
Copy Markdown
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3 ❌ NO
Node.js v24.16.0 v22.22.3 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environments.

Tested by Smoke Chroot

@lpcox lpcox merged commit f5a1712 into main May 29, 2026
69 of 70 checks passed
@lpcox lpcox deleted the fix/copilot-model-fallback-suppression branch May 29, 2026 14:47
@github-actions
Copy link
Copy Markdown
Contributor

fix(api-proxy): suppress model fallback for Copilot, add excludeEngines
feat(api-proxy): Anthropic WIF schema fields and OIDC validation probe
GitHub PRs: ✅
Safe CLI: ✅
Playwright: ✅
File write: ✅
Discussion: ✅
Build: ❌
Overall: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants