Skip to content

feat: improve provider detection template selection#1003

Merged
ding113 merged 6 commits intodevfrom
feat/provider-detect-templates
Apr 9, 2026
Merged

feat: improve provider detection template selection#1003
ding113 merged 6 commits intodevfrom
feat/provider-detect-templates

Conversation

@ding113
Copy link
Copy Markdown
Owner

@ding113 ding113 commented Apr 8, 2026

Summary

  • split provider detection into independent Claude, Codex, OpenAI Compatible, and Gemini template suites with relay-pulse-inspired built-in templates plus a dedicated OpenAI Compatible chat template
  • add automatic template ordering/fallback, parser-based response extraction, and full raw-response passthrough so detection can show complete bodies consistently
  • simplify the provider detection UI to only require model input, and align Gemini path/model handling with generateContent requests

Verification

  • bun run lint:fix
  • bun run typecheck
  • bun run test
  • bun run build

Notes

  • updated a stale provider patch contract test to follow the live MAX_TEXT_LENGTH constant so the full test suite stays green
  • removed an unused private accessor in request-filter-engine while clearing repo-wide lint noise encountered during verification

Greptile Summary

This PR refactors the provider detection system by splitting it into independent per-protocol template suites (Claude, Codex, OpenAI Compatible, Gemini), adding a preset-based auto-selection/fallback mechanism, switching Gemini from streaming (streamGenerateContent?alt=sse) to non-streaming (generateContent), and simplifying the UI to require only a model input. The rawResponse field is propagated through to the frontend for inspection, and the architecture is clean. All remaining findings are P2 suggestions.

Confidence Score: 5/5

Safe to merge — all remaining findings are P2 style suggestions that do not block correctness or reliability.

The core logic changes (Gemini non-streaming switch, preset auto-selection, rawResponse passthrough) are all covered by new tests that pass. The three findings are a stale JSDoc comment, a cosmetic latency-display inconsistency in the Gemini UI path, and a minor reliability suggestion for Gemini preset files. None affect runtime correctness.

src/app/[locale]/settings/providers/_components/forms/api-test-button.tsx (Gemini latency display), src/lib/provider-testing/types.ts (stale comment), src/lib/provider-testing/data/gm_flash_basic.json and gm_pro_basic.json (missing system instruction)

Vulnerabilities

No security concerns identified. rawResponse is gated behind an admin session check in testProviderUnified and testProviderGemini. The looksLikeAnthropicProxy heuristic uses hostname-segment matching and explicitly excludes anthropic.com/claude.ai, avoiding credential routing mistakes. API keys are not logged.

Important Files Changed

Filename Overview
src/lib/provider-testing/presets.ts New preset registry with per-suite templates, score-based selection, and fallback ordering — clean and well-structured.
src/lib/provider-testing/test-service.ts New unified test runner with deadline-aware retry loop across preset candidates; logic is sound.
src/actions/providers.ts Exports new ProviderApiTestSuccessDetails/ProviderApiTestFailureDetails types with rawResponse; switches Gemini to non-streaming generateContent; testProviderUnified correctly maps ProviderTestResult to UnifiedTestResult.
src/app/[locale]/settings/providers/_components/forms/api-test-button.tsx UI simplified to model-only input; Gemini path builds synthetic validationDetails with a hardcoded 5 s latency threshold that doesn't match the 60 s Gemini timeout.
src/lib/provider-testing/types.ts Adds optional fields to CodexTestBody and GeminiTestBody; existing JSDoc on rawResponse says 'truncated to 5000 chars' but the new test explicitly verifies the full body is preserved.
src/lib/provider-testing/utils/test-prompts.ts Updated user-agents, defaults, and headers; adds resolveClaudeHeaders with smart proxy detection heuristic; clean refactor.
src/lib/request-filter-engine.ts Removes unused backward-compat private getter/setter pairs; straightforward cleanup.
tests/unit/actions/providers-api-test.test.ts New test file covering testProviderUnified rawResponse passthrough and testProviderGemini response body exposure; well-structured mocking.
tests/unit/actions/providers-patch-contract.test.ts Updates hardcoded 256 to PROVIDER_RULE_LIMITS.MAX_TEXT_LENGTH + 1 to track the live constant; correct fix.
tests/unit/settings/providers/api-test-button.test.tsx New UI smoke test verifying simplified form no longer exposes format/template/keyword/timeout fields.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["UI: ApiTestButton\n(model input only)"] --> B{providerType?}
    B -- "gemini / gemini-cli" --> C["testProviderGemini()\n(providers.ts)"]
    B -- "claude / codex / openai-compatible" --> D["testProviderUnified()\n(providers.ts)"]
    C --> C1["executeProviderApiTest()\nPOST /v1beta/models/{model}:generateContent\nreturns ProviderApiTestResult\n(details.rawResponse)"]
    D --> E["executeProviderTest()\n(test-service.ts)"]
    E --> F["buildAttemptPlans()"]
    F --> G{"custom payload?"}
    G -- yes --> H["Single plan from JSON"]
    G -- no --> I["getExecutionPresetCandidates()\nscore + sort presets"]
    I --> J["PresetConfig[]"]
    J --> K["runSingleAttempt() x N\ndeadline-aware loop"]
    K --> L{"success or\nnon-retryable?"}
    L -- yes --> M["Return ProviderTestResult\n(rawResponse at top-level)"]
    L -- "no, HTTP 400/404/405/415/422" --> K
    M --> N["UnifiedTestResult\n(rawResponse at data root)"]
Loading

Comments Outside Diff (1)

  1. src/lib/provider-testing/types.ts, line 164 (link)

    P2 Stale JSDoc comment contradicts implementation

    The comment says rawResponse is "truncated to 5000 chars", but test-service.ts assigns rawResponse: responseBody with no truncation applied, and test-service.test.ts explicitly asserts that the full 7000+ character body is preserved. The description is actively misleading.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/lib/provider-testing/types.ts
    Line: 164
    
    Comment:
    **Stale JSDoc comment contradicts implementation**
    
    The comment says `rawResponse` is "truncated to 5000 chars", but `test-service.ts` assigns `rawResponse: responseBody` with no truncation applied, and `test-service.test.ts` explicitly asserts that the full 7000+ character body is preserved. The description is actively misleading.
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/lib/provider-testing/types.ts
Line: 163-164

Comment:
**Stale JSDoc contradicts new test**

The comment says `rawResponse` is "truncated to 5000 chars", but `test-service.test.ts` (added in this PR, lines 64–103) explicitly asserts that a 7 000+ character response body is preserved in full (`expect(result.rawResponse?.length).toBe(responseBody.length)`). A developer reading this comment will likely add truncation in a future refactor, breaking that contract. Please update it to reflect the current, intended behaviour.

```suggestion
  /** Raw response body for user inspection (full body, not truncated) */
  rawResponse?: string;
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/app/[locale]/settings/providers/_components/forms/api-test-button.tsx
Line: 256-259

Comment:
**Hardcoded 5 s latency threshold mismatches Gemini's 60 s timeout**

`getTimeoutMsForProvider("gemini")` returns `60_000` ms, but the synthetic `validationDetails` built here marks `latencyPassed: false` for any response that takes longer than 5 000 ms. A valid, slow Gemini response (e.g. 8 s) will show the latency check as failed in the `TestResultCard` even though the test overall succeeded. The `testProviderUnified` path avoids this because the threshold is evaluated on the backend. Consider using the same provider-aware threshold here:

```suggestion
          latencyPassed: isSuccess && latencyMs < getTimeoutMsForProvider(resolvedProviderType),
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/lib/provider-testing/data/gm_flash_basic.json
Line: 1-19

Comment:
**No system instruction — "pong" content check may be flaky**

The same applies to `gm_pro_basic.json`. The old body included `"systemInstruction": { "parts": [{ "text": "You are a echo bot. Always say pong." }] }`, which made the `defaultSuccessContains: "pong"` check reliable. Without it, the model must infer the ping/pong convention from training data alone. At `temperature: 0` this typically works, but it's one less guard against a model rephrasing (e.g. "Pong!" or "I'm ready"). Adding back a `systemInstruction` (even a minimal one) would make the validation deterministic.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (3): Last reviewed commit: "fix: tighten provider detection validati..." | Re-trigger Greptile

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 8, 2026

📝 Walkthrough

Walkthrough

引入按预设候选顺序的多次尝试执行路径,扩展并透传 upstream 原始 HTTP 响应(rawResponse),重构 Gemini/其他提供商请求/响应类型与头/路径/预设选择逻辑,并调整前端测试按钮为基于提供商类型的简化流。

Changes

Cohort / File(s) Summary
核心提供商测试逻辑
src/actions/providers.ts, src/lib/provider-testing/test-service.ts, src/lib/provider-testing/test-service.test.ts
将测试流程改为基于 AttemptPlan 的多次尝试循环;引入 rawResponse、usage、streamInfo 等字段;增加重试/切换模板规则、响应解析与验证抽象(parseResponse/buildValidationDetails);新增/调整单元测试覆盖。
前端 API 测试按钮
src/app/[locale]/settings/providers/_components/forms/api-test-button.tsx, tests/unit/settings/providers/api-test-button.test.tsx
移除基于 API 格式的多分支 UI,改为按提供商类型驱动的简化界面(仅模型输入/免责声明/测试按钮/结果);调整相关单元测试。
预设、模板与选择策略
src/lib/provider-testing/presets.ts, src/lib/provider-testing/presets.test.ts
重构 PRESETS 集合并新增字段(path/userAgent/extraHeaders/score/modelHints/urlHints);新增 PresetSelectionContextgetExecutionPresetCandidates,按得分与提示排序候选预设;修改 payload 克隆与模型覆盖行为。
默认请求、头部与 URL 构造
src/lib/provider-testing/utils/test-prompts.ts
更新各提供商默认请求体、headers、USER_AGENTS、BASE_HEADERS 与 API_ENDPOINTS;getTestHeaders/getTestUrl 签名改为支持 providerUrl、overrides 与 pathOverride;Claude 鉴权逻辑条件化;移除 Gemini ?alt=sse 添加。
类型与 fixtures(测试数据)
src/lib/provider-testing/types.ts, src/lib/provider-testing/data/*.json
扩展类型(如 CodexTestBodyGeminiTestBody 新字段及 thinkingConfig/role 等);新增多个 provider 测试 JSON fixture(Claude、Codex、OpenAI、Gemini 变体等)。
单元测试覆盖(actions)
tests/unit/actions/providers-api-test.test.ts, tests/unit/actions/providers-patch-contract.test.ts
新增对 testProviderUnified / testProviderGemini 的单元测试,验证 rawResponse 被透传;修正测试以使用动态常量 PROVIDER_RULE_LIMITS.MAX_TEXT_LENGTH
清理:请求过滤器引擎
src/lib/request-filter-engine.ts
移除私有别名访问器 globalFiltersproviderFilters,内部直接使用 globalGuardFilters/providerGuardFilters

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed 标题准确描述了此 PR 的主要目的——改进提供商检测模板选择,通过引入自动模板排序和回退机制。
Description check ✅ Passed PR description clearly describes the refactoring of provider detection templates, addition of preset selection mechanism, Gemini endpoint changes, and UI simplification.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/provider-detect-templates

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added enhancement New feature or request area:provider area:UI area:core size/XL Extra Large PR (> 1000 lines) labels Apr 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the provider API testing system to be more automated and robust. It introduces a preset-based 'Attempt Plan' mechanism that automatically cycles through compatible request templates when initial tests fail with specific protocol-related errors. The user interface for API testing has been simplified by removing manual configuration for formats, timeouts, and payloads in favor of automated probing. Key improvements include the addition of comprehensive test presets for various providers, enhanced response parsing, and the inclusion of raw response data for debugging. Feedback focuses on optimizing deep cloning using structuredClone and addressing potential cumulative timeout issues when executing multiple sequential test attempts.

Comment thread src/lib/provider-testing/presets.ts Outdated
}

// Deep clone to avoid mutating the original
const payload = JSON.parse(JSON.stringify(preset.payload)) as Record<string, unknown>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using JSON.parse(JSON.stringify(...)) for deep cloning is less efficient and can lose certain data types (though likely not an issue for these JSON payloads). Since this project uses Bun, you should prefer the native structuredClone for a cleaner and more performant deep clone.

Suggested change
const payload = JSON.parse(JSON.stringify(preset.payload)) as Record<string, unknown>;
const payload = structuredClone(preset.payload) as Record<string, unknown>;

Comment on lines +250 to +256
for (const plan of plans) {
const result = await runSingleAttempt(config, plan, timeoutMs, slowThresholdMs);
if (result.success || !shouldRetryWithNextTemplate(result)) {
return result;
}
fallbackResult = result;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation executes multiple test plans sequentially, each using the full timeoutMs. If multiple plans fail due to timeout or other retriable errors, the total execution time could significantly exceed the intended timeout, potentially leading to server-side request termination in serverless environments (e.g., Vercel's function timeout). Consider calculating the remaining time for each attempt to ensure the total duration stays within bounds.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 46bde198c1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +61 to +63
const presets = config.preset
? [getPreset(config.preset)].filter((preset): preset is PresetConfig => Boolean(preset))
: getExecutionPresetCandidates({
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject unknown preset IDs instead of falling back silently

When config.preset is provided but not found (for example, callers still sending legacy IDs like cx_base after this preset split), this code drops the preset and later falls back to generic defaults. That makes the probe run against a different protocol/path than requested and can return misleading pass/fail results instead of surfacing a clear configuration error to the caller.

Useful? React with 👍 / 👎.

config.successContains ??
preset.defaultSuccessContains ??
DEFAULT_SUCCESS_CONTAINS[config.providerType],
url: getTestUrl(config.providerUrl, config.providerType, config.model, undefined, preset.path),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use preset default model when building preset URLs

Preset execution builds the request URL from config.model only, so if the caller omits model, URL model placeholders are filled from global provider defaults rather than preset.defaultModel. This breaks model-specific presets such as gm_pro_basic: its body is selected, but the URL still targets the default Gemini model (gemini-2.5-flash), so fallback ordering cannot actually probe the intended model variant.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (9)
tests/unit/actions/providers-api-test.test.ts (1)

140-145: 建议优化类型断言的安全性

当前的类型断言 (result.data as { rawResponse?: string } | undefined)?.rawResponse 可以工作,但可以考虑使用更明确的类型守卫来提高可读性。

不过考虑到这是测试代码且断言逻辑正确,这是一个可选的改进。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/actions/providers-api-test.test.ts` around lines 140 - 145, The
test uses a broad type assertion on result.data to access rawResponse; replace
that assertion with a clearer type guard or explicit cast to improve readability
and safety: check result.data is defined and has a rawResponse property (e.g.,
via typeof or an inline type guard) before asserting its value in the expect
call referencing result and rawResponse in the test; keep the same assertion
semantics (expecting '{"message":"pong"}') but use the guard so the test avoids
the `(result.data as {...} | undefined)?` pattern.
src/app/[locale]/settings/providers/_components/forms/api-test-button.tsx (2)

98-98: 建议移除未使用的 prop

enableMultiProviderTypes prop 通过 void 被显式忽略。如果此 prop 不再需要,建议从接口定义和组件参数中移除,以保持代码整洁。

如果这是临时状态(等待后续 PR 使用),可以暂时保留。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/`[locale]/settings/providers/_components/forms/api-test-button.tsx at
line 98, Remove the unused prop enableMultiProviderTypes: delete the `void
enableMultiProviderTypes;` no-op, remove enableMultiProviderTypes from the
component's props interface/type and from the component parameter list (the API
test button component in this file), and update any callers that pass this prop
to stop supplying it; if this prop is intentionally kept for a future PR, add a
comment explaining it is temporarily unused instead of the `void` usage.

190-202: 类型断言可以更明确

当前的 details 类型断言是内联定义的。考虑到 testProviderGemini 的返回类型已在 src/actions/providers.ts 中定义,可以考虑复用或导出该类型以减少重复。

不过作为组件内部的临时类型定义,这种方式也是可接受的。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/`[locale]/settings/providers/_components/forms/api-test-button.tsx
around lines 190 - 202, The inline type assertion for the variable `details`
should be replaced with a shared/declared type from the provider action to avoid
duplication: import or re-export the response/details type used by
`testProviderGemini` (defined in src/actions/providers.ts) and use that type for
`details` instead of the inline union; update the `details` declaration in
`api-test-button.tsx` to reference the shared type (e.g., ProviderTestResponse
or ProviderTestDetails) so the component reuses the canonical shape returned by
`testProviderGemini`.
src/lib/provider-testing/test-service.ts (1)

101-111: 考虑将重试 HTTP 状态码提取为常量。

魔法数字 [400, 404, 405, 415, 422] 的含义不够明确,建议提取为命名常量以提高可读性和可维护性。

建议的重构
+const RETRYABLE_HTTP_CODES = [400, 404, 405, 415, 422] as const;
+
 function shouldRetryWithNextTemplate(result: ProviderTestResult): boolean {
   if (result.status !== "red") {
     return false;
   }

-  if (result.httpStatusCode && [400, 404, 405, 415, 422].includes(result.httpStatusCode)) {
+  if (result.httpStatusCode && RETRYABLE_HTTP_CODES.includes(result.httpStatusCode)) {
     return true;
   }

   return ["client_error", "invalid_request", "content_mismatch"].includes(result.subStatus);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/provider-testing/test-service.ts` around lines 101 - 111, The magic
HTTP status codes in shouldRetryWithNextTemplate should be extracted to a
clearly named constant (e.g., RETRYABLE_HTTP_STATUS_CODES) to improve
readability and maintainability; refactor the array [400, 404, 405, 415, 422]
into that constant (top-level or next to related constants) and replace the
inline array check in shouldRetryWithNextTemplate(result: ProviderTestResult)
with a lookup against
RETRYABLE_HTTP_STATUS_CODES.includes(result.httpStatusCode), keeping the
existing logic and types unchanged.
src/lib/provider-testing/presets.ts (2)

198-210: **getPresetPayload 的 model 注入条件需要文档说明。**Gemini API 使用 URL 路径中的模型名称(如 models/gemini-2.5-flash:generateContent),请求体中不包含 model 字段。 这与当前实现一致。

当前条件逻辑 if (model && "model" in payload) 是正确的设计:

  • 对于 Claude/Codex/OpenAI:payload 包含 model 键,会被覆盖
  • 对于 Gemini:payload 不包含 model 键,模型通过 URL 路径传递

建议添加代码注释说明这一设计决策,以便后续维护者理解。

建议添加注释
 export function getPresetPayload(presetId: string, model?: string): Record<string, unknown> {
   const preset = PRESETS[presetId];
   if (!preset) {
     throw new Error(`Preset not found: ${presetId}`);
   }

   const payload = JSON.parse(JSON.stringify(preset.payload)) as Record<string, unknown>;
+  // Only override model if the preset payload uses a model field in the body.
+  // Gemini API passes the model via URL path, not request body.
   if (model && "model" in payload) {
     payload.model = model;
   }

   return payload;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/provider-testing/presets.ts` around lines 198 - 210, Add a concise
code comment in getPresetPayload explaining why the model is only injected when
payload already has a "model" key (i.e., the if (model && "model" in payload)
check): note that OpenAI/Claude/Codex include model in the request body and
should be overwritten, while Gemini supplies the model via the URL path (e.g.,
models/gemini-...:generateContent) and therefore has no model key in the
payload. Mention this preserves correct behavior across providers and prevents
mistakenly adding a body model for Gemini.

222-259: scorePreset 中存在 modelHints 与硬编码逻辑的重复评分。

对于 Codex 和 Gemini 提供商,modelHints 匹配已经在 227-229 行加了 50 分,但 240-256 行的硬编码逻辑又额外加了 40 分。这导致同一模型匹配可能被双重计算。

例如,cx_codex_basicmodelHints: ["codex"],当模型包含 "codex" 时:

  • 行 227-229:+50 分(modelHints 匹配)
  • 行 241-243:+40 分(硬编码逻辑)

这可能是有意为之以强调特定匹配,但建议添加注释说明设计意图,或考虑统一评分逻辑。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/provider-testing/presets.ts` around lines 222 - 259, The scorePreset
function double-counts model matches because modelHints adds +50 and the
provider-specific hardcoded branches add +40 for the same model (e.g., preset id
"cx_codex_basic"), so update scorePreset to avoid duplicate boosts: in function
scorePreset, detect when a modelHints match already applied
(preset.modelHints?.some(...) result) and skip or reduce the provider-specific
+40 boost for those presets (or alternatively consolidate into a single
provider-specific boost value) for providerType "codex" and
"gemini"/"gemini-cli"; ensure you reference the existing symbols (scorePreset,
PresetConfig, modelHints, providerType, preset.id like "cx_codex_basic" and
"gm_flash_basic") when making the change and add a brief comment documenting the
chosen behavior.
src/lib/provider-testing/utils/test-prompts.ts (3)

163-171: 非代理 Claude 请求同时设置 x-api-keyAuthorization header。

对于直接访问 Anthropic API 的情况(非代理),代码同时设置了两个认证 header。官方 Anthropic API 使用 x-api-key,而某些兼容层可能需要 Authorization。如果这是为了最大兼容性的有意设计,建议添加注释说明。

建议添加注释说明设计意图
+  // For direct Anthropic API access (non-proxy), set both headers for maximum compatibility.
+  // Official API uses x-api-key; some compatibility layers may expect Authorization.
   headers["x-api-key"] = apiKey;
   headers.Authorization = `Bearer ${apiKey}`;
   return headers;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/provider-testing/utils/test-prompts.ts` around lines 163 - 171, The
code sets both x-api-key and Authorization for non-proxy Anthropic requests (see
looksLikeAnthropicProxy check and the headers assignment), which is ambiguous;
either remove the redundant Authorization header for direct Anthropic API calls
and only set headers["x-api-key"] = apiKey, or if keeping both for compatibility
is intentional, add a clear comment above the non-proxy branch explaining that
both headers are set deliberately for maximum compatibility with different
Anthropic-compatible layers (and mention which clients expect which header).

136-150: looksLikeAnthropicProxy 正则表达式可能产生误报。

正则 /proxy|relay|gateway|router|worker|openrouter|api2d|oaipro|gpt/i 中的 gpt 可能匹配非代理 URL(如 chatgpt-official-api.example.com)。如果这是有意为之以覆盖更多代理服务,建议添加注释说明。

建议添加注释或收紧匹配
 function looksLikeAnthropicProxy(providerUrl?: string): boolean {
   if (!providerUrl) {
     return false;
   }

   try {
     const hostname = new URL(providerUrl).hostname.toLowerCase();
     if (hostname.endsWith("anthropic.com") || hostname.endsWith("claude.ai")) {
       return false;
     }
+    // Heuristic: match common proxy/relay service patterns in hostname.
+    // "gpt" included to catch services like "gptproxy" or "gpt-api" relays.
     return /proxy|relay|gateway|router|worker|openrouter|api2d|oaipro|gpt/i.test(hostname);
   } catch {
     return false;
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/provider-testing/utils/test-prompts.ts` around lines 136 - 150, The
current looksLikeAnthropicProxy function can false-positive on hostnames because
the regex includes the token "gpt" which matches benign domains like
"chatgpt-official-api.example.com"; tighten the match by replacing the regex in
looksLikeAnthropicProxy to either remove "gpt" or require word boundaries/clear
separators for tokens (e.g., match tokens as whole words or require dot/hyphen
boundaries) and add a short comment above looksLikeAnthropicProxy explaining the
intent (or why "gpt" is included if deliberate) so future readers understand the
tradeoff.

248-249: _apiKey 参数未使用,建议移除或说明保留原因。

参数 _apiKey 带有下划线前缀表示有意未使用,但如果确实不需要,应考虑从函数签名中移除以保持 API 清洁。如果是为了保持向后兼容或未来使用,建议添加注释说明。

如果不再需要,建议移除参数
 export function getTestUrl(
   baseUrl: string,
   providerType: ProviderType,
   model?: string,
-  _apiKey?: string,
   pathOverride?: string
 ): string {

注意:需要同步更新 src/lib/provider-testing/test-service.ts 中的调用(line 97 的 undefined 参数可以移除)。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/provider-testing/utils/test-prompts.ts` around lines 248 - 249, The
parameter _apiKey?: string in the function that currently declares the
parameters "_apiKey?: string, pathOverride?: string" is unused—either remove
_apiKey from that function signature and delete the corresponding undefined
argument at all callers, or keep it but add a clear comment above the parameter
(e.g., "// intentionally unused for backward compatibility") to document why it
is present; ensure you update the caller that currently passes undefined so its
argument list matches the new signature if you remove the parameter.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/`[locale]/settings/providers/_components/forms/api-test-button.tsx:
- Line 403: Remove the literal emoji from the disclaimer header and replace it
with the imported Lucide AlertTriangle icon component: update the JSX element
that renders <div className="font-medium mb-1">⚠️ {t("disclaimer.title")}</div>
to render the AlertTriangle icon (with appropriate size/aria-hidden or role
attributes) alongside the localized title returned by t("disclaimer.title"),
ensuring styling and spacing match the original layout.

---

Nitpick comments:
In `@src/app/`[locale]/settings/providers/_components/forms/api-test-button.tsx:
- Line 98: Remove the unused prop enableMultiProviderTypes: delete the `void
enableMultiProviderTypes;` no-op, remove enableMultiProviderTypes from the
component's props interface/type and from the component parameter list (the API
test button component in this file), and update any callers that pass this prop
to stop supplying it; if this prop is intentionally kept for a future PR, add a
comment explaining it is temporarily unused instead of the `void` usage.
- Around line 190-202: The inline type assertion for the variable `details`
should be replaced with a shared/declared type from the provider action to avoid
duplication: import or re-export the response/details type used by
`testProviderGemini` (defined in src/actions/providers.ts) and use that type for
`details` instead of the inline union; update the `details` declaration in
`api-test-button.tsx` to reference the shared type (e.g., ProviderTestResponse
or ProviderTestDetails) so the component reuses the canonical shape returned by
`testProviderGemini`.

In `@src/lib/provider-testing/presets.ts`:
- Around line 198-210: Add a concise code comment in getPresetPayload explaining
why the model is only injected when payload already has a "model" key (i.e., the
if (model && "model" in payload) check): note that OpenAI/Claude/Codex include
model in the request body and should be overwritten, while Gemini supplies the
model via the URL path (e.g., models/gemini-...:generateContent) and therefore
has no model key in the payload. Mention this preserves correct behavior across
providers and prevents mistakenly adding a body model for Gemini.
- Around line 222-259: The scorePreset function double-counts model matches
because modelHints adds +50 and the provider-specific hardcoded branches add +40
for the same model (e.g., preset id "cx_codex_basic"), so update scorePreset to
avoid duplicate boosts: in function scorePreset, detect when a modelHints match
already applied (preset.modelHints?.some(...) result) and skip or reduce the
provider-specific +40 boost for those presets (or alternatively consolidate into
a single provider-specific boost value) for providerType "codex" and
"gemini"/"gemini-cli"; ensure you reference the existing symbols (scorePreset,
PresetConfig, modelHints, providerType, preset.id like "cx_codex_basic" and
"gm_flash_basic") when making the change and add a brief comment documenting the
chosen behavior.

In `@src/lib/provider-testing/test-service.ts`:
- Around line 101-111: The magic HTTP status codes in
shouldRetryWithNextTemplate should be extracted to a clearly named constant
(e.g., RETRYABLE_HTTP_STATUS_CODES) to improve readability and maintainability;
refactor the array [400, 404, 405, 415, 422] into that constant (top-level or
next to related constants) and replace the inline array check in
shouldRetryWithNextTemplate(result: ProviderTestResult) with a lookup against
RETRYABLE_HTTP_STATUS_CODES.includes(result.httpStatusCode), keeping the
existing logic and types unchanged.

In `@src/lib/provider-testing/utils/test-prompts.ts`:
- Around line 163-171: The code sets both x-api-key and Authorization for
non-proxy Anthropic requests (see looksLikeAnthropicProxy check and the headers
assignment), which is ambiguous; either remove the redundant Authorization
header for direct Anthropic API calls and only set headers["x-api-key"] =
apiKey, or if keeping both for compatibility is intentional, add a clear comment
above the non-proxy branch explaining that both headers are set deliberately for
maximum compatibility with different Anthropic-compatible layers (and mention
which clients expect which header).
- Around line 136-150: The current looksLikeAnthropicProxy function can
false-positive on hostnames because the regex includes the token "gpt" which
matches benign domains like "chatgpt-official-api.example.com"; tighten the
match by replacing the regex in looksLikeAnthropicProxy to either remove "gpt"
or require word boundaries/clear separators for tokens (e.g., match tokens as
whole words or require dot/hyphen boundaries) and add a short comment above
looksLikeAnthropicProxy explaining the intent (or why "gpt" is included if
deliberate) so future readers understand the tradeoff.
- Around line 248-249: The parameter _apiKey?: string in the function that
currently declares the parameters "_apiKey?: string, pathOverride?: string" is
unused—either remove _apiKey from that function signature and delete the
corresponding undefined argument at all callers, or keep it but add a clear
comment above the parameter (e.g., "// intentionally unused for backward
compatibility") to document why it is present; ensure you update the caller that
currently passes undefined so its argument list matches the new signature if you
remove the parameter.

In `@tests/unit/actions/providers-api-test.test.ts`:
- Around line 140-145: The test uses a broad type assertion on result.data to
access rawResponse; replace that assertion with a clearer type guard or explicit
cast to improve readability and safety: check result.data is defined and has a
rawResponse property (e.g., via typeof or an inline type guard) before asserting
its value in the expect call referencing result and rawResponse in the test;
keep the same assertion semantics (expecting '{"message":"pong"}') but use the
guard so the test avoids the `(result.data as {...} | undefined)?` pattern.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9c36cf31-9aca-4714-bf48-7d45610a3c2e

📥 Commits

Reviewing files that changed from the base of the PR and between c48ec7f and 46bde19.

📒 Files selected for processing (20)
  • src/actions/providers.ts
  • src/app/[locale]/settings/providers/_components/forms/api-test-button.tsx
  • src/lib/provider-testing/data/cc_beta_cli.json
  • src/lib/provider-testing/data/cc_haiku_basic.json
  • src/lib/provider-testing/data/cx_codex_basic.json
  • src/lib/provider-testing/data/cx_gpt_basic.json
  • src/lib/provider-testing/data/gm_flash_basic.json
  • src/lib/provider-testing/data/gm_pro_basic.json
  • src/lib/provider-testing/data/oa_chat_basic.json
  • src/lib/provider-testing/data/oa_chat_stream.json
  • src/lib/provider-testing/presets.test.ts
  • src/lib/provider-testing/presets.ts
  • src/lib/provider-testing/test-service.test.ts
  • src/lib/provider-testing/test-service.ts
  • src/lib/provider-testing/types.ts
  • src/lib/provider-testing/utils/test-prompts.ts
  • src/lib/request-filter-engine.ts
  • tests/unit/actions/providers-api-test.test.ts
  • tests/unit/actions/providers-patch-contract.test.ts
  • tests/unit/settings/providers/api-test-button.test.tsx
💤 Files with no reviewable changes (1)
  • src/lib/request-filter-engine.ts

Comment thread src/app/[locale]/settings/providers/_components/forms/api-test-button.tsx Outdated
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR introduces a provider detection template system with independent suites for Claude, Codex, OpenAI Compatible, and Gemini providers. It adds automatic template selection with scoring/fallback, parser-based response extraction, and full raw-response passthrough. The UI is simplified to require only model input. Overall the changes are well-scoped with good test coverage.

PR Size: XL

  • Lines changed: 1910 (1237 additions, 673 deletions)
  • Files changed: 20

Split Suggestion: This PR touches 20 files across the stack (UI, actions, new library module, data files, and tests). For future similar work, consider splitting into: (1) Core provider-testing library, (2) Data/preset files, (3) UI integration changes.

Issues Found

Category Critical High Medium Low
Logic/Bugs 0 0 0 0
Security 0 0 0 0
Error Handling 0 0 0 0
Types 0 0 0 0
Comments/Docs 0 0 0 0
Tests 0 0 0 0
Simplification 0 0 0 1

High Priority Issues (Should Fix)

No high priority issues identified.

Medium/Low Priority Notes

  1. [LOW] Greptile Comment Verification - The stale JSDoc comment flagged by Greptile (rawResponse truncation) was verified against the actual implementation. The comment in types.ts:163-164 states "truncated to 5000 chars" which correctly matches the implementation in test-service.ts:169 (rawResponse: responseBody.slice(0, 5000)). The Greptile finding appears to be a false positive - the documentation is accurate.

Review Coverage

  • Logic and correctness - Clean. Three-tier validation logic is sound.
  • Security (OWASP Top 10) - Clean. No injection risks identified. API keys handled securely.
  • Error handling - Clean. Network errors classified properly; no silent failures.
  • Type safety - Clean. Proper TypeScript types throughout.
  • Documentation accuracy - Clean. Comments match implementation (verified).
  • Test coverage - Adequate. Unit tests for presets, test-service, and UI components.
  • Code clarity - Good. UI simplification reduces cognitive load.

Notable Implementation Details

Gemini API Changes (lines 4159-4263 in providers.ts):

  • Changed from streamGenerateContent?alt=sse to synchronous generateContent
  • Added proper request body structure with role: "user" and thinkingConfig
  • Correctly extracts modelVersion from response

Error Classification (test-service.ts):

  • Comprehensive network error classification (timeout, DNS, connection refused, SSL)
  • All error paths are logged and surfaced to users

UI Simplification (api-test-button.tsx):

  • Removed preset selector, custom payload, and success-contains fields
  • Model-only input simplifies UX while maintaining functionality

Automated review by Claude AI


let fallbackResult: ProviderTestResult | null = null;
for (const plan of plans) {
const result = await runSingleAttempt(config, plan, timeoutMs, slowThresholdMs);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] [TEST-MISSING-CRITICAL] 新增的模板回退逻辑缺少单测覆盖

Evidence: src/lib/provider-testing/test-service.ts:250

for (const plan of plans) {
  const result = await runSingleAttempt(config, plan, timeoutMs, slowThresholdMs);
  if (result.success || !shouldRetryWithNextTemplate(result)) {
    return result;
  }
  fallbackResult = result;
}

Why this is a problem: 这是本 PR 的核心新行为(协议不匹配时自动尝试下一个模板)。当前 src/lib/provider-testing/test-service.test.ts 只覆盖了解析与 rawResponse 透传,没有覆盖“首次请求返回 400/invalid_request(或 content_mismatch)后会切换到下一个 preset 并最终成功”的路径。仓库规范要求:CLAUDE.mdTest Coverage - "All new features must have unit test coverage of at least 80%"。

Suggested fix (add a unit test that forces a retry to the next preset):

// src/lib/provider-testing/test-service.test.ts

test("openai-compatible 在 400/invalid_request 时应自动回退到下一个模板", async () => {
  const errorBody = JSON.stringify({ error: { message: "bad request" } });
  const okBody = JSON.stringify({
    model: "gpt-4.1-mini",
    choices: [{ message: { role: "assistant", content: "pong" } }],
  });

  fetchMock
    .mockResolvedValueOnce({
      status: 400,
      statusText: "Bad Request",
      headers: new Headers({ "content-type": "application/json" }),
      text: async () => errorBody,
    } as Response)
    .mockResolvedValueOnce({
      status: 200,
      statusText: "OK",
      headers: new Headers({ "content-type": "application/json" }),
      text: async () => okBody,
    } as Response);

  const result = await executeProviderTest({
    providerUrl: "https://api.example.com",
    apiKey: "sk-test",
    providerType: "openai-compatible",
    model: "gpt-4.1-mini",
  });

  expect(fetchMock).toHaveBeenCalledTimes(2);
  expect(result.success).toBe(true);
  expect(result.content).toBe("pong");

  const secondBody = JSON.parse(fetchMock.mock.calls[1]?.[1]?.body as string) as { stream?: boolean };
  expect(secondBody.stream).toBe(true);
});

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR is XL and combines template/preset refactors, new preset ordering + fallback behavior, response parsing changes, and settings UI updates, which increases regression risk.
The main blocker is missing unit coverage for the new multi-template fallback path.

PR Size: XL

  • Lines changed: 1910
  • Files changed: 20
  • Split suggestions:
    • Split A: src/lib/provider-testing/data/* + src/lib/provider-testing/presets.ts (template suite split + scoring)
    • Split B: src/lib/provider-testing/test-service.ts + parsers/raw-response passthrough behavior
    • Split C: settings UI + actions wiring (src/app/[locale]/settings/providers/... + src/actions/providers.ts)

Issues Found

Category Critical High Medium Low
Logic/Bugs 0 0 0 0
Security 0 0 0 0
Error Handling 0 0 0 0
Types 0 0 0 0
Comments/Docs 0 0 0 0
Tests 0 1 0 0
Simplification 0 0 0 0

Critical Issues (Must Fix)

None.

High Priority Issues (Should Fix)

  • [TEST-MISSING-CRITICAL] src/lib/provider-testing/test-service.ts:250 New retry/fallback loop is core behavior but has no unit test asserting fallback to the next preset on retryable failures (confidence: 90/100).

Review Coverage

  • Logic and correctness
  • Security (OWASP Top 10)
  • Error handling
  • Type safety
  • Documentation accuracy
  • Test coverage
  • Code clarity

Automated review by Codex AI

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Applied PR size label: size/XL (1910 lines changed, 20 files).
  • Posted 1 inline review comment: src/lib/provider-testing/test-service.ts:250 ([HIGH] [TEST-MISSING-CRITICAL]) — new multi-template retry/fallback loop lacks a unit test that proves fallback to the next preset on retryable failures (comment id 3053450307).
  • Submitted the required review summary via gh pr review --comment.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/actions/providers.ts (1)

4163-4210: ⚠️ Potential issue | 🟡 Minor

非流式测试端点与生产流式检测逻辑的设计差异已被记录

testProviderGemini 确实使用 :generateContent 进行同步请求,而 src/app/v1/_lib/proxy/forwarder.ts 的代理转发器通过检测 :streamGenerateContentalt=sse 或请求体中的 stream 字段来判断流式请求。但函数的 JSDoc 已明确标注用途为"连通性"测试,说明这是有意设计。

建议在第一次请求尝试前的注释中更明确地说明为何选择非流式端点(例如:"使用非流式端点验证基本连通性,流式功能由其他测试覆盖"),以便未来维护者快速理解设计意图。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/actions/providers.ts` around lines 4163 - 4210, Add a brief clarifying
comment in testProviderGemini right before the first request attempt explaining
that this test intentionally uses the non-stream ":generateContent" endpoint to
validate basic connectivity (not streaming behavior), and note that streaming is
handled elsewhere (see proxy/forwarder.ts which detects
":streamGenerateContent", "alt=sse" or a request body "stream" field) so
maintainers understand the design intent.
🧹 Nitpick comments (2)
src/lib/provider-testing/test-service.test.ts (1)

12-19: global.fetch 的恢复方式可能不正确

直接赋值 global.fetch = fetchMock 后,vi.unstubAllGlobals() 无法正确恢复原始的 fetch,因为它只能恢复通过 vi.stubGlobal() 创建的 stub。建议使用 vi.stubGlobal 来确保测试隔离。

建议的修复
 describe("executeProviderTest", () => {
   beforeEach(() => {
     vi.clearAllMocks();
-    global.fetch = fetchMock as typeof fetch;
+    vi.stubGlobal("fetch", fetchMock);
   });

   afterEach(() => {
     vi.unstubAllGlobals();
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/provider-testing/test-service.test.ts` around lines 12 - 19, The
tests assign global.fetch directly to fetchMock which prevents
vi.unstubAllGlobals() from restoring the original; instead, in the beforeEach
replace global.fetch using vi.stubGlobal('fetch', fetchMock) and keep
vi.clearAllMocks() as needed, and in afterEach call vi.unstubAllGlobals() (and
optionally vi.restoreAllMocks()) so the original global.fetch is correctly
restored; update the beforeEach/afterEach that reference global.fetch,
fetchMock, vi.clearAllMocks, vi.stubGlobal, and vi.unstubAllGlobals accordingly.
src/lib/provider-testing/presets.ts (1)

223-265: 评分逻辑中存在重复检查

scorePreset 函数在 Line 242-248 和 251-262 的 provider-specific boost 逻辑中,检查条件 !matchedModelHint && model.includes("codex") 等实际上与 modelHints 数组中的定义重复(例如 cx_codex_basicmodelHints 已包含 ["codex"])。

虽然 !matchedModelHint 条件防止了重复加分,但这种设计使得 modelHints 和 provider-specific boost 之间的关系不够清晰。建议在注释中说明这种设计意图,或者考虑统一通过 modelHints 机制处理。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/provider-testing/presets.ts` around lines 223 - 265, The scorePreset
function currently applies provider-specific boosts (using context.providerType
and checks like !matchedModelHint && model.includes("codex") for preset.id
values) which duplicates signals already present in preset.modelHints; either
document the intent clearly or unify the logic by moving those boosts into the
preset metadata and simplifying scorePreset. Fix by one of two options: (A) add
a clear comment above the provider-specific boost block explaining that boosts
are only applied when modelHints did not match (reference matchedModelHint,
context.providerType, and preset.id) so the signals aren't double-counted; or
(B) refactor so preset definitions carry explicit boost metadata (e.g., add a
numeric boost field per preset or include the provider tokens in modelHints) and
remove the hard-coded provider-specific branches in scorePreset, making the
function only sum preset.score, modelHints matches, urlHints matches, and any
preset-defined boost (reference function scorePreset, variable matchedModelHint,
and preset.modelHints/preset.id). Ensure behavior remains identical during
migration and add/update tests for presets that previously relied on the
codex/gemini branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/provider-testing/test-service.ts`:
- Around line 197-206: The validation currently concatenates parsed.content and
responseBody which can cause false positives when plan.successContains matches a
JSON field name; update the logic around parsed (from parseResponse) so
validationInput is set to parsed.content || responseBody (i.e., prefer the
parsed text when present) or alternatively perform two separate evaluations (one
against parsed.content and one against responseBody) before calling
evaluateContentValidation; adjust the code that builds validationInput and the
evaluateContentValidation(...) invocation (and related variables like
parsed.content, responseBody, validationInput, plan.successContains) accordingly
to ensure only the actual parsed text is used for the primary content check.

---

Outside diff comments:
In `@src/actions/providers.ts`:
- Around line 4163-4210: Add a brief clarifying comment in testProviderGemini
right before the first request attempt explaining that this test intentionally
uses the non-stream ":generateContent" endpoint to validate basic connectivity
(not streaming behavior), and note that streaming is handled elsewhere (see
proxy/forwarder.ts which detects ":streamGenerateContent", "alt=sse" or a
request body "stream" field) so maintainers understand the design intent.

---

Nitpick comments:
In `@src/lib/provider-testing/presets.ts`:
- Around line 223-265: The scorePreset function currently applies
provider-specific boosts (using context.providerType and checks like
!matchedModelHint && model.includes("codex") for preset.id values) which
duplicates signals already present in preset.modelHints; either document the
intent clearly or unify the logic by moving those boosts into the preset
metadata and simplifying scorePreset. Fix by one of two options: (A) add a clear
comment above the provider-specific boost block explaining that boosts are only
applied when modelHints did not match (reference matchedModelHint,
context.providerType, and preset.id) so the signals aren't double-counted; or
(B) refactor so preset definitions carry explicit boost metadata (e.g., add a
numeric boost field per preset or include the provider tokens in modelHints) and
remove the hard-coded provider-specific branches in scorePreset, making the
function only sum preset.score, modelHints matches, urlHints matches, and any
preset-defined boost (reference function scorePreset, variable matchedModelHint,
and preset.modelHints/preset.id). Ensure behavior remains identical during
migration and add/update tests for presets that previously relied on the
codex/gemini branches.

In `@src/lib/provider-testing/test-service.test.ts`:
- Around line 12-19: The tests assign global.fetch directly to fetchMock which
prevents vi.unstubAllGlobals() from restoring the original; instead, in the
beforeEach replace global.fetch using vi.stubGlobal('fetch', fetchMock) and keep
vi.clearAllMocks() as needed, and in afterEach call vi.unstubAllGlobals() (and
optionally vi.restoreAllMocks()) so the original global.fetch is correctly
restored; update the beforeEach/afterEach that reference global.fetch,
fetchMock, vi.clearAllMocks, vi.stubGlobal, and vi.unstubAllGlobals accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d68516f4-8a35-4249-9533-d81f7bc86006

📥 Commits

Reviewing files that changed from the base of the PR and between 46bde19 and 3744824.

📒 Files selected for processing (6)
  • src/actions/providers.ts
  • src/app/[locale]/settings/providers/_components/forms/api-test-button.tsx
  • src/lib/provider-testing/presets.ts
  • src/lib/provider-testing/test-service.test.ts
  • src/lib/provider-testing/test-service.ts
  • src/lib/provider-testing/utils/test-prompts.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/app/[locale]/settings/providers/_components/forms/api-test-button.tsx

Comment thread src/lib/provider-testing/test-service.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3744824b7f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

httpPassed:
responseStatus !== undefined ? responseStatus >= 200 && responseStatus < 300 : false,
httpStatusCode: responseStatus,
latencyPassed: latencyMs <= slowThresholdMs,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate latency pass on receiving an HTTP response

buildValidationDetails currently sets latencyPassed from elapsed time alone, but runSingleAttempt calls it with responseStatus as undefined on network exceptions (timeout/DNS/connection errors). That means fast failures can be reported as passing the latency tier even when no response was received, which produces misleading validation output in provider test results. Please make latency pass contingent on having a real HTTP response (or otherwise force it to false on error paths).

Useful? React with 👍 / 👎.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/provider-testing/test-service.ts`:
- Around line 269-279: The loop currently enforces a per-attempt minimum of
1000ms which can exceed the configured total timeout; before calling
runSingleAttempt in the plans loop, compute remaining = deadline - Date.now()
and if remaining <= 0 (or remaining < minimumAllowedMs) stop retrying and return
the current fallbackResult (or a timed-out ProviderTestResult), otherwise pass
remainingTimeoutMs = Math.min(remaining, configuredMinimumButNotBeyondRemaining)
(or simply remaining) to runSingleAttempt; update the loop around
runSingleAttempt/remainingTimeoutMs and the retry logic (functions/vars:
deadline, timeoutMs, remainingTimeoutMs, runSingleAttempt,
shouldRetryWithNextTemplate, fallbackResult) so total elapsed time never exceeds
timeoutMs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d0265ffc-f398-4595-bb5f-9c8f40c5f365

📥 Commits

Reviewing files that changed from the base of the PR and between 3744824 and 06e5a63.

📒 Files selected for processing (2)
  • src/lib/provider-testing/test-service.test.ts
  • src/lib/provider-testing/test-service.ts

Comment on lines +269 to +279
const deadline = Date.now() + timeoutMs;

let fallbackResult: ProviderTestResult | null = null;
for (const plan of plans) {
const remainingTimeoutMs = Math.max(1000, deadline - Date.now());
const result = await runSingleAttempt(config, plan, remainingTimeoutMs, slowThresholdMs);
if (result.success || !shouldRetryWithNextTemplate(result)) {
return result;
}
fallbackResult = result;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

超时管理的边界情况值得注意

Math.max(1000, deadline - Date.now()) 确保每次尝试至少有 1000ms,但这可能导致总执行时间超过配置的 timeoutMs。例如:

  • timeoutMs = 5000
  • 首次尝试耗时 4500ms 后返回 400
  • 第二次尝试仍会获得 1000ms(而非 500ms)
  • 总执行时间可达 5500ms

如果严格遵守总超时时间是需求,建议在循环开始时检查剩余时间是否足够:

 for (const plan of plans) {
+  if (Date.now() >= deadline) {
+    break;
+  }
   const remainingTimeoutMs = Math.max(1000, deadline - Date.now());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/provider-testing/test-service.ts` around lines 269 - 279, The loop
currently enforces a per-attempt minimum of 1000ms which can exceed the
configured total timeout; before calling runSingleAttempt in the plans loop,
compute remaining = deadline - Date.now() and if remaining <= 0 (or remaining <
minimumAllowedMs) stop retrying and return the current fallbackResult (or a
timed-out ProviderTestResult), otherwise pass remainingTimeoutMs =
Math.min(remaining, configuredMinimumButNotBeyondRemaining) (or simply
remaining) to runSingleAttempt; update the loop around
runSingleAttempt/remainingTimeoutMs and the retry logic (functions/vars:
deadline, timeoutMs, remainingTimeoutMs, runSingleAttempt,
shouldRetryWithNextTemplate, fallbackResult) so total elapsed time never exceeds
timeoutMs.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 06e5a632be

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


let fallbackResult: ProviderTestResult | null = null;
for (const plan of plans) {
const remainingTimeoutMs = Math.max(1000, deadline - Date.now());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce global timeout budget across template retries

The retry loop gives each fallback attempt at least 1000ms via Math.max(1000, deadline - Date.now()), so the total wall-clock time can exceed the caller’s configured timeoutMs. In practice, if the first probe consumes most of the budget and returns a retryable 4xx, later templates still run for an extra second each, which breaks timeout expectations and can keep the settings UI blocked longer than requested.

Useful? React with 👍 / 👎.

Comment on lines +67 to +71
const preset = getPreset(config.preset);
if (!preset) {
throw new Error(`Preset not found: ${config.preset}`);
}
presets = [preset];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject presets incompatible with provider type

Explicit preset selection only checks existence, not whether the preset belongs to the current providerType. If an API caller sends a mismatched preset (for example a Gemini preset with openai-compatible), this path still uses that preset’s payload/path and can even send unresolved placeholders in the URL, yielding misleading probe failures instead of a clear input validation error. Please validate compatibility before accepting config.preset.

Useful? React with 👍 / 👎.

Comment on lines +189 to +191
const details = (isSuccess ? response.data.details : undefined) as
| ProviderApiTestSuccessDetails
| undefined;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve Gemini failure details in UI result mapping

Failure details from testProviderGemini are dropped because details is only populated when isSuccess is true. That means red Gemini test results lose backend-provided rawResponse/error context, so the details dialog cannot show the same diagnostic body that unified/non-Gemini paths now expose. This makes Gemini failures significantly harder to debug even though the server already returns the data.

Useful? React with 👍 / 👎.

@ding113 ding113 merged commit 0b5683e into dev Apr 9, 2026
16 checks passed
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Claude Code Hub Roadmap Apr 9, 2026
@github-actions github-actions Bot mentioned this pull request Apr 13, 2026
13 tasks
ding113 added a commit that referenced this pull request Apr 13, 2026
* fix: relax provider routing rule limits

* feat: improve provider detection template selection (#1003)

* feat: improve provider detection templates

* fix: address provider detection review feedback

* chore: format code (feat-provider-detect-templates-02e4d1f)

* test: cover provider detection fallback retries

* chore: format code (feat-provider-detect-templates-4dc1701)

* fix: tighten provider detection validation

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: honor disabled error rules during queued reloads (#1006)

* fix: honor disabled error rules during queued reloads

* fix: propagate matched rule metadata in hedge errors

* fix: avoid hot retries on error rule reload failure

* fix: avoid redundant error-rule reload passes

* fix: await queued error rule reload completion

* fix: prevent STREAM_PROCESSING_ERROR from premature agent pool eviction (#1009)

* fix: prevent STREAM_PROCESSING_ERROR from premature agent pool eviction

The agent pool cleanup timer was destroying HTTP agents after 5min TTL
with no awareness of in-flight streaming requests, causing
ClientDestroyedError and ERR_HTTP2_STREAM_ERROR for long-running
responses. These errors were also misclassified because isTransportError
did not recognize them.

Three fixes:
1. Add UND_ERR_DESTROYED, UND_ERR_CLOSED, and HTTP/2 errors to
   isTransportError() so they classify as STREAM_UPSTREAM_ABORTED
2. Add reference counting (activeRequests) to AgentPool to prevent
   eviction of agents with in-flight requests (30min hard upper bound)
3. Wire up releaseAgent lifecycle in forwarder/response-handler to
   decrement refcount when streams complete

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: harden agent pool refcount against cross-generation misattribution

Addresses bot review feedback on #1009 covering four root causes that
still let STREAM_PROCESSING_ERROR leak through under concurrency:

- Track dispatcher generation ids so releaseAgent can ignore stale
  releases from old dispatchers after markUnhealthy / 30-min hard
  expire / LRU eviction rebuilds the same cacheKey
- Increment activeRequests for pending-creation waiters, not just the
  creator, so concurrent bursts don't prematurely evict the shared
  agent once the first request completes
- Move agent acquisition inside forwarder's fetch try block so any
  exception between acquire and fetch still releases the refcount
- Clear proxyConfig after a successful proxy-to-direct fallback to
  prevent session.releaseAgent double-decrementing the shared agent
- Add ERR_HTTP2_STREAM_ERROR to transport error codes so wrapped
  HTTP/2 errors on the cause chain are detected by isTransportError

Adds regression coverage for concurrent pending-creation waiter
counting (Promise.all with blocked createAgent), stale dispatcher
generation release safety, and HTTP/2 cause-chain detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: respect case-sensitive provider model ids (#1011)

* fix: respect case-sensitive provider model ids

* fix: keep model picker ids case-sensitive

* chore: update package dependencies to use major version ranges

* fix: address code review findings from PR #1014

- Use nullish coalescing (??) instead of logical OR (||) for parsed.content
  in test-service.ts to preserve empty string as valid content
- Wrap parseResponse() in try-catch to prevent parse exceptions from
  masking HTTP status codes as network_error
- Add releaseSessionAgent() to early-return branches in response-handler.ts
  to prevent agent pool reference count leaks
- Fix error-rule-detector reload() to always queue a fresh pass when
  called during an active reload, preventing stale snapshot reads
- Remove pre-clearing of isInitialized in event handler to avoid
  retry storms when database is unavailable
- Use ICU {max, number} format in i18n messages for locale-aware
  number formatting of rule limits across all 5 languages
- Remove emoji from test comments and fix pool cleanup in finally block

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: format code (dev-56698a2)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core area:provider area:UI enhancement New feature or request size/XL Extra Large PR (> 1000 lines)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant