Skip to content

feat(web-ui): proxy stream and usage cost accuracy bundle#155

Merged
ymkiux merged 10 commits into
mainfrom
fix/proxy-stream-resilience
May 11, 2026
Merged

feat(web-ui): proxy stream and usage cost accuracy bundle#155
ymkiux merged 10 commits into
mainfrom
fix/proxy-stream-resilience

Conversation

@ymkiux
Copy link
Copy Markdown
Collaborator

@ymkiux ymkiux commented May 9, 2026

Summary

  • Add retry logic for transient upstream errors in proxy SSE stream
  • Graceful SSE abort handling on client disconnect
  • Add cost estimation for claude sessions
  • Add -- separator to npm start share command prefix
  • Tighten preset button spacing and remove redundant custom config button
  • Add clone button to codex and claude provider cards (reuses add modal, zero backend changes)
  • Resolve merge conflicts with main
  • Forward upstream reasoning_content as output_text delta so reasoning models stream visible tokens through the openai bridge
  • Forward Anthropic cache_read_input_tokens and cache_creation_input_tokens through session usage parsers (codex, claude, codebuddy and index-entry paths)
  • Add Anthropic cache_write rates to the public catalog so cache_creation tokens are billed instead of being silently free
  • Calibrate Anthropic 4.x cache_read rates to the published 10%-off discount (opus 1.5, sonnet 0.3, haiku 0.08 per Mtok)
  • Split reasoning_output_tokens out of outputTokens in the cost formula so reasoning tokens are not double-billed against the visible completion

Tests

  • CI passed (18, 22)
  • Clone button verified: pre-fills url/key/useTransform for codex, apiKey/baseUrl/model for claude, user fills new name only
  • Existing add/edit/delete flows unchanged
  • npm test green (unit 531 + bridge 11 + e2e 30+ suites, exit 0)
  • telepub DeepSeek-V4-pro live stream verified: 28 output_text.delta events vs 3 before fix on a single "hi" prompt; reasoning tokens now visible to client end-to-end
  • New unit tests cover Anthropic cache_read/cache_creation field aliasing, cache_write rate participation, and reasoning split from visible output

Summary by CodeRabbit

  • New Features

    • Clone Claude configs and providers with pre-filled fields; new Telepub Codex provider preset; Claude/Codex UI clone actions and i18n entries.
  • Improvements

    • Retry handling for transient upstream errors and SSE keepalive for streaming; HTTP/HTTPS connection pooling tuned; share commands now include -- after npm start; cost estimation expanded to account for cache-creation input tokens and additional model pricing/fallbacks.
  • Tests

    • Added/updated unit tests for streaming abort/retry, reasoning-content forwarding, share-command formatting, and usage parsing.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 9, 2026

📝 Walkthrough

Walkthrough

This PR adds keep-alive connection pooling, transient-network retry for proxy/bridge requests, SSE heartbeat and safe-failure handling, forwards upstream reasoning_content into Responses SSE deltas, adds provider/Claude clone UI and i18n, updates CLI share prefix to npm start --, and extends session usage/pricing (including cacheCreationInputTokens) and cost estimation.

Changes

Network Resilience, SSE, Cloning, and Session Features

Layer / File(s) Summary
HTTP/HTTPS keep-alive pooling
cli.js
Configures keep-alive agents with keepAliveMsecs: 1000 and maxFreeSockets: 4.
Transient error detection & retry helper
cli/builtin-proxy.js, cli/openai-bridge.js
Adds transient-network classification and retryTransientRequest with fixed inter-attempt delays.
OpenAI bridge wiring & SSE delta accumulation
cli/openai-bridge.js
Applies retry wrapper to multiple upstream paths and emits response.output_text.delta combining delta.reasoning_content and delta.content.
Builtin-proxy SSE lifecycle & retry wiring
cli/builtin-proxy.js
Adds SSE keepalive heartbeat, stream-state tracking, safe failure helpers that emit response.failed + [DONE], abort handlers, and wraps proxy/fallback calls with retry logic.
Proxy & network resilience tests
tests/unit/builtin-proxy-responses-shim.test.mjs
Adds tests asserting SSE abort emits response.failed + [DONE] and transient upstream resets are retried and recover.
CLI session usage: cacheCreationInputTokens
cli.js, session parsers
Parses and propagates cache_creation_input_tokens / cacheCreationInputTokens in usage readers and session summary parsers (Codex/Claude/CodeBuddy) and list hydration.
Session usage backend tests
tests/unit/session-usage-backend.test.mjs
Adds tests validating Anthropic cache read/creation token mapping and totalTokens fallback behavior.
Web UI pricing & cost estimation
web-ui/modules/app.computed.session.mjs
Extends known model pricing, adds stripped-model fallback, makes cost estimation unconditional, and incorporates cacheCreationInputTokens into estimates.
CLI share command prefix & tests
web-ui/modules/app.methods.session-actions.mjs, tests/unit/*
getShareCommandPrefixInvocation() now returns npm start -- for non-codexmate; corresponding tests updated.
Provider & Claude clone methods
web-ui/modules/app.methods.claude-config.mjs, web-ui/modules/app.methods.providers.mjs
Adds openCloneClaudeConfigModal(name, config) and openCloneProviderModal(provider) to prefill add/clone modals.
Clone UI templates & i18n
web-ui/partials/index/*, web-ui/modules/i18n.dict.mjs
Removes Claude customConfig preset button, adds clone actions to provider/Claude cards, renders Codex preset buttons that prefill provider add modal, and adds clone-related i18n keys.
Web UI small changes
web-ui/app.js, modals
Clears initial Claude baseUrl/model defaults, changes modal reset behavior, and removes an add-provider hint element.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant BuiltinProxy
  participant OpenAIBridge
  participant Upstream
  Client->>BuiltinProxy: /v1/responses request (stream or JSON)
  BuiltinProxy->>OpenAIBridge: proxy call (wrapped in retryTransientRequest)
  OpenAIBridge->>Upstream: HTTP request or SSE connection
  Upstream-->>OpenAIBridge: SSE deltas (reasoning_content, content) or transient error
  OpenAIBridge->>OpenAIBridge: on transient error -> retryTransientRequest (delay + retry)
  OpenAIBridge-->>BuiltinProxy: proxied JSON or Responses-style SSE (includes output_text deltas)
  BuiltinProxy-->>Client: SSE keepalive, response.output_text.delta, or response.failed + [DONE]
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • awsl233777
  • SurviveM

Poem

🐰 I tap the stream with patient paws,
retries hum softly like tiny laws,
clone a config, ready to share,
keepalive whispers through open air,
deltas stitch reasoning into the fare.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the primary changes: proxy stream resilience, cost accuracy improvements, and UI polish bundled together.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/proxy-stream-resilience

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
cli/openai-bridge.js (1)

719-753: ⚡ Quick win

Consolidate the transient retry helper with cli/builtin-proxy.js.

isTransientNetworkError, TRANSIENT_RETRY_DELAYS_MS, and retryTransientRequest here are byte-for-byte identical to the helpers added in cli/builtin-proxy.js (lines 130–164). Two copies of a network-resilience policy will drift the moment a new error code is added on one side. Consider extracting to something like lib/cli-transient-retry.js and importing from both files; nothing in the helper depends on local closures.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/openai-bridge.js` around lines 719 - 753, The transient retry helpers
(isTransientNetworkError, TRANSIENT_RETRY_DELAYS_MS, retryTransientRequest) are
duplicated; extract them into a shared module and import from both places:
create a new module (e.g., cliTransientRetry) that exports
isTransientNetworkError, TRANSIENT_RETRY_DELAYS_MS, and retryTransientRequest
(preserve current behavior and signatures, including executor argument and
delays), replace the local definitions in both files with imports of those
exported symbols, and run tests/lint to ensure no remaining duplicates or broken
references.
cli/builtin-proxy.js (1)

130-139: ⚡ Quick win

isTransientNetworkError does not classify local request timeouts as transient.

proxyRequestJson and streamChatCompletionsAsResponsesSse both finish a req.setTimeout firing with { ok: false, error: 'timeout' } (lines 211-212 and 1005-1008). The new matcher only looks for ETIMEDOUT, so a bare 'timeout' slips through and retryTransientRequest will not retry it. This is an inconsistency with the older shouldFallbackFromUpstreamResponsesFailure (lines 121-128) which already treats /timeout/i as recoverable, and timeouts are arguably the most common "connect-phase" failure the PR aims to cover. Consider extending the regex (or the timeout error string) so the retry policy and the fallback heuristic agree.

🔁 Suggested matcher tweak
     function isTransientNetworkError(error) {
         const text = String(error || '').trim();
         if (!text) return false;
         if (/socket hang up/i.test(text)) return true;
-        if (/ECONNRESET|ECONNREFUSED|EPIPE|EPROTO|ETIMEDOUT/i.test(text)) return true;
+        if (/ECONNRESET|ECONNREFUSED|EPIPE|EPROTO|ETIMEDOUT|\btimeout\b/i.test(text)) return true;
         if (/EAI_AGAIN/i.test(text)) return true;
         if (/UND_ERR_SOCKET/i.test(text)) return true;
         if (/disconnected before|secure tls|tls handshake/i.test(text)) return true;
         return false;
     }

(If you keep the helpers duplicated, mirror this in cli/openai-bridge.js.)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/builtin-proxy.js` around lines 130 - 139, isTransientNetworkError
currently misses bare "timeout" errors so retries aren't triggered; update
isTransientNetworkError to treat plain timeout strings as transient (e.g.,
extend the existing regex checks to include /timeout/i or explicitly check for
'timeout') so it matches the behavior of
shouldFallbackFromUpstreamResponsesFailure and aligns with proxyRequestJson and
streamChatCompletionsAsResponsesSse which emit "{ ok: false, error: 'timeout'
}"; ensure the same change is mirrored in the duplicate helper in
cli/openai-bridge.js if present.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@cli/builtin-proxy.js`:
- Around line 130-139: isTransientNetworkError currently misses bare "timeout"
errors so retries aren't triggered; update isTransientNetworkError to treat
plain timeout strings as transient (e.g., extend the existing regex checks to
include /timeout/i or explicitly check for 'timeout') so it matches the behavior
of shouldFallbackFromUpstreamResponsesFailure and aligns with proxyRequestJson
and streamChatCompletionsAsResponsesSse which emit "{ ok: false, error:
'timeout' }"; ensure the same change is mirrored in the duplicate helper in
cli/openai-bridge.js if present.

In `@cli/openai-bridge.js`:
- Around line 719-753: The transient retry helpers (isTransientNetworkError,
TRANSIENT_RETRY_DELAYS_MS, retryTransientRequest) are duplicated; extract them
into a shared module and import from both places: create a new module (e.g.,
cliTransientRetry) that exports isTransientNetworkError,
TRANSIENT_RETRY_DELAYS_MS, and retryTransientRequest (preserve current behavior
and signatures, including executor argument and delays), replace the local
definitions in both files with imports of those exported symbols, and run
tests/lint to ensure no remaining duplicates or broken references.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b59e2b64-67bb-4e9c-9882-878cf63f054e

📥 Commits

Reviewing files that changed from the base of the PR and between 3e1deb0 and 3406025.

📒 Files selected for processing (4)
  • cli.js
  • cli/builtin-proxy.js
  • cli/openai-bridge.js
  • tests/unit/builtin-proxy-responses-shim.test.mjs
📜 Review details
🔇 Additional comments (2)
cli.js (1)

307-316: Keep-alive pool tuning looks good.

Line 307 through Line 316 are consistent with the resilience objective: bounded free-socket pools plus short keepalive probing should reduce half-dead socket reuse without changing call-site behavior.

tests/unit/builtin-proxy-responses-shim.test.mjs (1)

558-657: Tests cover the new SSE failure and retry paths well.

Both tests exercise the contract described in the PR — response.failed + [DONE] on a mid-stream abort, and a successful retry after a transient connection reset — and they assert on the proxy's external behavior (status, headers, SSE events, retry count) rather than on internal state, so they'll stay stable across implementation tweaks.

ymkiux added 2 commits May 11, 2026 00:55
…m config button

Co-Authored-By: ymkiux <ymkiux@users.noreply.github.com>
npm does not forward arguments after npm start to the underlying
script. The generated share commands used npm start add ... which
npm interprets as its own subcommand. Adding -- makes npm pass
the remaining arguments through to node cli.js.

Co-Authored-By: ymkiux <ymkiux@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@web-ui/partials/index/panel-config-codex.html`:
- Around line 52-56: The click handler directly mutates parts of newProvider and
can leak prior modal state; instead add a dedicated method (e.g.,
openAddPresetWithTemplate or resetAndApplyTemplate) that first resets the
draft/newProvider to a clean default object, then copies the template fields
(tpl.name, tpl.url, tpl.model into newProvider._suggestedModel, tpl.useTransform
into newProvider.useTransform, etc.), and finally sets showAddModal = true;
update the `@click` to call this new method so the form is always initialized
before applying tpl values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a32fd1a2-e731-4c81-94a7-8cfb54128008

📥 Commits

Reviewing files that changed from the base of the PR and between 3406025 and e04f3fa.

📒 Files selected for processing (5)
  • tests/unit/provider-share-command.test.mjs
  • tests/unit/web-ui-behavior-parity.test.mjs
  • web-ui/modules/app.methods.session-actions.mjs
  • web-ui/partials/index/panel-config-claude.html
  • web-ui/partials/index/panel-config-codex.html
📜 Review details
🔇 Additional comments (4)
web-ui/modules/app.methods.session-actions.mjs (1)

257-257: Good fix: npm command forwarding is now explicit.

Using npm start -- here is the correct invocation boundary for downstream CLI args.

tests/unit/provider-share-command.test.mjs (1)

197-197: Tests are correctly realigned with the new CLI prefix behavior.

These expectations now consistently validate npm start -- ... across provider and Claude share command builders.

Also applies to: 211-211, 256-256

tests/unit/web-ui-behavior-parity.test.mjs (1)

1193-1193: Parity assertions look correct and consistent.

Clipboard command expectations now properly reflect the npm start -- invocation contract.

Also applies to: 1235-1235

web-ui/partials/index/panel-config-claude.html (1)

48-48: Layout tweak looks good.

The margin-top: 0 adjustment is a safe, targeted UI refinement for the preset button group.

Comment on lines +52 to +56
@click="newProvider.name = tpl.name;
newProvider.url = tpl.url;
newProvider._suggestedModel = tpl.model || '';
newProvider.useTransform = !!tpl.useTransform;
showAddModal = true">
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reset preset form state before applying template values.

This click path only overwrites a subset of newProvider; any prior modal state can leak into the new preset flow. Use a dedicated handler that resets the draft object, then applies tpl fields before opening the modal.

Proposed change
-                            `@click`="newProvider.name = tpl.name;
-                                    newProvider.url = tpl.url;
-                                    newProvider._suggestedModel = tpl.model || '';
-                                    newProvider.useTransform = !!tpl.useTransform;
-                                    showAddModal = true">
+                            `@click`="applyCodexProviderTemplate(tpl)">
<!-- In component methods (outside this line range), reset draft first, then assign template fields -->
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web-ui/partials/index/panel-config-codex.html` around lines 52 - 56, The
click handler directly mutates parts of newProvider and can leak prior modal
state; instead add a dedicated method (e.g., openAddPresetWithTemplate or
resetAndApplyTemplate) that first resets the draft/newProvider to a clean
default object, then copies the template fields (tpl.name, tpl.url, tpl.model
into newProvider._suggestedModel, tpl.useTransform into
newProvider.useTransform, etc.), and finally sets showAddModal = true; update
the `@click` to call this new method so the form is always initialized before
applying tpl values.

Add Claude model pricing to known catalog, remove Claude exclusion
from cost estimation, and add date-suffix fuzzy matching for model
ID lookup (e.g. claude-sonnet-4-6-20250514 matches claude-sonnet-4-6).
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/unit/web-ui-logic.test.mjs (1)

1146-1149: 💤 Low value

Strengthen the assertion to actually prove Claude pricing was applied.

The test currently asserts only the absence of the “暂不含 Claude” prefix and 2/2 coverage, but the cost itself isn't checked. Since 2/2 coverage will also hold if the Claude entry happens to match with $0 (e.g. a future regression that nulls out Claude rates), pinning the computed costCard.value (or matching its title against a $x.xx pattern that includes the Claude session) would catch silent pricing regressions for the catalog entries this PR introduces.

🧪 Suggested extra assertion
     const costCard = cards.find((card) => card.key === 'estimated-cost');
     assert(costCard, 'missing estimated cost summary card');
     assert.ok(!costCard.title.includes('暂不含 Claude'), 'Claude exclusion prefix should not appear');
     assert.match(costCard.title, /覆盖 2\/2 个会话/);
+    // claude-sonnet-4-6: input 3, output 15 per MTok → 150000*3/1e6 + 100000*15/1e6 = 0.45 + 1.50 = $1.95
+    // gpt-5.3-codex (300k input incl 100k cached, 100k output) is exercised by the earlier test, so
+    // here we mainly want to confirm Claude actually contributed a non-zero amount.
+    assert.notStrictEqual(costCard.value, '$0.00');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/web-ui-logic.test.mjs` around lines 1146 - 1149, Add an assertion
that verifies the computed price is non-zero and includes Claude's contribution:
after locating costCard (the object found via cards.find where card.key ===
'estimated-cost'), assert that costCard.value (or costCard.title if you prefer)
matches a dollar-amount pattern like /\$\d+\.\d{2}/ and/or contains the expected
Claude-specific amount or label to ensure Claude pricing was actually applied
(e.g., assert.match(costCard.value, /\$\d+\.\d{2}/) and
assert.ok(costCard.value.includes('Claude') ||
costCard.title.includes('Claude'), or compare to a known expected numeric
total). Ensure you reference costCard.value/title and the existing costCard
lookup so the new assertion sits immediately after the current checks.
web-ui/modules/app.computed.session.mjs (1)

254-256: ⚡ Quick win

Stub leaves a chain of unreachable code — collapse the gating or remove it.

After making shouldEstimateUsageCostForSession() an unconditional true, several pieces become dead and will confuse future readers:

  • All three call sites (lines 274, 808, 914) still pass a session argument that the function ignores.
  • skippedUnsupportedSessions is incremented inside the unreachable else (line 275), and the field is still surfaced on the summary object (line 311) where every consumer will now see 0.
  • In sessionUsageSummaryCards the estimatedCostPrefix branch on skippedUnsupportedSessions > 0 (lines 648–650) — and consequently the usage.estimatedCost.note.excludesClaudePrefix i18n key referenced there — can never fire.

Either re-introduce real gating (if there is still a class of sessions you want to exclude) or fully drop the stub, the field, the dead branch, and the now-unused i18n entry. Today's state is harder to maintain than either end of that spectrum.

♻️ Sketch of the simplification
-function shouldEstimateUsageCostForSession() {
-    return true;
-}
-
 function estimateUsageCostSummary(sessions, providersList, currentProvider) {
     const list = Array.isArray(sessions) ? sessions : [];
     const pricingIndex = buildUsagePricingIndex(providersList);
     let totalCostUsd = 0;
     let estimatedSessions = 0;
     let totalTokens = 0;
     let estimatedTokens = 0;
     let configuredSessions = 0;
     let catalogSessions = 0;
     let missingPricingSessions = 0;
     let missingTokenSessions = 0;
     let supportedSessions = 0;
-    let skippedUnsupportedSessions = 0;

     for (const session of list) {
         if (!session || typeof session !== 'object') continue;
-        if (!shouldEstimateUsageCostForSession(session)) {
-            skippedUnsupportedSessions += 1;
-            continue;
-        }
         supportedSessions += 1;
         ...
     }
     ...
     return {
         ...
-        skippedUnsupportedSessions
     };
 }

Then drop the prefix branch in sessionUsageSummaryCards and the now-orphaned i18n key.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web-ui/modules/app.computed.session.mjs` around lines 254 - 256, The function
shouldEstimateUsageCostForSession() has been turned into an unconditional true
which leaves dead callers/fields/branches; either restore meaningful gating
logic or remove all artifacts that assume a false path: update or remove calls
that pass a session argument (call sites in this file around the former lines
274, 808, 914), remove the skippedUnsupportedSessions counter and its increment
and remove the skippedUnsupportedSessions field from the summary object produced
(around the summary assembly ~line 311), and clean up the
sessionUsageSummaryCards code by deleting the estimatedCostPrefix branch that
checks skippedUnsupportedSessions > 0 (and remove the corresponding
usage.estimatedCost.note.excludesClaudePrefix i18n key); ensure all remaining
code consistently uses the simplified always-true behavior if you choose to
delete the gating, or reintroduce proper session-based checks inside
shouldEstimateUsageCostForSession(session) if you want selective exclusion.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/web-ui-logic.test.mjs`:
- Around line 1146-1149: Add an assertion that verifies the computed price is
non-zero and includes Claude's contribution: after locating costCard (the object
found via cards.find where card.key === 'estimated-cost'), assert that
costCard.value (or costCard.title if you prefer) matches a dollar-amount pattern
like /\$\d+\.\d{2}/ and/or contains the expected Claude-specific amount or label
to ensure Claude pricing was actually applied (e.g.,
assert.match(costCard.value, /\$\d+\.\d{2}/) and
assert.ok(costCard.value.includes('Claude') ||
costCard.title.includes('Claude'), or compare to a known expected numeric
total). Ensure you reference costCard.value/title and the existing costCard
lookup so the new assertion sits immediately after the current checks.

In `@web-ui/modules/app.computed.session.mjs`:
- Around line 254-256: The function shouldEstimateUsageCostForSession() has been
turned into an unconditional true which leaves dead callers/fields/branches;
either restore meaningful gating logic or remove all artifacts that assume a
false path: update or remove calls that pass a session argument (call sites in
this file around the former lines 274, 808, 914), remove the
skippedUnsupportedSessions counter and its increment and remove the
skippedUnsupportedSessions field from the summary object produced (around the
summary assembly ~line 311), and clean up the sessionUsageSummaryCards code by
deleting the estimatedCostPrefix branch that checks skippedUnsupportedSessions >
0 (and remove the corresponding usage.estimatedCost.note.excludesClaudePrefix
i18n key); ensure all remaining code consistently uses the simplified
always-true behavior if you choose to delete the gating, or reintroduce proper
session-based checks inside shouldEstimateUsageCostForSession(session) if you
want selective exclusion.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 588114e2-53b0-48cd-9b01-512e8ae549cf

📥 Commits

Reviewing files that changed from the base of the PR and between e04f3fa and db37e83.

📒 Files selected for processing (6)
  • tests/unit/web-ui-behavior-parity.test.mjs
  • tests/unit/web-ui-logic.test.mjs
  • web-ui/logic.codex.mjs
  • web-ui/modules/app.computed.dashboard.mjs
  • web-ui/modules/app.computed.session.mjs
  • web-ui/modules/i18n.dict.mjs
✅ Files skipped from review due to trivial changes (1)
  • web-ui/modules/i18n.dict.mjs
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/web-ui-behavior-parity.test.mjs
📜 Review details
🔇 Additional comments (8)
web-ui/logic.codex.mjs (4)

27-29: LGTM!

The normalizeUrl utility correctly handles edge cases by checking the type and providing a safe fallback for non-string inputs.


31-43: LGTM!

The function correctly:

  • Guards against invalid input (line 32)
  • Normalizes URLs for consistent matching
  • Provides a sensible fallback for OpenAI-named providers without URLs
  • Returns shallow copies to prevent mutations

The logic is sound and follows defensive programming practices.


48-56: LGTM!

The CODEX_PROVIDER_TEMPLATES constant is well-structured and properly frozen for immutability. The MuAPI template correctly aligns with the corresponding HOST_RULES entry (line 24) and includes the necessary useTransform flag.


5-12: Verify these model names against OpenAI's current API documentation.

The model names in DEFAULT_OPENAI_CODEX_CATALOG do not match OpenAI's documented model naming conventions:

  • gpt-5-codex, gpt-5, gpt-5-mini — Nonstandard variants; GPT-5 was not available as of mid-2024, and no official -codex or -mini variants have been announced
  • gpt-4.1 — OpenAI does not use decimal versioning; standard names are gpt-4, gpt-4-turbo
  • o4-mini, o3-mini — OpenAI's reasoning series uses o1, o3 (no o4 announced); reasoning models do not have -mini variants

Confirm these are actual, currently available OpenAI models and not placeholder/test names.

web-ui/modules/app.computed.dashboard.mjs (2)

1-2: LGTM!

The import statement correctly brings in CODEX_PROVIDER_TEMPLATES from the new module with the appropriate relative path.


34-36: LGTM!

The codexProviderTemplates() computed property correctly returns the frozen constant and follows the same pattern as other computed properties in this module.

web-ui/modules/app.computed.session.mjs (2)

244-250: LGTM — date-suffix fallback is well-scoped.

The lookahead (?=[-_]|$) correctly anchors -YYYYMMDD to a boundary so model ids like claude-3-5-sonnet-20241022 map back to claude-3-5-sonnet without accidentally stripping non-date numeric segments. The check strippedModel !== model avoids a redundant second lookup on the common path.


156-156: ⚡ Quick win

gpt-5.2-codex pricing verified against official OpenAI API documentation.

Pricing confirmed correct. OpenAI's official pricing for gpt-5.2-codex (as of May 2026) matches the values in the code: input $1.75, output $14.00, and cached input $0.175 per 1M tokens.

Co-Authored-By: ymkiux <ymkiux@users.noreply.github.com>
@ymkiux ymkiux force-pushed the fix/proxy-stream-resilience branch from db37e83 to b613245 Compare May 11, 2026 05:42
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@web-ui/modules/app.methods.providers.mjs`:
- Line 256: The useTransform detection is too permissive: change the condition
that sets useTransform (currently using !!(provider.codexmate_bridge ||
'').trim()) to only enable transforms when provider.codexmate_bridge explicitly
equals 'openai' (after trimming) or when the provider.url matches
/\/bridge\/openai\//; update the expression that computes useTransform to check
(provider.codexmate_bridge || '').trim() === 'openai' ||
/\/bridge\/openai\//.test(provider.url || '') so clone flow matches the
edit-flow behavior and avoids enabling transform for arbitrary non-empty
codexmate_bridge values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0cf1fb07-56fb-4531-9268-5aec9f26a1aa

📥 Commits

Reviewing files that changed from the base of the PR and between db37e83 and b613245.

📒 Files selected for processing (9)
  • cli.js
  • tests/unit/provider-share-command.test.mjs
  • tests/unit/web-ui-behavior-parity.test.mjs
  • web-ui/modules/app.methods.claude-config.mjs
  • web-ui/modules/app.methods.providers.mjs
  • web-ui/modules/app.methods.session-actions.mjs
  • web-ui/modules/i18n.dict.mjs
  • web-ui/partials/index/panel-config-claude.html
  • web-ui/partials/index/panel-config-codex.html
✅ Files skipped from review due to trivial changes (2)
  • tests/unit/web-ui-behavior-parity.test.mjs
  • tests/unit/provider-share-command.test.mjs
🚧 Files skipped from review as they are similar to previous changes (3)
  • web-ui/partials/index/panel-config-codex.html
  • web-ui/modules/app.methods.session-actions.mjs
  • cli.js
📜 Review details
🔇 Additional comments (3)
web-ui/modules/i18n.dict.mjs (1)

755-756: Clone i18n entries are complete and consistent.

Nice addition of both label + ARIA keys in zh and en; this keeps the new clone actions localized and accessible.

Also applies to: 1036-1037, 1811-1812, 2092-2093

web-ui/modules/app.methods.claude-config.mjs (1)

46-54: Clone-modal initializer looks good.

The method cleanly seeds clone state (blank target name + copied config fields) and reuses the existing add modal flow.

web-ui/partials/index/panel-config-claude.html (1)

169-174: Clone action wiring is correct and accessible.

Good hook-up to openCloneClaudeConfigModal(...), and ARIA/title localization is in place for icon-only action.

name: '',
url: normalizeProviderUrl(provider.url || ''),
key: '',
useTransform: !!(provider.codexmate_bridge || '').trim() || /\/bridge\/openai\//.test(provider.url || '')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

useTransform detection is too broad in clone flow.

This currently turns on transform for any non-empty provider.codexmate_bridge, but edit flow only treats 'openai' as transform. Keep these rules consistent to avoid cloning providers into the wrong mode.

Suggested fix
-                useTransform: !!(provider.codexmate_bridge || '').trim() || /\/bridge\/openai\//.test(provider.url || '')
+                useTransform:
+                    (typeof provider.codexmate_bridge === 'string' && provider.codexmate_bridge.trim() === 'openai')
+                    || /\/bridge\/openai\//.test(provider.url || '')
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web-ui/modules/app.methods.providers.mjs` at line 256, The useTransform
detection is too permissive: change the condition that sets useTransform
(currently using !!(provider.codexmate_bridge || '').trim()) to only enable
transforms when provider.codexmate_bridge explicitly equals 'openai' (after
trimming) or when the provider.url matches /\/bridge\/openai\//; update the
expression that computes useTransform to check (provider.codexmate_bridge ||
'').trim() === 'openai' || /\/bridge\/openai\//.test(provider.url || '') so
clone flow matches the edit-flow behavior and avoids enabling transform for
arbitrary non-empty codexmate_bridge values.

@ymkiux ymkiux changed the title fix(proxy): retry transient upstream errors and graceful sse abort feat(web-ui): proxy resilience, cost estimation, provider clone and ui polish May 11, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/web-ui-behavior-parity.test.mjs`:
- Around line 340-342: The allowlist for modal keys in the test is incomplete:
the keys 'openCloneClaudeConfigModal' and 'openCloneProviderModal' (and
'isTransformProvider' if relevant) were only added in the parityAgainstHead ===
true branch; update both branches so these modal keys are present regardless of
parityAgainstHead. Locate the allowlist arrays used when parityAgainstHead is
checked in web-ui-behavior-parity.test.mjs (the branches around the
parityAgainstHead conditional) and add 'openCloneClaudeConfigModal' and
'openCloneProviderModal' to the array in both the true branch and the false
branch so the test sees the same allowed modal keys in either case.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a7a26469-b00c-47b4-b5cf-7195bbd96c7b

📥 Commits

Reviewing files that changed from the base of the PR and between b613245 and c30b9dc.

📒 Files selected for processing (2)
  • tests/unit/web-ui-behavior-parity.test.mjs
  • web-ui/partials/index/panel-config-codex.html
🚧 Files skipped from review as they are similar to previous changes (1)
  • web-ui/partials/index/panel-config-codex.html
📜 Review details
🧰 Additional context used
🪛 GitHub Actions: ci / 0_ci (22).txt
tests/unit/web-ui-behavior-parity.test.mjs

[error] 596-596: Unit test failed in 'captured bundled app skeleton only exposes expected data key drift versus parity baseline'. AssertionError: Expected deepStrictEqual mismatch (expected [], got ['openCloneClaudeConfigModal','openCloneProviderModal']). Command: node tests/unit/run.mjs (via npm run test:unit).

🪛 GitHub Actions: ci / 2_ci (18).txt
tests/unit/web-ui-behavior-parity.test.mjs

[error] 596-596: Web UI behavior parity test failed (AssertionError): expected strictly deep-equal. Actual leaked data keys ['openCloneClaudeConfigModal','openCloneProviderModal'], expected [].

🪛 GitHub Actions: ci / ci (18)
tests/unit/web-ui-behavior-parity.test.mjs

[error] 596-596: Unit test failed in "captured bundled app skeleton only exposes expected data key drift versus parity baseline". AssertionError: expected strictly deep-equal arrays, but actual contained ['openCloneClaudeConfigModal','openCloneProviderModal'] while expected [].

🪛 GitHub Actions: ci / ci (22)
tests/unit/web-ui-behavior-parity.test.mjs

[error] 596-596: Captured bundled app skeleton only exposes expected data key drift versus parity baseline failed: Expected values to be strictly deep-equal. Actual keys ['openCloneClaudeConfigModal', 'openCloneProviderModal'] did not match expected []. AssertionError: ERR_ASSERTION (deepStrictEqual).

🔇 Additional comments (1)
tests/unit/web-ui-behavior-parity.test.mjs (1)

1206-1206: LGTM! Test expectations correctly updated for -- separator.

The expected clipboard command strings have been correctly updated to include the -- separator after npm start, aligning with the PR's stated changes to the share command format.

Also applies to: 1248-1248

Comment thread tests/unit/web-ui-behavior-parity.test.mjs
@ymkiux ymkiux force-pushed the fix/proxy-stream-resilience branch from c30b9dc to 3ba441f Compare May 11, 2026 08:23
@ymkiux ymkiux force-pushed the fix/proxy-stream-resilience branch from 3ba441f to 7afd9d6 Compare May 11, 2026 08:43
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@web-ui/logic.codex.mjs`:
- Line 60: The model identifier in web-ui/logic.codex.mjs is using incorrect
case; change the model field value from 'DeepSeek-V4-pro' to the lowercase
'deepseek-v4-pro' in the object where model is defined so API calls use the
correct identifier. Also update HOST_RULES (or add an entry for the telepub.cn
provider) so getCodexModelCatalogForProvider returns a non-empty catalog for
that host if you intend multiple selectable models; alternatively keep the
explicit lowercase model as the default in the same object.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 141b3f3f-93b9-4cd5-a205-108fd4d9c3e6

📥 Commits

Reviewing files that changed from the base of the PR and between c30b9dc and d670372.

📒 Files selected for processing (3)
  • tests/unit/web-ui-behavior-parity.test.mjs
  • web-ui/logic.codex.mjs
  • web-ui/partials/index/panel-config-codex.html
✅ Files skipped from review due to trivial changes (1)
  • tests/unit/web-ui-behavior-parity.test.mjs
🚧 Files skipped from review as they are similar to previous changes (1)
  • web-ui/partials/index/panel-config-codex.html
📜 Review details
🔇 Additional comments (2)
web-ui/logic.codex.mjs (2)

56-62: Clean addition of new provider template.

The new Telepub template follows the established pattern and structure. The syntax is correct and consistent with the existing MuAPI template.


59-59: ⚡ Quick win

No issues. The URL structure is intentional—the /voyage/api path is the actual endpoint expected by the service, as validated by the test suite in tests/unit/builtin-proxy-responses-shim.test.mjs, which explicitly mocks endpoints like /voyage/api/chat/completions and /voyage/api/responses.

Comment thread web-ui/logic.codex.mjs
label: 'Telepub',
name: 'telepub',
url: 'https://voyage.prod.telepub.cn/voyage/api',
model: 'DeepSeek-V4-pro',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

What is the DeepSeek V4 Pro model and what are the correct model identifiers?

💡 Result:

DeepSeek V4 Pro is DeepSeek’s “V4” family flagship open-weight Mixture-of-Experts (MoE) large language model checkpoint (OpenAI/Anthropic-compatible API model and Hugging Face weights). It is described as “DeepSeek-V4-Pro” with 1.6T total parameters (49B activated) and supports a 1M-token context length.[1][2] Correct model identifiers (what to pass as the API model name): 1) deepseek-v4-pro (V4 Pro) — official DeepSeek API identifier.[3] Related/compatibility identifiers you may encounter (not what “V4 Pro” specifically is): 2) deepseek-v4-flash (V4 Flash, smaller tier in the same V4 family).[3] 3) deepseek-chat and deepseek-reasoner are legacy compatibility IDs that route to V4-Flash (non-thinking vs thinking mode), and are planned to retire later; they are not V4 Pro identifiers.[4][5] Correct open-weight repository name (Hugging Face) for V4 Pro weights: 4) deepseek-ai/DeepSeek-V4-Pro.[6] Sources: [1] https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro [2] https://api-docs.deepseek.com/news/news260424 [3] https://api-docs.deepseek.com/quick_start/pricing [4] https://deepseekai.guide/api/deepseek-api-documentation/ [5] https://api-docs.deepseek.com/quick_start/pricing [6] https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro

Citations:


🏁 Script executed:

# First, let's see the file and context around line 60
head -80 web-ui/logic.codex.mjs | tail -30

Repository: SakuraByteCore/codexmate

Length of output: 909


🏁 Script executed:

# Look for case normalization or how models are processed
rg -n "model.*lower|toLowerCase|normalize" web-ui/ -A 2 -B 2

Repository: SakuraByteCore/codexmate

Length of output: 50381


🏁 Script executed:

# Check how getCodexModelCatalogForProvider is implemented and used
rg -n "getCodexModelCatalogForProvider|HOST_RULES" web-ui/ -A 3 -B 1

Repository: SakuraByteCore/codexmate

Length of output: 2145


Change the model identifier to lowercase 'deepseek-v4-pro' to match the official DeepSeek API specification.

The model name should be 'deepseek-v4-pro' (lowercase), not 'DeepSeek-V4-pro'. According to the official DeepSeek API documentation, the correct model identifier is lowercase. When this model field is passed to the API, the case mismatch may cause request failures.

Additionally, telepub.cn is not present in HOST_RULES, so getCodexModelCatalogForProvider will return an empty catalog for this provider. While the explicit model field provides a working default, consider adding a HOST_RULES entry if multiple models should be available through the catalog.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web-ui/logic.codex.mjs` at line 60, The model identifier in
web-ui/logic.codex.mjs is using incorrect case; change the model field value
from 'DeepSeek-V4-pro' to the lowercase 'deepseek-v4-pro' in the object where
model is defined so API calls use the correct identifier. Also update HOST_RULES
(or add an entry for the telepub.cn provider) so getCodexModelCatalogForProvider
returns a non-empty catalog for that host if you intend multiple selectable
models; alternatively keep the explicit lowercase model as the default in the
same object.

@ymkiux ymkiux force-pushed the fix/proxy-stream-resilience branch 2 times, most recently from 377f9d4 to 7aee259 Compare May 11, 2026 09:29
@ymkiux ymkiux force-pushed the fix/proxy-stream-resilience branch 2 times, most recently from 62293f6 to 9f11a88 Compare May 11, 2026 09:48
@ymkiux ymkiux force-pushed the fix/proxy-stream-resilience branch from 9f11a88 to 8fd8b1c Compare May 11, 2026 10:03
@ymkiux ymkiux changed the title feat(web-ui): proxy resilience, cost estimation, provider clone and ui polish feat(web-ui): proxy reasoning stream and ui polish bundle May 11, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cli/openai-bridge.js (1)

1371-1396: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Streaming retry can double-emit response.created after partial output to the client.

streamChatCompletionsAsResponsesSse returns { ok: false, error } on upstream socket errors (line 1127). In Node.js, ClientRequest can emit 'error' after the 'response' event during streaming—for example, if the upstream socket dies mid-stream with ECONNRESET or socket hang up. If the error message matches isTransientNetworkError, retryTransientRequest invokes the executor again on the same res, where:

  • res.headersSent is already true, so writeHead is skipped (correct), but
  • streamChatCompletionsAsResponsesSse creates a fresh state object with a new responseId and emits a second response.created SSE event, resets the sequence counter, and registers another res.once('close', abortUpstream) listener.

Clients receive two response lifecycles in one connection (different ids, restarted sequence numbers), which breaks client-side logic. The fallback at lines 1384–1396 only protects after all retries are exhausted and does not prevent this intermediate state.

Suggest preventing retry once any bytes have been written to the client:

Proposed fix — short-circuit streaming retry after headers are sent
-                const streamed = await retryTransientRequest(() => streamChatCompletionsAsResponsesSse(upstreamUrl, {
+                const streamed = await retryTransientRequest((attempt) => {
+                    // Once any bytes have been emitted to the client, retries would
+                    // double-emit response.created / reset sequence numbers. Only the
+                    // first attempt is allowed to start streaming to the client.
+                    if (attempt > 0 && (res.headersSent || res.writableEnded || res.destroyed)) {
+                        return { ok: false, status: 0, error: 'stream already started', retry: false };
+                    }
+                    return streamChatCompletionsAsResponsesSse(upstreamUrl, {
                         method: 'POST',
                         body: chatBody,
                         headers: {
                             ...(authHeader ? { Authorization: authHeader } : {}),
                             ...upstreamHeaders
                         },
                         maxBytes: maxUpstreamBytes,
                         httpAgent,
                         httpsAgent,
                         res,
                         model: typeof chatBody.model === 'string' ? chatBody.model : ''
-                }));
+                    });
+                });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/openai-bridge.js` around lines 1371 - 1396, The streaming retry can
re-emit a new response lifecycle after sending partial output; to fix,
short-circuit retry when the client has already received bytes: in the caller
that invokes retryTransientRequest() around streamChatCompletionsAsResponsesSse
(the block creating `streamed`), detect if the response has already started
(e.g., check res.headersSent or res.socket && typeof res.socket.bytesWritten ===
'number' && res.socket.bytesWritten > 0) and if so, skip retrying and
immediately treat the upstream error as final (return the `streamed` failure
result and run the existing error-path logic), so
`streamChatCompletionsAsResponsesSse`, `response.created` and the SSE state are
not re-emitted on retries.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cli/openai-bridge.js`:
- Around line 1371-1396: The streaming retry can re-emit a new response
lifecycle after sending partial output; to fix, short-circuit retry when the
client has already received bytes: in the caller that invokes
retryTransientRequest() around streamChatCompletionsAsResponsesSse (the block
creating `streamed`), detect if the response has already started (e.g., check
res.headersSent or res.socket && typeof res.socket.bytesWritten === 'number' &&
res.socket.bytesWritten > 0) and if so, skip retrying and immediately treat the
upstream error as final (return the `streamed` failure result and run the
existing error-path logic), so `streamChatCompletionsAsResponsesSse`,
`response.created` and the SSE state are not re-emitted on retries.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: eabff2d0-949c-4b4e-8724-0c04302edd39

📥 Commits

Reviewing files that changed from the base of the PR and between d670372 and 5d3f804.

📒 Files selected for processing (6)
  • cli/openai-bridge.js
  • tests/unit/openai-bridge-upstream-responses.test.mjs
  • web-ui/app.js
  • web-ui/logic.codex.mjs
  • web-ui/modules/app.methods.claude-config.mjs
  • web-ui/partials/index/modals-basic.html
💤 Files with no reviewable changes (1)
  • web-ui/partials/index/modals-basic.html
🚧 Files skipped from review as they are similar to previous changes (2)
  • web-ui/logic.codex.mjs
  • web-ui/modules/app.methods.claude-config.mjs
📜 Review details
🔇 Additional comments (4)
web-ui/app.js (1)

277-278: LGTM! Empty defaults support the clone feature and are more neutral.

Changing baseUrl and model from hardcoded 智谱GLM-specific values to empty strings allows the clone modal to properly pre-fill from the source configuration while keeping defaults neutral for new configurations.

cli/openai-bridge.js (2)

719-753: LGTM on the transient matcher and retry helper.

The matcher covers the documented set of transient signals (and intentionally skips bare 'timeout', so socket inactivity timeouts won't loop). retryTransientRequest correctly short-circuits on ok, on results that carry a non-zero status (so HTTP responses pass through without retry), and on non-transient errors. The unref() on the backoff timer is a nice touch.


1312-1321: LGTM on applying the retry wrapper to the non-streaming upstream calls.

proxyRequestJson does not touch the client res until the executor resolves, so retries here are idempotent from the client's perspective and the upstream HTTP method/body/headers are stable across attempts.

Also applies to: 1403-1413, 1464-1474

tests/unit/openai-bridge-upstream-responses.test.mjs (1)

172-229: LGTM — good coverage for the reasoning_content forwarding behavior.

The test exercises the new delta.reasoning_contentresponse.output_text.delta path end-to-end through the HTTP handler, asserts the per-segment deltas, the concatenated text in response.output_text.done (thinking-stepanswer), and the terminating [DONE]. The structure matches the other bridge tests in this file.

@ymkiux ymkiux changed the title feat(web-ui): proxy reasoning stream and ui polish bundle feat(web-ui): proxy stream and usage cost accuracy bundle May 11, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web-ui/modules/app.computed.session.mjs (1)

315-340: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Claude cost estimation systematically undercharges by treating inputTokens as inclusive of cached tokens.

Per Anthropic's API semantics, input_tokens represents only tokens after the last cache breakpoint and is additive with cache_read_input_tokens and cache_creation_input_tokens. The parseClaudeSessionSummary parser in cli.js preserves this convention without normalization (confirmed by test: inputTokens=1000, cachedInputTokens=5000, cacheCreationInputTokens=2000, totalTokens=8500).

However, estimateUsageCostForSession in web-ui/modules/app.computed.session.mjs applies the formula billableInputTokens = Math.max(0, (inputTokens || 0) - cachedInputTokens - cacheCreationInputTokens) uniformly across all session sources. This formula assumes inputTokens already includes cached tokens (Codex/OpenAI convention), causing it to zero out billable input for Claude sessions where cached tokens dominate.

For the test case with inputTokens=200000, cachedInputTokens=50000, cacheCreationInputTokens=30000, outputTokens=100000 and Claude Sonnet pricing (input $3/M, cacheRead $0.3/M, cacheWrite $3.75/M, output $15/M), the current formula incorrectly yields billableInputTokens=120000 and cost $1.99, when the correct value is billableInputTokens=200000 and cost $2.23 — a 10.8% undercharge driven by subtracting 80k tokens that were never part of inputTokens.

The source-based differentiation approach in the review (checking session.source !== 'claude' to determine token semantics) is sound and aligns with how the application already distinguishes session types.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web-ui/modules/app.computed.session.mjs` around lines 315 - 340,
estimateUsageCostForSession is subtracting cached tokens from inputTokens for
all sessions which undercharges Claude sessions because Claude's inputTokens are
already after the cache; change the billable input calculation to respect
session source: compute billableInputTokens = Math.max(0, (inputTokens||0)) when
session.source === 'claude' (or currentProvider indicates Anthropic), otherwise
keep the existing Math.max(0, (inputTokens||0) - cachedInputTokens -
cacheCreationInputTokens) behavior; update any dependent variables
(fallbackSessionTokens/hasTokenBreakdown) if they assume the old subtraction so
totals remain consistent when resolveUsagePricingForSession and the pricing
rates are applied.
🧹 Nitpick comments (2)
web-ui/modules/app.computed.session.mjs (2)

244-250: 💤 Low value

Date-suffix stripping only consults knownByModel; consider also retrying byProvider/byModel.

If a user configures a provider with an exact pinned model id (e.g. claude-3-5-sonnet-20240620) but their sessions report the bare base id (claude-3-5-sonnet) — or the reverse — this fallback won't bridge them because it only re-looks up in the public catalog. Low priority; the public catalog covers the common case, but worth a follow-up to apply the stripped lookup uniformly across byProvider/byModel as well.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web-ui/modules/app.computed.session.mjs` around lines 244 - 250, The
date-suffix stripping currently only retries the public-catalog lookup via
pricingIndex.knownByModel; update the logic around strippedModel (in
web-ui/modules/app.computed.session.mjs) to also retry the internal lookups
pricingIndex.byProvider and pricingIndex.byModel (and any code paths that
consume them) when strippedModel !== model — i.e., if strippedKnown is not
found, attempt pricingIndex.byProvider.get(strippedModel) and
pricingIndex.byModel.get(strippedModel) (or the equivalent lookup functions) and
return the first match so pinned/exact model ids and bare base ids are bridged
consistently.

254-256: 💤 Low value

shouldEstimateUsageCostForSession is now unconditional; downstream "excludes Claude" branch is dead.

Now that this returns true for every session, estimatedCost.skippedUnsupportedSessions is always 0, which makes the excludesClaudePrefix branch at lines 651–653 and the corresponding i18n key effectively unreachable. Consider either inlining the predicate at the call sites and removing skippedUnsupportedSessions/excludesClaudePrefix from the summary contract, or keeping the predicate as a meaningful filter (e.g., gating on missing model/provider). Not blocking — leaving as-is is fine for backward compatibility.

Also applies to: 272-277

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web-ui/modules/app.computed.session.mjs` around lines 254 - 256, The helper
shouldEstimateUsageCostForSession currently always returns true, which makes
estimatedCost.skippedUnsupportedSessions and the excludesClaudePrefix branch
dead; either: (A) remove the helper and inline the predicate at call sites, then
delete skippedUnsupportedSessions/excludesClaudePrefix from the summary contract
and i18n usage, or (B) restore a meaningful predicate in
shouldEstimateUsageCostForSession (e.g., check session.model and
session.provider and exclude providers like "claude") so it actually filters
unsupported sessions; update all occurrences (including the duplicate at the
other shouldEstimateUsageCostForSession instance around lines 272–277) and
adjust any callers that rely on skippedUnsupportedSessions/excludesClaudePrefix
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@web-ui/modules/app.computed.session.mjs`:
- Around line 315-340: estimateUsageCostForSession is subtracting cached tokens
from inputTokens for all sessions which undercharges Claude sessions because
Claude's inputTokens are already after the cache; change the billable input
calculation to respect session source: compute billableInputTokens = Math.max(0,
(inputTokens||0)) when session.source === 'claude' (or currentProvider indicates
Anthropic), otherwise keep the existing Math.max(0, (inputTokens||0) -
cachedInputTokens - cacheCreationInputTokens) behavior; update any dependent
variables (fallbackSessionTokens/hasTokenBreakdown) if they assume the old
subtraction so totals remain consistent when resolveUsagePricingForSession and
the pricing rates are applied.

---

Nitpick comments:
In `@web-ui/modules/app.computed.session.mjs`:
- Around line 244-250: The date-suffix stripping currently only retries the
public-catalog lookup via pricingIndex.knownByModel; update the logic around
strippedModel (in web-ui/modules/app.computed.session.mjs) to also retry the
internal lookups pricingIndex.byProvider and pricingIndex.byModel (and any code
paths that consume them) when strippedModel !== model — i.e., if strippedKnown
is not found, attempt pricingIndex.byProvider.get(strippedModel) and
pricingIndex.byModel.get(strippedModel) (or the equivalent lookup functions) and
return the first match so pinned/exact model ids and bare base ids are bridged
consistently.
- Around line 254-256: The helper shouldEstimateUsageCostForSession currently
always returns true, which makes estimatedCost.skippedUnsupportedSessions and
the excludesClaudePrefix branch dead; either: (A) remove the helper and inline
the predicate at call sites, then delete
skippedUnsupportedSessions/excludesClaudePrefix from the summary contract and
i18n usage, or (B) restore a meaningful predicate in
shouldEstimateUsageCostForSession (e.g., check session.model and
session.provider and exclude providers like "claude") so it actually filters
unsupported sessions; update all occurrences (including the duplicate at the
other shouldEstimateUsageCostForSession instance around lines 272–277) and
adjust any callers that rely on skippedUnsupportedSessions/excludesClaudePrefix
accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 16372ff1-c16b-452c-8fc5-c9e085ecab96

📥 Commits

Reviewing files that changed from the base of the PR and between 5d3f804 and fead0b5.

📒 Files selected for processing (4)
  • cli.js
  • tests/unit/session-usage-backend.test.mjs
  • tests/unit/web-ui-logic.test.mjs
  • web-ui/modules/app.computed.session.mjs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: ci (22)
  • GitHub Check: ci (22)
🔇 Additional comments (7)
cli.js (3)

307-316: Shared keep-alive agent setup looks good.

Using module-scoped agents here keeps outbound connection reuse consistent across the bridge and built-in proxies without scattering transport config.


3649-3690: Usage normalization stays nicely centralized.

Pulling cacheCreationInputTokens into the same normalization/state-update path as the other usage counters keeps the fallback totalTokens math and session aggregation logic aligned.

Also applies to: 3708-3726


3966-4080: Good propagation of cache-creation usage through session summaries.

Threading the new field through the per-source parsers and the Claude index/head/tail merge path should keep downstream cost estimation working even when different session sources expose different subsets of usage metadata.

Also applies to: 4117-4235, 4271-4395, 4685-4777

tests/unit/web-ui-logic.test.mjs (3)

1103-1141: This $1.99 expectation encodes the Claude billable-input bug.

The fixture (inputTokens: 200000 with cachedInputTokens: 50000 + cacheCreationInputTokens: 30000) only yields $1.99 because the current estimator subtracts the cache fields from inputTokens. Under real Anthropic semantics those three fields are additive (non-cached input is already 200k), and the correct estimate is $2.23:

  • input: 200000 × $3 = $0.60
  • cache_read: 50000 × $0.30 = $0.015
  • cache_write: 30000 × $3.75 = $0.1125
  • output: 100000 × $15 = $1.50
  • total: $2.2275 → $2.23

If the root-cause issue flagged in web-ui/modules/app.computed.session.mjs is addressed, this assertion (and the 2/2 coverage tests below) will need to be updated in lockstep. Not raising as a separate finding — see the major-issue comment on estimateUsageCostForSession.


1053-1057: LGTM — $0.8500 matches the visible/reasoning output split.

With the new visibleOutputTokens = outputTokens − reasoningOutputTokens carve-out and the fixture's outputTokens: 50000, reasoningOutputTokens: 50000, output billing splits into 0 visible + 50000 reasoning at the configured rate ($8/M), which lands the total at exactly $0.85 (input $0.40 + cacheRead $0.05 + reasoning $0.40 + visible $0). The 4-fraction-digit formatting ($0.8500) comes from precise: false triggering the < 1 branch returning 4 digits.


1143-1190: LGTM — Claude is correctly included in cost coverage now that the unconditional predicate is in place.

Asserting absence of 暂不含 Claude and the 2/2 coverage matches the updated shouldEstimateUsageCostForSession behavior. The first (Codex) and second (Claude) sessions both have provider+model+token-breakdown sufficient for an estimate; coverage of 2/2 is correct. Note: the Claude session here uses cachedInputTokens: 0, which masks the Anthropic-convention issue raised on app.computed.session.mjs — a session with non-zero cachedInputTokens would expose the undercharge.

tests/unit/session-usage-backend.test.mjs (1)

669-759: LGTM — assertions correctly capture Anthropic's additive cache-token semantics.

The fixture's usage shape (input_tokens: 1000 separate from cache_read_input_tokens: 5000 and cache_creation_input_tokens: 2000) and the totalTokens === 8500 fallback assertion correctly reflect Anthropic's API contract that input_tokens excludes cache reads/creations. Note that this convention is at odds with how estimateUsageCostForSession in web-ui/modules/app.computed.session.mjs consumes these fields — see the comment on that file.

The bindings setup duplicates the prior parseClaudeSessionSummary test almost verbatim, but it's consistent with the existing per-test instantiation pattern in this file, so no refactor needed here.

@ymkiux ymkiux merged commit 2f63044 into main May 11, 2026
7 checks passed
@ymkiux ymkiux deleted the fix/proxy-stream-resilience branch May 11, 2026 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant