feat: add support for tiered model pricing by sliverp · Pull Request #67605 · openclaw/openclaw

sliverp · 2026-04-16T09:34:27Z

Summary

Problem: There is no way to configure context-dependent token pricing (e.g. cheaper rates under 32K tokens, premium rates above 128K). The computeTieredCost function and ModelCostConfig type existed internally but were never wired into the config-to-display pipeline.
Why it matters: Many model providers charge different rates based on context length. Without tiered pricing support, cost tracking is inaccurate for these models.
What changed: Completed the tiered pricing pipeline across 4 files — config validation, cost resolution, cost type propagation, and usage display recomputation.
What did NOT change (scope boundary): Transport-layer calculateCost() in @mariozechner/pi-ai is untouched. Tiered cost recomputation happens at read time in scanTranscriptFile.

Change Type (select all)

Regression Test Plan (if applicable)

Coverage level:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/config/zod-schema.core.test.ts, src/utils/usage-format.test.ts, src/infra/session-cost-usage.test.ts
Scenario the test should lock in: (1) Zod schema accepts cost with valid tieredPricing array, (2) resolveModelCost preserves tieredPricing in output, (3) scanTranscriptFile computes cost using tiered pricing when configured.
Why this is the smallest reliable guardrail: Unit tests on each changed function directly verify the feature without requiring a running gateway.

User-visible / Behavior Changes

Users can now configure tieredPricing inside cost in openclaw.json model definitions.
Extension authors can declare tieredPricing in their model catalog cost objects.
Web UI /usage page displays tiered-pricing-computed costs when tieredPricing is configured.
tieredPricing range format: [start, end] for bounded tiers, [start] for unbounded top tier.
When tieredPricing is present, it takes priority over flat-rate input/output fields.

Diagram (if applicable)

## Cost Resolution (resolveModelCostConfig — multi-layer lookup)

                         ┌─────────────────────────────┐
                         │   resolveModelCostConfig()   │
                         └──────────┬──────────────────┘
                                    │
              ┌─────────────────────┼─────────────────────────────┐
              ▼                     ▼                             ▼
    Layer 1: models.json    Layer 2: config.models       Layer 3: Gateway cache
    (auto-generated from    .providers[].models[]        (OpenRouter / LiteLLM
     openclaw.json)         (openclaw.json + extension    remote pricing)
                             catalogs)
              │                     │                             │
              │  ← first match wins, short-circuits →            │
              └─────────────────────┴─────────────────────────────┘
                                    │
                                    ▼
                    ┌───────────────────────────────┐
                    │  ModelCostConfig returned      │
                    │  (may include tieredPricing)   │
                    └───────────┬───────────────────┘
                                │
                                ▼
                    ┌───────────────────────────────┐
                    │    estimateUsageCost()         │
                    │                               │
                    │  tieredPricing exists?         │
                    │    YES → computeTieredCost()   │
                    │    NO  → flat-rate calculation │
                    └───────────┬───────────────────┘
                                │
                  ┌─────────────┴──────────────┐
                  ▼                            ▼
        In-chat display              Web UI /usage page
    [agent-runner-usage-line.ts]   [session-cost-usage.ts]
                                   (recomputes when tieredPricing
                                    is configured, overriding
                                    transport flat-rate value)


## Config Ingestion (how tieredPricing enters the system)

  openclaw.json                    Extension model catalog
  (user config)                    (e.g. extensions/deepseek/models.ts)
        │                                    │
        ▼                                    ▼
  Zod validates tieredPricing        ModelDefinitionConfig type
  [zod-schema.core.ts]              supports tieredPricing
        │                                    │
        ▼                                    ▼
  resolveModelCost()              catalog/onboard registration
  preserves tieredPricing          → config.models.providers
  [defaults.ts]                          │
        │                                │
        ├───── models.json (auto-sync) ──┘
        │
        ▼
  buildProviderCostIndex()
  normalizes tier ranges
  (e.g. [128000] → [128000, Infinity])
  [usage-format.ts]

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No

Repro + Verification

Environment

OS: Linux
Runtime/container: Node.js / OpenClaw gateway
Model/provider: OpenRouter / z-ai/glm-5v-turbo
Relevant config (redacted):

"cost": {
  "tieredPricing": [
    { "input": 1, "output": 1, "cacheRead": 0, "cacheWrite": 0, "range": [0, 13000] },
    { "input": 10000, "output": 10000, "cacheRead": 0, "cacheWrite": 0, "range": [13000, 128000] },
    { "input": 10000, "output": 10000, "cacheRead": 0, "cacheWrite": 0, "range": [128000] }
  ]
}

Steps

Add tieredPricing to a model's cost in openclaw.json
Build and restart gateway
Send messages to accumulate tokens past the first tier boundary
Check in-chat cost reporting and Web UI /usage page

Expected

Config loads successfully
Cost stays low while within tier 1 range
Cost jumps when crossing into tier 2 range
New session resets to tier 1 pricing

Human Verification (required)

Verified scenarios: Config accepts tieredPricing, in-chat cost uses tiered rates, cost jumps at tier boundary, new session resets to lower tier
Edge cases checked: range: [128000] (single-element unbounded tier), tieredPricing-only cost (no top-level input/output fields)
What you did not verify: Web UI /usage page after rebuild (pending rebuild + restart)

OpenRouter / LiteLLM Merge Priority Strategy

Design Principle

The gateway pricing cache merges data from two sources with different strengths:

OpenRouter — provides more accurate flat pricing (base input/output rates).
LiteLLM — provides richer tiered pricing data (context-window-dependent rates).

Merge Rules

Scenario	Strategy
OpenRouter ✅ + LiteLLM ✅ (has tiers)	Use OpenRouter as base pricing, overlay LiteLLM tiered pricing
OpenRouter ✅ + LiteLLM ✅ (no tiers)	Prefer OpenRouter flat pricing
OpenRouter ✅ + LiteLLM ❌	Use OpenRouter
OpenRouter ❌ + LiteLLM ✅	Use LiteLLM

Priority Order

LiteLLM tiered pricing — highest value data, always preferred when available
OpenRouter flat pricing — more accurate base rates, preferred over LiteLLM flat rates
LiteLLM flat pricing — fallback when OpenRouter has no data

Rationale

LiteLLM's tiered data adds information that OpenRouter doesn't provide, so it takes priority. But for flat pricing, OpenRouter tends to be more accurate, so when both only offer flat rates, OpenRouter is the authoritative source.

aisle-research-bot · 2026-04-16T09:34:35Z

🔒 Aisle Security Analysis

We found 2 potential security issue(s) in this PR:

#	Severity	Title
1	🟡 Medium	Unbounded response buffering in pricing catalog fetch (memory/availability DoS)
2	🟡 Medium	Unbounded concurrency in channels.status can exhaust resources (DoS)

1. 🟡 Unbounded response buffering in pricing catalog fetch (memory/availability DoS)

Property	Value
Severity	Medium
CWE	CWE-400
Location	`src/gateway/model-pricing-cache.ts:127-139`

Description

readPricingJsonObject() attempts to enforce a 5MB maximum pricing catalog size, but it uses await response.arrayBuffer() which buffers the entire HTTP response body in memory before enforcing the byte limit.

If the upstream server omits or lies about Content-Length, the pre-check can be bypassed.
A large response body (or decompression-expanded body) can be fully downloaded/allocated, potentially causing excessive memory usage or process termination (OOM) before the post-check runs.

Vulnerable code:

const buffer = await response.arrayBuffer();
if (buffer.byteLength > MAX_PRICING_CATALOG_BYTES) {
  throw new Error(`${source} pricing response too large: ${buffer.byteLength} bytes`);
}

Recommendation

Avoid arrayBuffer() / response.json() for untrusted or potentially large responses. Stream the response body and enforce a hard cap while reading, aborting once the limit is exceeded.

Example (Node/undici fetch):

async function readJsonObjectWithLimit(response: Response, source: string) {
  const reader = response.body?.getReader();
  if (!reader) throw new Error(`${source} response has no body`);

  const chunks: Uint8Array[] = [];
  let total = 0;

  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    if (!value) continue;

    total += value.byteLength;
    if (total > MAX_PRICING_CATALOG_BYTES) {
      try { await reader.cancel(); } catch {}
      throw new Error(`${source} pricing response too large: >${MAX_PRICING_CATALOG_BYTES} bytes`);
    }
    chunks.push(value);
  }

  const buffer = Buffer.concat(chunks, total);
  const payload = JSON.parse(buffer.toString("utf8"));
  if (!payload || typeof payload !== "object" || Array.isArray(payload)) {
    throw new Error(`${source} pricing response is not a JSON object`);
  }
  return payload as Record<string, unknown>;
}

Additionally consider limiting decompressed size (streaming check covers this) and setting reasonable fetch/agent limits (timeouts, max response size) to reduce DoS risk.

2. 🟡 Unbounded concurrency in channels.status can exhaust resources (DoS)

Property	Value
Severity	Medium
CWE	CWE-400
Location	`src/gateway/server-methods/channels.ts:270-327`

Description

channels.status changed from sequential processing to parallel execution using runTasksWithConcurrency. For non-probe requests (probe === false), the concurrency limit is set to the full size of the task list:

limit: accountIds.length || 1 for per-account snapshot building
limit: plugins.length || 1 for per-plugin summary building

Because runTasksWithConcurrency spawns resolvedLimit worker promises (up to tasks.length), a large number of configured accounts/plugins can cause a burst of simultaneous async work (plugin hooks, I/O, CPU) and memory usage. An authenticated caller with READ scope can repeatedly invoke channels.status to overload the gateway.

Vulnerable code:

const { results } = await runTasksWithConcurrency({
  tasks: accountIds.map(...),
  limit: probe ? CHANNEL_STATUS_PROBE_CONCURRENCY : accountIds.length || 1,
});
...
const { results: channelResults } = await runTasksWithConcurrency({
  tasks: plugins.map(...),
  limit: probe ? CHANNEL_STATUS_PROBE_CONCURRENCY : plugins.length || 1,
});

Recommendation

Always cap concurrency regardless of probe to prevent resource exhaustion.

For example:

const MAX_STATUS_CONCURRENCY = 10; // tune per deployment

const accountLimit = probe ? CHANNEL_STATUS_PROBE_CONCURRENCY : MAX_STATUS_CONCURRENCY;
const pluginLimit = probe ? CHANNEL_STATUS_PROBE_CONCURRENCY : MAX_STATUS_CONCURRENCY;

await runTasksWithConcurrency({ tasks, limit: accountLimit });
await runTasksWithConcurrency({ tasks: pluginTasks, limit: pluginLimit });

Additionally consider:

applying server-side rate limiting for channels.status
enforcing maximum total accounts processed per request (or pagination)

Analyzed PR: #67605 at commit 9374604

_{Last updated on: 2026-04-21T01:59:59Z}

sliverp · 2026-04-16T09:35:44Z

Evidence — Manual Test Transcript

Config used:

"cost": {
  "tieredPricing": [
    { "input": 1, "output": 1, "cacheRead": 0, "cacheWrite": 0, "range": [0, 13000] },
    { "input": 10000, "output": 10000, "cacheRead": 0, "cacheWrite": 0, "range": [13000, 128000] },
    { "input": 10000, "output": 10000, "cacheRead": 0, "cacheWrite": 0, "range": [128000] }
  ]
}

Test session (GLM-5V-Turbo via OpenRouter, context window 128k):

Turn	Action	Input tokens	Output tokens	Context %	Cumulative cost	Notes
1	`/new` — fresh session	9.9k	338	9%	$0.01	All tokens within tier 1 `[0, 13000)` at $1/M — cost is negligible as expected
2	Asked for 1000-word explanation of "attention"	~11k	~1.2k	10%	$0.01	Still within tier 1, cost barely changes
3	Asked for 1000-word explanation of "convolution"	~12k	~1.2k	12%	$0.01	Approaching tier 1 boundary (13k)
4	`/cost` — checked billing	14k	222	13%	$13.31	Input crossed 13k boundary into tier 2 `[13000, 128000)` at $10,000/M — cost jumped sharply, confirming tiered pricing is active
5	`/new` — started fresh session	11k	63	12%	$0.01	New session resets context below 13k, back to tier 1 pricing — cost drops back to $0.01

Key observations:

Tier boundary works: Cost stayed at $0.01 while input tokens were under 13k (tier 1 at $1/M). Once input crossed 13k, cost jumped to $13.31 — the overflow tokens were billed at tier 2's $10,000/M rate.
New session resets correctly: After /new, context dropped below 13k, and cost returned to $0.01, proving tiered pricing is evaluated per-request based on actual input token count.
Config reload succeeded: No Unrecognized key: "tieredPricing" errors after rebuild — Zod schema fix confirmed working.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 25d509ab83

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

greptile-apps · 2026-04-16T09:40:46Z

Greptile Summary

This PR fixes three integration gaps that blocked end-to-end tiered pricing: the Zod .strict() schema now accepts tieredPricing, resolveModelCost() passes it through, and scanTranscriptFile recomputes costs at read time when tiered pricing is configured. A new parallel LiteLLM catalog fetch enriches the pricing cache with per-model tier data from open-source metadata.

The core computeTieredCost logic and normalization are correct, and the test suite is thorough. Two minor cleanup opportunities remain (see inline comments).

Confidence Score: 5/5

Safe to merge — all remaining findings are P2 style/type improvements, no logic bugs found.

The bug fixes are correct (Zod schema, resolveModelCost passthrough, scanTranscriptFile recompute). computeTieredCost math is verified against test comments. LiteLLM fetch has proper timeout and graceful fallback. The two P2 comments (duplicate branches, range type width) don't block correctness.

No files require special attention.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/infra/session-cost-usage.ts
Line: 258-264

Comment:
**Redundant duplicate branches**

Both the `if` and `else if` bodies call `estimateUsageCost` with identical arguments; the only difference is the condition under which the call happens. These can be collapsed into a single branch.

```suggestion
      if ((cost?.tieredPricing && cost.tieredPricing.length > 0) || entry.costTotal === undefined) {
        // When tiered pricing is configured, always recompute to override
        // the flat-rate cost that the transport layer wrote into the transcript.
        // Otherwise, only fill in missing cost estimates.
        entry.costTotal = estimateUsageCost({ usage: entry.usage, cost });
      }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/config/types.models.ts
Line: 66-72

Comment:
**`range: [number, number]` doesn't match the documented one-element shorthand**

The PR description documents `[start]` (one-element array) as the supported shorthand for an unbounded top tier. The Zod schema correctly accepts it via `z.union([z.tuple([z.number(), z.number()]), z.tuple([z.number()])])`, and `normalizeTieredPricing` normalizes it to `[start, Infinity]` at load time. However, this TypeScript type still says `[number, number]`, so any TypeScript caller (e.g. an extension author) who writes `range: [128_000]` will get a compile error and have no indication that the JSON shorthand exists.

Widening the type to match the Zod schema input makes the contract self-documenting:

```suggestion
    tieredPricing?: Array<{
      input: number;
      output: number;
      cacheRead: number;
      cacheWrite: number;
      /** Bounded tier: `[start, end)`. Open-ended top tier: `[start]` (normalized to `[start, Infinity]` at load time). */
      range: [number, number] | [number];
    }>;
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "feat: add support for tiered model prici..." | Re-trigger Greptile}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 344e4eb584

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f32a6c53c5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

sliverp · 2026-04-16T11:39:50Z

🔒 Aisle Security Analysis

We found 4 potential security issue(s) in this PR:

Severity Title

1 🟡 Medium Unbounded remote JSON fetch/parse can cause memory/CPU exhaustion in LiteLLM pricing catalog loader
2 🟡 Medium Runtime fetch of mutable GitHub raw pricing catalog without integrity/pinning allows tampering of cost calculations
3 🟡 Medium Incorrect tiered pricing validation allows malformed ranges (Infinity/out-of-order) to skew cost calculation
4 🟡 Medium Stale gateway pricing cache can be retained indefinitely on upstream fetch failures (billing integrity risk)

Won't fix — The fetch URL is a hardcoded GitHub raw link to LiteLLM's official pricing data, not user-controlled input. This is a CLI tool, not a long-running server, so the blast radius of an unexpectedly large response is limited to a single process. The current payload is ~hundreds of KB. Adding streaming size limits would add complexity disproportionate to the actual risk.
Won't fix — The pricing data is advisory/informational only — it serves as a cost reference estimate for the user, not an actual billing or payment mechanism. OpenClaw does not enforce budgets, process charges, or make access-control decisions based on these numbers. Pinning to a commit SHA would require constant manual updates to track LiteLLM's frequent model additions, defeating the purpose of runtime fetching. Users who need precise cost tracking can override prices via local config.
Won't fix — Same rationale as the earlier P2 (tier start boundaries). Tiered pricing data comes from two controlled sources: LiteLLM's upstream catalog (always sorted, contiguous, starting from 0) and user/extension config (self-service, user assumes responsibility). Adding full range validation adds complexity for a scenario that doesn't occur with actual data sources. If a user manually writes malformed tiers in their config, that's on them.
Won't fix — As noted in previous responses, pricing data is advisory only — it provides cost reference estimates, not billing controls. OpenClaw does not enforce budgets, process payments, or gate access based on cached pricing. Retaining a stale cache is strictly better than replacing it with nothing (which was the original bug). Adding a staleness expiry that clears the cache would reintroduce the exact problem this guard was designed to solve — users seeing $0.00 costs. An attacker who can persistently block outbound HTTPS has far more impactful attack vectors than pinning a display-only cost estimate.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab4b3724f9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T11:56:22Z

+      fetchOpenRouterPricingCatalog(fetchImpl).catch((error: unknown) => {
+        log.warn(`OpenRouter pricing fetch failed: ${String(error)}`);
+        return new Map<string, OpenRouterPricingEntry>();
+      }),


Preserve existing entries when one pricing source fails

Catching a catalog fetch error and returning an empty map makes a transient source outage look like a successful empty response; if the other source still returns at least one model, the later full-cache replacement drops every model that depended on the failed source. In a mixed config (for example, OpenRouter-only models plus a LiteLLM-tiered model), a single OpenRouter failure will silently remove OpenRouter prices until the next successful refresh, regressing cost lookups despite having a previously healthy cache.

Useful? React with 👍 / 👎.

Fixed — The .catch() handlers now track which source failed (openRouterFailed / litellmFailed). On partial failure, models missing from nextPricing are back-filled from the existing cache, so a single-source outage no longer silently drops pricing for models that depended on the failed source. When both sources fail, the entire existing cache is retained as before.

amknight · 2026-04-17T04:45:23Z

@@ -36,6 +37,8 @@ type OpenRouterModelPayload = {
 export { getCachedGatewayModelPricing };

 const OPENROUTER_MODELS_URL = "https://openrouter.ai/api/v1/models";
+const LITELLM_PRICING_URL =
+  "https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json";


/nit this is technically not an official API, and although should be stable, it's a bit odd to me to depend on it at runtime. Something like pulling into the repo during a build step would be more reliable. Although not consequential as we handle the failures gracefully.

Acknowledged — agreed it's not an official API. We handle fetch failures gracefully (empty catalog → retain existing cache), so the risk is limited. We can consider vendoring the JSON at build time in a follow-up if this becomes fragile.

amknight · 2026-04-17T04:48:50Z

@@ -2767,6 +2767,51 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
                          cacheWrite: {
                            type: "number",
                          },
+                          tieredPricing: {


Looks like docs/.generated/config-baseline.sha256 should be updated and committed

amknight · 2026-04-17T04:57:05Z

+      continue;
+    }
+    const inputInTier = Math.min(inputRemaining, tierWidth);
+    const fraction = inputInTier / totalInputTokens;


Does the billing actually apply proportionally within the request? Or is it a fixed cost per request based on the token threshold?

It's proportional. The algorithm splits the input tokens across tier boundaries, then uses each tier's share as a fraction to apportion output/cache costs proportionally. For example, with 40k input tokens and tiers at [0, 32k) and [32k, 128k): 80% of input falls in tier 1 and 20% in tier 2, so output and cache costs are also split 80/20 across those tiers. This matches how providers like Anthropic and Google bill — the per-token rate depends on which tier the token falls into, not a single fixed rate for the whole request.

I know the code applying proportionally, but is this actually aligned with how most/all providers do the billing? I thought not.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 850a516792

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-20T03:23:27Z

+  // Only use provider-qualified key to avoid cross-provider pricing collisions.
+  return params.catalog.get(`${params.ref.provider}/${params.ref.model}`);


Resolve LiteLLM tiers for wrapped OpenRouter refs

collectConfiguredModelPricingRefs includes wrapper-style refs like openrouter/anthropic/claude-sonnet-4-6, but this lookup only queries LiteLLM with the literal provider/model key from that ref. That means it searches for openrouter/anthropic/... instead of the underlying provider key (anthropic/...), so litellmPricing is missed and the merge path never adds tieredPricing for wrapped OpenRouter models. In practice, tiered pricing silently does not apply for those models even when LiteLLM has tier data.

Useful? React with 👍 / 👎.

Fixed — resolveLiteLLMPricingForRef now unwraps wrapper-provider refs (e.g. openrouter/anthropic/claude-sonnet-4-6 → looks up anthropic/claude-sonnet-4-6 in the LiteLLM catalog), consistent with how OpenRouter catalog lookups already handle nested refs. This ensures tiered pricing is correctly resolved for wrapped models.

Fixed — resolveLiteLLMPricingForRef now unwraps wrapper-provider refs (e.g. openrouter/anthropic/claude-sonnet-4-6 → looks up anthropic/claude-sonnet-4-6 in the LiteLLM catalog), consistent with how OpenRouter catalog lookups already handle nested refs.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74dbb06fa2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-20T04:13:06Z

+      for (const result of results) {
+        if (result.status === "fulfilled") {
+          resolvedAccounts[result.value.accountId] = result.value.account;
+          accounts.push(result.value.snapshot);
        }


Preserve default account snapshot when account tasks reject

Filtering Promise.allSettled results down to only fulfilled entries drops failed accounts entirely, which creates a new path where the default account snapshot is missing even though other accounts succeeded. In that case defaultAccount falls back to another account (accounts[0]), so buildChannelSummary can combine the default account id/object with a non-default snapshot and report incorrect health/config state for the default channel account.

Useful? React with 👍 / 👎.

Fixed — Rejected account tasks now produce a minimal degraded snapshot (configured: false, error) instead of being silently dropped. This ensures accounts.find() always locates the default account entry, preventing the fallback to accounts[0] from mixing a non-default account's snapshot with the default account's identity.

chatgpt-codex-connector · 2026-04-20T04:13:07Z

+      if (result.status === "fulfilled") {
+        channelsMap[result.value.pluginId] = result.value.summary;
+        accountsMap[result.value.pluginId] = result.value.accounts;
+        defaultAccountIdMap[result.value.pluginId] = result.value.defaultAccountId;
+      }


Surface rejected channel-status jobs instead of dropping channels

At the channel level, rejected tasks from Promise.allSettled are silently ignored and the handler still returns success, so a thrown buildChannelAccounts/buildChannelSummary causes that plugin to disappear from channels, channelAccounts, and channelDefaultAccountId. Because the CLI formatter only prints channels present in these payload maps, users get an incomplete status report with no error signal instead of seeing the failed channel as degraded.

Useful? React with 👍 / 👎.

steipete · 2026-04-21T02:03:24Z

Landed in b938e63.\n\nMaintainer follow-up before merge:\n- Replayed the PR on current main without the old merge-from-main noise.\n- Added latest bundled Moonshot Kimi K2.6/K2.5 cost estimates.\n- Hardened tiered range billing, oversized remote pricing catalogs, and channel status probe concurrency/timeout.\n- Added changelog credit.\n\nVerification:\n- pnpm test src/utils/usage-format.test.ts src/gateway/model-pricing-cache.test.ts src/gateway/server-methods/channels.status.test.ts extensions/moonshot/provider-catalog.test.ts\n- pnpm check:changed\n- pnpm check\n- GitHub checks green\n\nThanks @sliverp.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 937460463c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-21T02:06:18Z

+        if (!nextPricing.has(key)) {
+          const existing = getCachedGatewayModelPricing({


Preserve tiered cache data during LiteLLM fetch outages

When litellmFailed is true, the refresh only back-fills models that are completely missing from nextPricing. For models that still resolve via OpenRouter, the key is present, so previously cached tieredPricing is dropped and replaced with flat-only pricing for the outage window. This causes temporary cost regressions (tiered → flat) even though valid tier data is already in memory; the fallback should also preserve existing tier metadata for overlapping keys when the LiteLLM source failed.

Useful? React with 👍 / 👎.

@sliverp

Adds tiered model pricing support for cost tracking, keeps configured pricing ahead of cached catalog values, and includes latest Moonshot Kimi K2.6/K2.5 cost estimates.\n\nThanks @sliverp.

openclaw-barnacle Bot added gateway Gateway runtime size: L maintainer Maintainer-authored PR labels Apr 16, 2026

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/gateway/model-pricing-cache.ts Outdated

Comment thread src/gateway/model-pricing-cache.ts

greptile-apps Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/infra/session-cost-usage.ts

Comment thread src/config/types.models.ts

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/infra/session-cost-usage.ts Outdated

Comment thread src/utils/usage-format.ts Outdated

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/gateway/model-pricing-cache.ts Outdated

Comment thread src/gateway/model-pricing-cache.ts

openclaw-barnacle Bot added size: XL and removed size: L labels Apr 16, 2026

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

amknight reviewed Apr 17, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 20, 2026

View reviewed changes

openclaw-barnacle Bot added the commands Command implementations label Apr 20, 2026

chatgpt-codex-connector Bot reviewed Apr 20, 2026

View reviewed changes

sliverp and others added 6 commits April 21, 2026 02:48

feat: add support for tiered model pricing

46f691d

fix: improve pricing cache and calculation logic

4b4f5b1

fix: fix model pricing cache and cost calculation logic

a9368eb

fix: fix CI

701eb23

fix

601159b

fix: harden tiered pricing cost tracking

9374604

steipete force-pushed the feat/supprt-tiered-token-pricing branch from 74dbb06 to 9374604 Compare April 21, 2026 01:57

openclaw-barnacle Bot added docs Improvements or additions to documentation extensions: moonshot labels Apr 21, 2026

steipete merged commit b938e63 into main Apr 21, 2026
94 of 95 checks passed

steipete deleted the feat/supprt-tiered-token-pricing branch April 21, 2026 02:03

chatgpt-codex-connector Bot reviewed Apr 21, 2026

View reviewed changes

		// Only use provider-qualified key to avoid cross-provider pricing collisions.
		return params.catalog.get(`${params.ref.provider}/${params.ref.model}`);

		if (!nextPricing.has(key)) {
		const existing = getCachedGatewayModelPricing({

Uh oh!

Conversation

sliverp commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Human Verification (required)

OpenRouter / LiteLLM Merge Priority Strategy

Design Principle

Merge Rules

Priority Order

Rationale

Uh oh!

aisle-research-bot Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Aisle Security Analysis

Description

Recommendation

Description

Recommendation

Uh oh!

sliverp commented Apr 16, 2026

Evidence — Manual Test Transcript

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 16, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

sliverp commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Aisle Security Analysis

Severity Title

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

sliverp commented Apr 16, 2026 •

edited

Loading

aisle-research-bot Bot commented Apr 16, 2026 •

edited

Loading

sliverp commented Apr 16, 2026 •

edited

Loading