-
Notifications
You must be signed in to change notification settings - Fork 0
Gen3 Admin Financial Controls Model Pricing
- Open Financial Controls from the Control Panel sidebar.
- Select the Model Pricing tab [route: /dashboard/billing?tab=models|Model Pricing tab].
- Confirm each configured model from Models appears with a pricing status badge.
- Filter by Provider or Status when you need to focus on unresolved rows or one vendor.
- Use Reset selected to online defaults or Reset all to online defaults after catalog changes, then click Save model pricing when manual edits are complete.
- Ask a tenant owner to review Billing on Observability to confirm settled usage reflects your policy.
Model pricing turns inference, embedding, speech, and image activity into infrastructure-credit burn tenants see in analytics. If prices drift from provider rate cards—or stay Unresolved or Unsupported—chargeback, budget warnings, and owner-facing billing summaries become misleading even when usage is healthy.
The Model Pricing workspace lives on the Control Panel Financial Controls route (/dashboard/billing, tab models). It prices every model registered in Models → Configured Models, including multi-capability models and compound routers such as Groq Compound.
For provider list prices and mapping examples, see Provider rate cards. For retained-content meters, see Storage pricing. For the full billing-policy surface, see Financial Controls.
| Area | Purpose |
|---|---|
| Search | Filter by provider name, model name, or model key (press Enter to apply) |
| Provider | Restrict to one inference provider |
| Status | Filter by pricing status or enabled/disabled |
| Sort | Order by provider, model, input price, output price, or status |
| Table | One row per configured model with expandable capability pricing |
| Bulk actions | Export, import, clear, reset online, save |
The summary bar reports how many model rows are configured, how many are active, and how many are selected on the current page.
| Status | Meaning | Typical next step |
|---|---|---|
| Auto-priced | GT AI OS resolved a price from an online catalog snapshot (OpenRouter and/or LiteLLM) | Review after provider price changes; re-run online reset if needed |
| Manual | An operator entered or imported prices, or disabled a row | Document Source / notes; keep aligned with your rate-card policy |
| Unresolved | No online source matched, or catalog fetch failed | Enter manual token or unit prices, enable the row, save; or fix model key/provider mapping |
| Unsupported | Online catalog has the model but not in a meter shape GT AI OS can apply (for example non-token image rates) | Enter manual unit price for the capability, enable, save |
Status badges appear on the model row. When pricingStatusReason is set (for example model not found in litellm catalog), it prints under the badge.
Rows that are Unresolved, Unsupported, or Disabled show guidance: enter manual pricing, keep Enabled on, and save to use the model for inference billing.
When you reset to online defaults, the Control Panel backend refreshes cached snapshots (about 12-hour TTL) and resolves each capability profile:
| Source | URL | Used when |
|---|---|---|
| OpenRouter catalog | https://openrouter.ai/api/v1/models |
Provider type or name is OpenRouter |
| LiteLLM model catalog | https://api.litellm.ai/model_catalog |
Primary non-OpenRouter resolution |
| LiteLLM raw JSON | https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json |
Fallback when the API catalog does not match |
OpenRouter is consulted first for OpenRouter providers. Other providers use LiteLLM catalog, then the raw GitHub JSON. Manual rows are preserved on background sync unless you force an online reset.
For models whose inference provider is OpenRouter, GT AI OS uses two different price sources:
| Phase | Price source | Purpose |
|---|---|---|
| Pre-request reservation | Model pricing row (catalog-synced or manual) | Hold infrastructure credits before inference runs |
| Post-request settlement (chat) | OpenRouter usage.cost from the API response |
Authoritative billed amount when present |
| Settlement fallback | Model pricing row | Used only when usage.cost is missing or zero |
Catalog sync does not set your bill. OpenRouter returns per-request usage.cost in chat completion responses (final chunk when streaming). The resource cluster passes that through at settlement (costSource: provider_passthrough). The Model Pricing workspace rates are fallback estimates for reservations and for the rare case where OpenRouter omits usage.cost.
What the OpenRouter catalog import actually contains: GT AI OS reads pricing.prompt and pricing.completion from GET /api/v1/models. OpenRouter documents this top-level pricing object as the lowest listed prompt/completion price across all OpenRouter endpoints for that model slug — a catalog floor, not the price of any single routed request. Per-endpoint rates (which can be higher) live under GET /api/v1/models/{author}/{slug}/endpoints. OpenRouter’s model-list sort options (for example pricing-low-to-high) use a separate weighted average across pricing dimensions for ordering the catalog; that sort metric is not what GT AI OS imports.
After Reset … to online defaults, auto-priced OpenRouter rows show a status reason explaining this fallback behavior. Token price inputs for OpenRouter models are labeled as catalog floor / fallback rates.
- Optionally filter to the provider or status you care about.
- Select rows with checkboxes, or rely on filter-scoped reset:
- Reset selected to online defaults — only checked rows on the current result set.
- Reset all matching to online defaults — every row matching active search/provider/status filters (confirmation dialog).
- Reset all to online defaults — entire catalog when no filters are active.
- Read the notice summary (
auto-priced,unresolved,unsupportedcounts). - Review rows that stayed unresolved; add manual prices where needed.
- Click Save model pricing to persist (reset updates server state, but treat unsaved local edits separately if you changed cells before saving).
Reset replaces manual prices when an online source resolves the row. It does not invent prices for unsupported metering shapes.
- Locate the model row (search or provider filter).
- For token I/O capabilities, set Input token price / 1M and Output token price / 1M.
- For audio duration, input characters, or image count capabilities, set the unit price field (audio price per hour, input characters per 1M, or price per generated image).
- Optionally fill Source / notes (
priceSource) with your internal reference or provider doc link. - Ensure Enabled is checked.
- Click Save model pricing.
Typing prices marks the capability Manual in the UI draft. Saving writes the catalog version tenants consume.
Each configured model can expose one or more capability rows derived from model capabilities in Models:
Request type (requestType) |
Operator label | Pricing method | Fields you edit |
|---|---|---|---|
chat |
chat | token I/O | Input / output per 1M tokens |
embed |
embed | token I/O | Input / output per 1M tokens (output often 0) |
image_analysis |
image analysis | token I/O | Input / output per 1M tokens |
transcription |
transcription | audio duration (audio_seconds) |
Unit price (USD per hour; runtime meters audio seconds) |
speech_synthesis |
speech synthesis | input characters | Unit price (USD per 1M input characters) |
image_generation |
image generation | image count | Unit price (USD per generated image) |
Additional normalized types (translation, web_search) follow the same catalog rules when present on a model.
The Capabilities and Method columns summarize request type, pricing method, billing unit, and rounding policy (for example ceil to second for STT, whole image for image generation).
Groq Compound (groq/compound) and similar compound routers show:
- Groq Compound underlying model overrides — per-underlying input and output token price per 1M
- Groq Compound built-in tool pricing — per-invocation charges for tools reported in Compound inference responses (advanced search, basic search, visit website, code execution, browser automation)
Each underlying model lists display name, model key, whether pricing comes from a configured model row or manual override, and editable token prices. Built-in tool rows list the tool name/key and an editable price per invocation (defaults match Groq list pricing).
Configure prices for underlying models in Models when possible; use compound overrides when chargeback must differ from the child model catalog row or Groq tool list prices.
Export all or Export selected downloads a CSV with canonical columns:
modelId, modelKey, providerName, modelName, requestType, pricingMethod, billingUnit, unit, unitPrice, priceSource, pricingStatus, pricingStatusReason, inputPricePerMillion, outputPricePerMillion, currency, active
Import CSV merges into the current workspace by modelId or modelKey, optionally scoped with requestType for multi-capability models. Unmatched references are skipped and reported in the notice.
Aliases accepted on import include provider, input_price_per_million, output_price_per_million, pricing_status, enabled → active, and similar snake_case forms.
After import: review statuses, fix unmatched keys, click Save model pricing.
Clear selected removes prices on checked rows (sets unresolved-style empty pricing in the draft); save only if that is intentional.
| Action | Save required? |
|---|---|
| Edit input/output/unit prices in the grid | Yes — Save model pricing |
| Toggle Enabled | Yes |
| Import CSV | Yes (import updates draft state; notice reminds you) |
| Reset to online defaults | Server applies reset immediately; still review and save if you combine with local edits |
| Export CSV | No |
Read-only sessions (non–Super Admin or read-only license posture) disable save, reset, and import.
Configuration is Control Panel–only. Validation is tenant-side:
- Ensure Financial controls are enabled and infrastructure credits are funded on Financial Controls → Infrastructure Balance.
- Sign in as a tenant owner.
- Open Observability → Billing tab.
- Compare model and storage breakdowns to the prices you configured after representative usage.
Managers and tenant users do not see the Billing tab; use an owner account for verification.
- Reconcile model pricing whenever Models gains or changes providers.
- Run Reset all to online defaults after major OpenRouter or LiteLLM catalog updates, then manually fix remaining Unsupported rows.
- Keep Source / notes populated for manual rows (contract ID, rate-card date, or URL).
- Price embeddings and vision capabilities separately when the same model key serves multiple request types.
- Set local/Ollama models to
$0.00input and output when you intentionally exclude pass-through API cost (see Provider rate cards).
- Confirm the model exists and is enabled under Models.
- Check provider type and model key match the upstream catalog (OpenRouter slugs vs LiteLLM keys).
- Retry Reset … to online defaults after cluster egress to the catalog URLs is restored.
- Enter manual prices, enable the row, save.
- Common when OpenRouter lists image pricing not expressed as per-image counts, or LiteLLM lists transcription/TTS in a non-metered shape.
- Enter the correct unit price for that capability method and save as Manual.
- Verify the row is Enabled and saved.
- Confirm tenants use the priced model ID, not an alias bypassing the catalog.
- Check capability row (chat vs embed) matches the request type actually metered.
- Include
modelKeyormodelIdexactly as in the workspace. - Add
requestTypewhen importing a non-chat capability on a multi-capability model.
From Financial Controls → Pricing Guides or the ? shelf, ask GT Helper questions such as:
- “How do I fix unresolved pricing for Azure OpenAI gpt-4o?”
- “What CSV columns do I need to bulk-update embedding prices?”
- “When should I reset model pricing to online defaults?”