docs(pricing-may-2026): customer-supplied inference (BYOK + CIE + BYOLLM)#115
Conversation
…LLM) Part of the May 14, 2026 pricing-and-packaging docs launch. - BYOK is now available on the Free plan; page rewritten to open eligibility, refresh model examples, and add the BYOK/CIE/BYOLLM comparison table. - New Custom Inference Endpoint (CIE) page for OpenAI Chat Completions– compatible endpoints (OpenRouter, LiteLLM, z.ai, internal gateways). Sidebar entry added under Plans and billing. - BYOLLM reframed as Enterprise-only managed inference. AWS Bedrock GA; Google Vertex AI and Azure AI Foundry on the roadmap. Cloud-native credentials now span IAM/OIDC across all three cloud providers. - 10-employee org rule applies to BYOK and CIE; larger orgs need Business or Enterprise. - Platform-credits caveats: on Business/Enterprise local agent runs, customer-supplied inference still consumes platform credits even though no AI credits are charged. - plans-and-billing/index.mdx updated to surface the new CIE page. Co-Authored-By: Oz <oz-agent@warp.dev>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR adds and updates documentation for BYOK, Custom inference endpoint, and BYOLLM pricing/billing behavior.
Concerns
- The new platform-credit caveats conflict with several blanket statements that customer-supplied inference consumes no credits. Those statements need to be narrowed to AI/inference credits so Business and Enterprise local agent runs are not documented as free of platform-credit usage.
- The sidebar change removes the existing File locations page from Terminal > Settings file navigation even though the PR is scoped to pricing docs.
- The comparison tables are not fully consistent across the BYOK/CIE/BYOLLM pages, despite the PR description calling out consistency as an editorial rule.
Verdict
Found: 0 critical, 4 important, 1 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
|
|
||
| When a request routes through BYOLLM: | ||
|
|
||
| * **Warp does not consume credits** for that request. |
There was a problem hiding this comment.
| Warp supports **Bring your own API key (BYOK)** for users who want to connect Warp's agents to their own Anthropic, OpenAI, or Google API accounts. | ||
|
|
||
| This lets you use your own API keys to access models directly, giving you full control over model selection, billing, and data routing. See [Model Choice](/agent-platform/capabilities/model-choice/) for a list of supported models. | ||
| BYOK gives you full control over model selection, billing, and data routing. See [Model Choice](/agent-platform/capabilities/model-choice/) for the full list of supported models. When you route a request through your own key, Warp **never consumes your** [credits](/support-and-community/plans-and-billing/credits/) for that request. |
There was a problem hiding this comment.
never consumes your credits wording should be narrowed to AI/inference credits.
|
|
||
| A **Custom inference endpoint (CIE)** lets you connect Warp's agents to any OpenAI-compatible inference endpoint, so you can route AI requests through your preferred model router, hosted gateway, or internal infrastructure. | ||
|
|
||
| CIE is the right fit when you want to choose your provider, consolidate billing through a third-party router, or run inference behind your own gateway — without giving up the agent experience inside Warp. When a CIE is configured and selected, Warp **never consumes your** [credits](/support-and-community/plans-and-billing/credits/) for the request. |
There was a problem hiding this comment.
never consumes your credits claim; qualify this as AI/inference credits so local Business and Enterprise runs are not documented as using zero platform credits.
| items: [ | ||
| { slug: 'terminal/settings', label: 'Overview' }, | ||
| { slug: 'terminal/settings/all-settings', label: 'All settings reference' }, | ||
| { slug: 'terminal/settings/file-locations', label: 'File locations' }, |
There was a problem hiding this comment.
| | Name | Meaning | Plans | | ||
| | --- | --- | --- | | ||
| | **[Bring your own API key](/support-and-community/plans-and-billing/bring-your-own-api-key/)** (BYOK) | Use your own API key for OpenAI, Anthropic, or Google models. Keys are stored locally on your device. | Free and all eligible paid plans | | ||
| | **Custom inference endpoint** (CIE) | Connect Warp to an OpenAI-compatible endpoint such as OpenRouter, LiteLLM, z.ai, or an internal gateway. | Free and all eligible paid plans | |
There was a problem hiding this comment.
💡 [SUGGESTION] This plan-availability cell does not match the BYOLLM page's comparison table, which includes the 10-employee org qualifier and Business/Enterprise requirement for larger orgs; keep the shared comparison table wording consistent across the three pages.
… + CIE platform-credits callouts Both BYOK and CIE pages now spell out that self-serve billing for platform credits (including Business BYOK / CIE) doesn't start until July 1, 2026. Between May 14 and June 30, 2026, platform-credit consumption is visible in the Warp app's usage breakdown for transparency on Build, Max, and Business, but no platform credits are deducted from your Reload pool or counted against your spend cap. Enterprise plans are billed per contract from May 14 and aren't affected by this preview period. Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
Per launch direction: keep the Enterprise BYOLLM page largely unchanged for this launch. The BYOK/CIE/BYOLLM comparison still lives on the BYOK and CIE pages, so readers landing on either of those will see the three-way framing. This restores: - The original AWS-Bedrock-focused frontmatter description and opening paragraph (instead of the cross-provider reframing). - The original 'BYOLLM currently supports AWS Bedrock only. Coming soon: Azure Foundry and Google Vertex support.' caveat. - The original 'Cloud-native credentials - Authenticate using each user's AWS IAM identity' key feature. - The original 'How is BYOLLM different from BYOK?' FAQ with its 4-row comparison table. - The original Related resources list. Drops the launch-era additions: - The 'How BYOLLM differs from BYOK and Custom inference endpoint' section with the three-way comparison table. - The :::note about centrally configured BYOK / CIE for Enterprise being a fast-follow. - The :::note about platform credits for BYOLLM-routed local runs. Co-Authored-By: Oz <oz-agent@warp.dev>
…remove preview-period notes, restore File locations sidebar - Restore the 'File locations' sidebar entry under Settings file (added on main by PR #110, accidentally dropped during the rebase). - Drop the 'CIE' abbreviation throughout the customer-supplied inference pages. Use the full name 'custom inference endpoint' (or 'your endpoint' / 'endpoint-routed model' in context) instead. - Narrow the 'never consumes Warp credits' claim to 'doesn't consume AI credits' on the BYOK and custom inference endpoint pages, since Business / Enterprise local agent runs still consume platform credits. - Rewrite the 'No Warp credits consumed' Key features bullet on the custom inference endpoint page so it accurately calls out the platform-credits caveat on Business / Enterprise. - Drop the 'Self-serve preview period' paragraph from the platform-credits :::note callouts on the BYOK and custom inference endpoint pages. The July 1, 2026 cutover lives only in pricing-faqs.mdx now \u2014 canonical feature pages don't carry the launch-period detail. Co-Authored-By: Oz <oz-agent@warp.dev>
…ce endpoint openings, consolidate plan notes - Reframe the BYOK and custom inference endpoint opening copy around model selection and data routing instead of billing. Move the AI-credits-consumption details out of the intro and down into the dedicated billing sections where they belong. - Collapse the two stacked :::note callouts about plan availability and the 10-or-fewer-employees rule into a single, briefer note on each page. - Move the Business / Enterprise platform-credits caveat off the top of the BYOK page and into the 'Credit usage' subsection alongside the related credit details. - Trim the 'BYOK on Enterprise and Business plans' section on the BYOK page so it doesn't restate the org-size rule already covered up top. - Replace the redundant 'Plan availability' section on the custom inference endpoint page with a focused 'Centrally managed configuration' section that only covers what's still unique to that page (user-level config today, admin-managed coming later).\n - Light copy polish on phrasing in both files. Co-Authored-By: Oz <oz-agent@warp.dev>
…e credits claim to AI credits Per follow-up review, undo the polish on the BYOK intro and restore the original three-paragraph opening verbatim: - Title back to 'Bring Your Own API Key' (Title Case) - 'Warp supports Bring Your Own Key (BYOK) for users who want to connect Warp's agents to their own Anthropic, OpenAI, or Google API accounts.' - 'This lets you use your own API keys to access models directly, giving you full control over model selection, billing, and data routing. See Model Choice for a list of supported models.' - 'BYOK provides greater flexibility in model access and ensures Warp never consumes your AI credits for requests routed through your own keys.' The only substantive change vs the original is narrowing 'credits' to 'AI credits' in that last sentence, per earlier feedback that the unqualified 'never consumes Warp credits' claim is too broad now that Business / Enterprise local runs can consume platform credits. The combined plan-availability + 10-employee :::note below the intro stays as-is. Everything below the intro (BYOK works, Enabling BYOK, billing behavior, Credit usage with the platform-credits note, ZDR, Enterprise/Business config, Related resources) is unchanged. Co-Authored-By: Oz <oz-agent@warp.dev>
…laims in BYOK + CIE credit sections - Drop the 'No AI credits are consumed' bullet and the 'credit transparency footer shows 0 credits used' sentence from BYOK's Credit usage subsection. Replaced with a more general framing that says inference is billed through your provider account rather than drawing from your Warp AI credits, alongside the existing platform credits caveat for Business / Enterprise. - Same softening on the custom inference endpoint page's Warp AI credits subsection \u2014 collapse the three firm bullets into one general sentence and keep the platform-credits note. This avoids the misleadingly absolute '0 credits' claim, which is inaccurate for Business / Enterprise local runs where platform credits can still apply. Co-Authored-By: Oz <oz-agent@warp.dev>
…ad with powering Warp's agents Mirror the BYOK page's intro pattern so it's explicit upfront that a custom inference endpoint is used to power Warp's agents. New opening: Warp supports custom inference endpoints for users who want to power Warp's agents with any OpenAI-compatible inference endpoint \u2014 a model router, hosted gateway, or internal infrastructure they already run. This lets you route AI requests through your preferred provider, run inference behind your own gateway, or use a router like OpenRouter or LiteLLM, while keeping the agent experience inside Warp. No other changes. Co-Authored-By: Oz <oz-agent@warp.dev>
- reference/cli/api-keys.mdx: drop the lingering '(with pay-as-you-go fallback if enabled in the contract)' parenthetical from the Personal API keys Enterprise bullet, matching the phrasing used elsewhere on this PR. - credits.mdx: tighten BYOLLM scope to reflect actual launch state (AWS Bedrock today; Azure Foundry and Google Vertex coming soon) in both the AI/compute/platform overview and the platform-credits eligibility bullets. Also normalizes 'Amazon Bedrock' to 'AWS Bedrock' to match the canonical BYOLLM page. - credits.mdx: refresh the example model list from Claude Opus 4.6 / 4.5 / GPT-5.4 / GPT-5.3 Codex / Gemini 3 Pro to the current set (Claude Opus 4.7, Claude Sonnet 4.6, GPT-5.5, Gemini 3.1 Pro) so PR #114 and PR #115's BYOK page cite the same models. Co-Authored-By: Oz <oz-agent@warp.dev>
- bring-your-own-api-key.mdx intro: fix the wrong BYOK expansion
('Bring Your Own Key (BYOK)') to match the page title and standard
usage ('Bring Your Own API Key (BYOK)').
- bring-your-own-api-key.mdx + custom-inference-endpoint.mdx
comparison tables: tighten the BYOLLM row so it reflects current
launch scope ('AWS Bedrock today; Azure Foundry and Google Vertex
coming soon') instead of implying all three ship at launch.
Co-Authored-By: Oz <oz-agent@warp.dev>
tylerlam-warp
left a comment
There was a problem hiding this comment.
I didn't review everything (i.e. skipped the custom inference page) but left some comments
|
|
||
| :::note | ||
| BYOK is currently only available on Warp's paid plans, starting with Build. Learn more about plans and pricing [warp.dev/pricing](https://www.warp.dev/pricing). | ||
| BYOK is available on Free and all eligible paid plans for individual users and organizations with 10 or fewer employees, subject to Warp's [Terms of Service](https://www.warp.dev/terms-of-service). Larger organizations need a Business or Enterprise plan. See [warp.dev/pricing](https://www.warp.dev/pricing) for current availability. |
There was a problem hiding this comment.
I kinda prefer "Larger organizations need a Business or Enterprise plan" to "Larger organizations require a Business or Enterprise plan" but that's just me
| See [warp.dev/pricing](https://www.warp.dev/pricing) for current plan availability. | ||
|
|
||
| ## How does BYOK work? | ||
| Platform credits may apply for local agent runs on Business and Enterprise when using BYOK, a custom inference endpoint, or BYOLLM. See [platform credits](/support-and-community/plans-and-billing/platform-credits/). |
There was a problem hiding this comment.
Don't platform credits also apply for cloud runs?
| When you add your own model API keys in Warp, those keys are stored **locally on your device** and are **never synced to the cloud**. | ||
|
|
||
| Warp uses these API keys to directly route your agent requests to the model provider you've configured. | ||
| Warp uses these API keys to route your agent requests directly to the model provider you've configured. |
There was a problem hiding this comment.
Is this correct to say? Requests still technically go through our server right?
- bring-your-own-api-key.mdx 'Platform credits' note: Tyler correctly
pointed out that platform credits also apply for cloud agent runs.
Rewrite the line to lead with the cloud-agent case ('apply to every
cloud agent run on any plan') and then cover the local-runs case
('and to local agent runs on Business and Enterprise when using
BYOK, a custom inference endpoint, or BYOLLM').
- bring-your-own-api-key.mdx 'How BYOK works' opening: drop the
misleading 'directly to the model provider' phrasing since requests
still flow through Warp's infrastructure. Now reads 'Warp uses
these API keys when routing your agent requests to the model
provider you've configured.'
Tyler's third comment was a stylistic preference for 'need' over
'require' on the page note, which already uses 'need' here. The
parallel 'require' phrasing in pricing-faqs.mdx will be normalized on
PR #116.
Co-Authored-By: Oz <oz-agent@warp.dev>
Tyler's stylistic preference (raised on PR #115) was for 'need' over 'require' on the BYOK/CIE plan-availability sentences. The PR #115 page already uses 'need'; the equivalent phrasing in pricing-faqs.mdx ('Companies or organizations with more than 10 employees require a Warp Business or Enterprise plan to use these features') now reads 'need' in both places (BYOK FAQ + Custom Inference Endpoint FAQ). Co-Authored-By: Oz <oz-agent@warp.dev>
Umbrella tracker for the May 21, 2026 pricing and packaging docs launch. Populates as the three thematic PRs land: - #114 - Credits, billing, and cloud-agent team billing - #115 - BYOK + custom inference endpoint + BYOLLM - #116 - Plans summary, pricing FAQs, teams copy Co-Authored-By: Oz <oz-agent@warp.dev>
Part of the May 14, 2026 Warp pricing-and-packaging docs launch. Targets the umbrella branch
hyc/plan-updates.This PR carries the "bring your own" inference docs. Companion thematic PRs:
hyc/plan-updates-credits-billing— credits, billing, service-account modelhyc/plan-updates-plans-faqs-teams— plan summary, pricing FAQs, team-management consequencesWhat's covered
plans-and-billing/index.mdxupdated to surface the new Custom inference endpoint page in the landing list.Files changed
src/content/docs/support-and-community/plans-and-billing/bring-your-own-api-key.mdxsrc/content/docs/support-and-community/plans-and-billing/custom-inference-endpoint.mdx(new)src/content/docs/enterprise/enterprise-features/bring-your-own-llm.mdxsrc/sidebar.tssrc/content/docs/support-and-community/plans-and-billing/index.mdxEditorial rules followed
warp.dev/pricing.Co-Authored-By: Oz oz-agent@warp.dev