docs(pricing-may-2026): customer-supplied inference (BYOK + CIE + BYOLLM) by hongyi-chen · Pull Request #115 · warpdotdev/docs

hongyi-chen · 2026-05-19T23:56:22Z

Part of the May 14, 2026 Warp pricing-and-packaging docs launch. Targets the umbrella branch hyc/plan-updates.

This PR carries the "bring your own" inference docs. Companion thematic PRs:

hyc/plan-updates-credits-billing — credits, billing, service-account model
hyc/plan-updates-plans-faqs-teams — plan summary, pricing FAQs, team-management consequences

What's covered

BYOK is now available on the Free plan (previously paid-only). Page rewritten to open eligibility, refresh model examples (Claude Opus 4.7, Claude Sonnet 4.6, GPT-5.5, Gemini 3.1 Pro), and add the BYOK/CIE/BYOLLM comparison table.
Custom Inference Endpoint (CIE) is a new feature for OpenAI Chat Completions–compatible endpoints (OpenRouter, LiteLLM, z.ai, internal gateways). New page added; sidebar entry added under Plans and billing.
BYOLLM reframed as Enterprise-only managed inference. AWS Bedrock is GA today; Google Vertex AI and Azure AI Foundry on the roadmap; approved internal gateways evaluated case-by-case. "Cloud-native credentials" key feature now covers IAM/OIDC across all three cloud providers, not just AWS.
10-employee org rule applies to BYOK and CIE: available to individuals and orgs with ≤10 employees; larger orgs need Business or Enterprise.
Platform-credits caveats woven into BYOK and CIE billing copy — on Business/Enterprise local agent runs, customer-supplied inference still consumes platform credits even though no AI credits are charged.
plans-and-billing/index.mdx updated to surface the new Custom inference endpoint page in the landing list.

Files changed

src/content/docs/support-and-community/plans-and-billing/bring-your-own-api-key.mdx
src/content/docs/support-and-community/plans-and-billing/custom-inference-endpoint.mdx (new)
src/content/docs/enterprise/enterprise-features/bring-your-own-llm.mdx
src/sidebar.ts
src/content/docs/support-and-community/plans-and-billing/index.mdx

Editorial rules followed

No per-plan monthly credit counts hard-coded; references link to warp.dev/pricing.
Three-way BYOK / CIE / BYOLLM comparison table is consistent across all three pages.

Co-Authored-By: Oz oz-agent@warp.dev

…LLM) Part of the May 14, 2026 pricing-and-packaging docs launch. - BYOK is now available on the Free plan; page rewritten to open eligibility, refresh model examples, and add the BYOK/CIE/BYOLLM comparison table. - New Custom Inference Endpoint (CIE) page for OpenAI Chat Completions– compatible endpoints (OpenRouter, LiteLLM, z.ai, internal gateways). Sidebar entry added under Plans and billing. - BYOLLM reframed as Enterprise-only managed inference. AWS Bedrock GA; Google Vertex AI and Azure AI Foundry on the roadmap. Cloud-native credentials now span IAM/OIDC across all three cloud providers. - 10-employee org rule applies to BYOK and CIE; larger orgs need Business or Enterprise. - Platform-credits caveats: on Business/Enterprise local agent runs, customer-supplied inference still consumes platform credits even though no AI credits are charged. - plans-and-billing/index.mdx updated to surface the new CIE page. Co-Authored-By: Oz <oz-agent@warp.dev>

vercel · 2026-05-19T23:56:25Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	May 21, 2026 4:10am

oz-for-oss · 2026-05-19T23:56:35Z

@hongyi-chen

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss

Overview

This PR adds and updates documentation for BYOK, Custom inference endpoint, and BYOLLM pricing/billing behavior.

Concerns

The new platform-credit caveats conflict with several blanket statements that customer-supplied inference consumes no credits. Those statements need to be narrowed to AI/inference credits so Business and Enterprise local agent runs are not documented as free of platform-credit usage.
The sidebar change removes the existing File locations page from Terminal > Settings file navigation even though the PR is scoped to pricing docs.
The comparison tables are not fully consistent across the BYOK/CIE/BYOLLM pages, despite the PR description calling out consistency as an editorial rule.

Verdict

Found: 0 critical, 4 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss · 2026-05-19T23:59:32Z


 When a request routes through BYOLLM:

 * **Warp does not consume credits** for that request.


⚠️ [IMPORTANT] This now conflicts with the new platform-credit note below: if BYOLLM local runs still consume platform credits, this bullet should say Warp does not consume AI/inference credits rather than credits generally.

oz-for-oss · 2026-05-19T23:59:32Z

+Warp supports **Bring your own API key (BYOK)** for users who want to connect Warp's agents to their own Anthropic, OpenAI, or Google API accounts.

-This lets you use your own API keys to access models directly, giving you full control over model selection, billing, and data routing. See [Model Choice](/agent-platform/capabilities/model-choice/) for a list of supported models.
+BYOK gives you full control over model selection, billing, and data routing. See [Model Choice](/agent-platform/capabilities/model-choice/) for the full list of supported models. When you route a request through your own key, Warp **never consumes your** [credits](/support-and-community/plans-and-billing/credits/) for that request.


⚠️ [IMPORTANT] The new note says Business and Enterprise local runs can still consume platform credits, so this blanket never consumes your credits wording should be narrowed to AI/inference credits.

oz-for-oss · 2026-05-19T23:59:32Z

+
+A **Custom inference endpoint (CIE)** lets you connect Warp's agents to any OpenAI-compatible inference endpoint, so you can route AI requests through your preferred model router, hosted gateway, or internal infrastructure.
+
+CIE is the right fit when you want to choose your provider, consolidate billing through a third-party router, or run inference behind your own gateway — without giving up the agent experience inside Warp. When a CIE is configured and selected, Warp **never consumes your** [credits](/support-and-community/plans-and-billing/credits/) for the request.


⚠️ [IMPORTANT] The platform-credit caveat later on the page contradicts this blanket never consumes your credits claim; qualify this as AI/inference credits so local Business and Enterprise runs are not documented as using zero platform credits.

oz-for-oss · 2026-05-19T23:59:32Z

 							items: [
 								{ slug: 'terminal/settings', label: 'Overview' },
 								{ slug: 'terminal/settings/all-settings', label: 'All settings reference' },
-								{ slug: 'terminal/settings/file-locations', label: 'File locations' },


⚠️ [IMPORTANT] This removes the existing File locations page from the Settings file sidebar in an otherwise pricing-focused PR; restore it unless the page is intentionally being removed from navigation.

oz-for-oss · 2026-05-19T23:59:33Z

+| Name | Meaning | Plans |
+| --- | --- | --- |
+| **[Bring your own API key](/support-and-community/plans-and-billing/bring-your-own-api-key/)** (BYOK) | Use your own API key for OpenAI, Anthropic, or Google models. Keys are stored locally on your device. | Free and all eligible paid plans |
+| **Custom inference endpoint** (CIE) | Connect Warp to an OpenAI-compatible endpoint such as OpenRouter, LiteLLM, z.ai, or an internal gateway. | Free and all eligible paid plans |


💡 [SUGGESTION] This plan-availability cell does not match the BYOLLM page's comparison table, which includes the 10-employee org qualifier and Business/Enterprise requirement for larger orgs; keep the shared comparison table wording consistent across the three pages.

… + CIE platform-credits callouts Both BYOK and CIE pages now spell out that self-serve billing for platform credits (including Business BYOK / CIE) doesn't start until July 1, 2026. Between May 14 and June 30, 2026, platform-credit consumption is visible in the Warp app's usage breakdown for transparency on Build, Max, and Business, but no platform credits are deducted from your Reload pool or counted against your spend cap. Enterprise plans are billed per contract from May 14 and aren't affected by this preview period. Co-Authored-By: Oz <oz-agent@warp.dev>

Co-Authored-By: Oz <oz-agent@warp.dev>

Per launch direction: keep the Enterprise BYOLLM page largely unchanged for this launch. The BYOK/CIE/BYOLLM comparison still lives on the BYOK and CIE pages, so readers landing on either of those will see the three-way framing. This restores: - The original AWS-Bedrock-focused frontmatter description and opening paragraph (instead of the cross-provider reframing). - The original 'BYOLLM currently supports AWS Bedrock only. Coming soon: Azure Foundry and Google Vertex support.' caveat. - The original 'Cloud-native credentials - Authenticate using each user's AWS IAM identity' key feature. - The original 'How is BYOLLM different from BYOK?' FAQ with its 4-row comparison table. - The original Related resources list. Drops the launch-era additions: - The 'How BYOLLM differs from BYOK and Custom inference endpoint' section with the three-way comparison table. - The :::note about centrally configured BYOK / CIE for Enterprise being a fast-follow. - The :::note about platform credits for BYOLLM-routed local runs. Co-Authored-By: Oz <oz-agent@warp.dev>

…remove preview-period notes, restore File locations sidebar - Restore the 'File locations' sidebar entry under Settings file (added on main by PR #110, accidentally dropped during the rebase). - Drop the 'CIE' abbreviation throughout the customer-supplied inference pages. Use the full name 'custom inference endpoint' (or 'your endpoint' / 'endpoint-routed model' in context) instead. - Narrow the 'never consumes Warp credits' claim to 'doesn't consume AI credits' on the BYOK and custom inference endpoint pages, since Business / Enterprise local agent runs still consume platform credits. - Rewrite the 'No Warp credits consumed' Key features bullet on the custom inference endpoint page so it accurately calls out the platform-credits caveat on Business / Enterprise. - Drop the 'Self-serve preview period' paragraph from the platform-credits :::note callouts on the BYOK and custom inference endpoint pages. The July 1, 2026 cutover lives only in pricing-faqs.mdx now \u2014 canonical feature pages don't carry the launch-period detail. Co-Authored-By: Oz <oz-agent@warp.dev>

…ce endpoint openings, consolidate plan notes - Reframe the BYOK and custom inference endpoint opening copy around model selection and data routing instead of billing. Move the AI-credits-consumption details out of the intro and down into the dedicated billing sections where they belong. - Collapse the two stacked :::note callouts about plan availability and the 10-or-fewer-employees rule into a single, briefer note on each page. - Move the Business / Enterprise platform-credits caveat off the top of the BYOK page and into the 'Credit usage' subsection alongside the related credit details. - Trim the 'BYOK on Enterprise and Business plans' section on the BYOK page so it doesn't restate the org-size rule already covered up top. - Replace the redundant 'Plan availability' section on the custom inference endpoint page with a focused 'Centrally managed configuration' section that only covers what's still unique to that page (user-level config today, admin-managed coming later).\n - Light copy polish on phrasing in both files. Co-Authored-By: Oz <oz-agent@warp.dev>

…e credits claim to AI credits Per follow-up review, undo the polish on the BYOK intro and restore the original three-paragraph opening verbatim: - Title back to 'Bring Your Own API Key' (Title Case) - 'Warp supports Bring Your Own Key (BYOK) for users who want to connect Warp's agents to their own Anthropic, OpenAI, or Google API accounts.' - 'This lets you use your own API keys to access models directly, giving you full control over model selection, billing, and data routing. See Model Choice for a list of supported models.' - 'BYOK provides greater flexibility in model access and ensures Warp never consumes your AI credits for requests routed through your own keys.' The only substantive change vs the original is narrowing 'credits' to 'AI credits' in that last sentence, per earlier feedback that the unqualified 'never consumes Warp credits' claim is too broad now that Business / Enterprise local runs can consume platform credits. The combined plan-availability + 10-employee :::note below the intro stays as-is. Everything below the intro (BYOK works, Enabling BYOK, billing behavior, Credit usage with the platform-credits note, ZDR, Enterprise/Business config, Related resources) is unchanged. Co-Authored-By: Oz <oz-agent@warp.dev>

…laims in BYOK + CIE credit sections - Drop the 'No AI credits are consumed' bullet and the 'credit transparency footer shows 0 credits used' sentence from BYOK's Credit usage subsection. Replaced with a more general framing that says inference is billed through your provider account rather than drawing from your Warp AI credits, alongside the existing platform credits caveat for Business / Enterprise. - Same softening on the custom inference endpoint page's Warp AI credits subsection \u2014 collapse the three firm bullets into one general sentence and keep the platform-credits note. This avoids the misleadingly absolute '0 credits' claim, which is inaccurate for Business / Enterprise local runs where platform credits can still apply. Co-Authored-By: Oz <oz-agent@warp.dev>

…ad with powering Warp's agents Mirror the BYOK page's intro pattern so it's explicit upfront that a custom inference endpoint is used to power Warp's agents. New opening: Warp supports custom inference endpoints for users who want to power Warp's agents with any OpenAI-compatible inference endpoint \u2014 a model router, hosted gateway, or internal infrastructure they already run. This lets you route AI requests through your preferred provider, run inference behind your own gateway, or use a router like OpenRouter or LiteLLM, while keeping the agent experience inside Warp. No other changes. Co-Authored-By: Oz <oz-agent@warp.dev>

- reference/cli/api-keys.mdx: drop the lingering '(with pay-as-you-go fallback if enabled in the contract)' parenthetical from the Personal API keys Enterprise bullet, matching the phrasing used elsewhere on this PR. - credits.mdx: tighten BYOLLM scope to reflect actual launch state (AWS Bedrock today; Azure Foundry and Google Vertex coming soon) in both the AI/compute/platform overview and the platform-credits eligibility bullets. Also normalizes 'Amazon Bedrock' to 'AWS Bedrock' to match the canonical BYOLLM page. - credits.mdx: refresh the example model list from Claude Opus 4.6 / 4.5 / GPT-5.4 / GPT-5.3 Codex / Gemini 3 Pro to the current set (Claude Opus 4.7, Claude Sonnet 4.6, GPT-5.5, Gemini 3.1 Pro) so PR #114 and PR #115's BYOK page cite the same models. Co-Authored-By: Oz <oz-agent@warp.dev>

- bring-your-own-api-key.mdx intro: fix the wrong BYOK expansion ('Bring Your Own Key (BYOK)') to match the page title and standard usage ('Bring Your Own API Key (BYOK)'). - bring-your-own-api-key.mdx + custom-inference-endpoint.mdx comparison tables: tighten the BYOLLM row so it reflects current launch scope ('AWS Bedrock today; Azure Foundry and Google Vertex coming soon') instead of implying all three ship at launch. Co-Authored-By: Oz <oz-agent@warp.dev>

tylerlam-warp

I didn't review everything (i.e. skipped the custom inference page) but left some comments

tylerlam-warp · 2026-05-21T03:41:19Z


 :::note
-BYOK is currently only available on Warp's paid plans, starting with Build. Learn more about plans and pricing [warp.dev/pricing](https://www.warp.dev/pricing).
+BYOK is available on Free and all eligible paid plans for individual users and organizations with 10 or fewer employees, subject to Warp's [Terms of Service](https://www.warp.dev/terms-of-service). Larger organizations need a Business or Enterprise plan. See [warp.dev/pricing](https://www.warp.dev/pricing) for current availability.


I kinda prefer "Larger organizations need a Business or Enterprise plan" to "Larger organizations require a Business or Enterprise plan" but that's just me

tylerlam-warp · 2026-05-21T03:43:45Z

+See [warp.dev/pricing](https://www.warp.dev/pricing) for current plan availability.

-## How does BYOK work?
+Platform credits may apply for local agent runs on Business and Enterprise when using BYOK, a custom inference endpoint, or BYOLLM. See [platform credits](/support-and-community/plans-and-billing/platform-credits/).


Don't platform credits also apply for cloud runs?

tylerlam-warp · 2026-05-21T03:44:31Z

 When you add your own model API keys in Warp, those keys are stored **locally on your device** and are **never synced to the cloud**.

-Warp uses these API keys to directly route your agent requests to the model provider you've configured.
+Warp uses these API keys to route your agent requests directly to the model provider you've configured.


Is this correct to say? Requests still technically go through our server right?

- bring-your-own-api-key.mdx 'Platform credits' note: Tyler correctly pointed out that platform credits also apply for cloud agent runs. Rewrite the line to lead with the cloud-agent case ('apply to every cloud agent run on any plan') and then cover the local-runs case ('and to local agent runs on Business and Enterprise when using BYOK, a custom inference endpoint, or BYOLLM'). - bring-your-own-api-key.mdx 'How BYOK works' opening: drop the misleading 'directly to the model provider' phrasing since requests still flow through Warp's infrastructure. Now reads 'Warp uses these API keys when routing your agent requests to the model provider you've configured.' Tyler's third comment was a stylistic preference for 'need' over 'require' on the page note, which already uses 'need' here. The parallel 'require' phrasing in pricing-faqs.mdx will be normalized on PR #116. Co-Authored-By: Oz <oz-agent@warp.dev>

Tyler's stylistic preference (raised on PR #115) was for 'need' over 'require' on the BYOK/CIE plan-availability sentences. The PR #115 page already uses 'need'; the equivalent phrasing in pricing-faqs.mdx ('Companies or organizations with more than 10 employees require a Warp Business or Enterprise plan to use these features') now reads 'need' in both places (BYOK FAQ + Custom Inference Endpoint FAQ). Co-Authored-By: Oz <oz-agent@warp.dev>

Umbrella tracker for the May 21, 2026 pricing and packaging docs launch. Populates as the three thematic PRs land: - #114 - Credits, billing, and cloud-agent team billing - #115 - BYOK + custom inference endpoint + BYOLLM - #116 - Plans summary, pricing FAQs, teams copy Co-Authored-By: Oz <oz-agent@warp.dev>

cla-bot Bot added the cla-signed label May 19, 2026

vercel Bot had a problem deploying to Preview May 19, 2026 23:57 Failure

hongyi-chen mentioned this pull request May 19, 2026

docs(pricing-may-2026): May 14 2026 pricing + packaging launch #113

Closed

oz-for-oss Bot reviewed May 19, 2026

View reviewed changes

vercel Bot had a problem deploying to Preview May 20, 2026 00:09 Failure

docs(pricing-may-2026): correct launch date May 14 \u2192 May 21, 2026

d3c5b54

Co-Authored-By: Oz <oz-agent@warp.dev>

vercel Bot had a problem deploying to Preview May 20, 2026 18:01 Failure

vercel Bot had a problem deploying to Preview May 21, 2026 00:10 Failure

vercel Bot deployed to Preview May 21, 2026 00:17 View deployment

vercel Bot deployed to Preview May 21, 2026 00:25 View deployment

vercel Bot deployed to Preview May 21, 2026 00:28 View deployment

vercel Bot deployed to Preview May 21, 2026 00:31 View deployment

vercel Bot deployed to Preview May 21, 2026 00:33 View deployment

vercel Bot deployed to Preview May 21, 2026 01:50 View deployment

tylerlam-warp reviewed May 21, 2026

View reviewed changes

vercel Bot deployed to Preview May 21, 2026 04:10 View deployment

hongyi-chen mentioned this pull request May 21, 2026

docs(pricing-may-2026): May 21, 2026 pricing + packaging launch #119

Open

hongyi-chen merged commit 65cb4fe into hyc/plan-updates May 21, 2026
4 checks passed

hongyi-chen deleted the hyc/plan-updates-byok-cie-byollm branch May 21, 2026 04:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(pricing-may-2026): customer-supplied inference (BYOK + CIE + BYOLLM)#115

docs(pricing-may-2026): customer-supplied inference (BYOK + CIE + BYOLLM)#115
hongyi-chen merged 11 commits into
hyc/plan-updatesfrom
hyc/plan-updates-byok-cie-byollm

hongyi-chen commented May 19, 2026

Uh oh!

vercel Bot commented May 19, 2026 •

edited

Loading

Uh oh!

oz-for-oss Bot commented May 19, 2026 •

edited

Loading

Uh oh!

oz-for-oss Bot left a comment

Uh oh!

oz-for-oss Bot May 19, 2026

Uh oh!

oz-for-oss Bot May 19, 2026

Uh oh!

oz-for-oss Bot May 19, 2026

Uh oh!

oz-for-oss Bot May 19, 2026

Uh oh!

oz-for-oss Bot May 19, 2026

Uh oh!

tylerlam-warp left a comment

Uh oh!

tylerlam-warp May 21, 2026

Uh oh!

tylerlam-warp May 21, 2026

Uh oh!

tylerlam-warp May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		When a request routes through BYOLLM:

		* Warp does not consume credits for that request.


		A Custom inference endpoint (CIE) lets you connect Warp's agents to any OpenAI-compatible inference endpoint, so you can route AI requests through your preferred model router, hosted gateway, or internal infrastructure.

		CIE is the right fit when you want to choose your provider, consolidate billing through a third-party router, or run inference behind your own gateway — without giving up the agent experience inside Warp. When a CIE is configured and selected, Warp never consumes your [credits](/support-and-community/plans-and-billing/credits/) for the request.

Conversation

hongyi-chen commented May 19, 2026

What's covered

Files changed

Editorial rules followed

Uh oh!

vercel Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oz-for-oss Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oz-for-oss Bot left a comment

Choose a reason for hiding this comment

Overview

Concerns

Verdict

Uh oh!

oz-for-oss Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

tylerlam-warp left a comment

Choose a reason for hiding this comment

Uh oh!

tylerlam-warp May 21, 2026

Choose a reason for hiding this comment

Uh oh!

tylerlam-warp May 21, 2026

Choose a reason for hiding this comment

Uh oh!

tylerlam-warp May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented May 19, 2026 •

edited

Loading

oz-for-oss Bot commented May 19, 2026 •

edited

Loading