Skip to content

OpenCode requests unaffordable output budget for gpt-5.4 in normal serve/web sessions and fails with 402 #22158

@rosspeoples

Description

@rosspeoples

Description

When OpenCode is configured to use gpt-5.4 through an OpenAI-compatible provider path, normal prompts in serve/web mode can fail immediately because OpenCode requests an output budget that is too large for the available model credits.

In our repro, even a trivial prompt failed because OpenCode requested up to 32000 output tokens. The provider returned a 402 error indicating the request required more credits or fewer max tokens.

Representative error returned through our provider gateway:

This request requires more credits, or fewer max_tokens. You requested up to 32000 tokens, but can only afford 8421.

Important control case:

  • the same model (gpt-5.4) worked correctly when we called the provider gateway directly and explicitly capped max_tokens to a smaller value like 512
  • so the model itself was not unusable in our environment; the failure was specifically OpenCode's default request shape/output budget for that model in serve/web usage

This made gpt-5.4 unusable as the default model in our Build-integrated OpenCode deployment.

Plugins

Build-local plugins:

  • secret-guard.js
  • workspace-shell.js

OpenCode version

1.3.15

Steps to reproduce

  1. Run OpenCode in serve/web mode.
  2. Configure an OpenAI-compatible provider path for gpt-5.4.
  3. Make gpt-5.4 the default model.
  4. Send a trivial prompt like: What is 2 plus 2? Reply in one sentence.
  5. Observe the request fails immediately with a provider-side 402 affordability error.
  6. As a control, call the same provider directly with the same model and a capped max_tokens like 512.
  7. Observe the direct provider call succeeds.

Screenshot and/or share link

No share link. Representative session error from OpenCode:

litellm.APIError: APIError: OpenrouterException - {"error":{"message":"This request requires more credits, or fewer max_tokens. You requested up to 32000 tokens, but can only afford 8421." ... }}

Operating System

Linux (Kubernetes pod on Fedora CoreOS host)

Terminal

Web UI / serve mode behind a reverse proxy launcher

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions