fix: widen Kimi completion budget by wbxl2000 · Pull Request #17 · MoonshotAI/kimi-code

wbxl2000 · 2026-05-25T07:42:40Z

Background

Kimi reasoning models use max_completion_tokens for both reasoning_content and the final content. The previous default path resolved a single desired budget and then applied min(desired, remaining). Because the default desired value was 32000, ordinary turns still sent max_completion_tokens: 32000 even when the model had much more output room left in the context window.

That can trigger a real failure mode: thinking may consume the entire 32k budget, and the backend can return HTTP 200 with thinking content but no final summary/content. The empty-summary compaction guard is handled in a separate PR; this PR fixes the upstream default budget that makes that failure more likely.

Changes

Split completion budget semantics into configuration-level and request-level values:
- CompletionBudgetConfig.hardCap is an explicit user-configured maximum.
- CompletionBudgetConfig.fallback is used only when the model context window is unknown.
Change default cap calculation:
- When max_context_tokens is known and no hard cap is configured, use the safe remaining window: max_context_tokens - estimated_input - safety_margin.
- When a hard cap is configured, use min(hardCap, remaining).
- When the context window is unknown, fall back to loop_control.reserved_context_size, then 32000.
Preserve environment variable behavior:
- KIMI_MODEL_MAX_COMPLETION_TOKENS takes priority over legacy KIMI_MODEL_MAX_TOKENS.
- Positive integers are explicit hard caps.
- 0 or negative values disable client-side clamping entirely.
Rename ordinary-turn plumbing to completionBudgetConfig so the configuration object is not confused with the final cap sent to the backend.
Update English and Chinese environment variable docs and add a changeset.

Behavior Impact

By default, Kimi ordinary turns are no longer capped at 32k when the model context window is known. Instead, they use the safe remaining context window. This reduces the chance that a reasoning model spends the entire output budget on thinking and returns no final content.

Callers that want the old 32k behavior can set:

KIMI_MODEL_MAX_COMPLETION_TOKENS=32000

Callers that want to leave completion-token handling entirely to the backend can set:

KIMI_MODEL_MAX_COMPLETION_TOKENS=0

Verification

pnpm vitest run packages/agent-core/test/utils/completion-budget.test.ts packages/agent-core/test/agent/kosong-llm.test.ts
pnpm run typecheck
pnpm --dir docs run build
git diff --check

7Sageer

LGTM — the refactor is clean (splitting desired into hardCap/fallback) and unit coverage looks good.

One non-blocking item to confirm before merge: the default path now sends max_completion_tokens ≈ remaining context (~255k on 256k-context models, vs. the old 32k). This is safe only if the Kimi backend has no separate per-request output cap below the context window and does not pessimistically reserve scheduling budget by max_completion_tokens — worth a quick check with the backend owner.

Minor follow-up: the same helper is also used by compaction (compaction/full.ts:453-461), which this PR doesn't touch; its inline comment now describes the old "clamp to reserved size" behavior and should be realigned.

fix: widen Kimi completion budget

6084dc7

7Sageer approved these changes May 25, 2026

View reviewed changes

wbxl2000 merged commit bfbd522 into main May 25, 2026
6 checks passed

wbxl2000 deleted the fix-completion-budget-remaining branch May 25, 2026 10:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: widen Kimi completion budget#17

fix: widen Kimi completion budget#17
wbxl2000 merged 1 commit into
mainfrom
fix-completion-budget-remaining

wbxl2000 commented May 25, 2026 •

edited

Loading

Uh oh!

7Sageer left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wbxl2000 commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Changes

Behavior Impact

Verification

Uh oh!

7Sageer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wbxl2000 commented May 25, 2026 •

edited

Loading