feat(zai): expose configurable max output tokens for GLM models (#161) by proyectoauraorg · Pull Request #274 · Zoo-Code-Org/Zoo-Code

proyectoauraorg · 2026-05-24T00:51:36Z

Summary

Z.ai GLM models (glm-5.1, glm-5-turbo) default to a 20% output clamp. The runtime already honors an explicit modelMaxTokens (createStreamWithThinking: max_tokens = options.modelMaxTokens || getModelMaxOutputTokens(...)), but the settings UI never surfaced a control — those models only exposed reasoning effort. This closes that gap.

Change

New optional capability flag supportsMaxTokens on modelInfoSchema, set on the configurable GLM models in both internationalZAiModels and mainlandZAiModels.
Reuses the existing ThinkingBudget settings component to render a max-output-tokens slider, gated on supportsMaxTokens && !supportsReasoningBudget. It defaults to the runtime's existing clamp when unset (behavior unchanged until the user edits), and persists an explicit value as modelMaxTokens.

Tests

Backend (zai.spec.ts): the flag is present on the configurable GLM models (absent on glm-4.7), and an explicit modelMaxTokens override is sent as max_tokens instead of the clamp.
Webview (ThinkingBudget.spec.tsx): slider renders alongside reasoning effort, defaults to the clamp when unset, reflects an override, does NOT persist on initial render (init vs user-edit), persists on user change, and does not render when the flag is absent.

Closes #161

Summary by CodeRabbit

New Features
- GLM-5.1 and GLM-5-turbo now advertise configurable max-output-tokens; settings expose a dedicated max-output-tokens slider for supported models.
- If a model supports max tokens, users can persist a custom modelMaxTokens; otherwise existing token-clamp and reasoning-effort controls remain.
Chores
- Export logic refined to preserve or strip token-related fields based on model capabilities.
Tests
- Added coverage for capability advertising, slider behavior, persistence, and export rules.

coderabbitai · 2026-05-24T00:51:52Z

Warning

Review limit reached

@proyectoauraorg, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 14 minutes and 53 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5b912694-d7bf-4cfc-8960-5805c54c88fe

📥 Commits

Reviewing files that changed from the base of the PR and between a99ea3c and 6cdbb6b.

📒 Files selected for processing (7)

packages/types/src/model.ts
packages/types/src/providers/zai.ts
src/api/providers/__tests__/zai.spec.ts
src/core/config/ProviderSettingsManager.ts
src/core/config/__tests__/ProviderSettingsManager.spec.ts
webview-ui/src/components/settings/ThinkingBudget.tsx
webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx

📝 Walkthrough

Walkthrough

Adds an optional supportsMaxTokens model capability, marks Z.AI GLM variants as supporting it, surfaces a conditional max-output-tokens slider with runtime defaults and persistence, and adds backend and UI tests validating capability advertising and explicit override behavior.

Changes

Max Output Tokens Capability for Z.A.I GLM

Layer / File(s)	Summary
Type contract and provider metadata `packages/types/src/model.ts`, `packages/types/src/providers/zai.ts`	`modelInfoSchema` adds optional `supportsMaxTokens`; GLM-5.1 and GLM-5-turbo (international and mainland) are marked `supportsMaxTokens: true`.
Backend token override behavior `src/api/providers/__tests__/zai.spec.ts`	Tests verify `supportsMaxTokens` is advertised for configurable GLMs and that explicit `modelMaxTokens` passed into `ZAiHandler` becomes `max_tokens` in the provider `create` call.
Standalone max-tokens slider control `webview-ui/src/components/settings/ThinkingBudget.tsx`	Imports `getModelMaxOutputTokens`, computes `defaultMaxOutputTokens` when `modelMaxTokens` is unset, and conditionally renders a standalone `max-output-tokens` slider for models with `supportsMaxTokens`, persisting user changes via `setApiConfigurationField`.
Reasoning-effort and max-tokens integration `webview-ui/src/components/settings/ThinkingBudget.tsx`	Reworks reasoning-effort branch to render the max-tokens slider alongside the reasoning-effort `Select`, with updated placeholder and `onValueChange` mapping for `"disable"` vs `"none"`/effort values.
UI test coverage for max-tokens behavior `webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx`	Adds tests for slider rendering with `supportsMaxTokens`, default/clamped initialization, explicit override behavior, non-persistence on initial render, persistence on user interaction, and conditional hiding when flag is absent.
Provider settings export filtering `src/core/config/ProviderSettingsManager.ts`, `src/core/config/__tests__/ProviderSettingsManager.spec.ts`	Splits export gating so `modelMaxThinkingTokens` is removed only when reasoning budgets are unsupported, while `modelMaxTokens` is removed only when the model supports neither reasoning budgets nor configurable max tokens; tests added for both cases.

Sequence Diagram

sequenceDiagram
  participant ThinkingBudget as ThinkingBudget (Webview UI)
  participant SettingsStore as SettingsStore (setApiConfigurationField)
  participant ProviderSettings as ProviderSettingsManager
  participant ZAiHandler as ZAiHandler (Server)
  participant ZAiAPI as Z.ai API (create)

  ThinkingBudget->>SettingsStore: user changes modelMaxTokens (persist)
  SettingsStore->>ProviderSettings: saved apiConfiguration/profile
  ProviderSettings->>ZAiHandler: runtime loads apiConfiguration
  ZAiHandler->>ZAiAPI: create(request with max_tokens = modelMaxTokens)
  ZAiAPI-->>ZAiHandler: response
  ZAiHandler-->>ThinkingBudget: result/state

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

edelauna
taltas
hannesrudolph

Poem

A rabbit twists a tiny knob, 🐇
Numbers fall like token throb,
Clamp once held the output tight,
Now a slider brings delight,
Hop, persist, and set it right.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: adding a configurable max output tokens feature for Z.ai GLM models.
Description check	✅ Passed	The PR description covers all required sections including summary, changes, tests, and issue reference with clear implementation details.
Linked Issues check	✅ Passed	The PR fully addresses issue `#161`'s requirements: exposes max output control for GLM models, preserves defaults, persists explicit values, and includes comprehensive test coverage.
Out of Scope Changes check	✅ Passed	All changes directly support the linked issue objective of exposing configurable max output tokens for Z.ai GLM models; no out-of-scope modifications detected.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx`:
- Around line 320-326: The test fixture glmApiConfiguration is widening
apiProvider to string causing a TS2322 error; fix by typing the fixture with the
ProviderSettings literal type (e.g., change the declaration of
glmApiConfiguration to use "satisfies ProviderSettings" — optionally combined
with "as const" to preserve literal types) so the apiProvider remains the
literal "zai" when passed into ThinkingBudget with defaultProps.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 995ad486-eba2-459e-9c28-6802c53ac7db

📥 Commits

Reviewing files that changed from the base of the PR and between b5c5e21 and 9bd5fbd.

📒 Files selected for processing (5)

packages/types/src/model.ts
packages/types/src/providers/zai.ts
src/api/providers/__tests__/zai.spec.ts
webview-ui/src/components/settings/ThinkingBudget.tsx
webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx

codecov · 2026-05-24T01:24:55Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

edelauna · 2026-05-24T04:23:34Z

Profile export strips modelMaxTokens for GLM models (src/core/config/ProviderSettingsManager.ts:572)

The export cleanup at line 572 deletes modelMaxTokens when supportsReasoningBudget is falsy. GLM models only advertise supportsMaxTokens + supportsReasoningEffort — neither reasoning-budget flag — so any custom max-tokens value the user sets via the new slider will be silently stripped on profile export. Re-importing the profile then resets the slider to the default clamp.

Suggested fix:

const keepMaxTokens = supportsReasoningBudget || modelInfo.supportsMaxTokens
if (!supportsReasoningBudget) {
    delete configs[name].modelMaxThinkingTokens
}
if (!keepMaxTokens) {
    delete configs[name].modelMaxTokens
}

(This file isn't in the PR diff so can't be posted as an inline comment.)

edelauna · 2026-05-24T04:23:21Z

+		) : null
+
 	// Models with supportsReasoningBinary (binary reasoning) show a simple on/off toggle
 	if (isReasoningSupported) {


If a model ever has both supportsReasoningBinary: true and supportsMaxTokens: true, the standalone slider won't render — this block returns before maxOutputTokensControl is reached. Worth including {maxOutputTokensControl} here alongside the checkbox, even if no current model hits this?

Great catch — you're right that supportsReasoningBinary models (like deepseek-v3.2) would have their modelMaxThinkingTokens incorrectly stripped under the current logic.\n\nI've updated the guard to include supportsReasoningBinary in the reasoning-capable check:\n\nts\nconst supportsReasoning =\n modelInfo.supportsReasoningBudget ||\n modelInfo.requiredReasoningBudget ||\n modelInfo.supportsReasoningBinary\n\nif (supportsReasoning) {\n // keep modelMaxThinkingTokens — budget, required, or binary\n} else {\n delete configs[name].modelMaxThinkingTokens\n}\n\n\nThis ensures binary-reasoning models retain the field while non-reasoning models still get it cleaned up. Thanks for flagging this!

proyectoauraorg · 2026-05-24T09:33:13Z

While reviewing the round-trip behavior for this feature, I found that ProviderSettingsManager.export() strips modelMaxTokens for any model without a reasoning budget — which would silently wipe the GLM max-output value this PR introduces on export/import. I've split the two deletions so modelMaxTokens is preserved whenever the model sets supportsMaxTokens (as GLM does), while modelMaxThinkingTokens still requires a reasoning budget. Added tests covering both branches (GLM preserves, a plain model strips both).

…Code-Org#161) Z.ai GLM models (glm-5.1, glm-5-turbo) default to a 20% output clamp and the runtime already honors an explicit modelMaxTokens, but the settings UI never surfaced a control for it — those models only exposed reasoning effort. Adds an optional supportsMaxTokens capability flag (set on the GLM models), and reuses the ThinkingBudget settings component to render a max-output-tokens slider gated on supportsMaxTokens && !supportsReasoningBudget. The slider defaults to the existing output clamp when unset, so behavior is unchanged until the user edits it; an explicit value persists as modelMaxTokens. Closes Zoo-Code-Org#161

…rName (Zoo-Code-Org#161)

…urable max output (Zoo-Code-Org#161)

proyectoauraorg · 2026-05-26T08:37:55Z

@edelauna Thanks for the detailed code review identifying the modelMaxTokens bug!

I've created a fix PR that addresses the scoping issue: PR #332

The fix extracts maxOutputTokensControl as a named JSX variable declared before both the binary reasoning and budget reasoning conditional branches, making the max output tokens slider available for all reasoning models (including supportsReasoningBinary models like Zhipu GLM).

Could you please review PR #332 when you get a chance? 🙏

proyectoauraorg requested review from JamesRobert20, edelauna, hannesrudolph, navedmerchant and taltas as code owners May 24, 2026 00:51

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Comment thread webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx Outdated

edelauna reviewed May 24, 2026

View reviewed changes

proyectoauraorg added 3 commits May 25, 2026 23:58

fix(test): type GLM apiConfiguration mock as const to satisfy Provide…

58cc57a

…rName (Zoo-Code-Org#161)

fix(config): preserve modelMaxTokens on export for models with config…

6cdbb6b

…urable max output (Zoo-Code-Org#161)

proyectoauraorg force-pushed the feat/161-zai-glm-max-output branch from a99ea3c to 6cdbb6b Compare May 26, 2026 05:59

proyectoauraorg mentioned this pull request May 26, 2026

fix(ThinkingBudget): extract maxOutputTokensControl to resolve undefined reference in binary reasoning path #332

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(zai): expose configurable max output tokens for GLM models (#161)#274

feat(zai): expose configurable max output tokens for GLM models (#161)#274
proyectoauraorg wants to merge 3 commits into
Zoo-Code-Org:mainfrom
proyectoauraorg:feat/161-zai-glm-max-output

proyectoauraorg commented May 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 24, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

codecov Bot commented May 24, 2026

Uh oh!

edelauna commented May 24, 2026

Uh oh!

edelauna May 24, 2026

Uh oh!

proyectoauraorg May 26, 2026

Uh oh!

proyectoauraorg commented May 24, 2026

Uh oh!

proyectoauraorg commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

proyectoauraorg commented May 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 24, 2026

Codecov Report

Uh oh!

edelauna commented May 24, 2026

Uh oh!

edelauna May 24, 2026

Choose a reason for hiding this comment

Uh oh!

proyectoauraorg May 26, 2026

Choose a reason for hiding this comment

Uh oh!

proyectoauraorg commented May 24, 2026

Uh oh!

proyectoauraorg commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

proyectoauraorg commented May 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 24, 2026 •

edited

Loading