Skip to content

fix(core): Cap maxOutputTokens in chat compression to prevent API errors#7579

Closed
qingshanyuluo wants to merge 3 commits intogoogle-gemini:mainfrom
qingshanyuluo:fix/max-tokens-bug
Closed

fix(core): Cap maxOutputTokens in chat compression to prevent API errors#7579
qingshanyuluo wants to merge 3 commits intogoogle-gemini:mainfrom
qingshanyuluo:fix/max-tokens-bug

Conversation

@qingshanyuluo
Copy link

TLDR

This pull request fix an issue described in #7578

Dive Deeper

Reviewer Test Plan

Testing Matrix

🍏 🪟 🐧
npm run
npx
Docker
Podman - -
Seatbelt - -

Linked issues / bugs

@qingshanyuluo qingshanyuluo requested a review from a team as a code owner September 2, 2025 05:41
@google-cla
Copy link

google-cla bot commented Sep 2, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @qingshanyuluo, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where the chat compression mechanism could inadvertently request an output token count exceeding API limits, leading to errors. By introducing a maximum cap on the maxOutputTokens parameter, the change ensures the stability and reliability of chat compression interactions with the API, preventing service disruptions.

Highlights

  • API Error Prevention: Implemented a cap on maxOutputTokens during chat compression to 65535 to prevent potential API errors that could arise from requesting an excessively large number of output tokens.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a fix to cap the maxOutputTokens for chat compression, preventing potential API errors. The change is straightforward and effective. I've added one suggestion to improve maintainability by moving the newly introduced constant to a more appropriate scope, consistent with the existing codebase.


this.getChat().setHistory(historyToCompress);

const MAX_COMPRESSION_OUTPUT_TOKENS = 65535;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For better maintainability and consistency with other constants in this file (e.g., COMPRESSION_TOKEN_THRESHOLD), consider defining MAX_COMPRESSION_OUTPUT_TOKENS at the top level of the module, outside the GeminiClient class (around line 114). This makes the constant more visible and easier to modify if the API limit changes in the future.

Adding a comment explaining the origin of this value (e.g., // This is the maximum value allowed by the Gemini API for maxOutputTokens) would also be helpful for future maintainers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completed

黄龙杰 and others added 2 commits September 2, 2025 14:25
…ent and documentation

- Move MAX_COMPRESSION_OUTPUT_TOKENS constant to module level for better visibility
- Add comprehensive JSDoc comment explaining the API limit origin
- Follow consistent pattern with other constants in the file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant