perf(core): skip model routing classification when redundant by akh64bit · Pull Request #25554 · google-gemini/gemini-cli

akh64bit · 2026-04-16T18:58:44Z

Summary

This PR introduces an optimization in the ModelRouterService to skip the lightweight model classification step when both the pro and flash tiers resolve to the same underlying model. This happens, for example, when the user overrides both tiers using settings like model.gemma4Variant. Skipping the redundant API call noticeably improves the Time To First Token (TTFT) for all requests in this scenario.

Details

The ClassifierStrategy and GemmaClassifierStrategy now identify the resolved model for both the pro and flash tiers prior to calling the classification LLM. If the two resolved models match, the strategy takes a fast-path, returning the resolved model immediately with zero latency and reasoning indicating the classification was skipped.

Related Issues

How to Validate

Configure .gemini/settings.json to route to a specific model across both tiers. e.g., Set "gemma4Variant": "gemma-4-31b-it".
Run a command like gemini "hello".
Check the debug logs or DevTools. The routing step should reflect zero latency and the Classifier (or GemmaClassifier) source should indicate it skipped classification.
Tests have been added to classifierStrategy.test.ts and gemmaClassifierStrategy.test.ts to assert this behavior.

Pre-Merge Checklist

Introduces a new `model.gemma4Variant` setting that allows users to optionally redirect all requests destined for `gemini-pro` and `gemini-flash` (and their related aliases) to the selected Gemma 4 variant (`gemma-4-26b-a4b-it` or `gemma-4-31b-it`). The router model (`flash-lite`) remains unaffected.

gemini-cli · 2026-04-16T18:58:54Z

Hi @akh64bit, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this.

We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines.

Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed.

Thank you for your understanding and for being a part of our community!

gemini-code-assist · 2026-04-16T18:59:03Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a performance optimization for the model routing service by detecting and bypassing unnecessary classification steps when model tiers resolve to identical targets. Additionally, it expands the CLI's capabilities by adding support for routing requests to specific Gemma 4 model variants, complete with necessary configuration updates, documentation, and validation tests.

Highlights

Performance Optimization: Introduced a fast-path in ModelRouterService to skip redundant model classification when both pro and flash tiers resolve to the same model, significantly improving Time To First Token (TTFT).
Gemma 4 Support: Added support for routing requests to Gemma 4 models (gemma-4-26b-a4b-it and gemma-4-31b-it) via the model.gemma4Variant configuration setting.
Configuration and Documentation: Updated settings schema, CLI configuration, and documentation to support the new Gemma 4 routing options.
Testing: Added comprehensive integration and unit tests to verify the new routing logic and the classification skip optimization.

Ignored Files

Ignored by pattern: .gemini/** (1)
- .gemini/settings.json

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-04-16T19:01:57Z

Size Change: +8.02 kB (+0.02%)

Total Size: 33.6 MB

Filename	Size	Change
`./bundle/chunk-B4ZHRZIJ.js`	0 B	-3.8 kB (removed)	🏆
`./bundle/chunk-H6CYASTD.js`	0 B	-3.43 MB (removed)	🏆
`./bundle/chunk-HA6367YL.js`	0 B	-14.5 MB (removed)	🏆
`./bundle/chunk-LV2OUWFA.js`	0 B	-1.97 MB (removed)	🏆
`./bundle/core-7A6JUEEV.js`	0 B	-46.7 kB (removed)	🏆
`./bundle/devtoolsService-EW7GFBVS.js`	0 B	-28.4 kB (removed)	🏆
`./bundle/gemini-ZHRKSWKT.js`	0 B	-553 kB (removed)	🏆
`./bundle/interactiveCli-EGAYNZZX.js`	0 B	-1.29 MB (removed)	🏆
`./bundle/oauth2-provider-BPIBGFXK.js`	0 B	-9.16 kB (removed)	🏆
`./bundle/chunk-CD4XTC63.js`	1.97 MB	+1.97 MB (new file)	🆕
`./bundle/chunk-DL5RXHDU.js`	14.5 MB	+14.5 MB (new file)	🆕
`./bundle/chunk-INX4K5LW.js`	3.8 kB	+3.8 kB (new file)	🆕
`./bundle/chunk-RMJEFT2R.js`	3.43 MB	+3.43 MB (new file)	🆕
`./bundle/core-M4AIFNNX.js`	46.8 kB	+46.8 kB (new file)	🆕
`./bundle/devtoolsService-U4MOSHLI.js`	28.4 kB	+28.4 kB (new file)	🆕
`./bundle/gemini-RZLRC7XS.js`	553 kB	+553 kB (new file)	🆕
`./bundle/interactiveCli-P5S75ZY2.js`	1.29 MB	+1.29 MB (new file)	🆕
`./bundle/oauth2-provider-H6BBWTCF.js`	9.16 kB	+9.16 kB (new file)	🆕

ℹ️ View Unchanged

Filename	Size	Change
`./bundle/bundled/third_party/index.js`	8 MB	0 B
`./bundle/chunk-34MYV7JD.js`	2.45 kB	0 B
`./bundle/chunk-5AUYMPVF.js`	858 B	0 B
`./bundle/chunk-5PS3AYFU.js`	1.18 kB	0 B
`./bundle/chunk-664ZODQF.js`	124 kB	0 B
`./bundle/chunk-DAHVX5MI.js`	206 kB	0 B
`./bundle/chunk-IUUIT4SU.js`	56.5 kB	0 B
`./bundle/chunk-RJTRUG2J.js`	39.8 kB	0 B
`./bundle/cleanup-IX5GZ2QQ.js`	0 B	-932 B (removed)	🏆
`./bundle/devtools-36NN55EP.js`	696 kB	0 B
`./bundle/dist-T73EYRDX.js`	356 B	0 B
`./bundle/events-XB7DADIJ.js`	418 B	0 B
`./bundle/examples/hooks/scripts/on-start.js`	188 B	0 B
`./bundle/examples/mcp-server/example.js`	1.43 kB	0 B
`./bundle/gemini.js`	4.97 kB	0 B
`./bundle/getMachineId-bsd-TXG52NKR.js`	1.55 kB	0 B
`./bundle/getMachineId-darwin-7OE4DDZ6.js`	1.55 kB	0 B
`./bundle/getMachineId-linux-SHIFKOOX.js`	1.34 kB	0 B
`./bundle/getMachineId-unsupported-5U5DOEYY.js`	1.06 kB	0 B
`./bundle/getMachineId-win-6KLLGOI4.js`	1.72 kB	0 B
`./bundle/memoryDiscovery-CMNXJICE.js`	0 B	-980 B (removed)	🏆
`./bundle/multipart-parser-KPBZEGQU.js`	11.7 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/client/main.js`	222 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/_client-assets.js`	229 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/index.js`	13.4 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/types.js`	132 B	0 B
`./bundle/sandbox-macos-permissive-open.sb`	890 B	0 B
`./bundle/sandbox-macos-permissive-proxied.sb`	1.31 kB	0 B
`./bundle/sandbox-macos-restrictive-open.sb`	3.36 kB	0 B
`./bundle/sandbox-macos-restrictive-proxied.sb`	3.56 kB	0 B
`./bundle/sandbox-macos-strict-open.sb`	4.82 kB	0 B
`./bundle/sandbox-macos-strict-proxied.sb`	5.02 kB	0 B
`./bundle/src-QVCVGIUX.js`	47 kB	0 B
`./bundle/tree-sitter-7U6MW5PS.js`	274 kB	0 B
`./bundle/tree-sitter-bash-34ZGLXVX.js`	1.84 MB	0 B
`./bundle/cleanup-UGAIL5OE.js`	932 B	+932 B (new file)	🆕
`./bundle/memoryDiscovery-44SKOJDH.js`	980 B	+980 B (new file)	🆕

_{compressed-size-action}

gemini-code-assist

Code Review

This pull request introduces support for Gemma 4 models (gemma-4-26b-a4b-it and gemma-4-31b-it) by implementing a routing mechanism that redirects Gemini Pro and Flash requests to a user-configured Gemma 4 variant. The implementation includes updates to the configuration schema, model resolution logic, and documentation. Furthermore, the classifier routing strategies were optimized to skip redundant classification steps when both Pro and Flash tiers resolve to the same model. I have no feedback to provide.

Fixes an issue where the CLI hangs on 'Thinking...' for models (like Gemma 4) that return thought text in the 'thought' field instead of 'text'. Also updates Gemma 4 model definitions to accurately reflect their 'thinking' capabilities.

…inition flags

github-actions · 2026-04-16T20:36:51Z

🛑 Action Required: Evaluation Approval

Steering changes have been detected in this PR. To prevent regressions, a maintainer must approve the evaluation run before this PR can be merged.

Maintainers:

Go to the Workflow Run Summary.
Click the yellow 'Review deployments' button.
Select the 'eval-gate' environment and click 'Approve'.

Once approved, the evaluation results will be posted here automatically.

akh64bit added 4 commits April 15, 2026 01:51

docs: Add Gemma 4 routing instructions to README and docs

61096af

test: add integration test for Gemma 4 routing

44f9b59

perf(core): skip model routing classification when redundant

5f181f9

akh64bit requested review from a team as code owners April 16, 2026 18:58

gemini-code-assist bot reviewed Apr 16, 2026

View reviewed changes

gemini-cli bot added the priority/p1 Important and should be addressed in the near term. label Apr 16, 2026

akh64bit added 2 commits April 16, 2026 19:16

fix(core): pass config to supportsModernFeatures to respect model def…

0a119e6

…inition flags

akh64bit requested a review from a team as a code owner April 16, 2026 20:36

akh64bit requested a deployment to eval-gate April 16, 2026 20:36 — with GitHub Actions Waiting

github-actions bot mentioned this pull request Apr 17, 2026

📊 AI CLI 工具社区动态日报 2026-04-17 gsscsd/big_model_radar#198

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(core): skip model routing classification when redundant#25554

perf(core): skip model routing classification when redundant#25554
akh64bit wants to merge 6 commits intomainfrom
redirect-to-gemma4

akh64bit commented Apr 16, 2026

Uh oh!

gemini-cli bot commented Apr 16, 2026

Uh oh!

gemini-code-assist bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

akh64bit commented Apr 16, 2026

Summary

Details

Related Issues

How to Validate

Pre-Merge Checklist

Uh oh!

gemini-cli bot commented Apr 16, 2026

Uh oh!

gemini-code-assist bot commented Apr 16, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions bot commented Apr 16, 2026

🛑 Action Required: Evaluation Approval

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Apr 16, 2026 •

edited

Loading