feat(cli): add foundational configuration schema for multimodal voice mode#21651
feat(cli): add foundational configuration schema for multimodal voice mode#21651frostbyte012 wants to merge 1 commit intogoogle-gemini:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request establishes the foundational configuration schema for the planned Hands-Free Multimodal Voice Mode. By integrating new voice-related settings into the core configuration, it enables persistent storage of user preferences for voice interaction without affecting existing text-based functionalities, aligning with the project's monorepo architecture and leveraging existing configuration management. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces the foundational configuration schema for the new multimodal voice mode. The changes are well-structured and add the necessary settings to settingsSchema.ts, along with updating the auto-generated JSON schema and documentation.
I've found one high-severity issue related to an inconsistency in the requiresRestart flag for the new voice settings group, which could lead to incorrect behavior regarding application restarts. My detailed comment provides a suggestion for resolving this.
| type: 'object', | ||
| label: 'Voice Mode', | ||
| category: 'Experimental', | ||
| requiresRestart: true, |
There was a problem hiding this comment.
There's an inconsistency in the requiresRestart flags for the new voice settings. The parent voice object is marked with requiresRestart: true, while some of its properties (vadSensitivity and ttsVoice) are correctly marked with requiresRestart: false.
This conflicts with the pattern used in other settings groups like general and ui, where the parent object has requiresRestart: false and only the specific properties that need a restart are marked as such. This inconsistency can lead to confusion and potential bugs in how restart requirements are determined.
To align with the existing architecture and ensure clarity, the parent voice object's requiresRestart flag should be false.
| requiresRestart: true, | |
| requiresRestart: false, |
|
@google-cla-bot check |
de8a04f to
bd3cf49
Compare
|
@google-cla-bot check |
5c077db to
7912089
Compare
c50fe8c to
34c2478
Compare
|
CLA is signed and verified. I have squashed the commits into a single clean feature commit and resolved all previous schema inconsistencies. Ready for CI checks and review! |
Description
This PR introduces the foundational configuration schema for the Hands-Free Multimodal Voice Mode (GSoC 2026 Idea 11).
By integrating these settings into the core
settingsSchema.ts, the CLI can now persistently store user preferences for voice interaction without impacting current text-based workflows. This follows the project's monorepo architecture and leverages the existing deep-merge logic for User and Workspace settings.Changes
voiceconfiguration block toSETTINGS_SCHEMAinpackages/cli/src/config/settingsSchema.ts.enabled(boolean),inputDevice(string),vadSensitivity(number), andttsVoice(enum).schemas/settings.schema.json.npm run docs:settings.Testing Done
Verified build via
npm run build.Confirmed schema inference in the configuration engine.
Ran
npm run docs:settingsto ensure documentation consistency across the monorepo.Related Issue
Fixes #21649