feat: feature-gate local inference dependencies#7976
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a77457d21f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| #[cfg(feature = "local-inference")] | ||
| pub mod local_inference; |
There was a problem hiding this comment.
Keep
goose-cli opted into local-inference
Gating goose::providers::local_inference here breaks the standalone CLI build, because crates/goose-cli/Cargo.toml:23 still depends on goose with default-features = false and crates/goose-cli/src/cli.rs:1477-1480 imports goose::providers::local_inference::* unconditionally. In the common cargo build -p goose-cli path, that module no longer exists, so the package stops compiling unless the CLI crate also enables the new feature.
Useful? React with 👍 / 👎.
| #[cfg(feature = "local-inference")] | ||
| let router = router | ||
| .merge(dictation::routes(state.clone())) | ||
| .merge(local_inference::routes(state)); |
There was a problem hiding this comment.
Gate the OpenAPI registrations with
local-inference
This makes routes::dictation and routes::local_inference disappear when local-inference is off, but crates/goose-server/src/openapi.rs:477-492 and crates/goose-server/src/openapi.rs:664-677 still reference those handlers and schema types unconditionally. As a result, cargo build/check -p goose-server --no-default-features now fails with unresolved-module errors, so the new server-side feature flag cannot actually be turned off.
Useful? React with 👍 / 👎.
| ]; | ||
|
|
||
| #[cfg(feature = "local-inference")] | ||
| pub const LOCAL_PROVIDER_DEF: DictationProviderDef = DictationProviderDef { | ||
| provider: DictationProvider::Local, |
There was a problem hiding this comment.
Return the local dictation provider in
/dictation/config
Moving DictationProvider::Local out of PROVIDERS changes crates/goose-server/src/routes/dictation.rs:192-220, which still builds the config response by iterating only PROVIDERS. The desktop settings screen then renders its provider dropdown from that response (ui/desktop/src/components/settings/dictation/DictationSettings.tsx:108-150), so with local-inference enabled the local option disappears from Settings and users can no longer select or manage local dictation.
Useful? React with 👍 / 👎.
Follow-up planIf this PR looks good, I'd like to apply the same pattern to other heavy dependency groups in subsequent PRs:
Each would follow the same approach: The goal is to let downstream consumers (e.g., projects that only need |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 40a5110cba
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Heavy dependencies for local LLM inference (candle, llama-cpp-2) and local Whisper transcription (symphonia, rubato, tokenizers) are now optional behind the `local-inference` Cargo feature. This allows downstream consumers of the `goose` crate to opt out of ~200MB+ of compiled dependencies when they only need the core agent runtime with cloud providers. The feature is enabled by default, so existing builds are unaffected. To build without it: `cargo build --no-default-features` Dependencies made optional: - candle-core, candle-nn, candle-transformers (Whisper ML inference) - llama-cpp-2 (local LLM inference) - tokenizers (HuggingFace tokenizer) - symphonia (audio decoding) - rubato (audio resampling) - byteorder (audio processing) Gated modules: - goose::providers::local_inference - goose::dictation::whisper - DictationProvider::Local variant - goose-server dictation & local_inference routes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
…example - Add `all_providers()` function that returns all provider defs including Local when `local-inference` is enabled, fixing a behavioral regression where `/dictation/config` endpoint would not list the Local provider - Add `[[example]]` entry for test_whisper with `required-features` so `cargo build --examples` works without `local-inference` - Use separate cfg blocks instead of `#[allow(unused_mut)]` Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
1d3b17b to
a356b06
Compare
… and OpenAPI - Remove cfg gate from dictation module; only local whisper model management routes are gated behind local-inference - Split OpenAPI into base ApiDoc + conditional LocalInferenceApiDoc merged at schema generation time - Gate LocalModels CLI command and dispatch behind local-inference - Add local-inference feature to goose-cli (default-enabled) - Make cuda feature imply local-inference in both server and CLI Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a4763c4244
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Merge main into feature/gate-local-inference, keeping both the new session_events routes from main and the local-inference feature gating. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
e02c105 to
3545d65
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: db9ae93381
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
…e tab Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
db9ae93 to
758e7aa
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 758e7aadc0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
The /features endpoint returns compile-time feature flags which are not sensitive and need to be accessible before authentication setup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
- Default local-inference feature flag to false on fetch failure to prevent showing disabled UI that leads to 404 errors - Reset active settings tab to 'models' when local-inference becomes unavailable while the tab is selected Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: da0fb81232
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (!localInference && loadedProvider === 'local') { | ||
| loadedProvider = null; | ||
| await upsert('voice_dictation_provider', '', false); |
There was a problem hiding this comment.
Avoid clearing local dictation provider before flags load
This branch can wipe a valid voice_dictation_provider=local setting during initial render: FeaturesContext starts with localInference as false until /features resolves, so users who already selected local dictation will hit this path and persist '' back to config even when local inference is actually enabled. That creates a silent, destructive preference reset on each mount under normal startup timing; defer this reset until feature loading is complete (e.g., gate on !isLoading).
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
@soilSpoon This looks like a valid comment from codex -- voice_dictation_provider can get reset to '' if this code runs before the response from /features comes back. I think it needs to only reset the config if !isLoading
There was a problem hiding this comment.
I pushed a commit fixing this
Prevent clearing a valid voice_dictation_provider=local setting during initial render. FeaturesContext starts with localInference=false until /features resolves, so the reset check would fire prematurely and wipe the user's preference. Gate the effect on !isFeaturesLoading so the local inference flag reflects the actual server state before any provider reset logic runs. Signed-off-by: jh-block <jhugo@block.xyz>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f6b472f3f0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| localInference: features['local-inference'] ?? false, | ||
| codeMode: features['code-mode'] ?? true, |
There was a problem hiding this comment.
Treat missing
/features as unknown, not disabled
ExternalBackendSection allows the desktop to target arbitrary goosed versions, including older servers that predate the new /features endpoint. In that case features stays empty and this fallback forces localInference=false, which hides the Local Inference tab and onboarding flow even though those older servers still expose the local-model routes. The same false-negative happens after any transient /features failure, because the context never retries.
Useful? React with 👍 / 👎.
| if (!localInference && loadedProvider === 'local') { | ||
| loadedProvider = null; | ||
| await upsert('voice_dictation_provider', '', false); |
There was a problem hiding this comment.
Avoid clearing local dictation on feature-probe failures
Even with the new loading guard, any permanent failure to populate localInference (for example, connecting the desktop to an older external server with no /features route, or a startup request failure) leaves it false. When a user who already has voice_dictation_provider=local opens Dictation settings, this branch persists '' back to config and silently disables a previously working dictation setup, so the feature probe can now destroy user preferences.
Useful? React with 👍 / 👎.
* origin/main: fix: handle reasoning content blocks in OpenAI-compat streaming parser (#8078) chore(acp): build native packages on latest mac (#8075) Display delegate sub agents logs in UI (#7519) Update tar version to avoid CVE-2026-33056 (#8073) refactor: consolidate duplicated dependencies into workspace (#8041) tui: set up for publishing via github actions (#8020) feat: feature-gate local inference dependencies (#7976) feat: ability to manage sub recipes in desktop ui (#6360)
* main: (37 commits) fix: handle reasoning content blocks in OpenAI-compat streaming parser (#8078) chore(acp): build native packages on latest mac (#8075) Display delegate sub agents logs in UI (#7519) Update tar version to avoid CVE-2026-33056 (#8073) refactor: consolidate duplicated dependencies into workspace (#8041) tui: set up for publishing via github actions (#8020) feat: feature-gate local inference dependencies (#7976) feat: ability to manage sub recipes in desktop ui (#6360) Tweak the release process: no more merge to main (#7994) fix: gemini models via databricks (#8042) feat(apps): Pass toolInfo to MCP Apps via hostContext (#7506) fix: remove configured marker when deleting oauth provider configuration (#7887) docs: add vmware-aiops MCP extension documentation (#8055) Show setup instructions for ACP providers in settings modal (#8065) deps: replace sigstore-verification with sigstore-verify to kill vulns (#8064) feat(acp): add session/set_config and stabilize list, delete and close (#7984) docs: Correct `gosoe` typo to `goose` (#8062) fix: use default provider and model when provider in session no longer exists (#8035) feat: add GOOSE_SHELL env var to configure preferred shell (#7909) fix(desktop): fullscreen header bar + always-visible close controls (#8033) ...
Summary
local-inferenceCargo feature (enabled by default)goosecrate to skip ~200MB+ of compiled deps when only cloud providers are neededcargo build -p goose --no-default-featuresnow excludes candle, llama-cpp-2, symphonia, rubato, tokenizers, and byteorderMotivation
Downstream projects that use
gooseas a library (e.g., withdefault-features = false) currently compile the full local inference stack even if they only needAgent::reply()with cloud providers. This PR gates the heaviest dependencies behind a feature flag while keeping all defaults unchanged.Changes
crates/goose/Cargo.tomllocal-inferencefeature gating 8 optional dependenciescudafeature now implieslocal-inferencelocal-inference(no behavior change)Source gating (
#[cfg(feature = "local-inference")])providers::local_inferencemoduledictation::whispermoduleDictationProvider::Localenum variant + related codecrates/goose-serverlocal-inferencefeature (default: enabled)dictationandlocal_inferenceroute modulesInferenceRuntimeinAppStateTest plan
cargo check -p goose --no-default-features— compiles without candle/llama/symphoniacargo check -p goose— compiles with all defaults (no behavior change)cargo check -p goose-server— compiles with all defaultscargo test -p goose --no-default-features --lib— 922 tests pass (1 pre-existing snapshot failure unrelated to this change)🤖 Generated with Claude Code