Skip to content

feat: feature-gate local inference dependencies#7976

Merged
jh-block merged 15 commits intoblock:mainfrom
soilSpoon:feature/gate-local-inference
Mar 23, 2026
Merged

feat: feature-gate local inference dependencies#7976
jh-block merged 15 commits intoblock:mainfrom
soilSpoon:feature/gate-local-inference

Conversation

@soilSpoon
Copy link
Contributor

Summary

  • Makes heavy local inference dependencies optional behind a new local-inference Cargo feature (enabled by default)
  • Allows downstream consumers of the goose crate to skip ~200MB+ of compiled deps when only cloud providers are needed
  • cargo build -p goose --no-default-features now excludes candle, llama-cpp-2, symphonia, rubato, tokenizers, and byteorder

Motivation

Downstream projects that use goose as a library (e.g., with default-features = false) currently compile the full local inference stack even if they only need Agent::reply() with cloud providers. This PR gates the heaviest dependencies behind a feature flag while keeping all defaults unchanged.

Changes

crates/goose/Cargo.toml

  • New local-inference feature gating 8 optional dependencies
  • cuda feature now implies local-inference
  • Default features include local-inference (no behavior change)

Source gating (#[cfg(feature = "local-inference")])

  • providers::local_inference module
  • dictation::whisper module
  • DictationProvider::Local enum variant + related code
  • Provider registry registration

crates/goose-server

  • Propagates local-inference feature (default: enabled)
  • Gates dictation and local_inference route modules
  • Gates InferenceRuntime in AppState

Test plan

  • cargo check -p goose --no-default-features — compiles without candle/llama/symphonia
  • cargo check -p goose — compiles with all defaults (no behavior change)
  • cargo check -p goose-server — compiles with all defaults
  • cargo test -p goose --no-default-features --lib — 922 tests pass (1 pre-existing snapshot failure unrelated to this change)
  • CI validation

🤖 Generated with Claude Code

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a77457d21f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +30 to 31
#[cfg(feature = "local-inference")]
pub mod local_inference;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep goose-cli opted into local-inference

Gating goose::providers::local_inference here breaks the standalone CLI build, because crates/goose-cli/Cargo.toml:23 still depends on goose with default-features = false and crates/goose-cli/src/cli.rs:1477-1480 imports goose::providers::local_inference::* unconditionally. In the common cargo build -p goose-cli path, that module no longer exists, so the package stops compiling unless the CLI crate also enables the new feature.

Useful? React with 👍 / 👎.

Comment on lines +49 to +52
#[cfg(feature = "local-inference")]
let router = router
.merge(dictation::routes(state.clone()))
.merge(local_inference::routes(state));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Gate the OpenAPI registrations with local-inference

This makes routes::dictation and routes::local_inference disappear when local-inference is off, but crates/goose-server/src/openapi.rs:477-492 and crates/goose-server/src/openapi.rs:664-677 still reference those handlers and schema types unconditionally. As a result, cargo build/check -p goose-server --no-default-features now fails with unresolved-module errors, so the new server-side feature flag cannot actually be turned off.

Useful? React with 👍 / 👎.

Comment on lines 74 to +78
];

#[cfg(feature = "local-inference")]
pub const LOCAL_PROVIDER_DEF: DictationProviderDef = DictationProviderDef {
provider: DictationProvider::Local,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return the local dictation provider in /dictation/config

Moving DictationProvider::Local out of PROVIDERS changes crates/goose-server/src/routes/dictation.rs:192-220, which still builds the config response by iterating only PROVIDERS. The desktop settings screen then renders its provider dropdown from that response (ui/desktop/src/components/settings/dictation/DictationSettings.tsx:108-150), so with local-inference enabled the local option disappears from Settings and users can no longer select or manage local dictation.

Useful? React with 👍 / 👎.

@soilSpoon
Copy link
Contributor Author

Follow-up plan

If this PR looks good, I'd like to apply the same pattern to other heavy dependency groups in subsequent PRs:

PR Feature flag Dependencies Est. savings
This PR local-inference candle, llama-cpp-2, symphonia, rubato, tokenizers, byteorder ~200MB+
Follow-up 1 code-analysis tree-sitter + 9 language parsers ~50-100MB
Follow-up 2 aws-providers aws-config, aws-sdk-bedrockruntime, aws-sdk-sagemakerruntime, aws-smithy-types ~30-50MB
Follow-up 3 telemetry opentelemetry, opentelemetry_sdk, opentelemetry-otlp, tracing-opentelemetry ~15-30MB

Each would follow the same approach: optional = true deps, #[cfg(feature)] module gating, default-enabled, zero behavior change for existing users.

The goal is to let downstream consumers (e.g., projects that only need Agent::reply() with cloud providers) build with a minimal dependency footprint while keeping the full-featured default intact.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 40a5110cba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

soilSpoon and others added 3 commits March 18, 2026 14:50
Heavy dependencies for local LLM inference (candle, llama-cpp-2) and
local Whisper transcription (symphonia, rubato, tokenizers) are now
optional behind the `local-inference` Cargo feature.

This allows downstream consumers of the `goose` crate to opt out of
~200MB+ of compiled dependencies when they only need the core agent
runtime with cloud providers.

The feature is enabled by default, so existing builds are unaffected.
To build without it: `cargo build --no-default-features`

Dependencies made optional:
- candle-core, candle-nn, candle-transformers (Whisper ML inference)
- llama-cpp-2 (local LLM inference)
- tokenizers (HuggingFace tokenizer)
- symphonia (audio decoding)
- rubato (audio resampling)
- byteorder (audio processing)

Gated modules:
- goose::providers::local_inference
- goose::dictation::whisper
- DictationProvider::Local variant
- goose-server dictation & local_inference routes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
…example

- Add `all_providers()` function that returns all provider defs including
  Local when `local-inference` is enabled, fixing a behavioral regression
  where `/dictation/config` endpoint would not list the Local provider
- Add `[[example]]` entry for test_whisper with `required-features` so
  `cargo build --examples` works without `local-inference`
- Use separate cfg blocks instead of `#[allow(unused_mut)]`

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
@soilSpoon soilSpoon force-pushed the feature/gate-local-inference branch from 1d3b17b to a356b06 Compare March 18, 2026 05:51
… and OpenAPI

- Remove cfg gate from dictation module; only local whisper model
  management routes are gated behind local-inference
- Split OpenAPI into base ApiDoc + conditional LocalInferenceApiDoc
  merged at schema generation time
- Gate LocalModels CLI command and dispatch behind local-inference
- Add local-inference feature to goose-cli (default-enabled)
- Make cuda feature imply local-inference in both server and CLI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4763c4244

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@jh-block jh-block self-assigned this Mar 19, 2026
@DOsinga DOsinga added the needs_human label to set when a robot looks at a PR and can't handle it label Mar 20, 2026
Merge main into feature/gate-local-inference, keeping both the new
session_events routes from main and the local-inference feature gating.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
@soilSpoon soilSpoon force-pushed the feature/gate-local-inference branch from e02c105 to 3545d65 Compare March 23, 2026 05:54
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db9ae93381

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

soilSpoon and others added 7 commits March 23, 2026 16:51
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
…e tab

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
@soilSpoon soilSpoon force-pushed the feature/gate-local-inference branch from db9ae93 to 758e7aa Compare March 23, 2026 07:52
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 758e7aadc0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

soilSpoon and others added 2 commits March 23, 2026 17:23
The /features endpoint returns compile-time feature flags which are
not sensitive and need to be accessible before authentication setup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
- Default local-inference feature flag to false on fetch failure to
  prevent showing disabled UI that leads to 404 errors
- Reset active settings tab to 'models' when local-inference becomes
  unavailable while the tab is selected

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: DaeHee Lee <lee111dae11@proton.me>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: da0fb81232

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +50 to +52
if (!localInference && loadedProvider === 'local') {
loadedProvider = null;
await upsert('voice_dictation_provider', '', false);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid clearing local dictation provider before flags load

This branch can wipe a valid voice_dictation_provider=local setting during initial render: FeaturesContext starts with localInference as false until /features resolves, so users who already selected local dictation will hit this path and persist '' back to config even when local inference is actually enabled. That creates a silent, destructive preference reset on each mount under normal startup timing; defer this reset until feature loading is complete (e.g., gate on !isLoading).

Useful? React with 👍 / 👎.

Copy link
Collaborator

@jh-block jh-block Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soilSpoon This looks like a valid comment from codex -- voice_dictation_provider can get reset to '' if this code runs before the response from /features comes back. I think it needs to only reset the config if !isLoading

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit fixing this

@soilSpoon soilSpoon requested review from jh-block March 23, 2026 08:57
Prevent clearing a valid voice_dictation_provider=local setting during
initial render. FeaturesContext starts with localInference=false until
/features resolves, so the reset check would fire prematurely and wipe
the user's preference. Gate the effect on !isFeaturesLoading so the
local inference flag reflects the actual server state before any
provider reset logic runs.

Signed-off-by: jh-block <jhugo@block.xyz>
@jh-block jh-block enabled auto-merge March 23, 2026 15:38
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f6b472f3f0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +33 to +34
localInference: features['local-inference'] ?? false,
codeMode: features['code-mode'] ?? true,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat missing /features as unknown, not disabled

ExternalBackendSection allows the desktop to target arbitrary goosed versions, including older servers that predate the new /features endpoint. In that case features stays empty and this fallback forces localInference=false, which hides the Local Inference tab and onboarding flow even though those older servers still expose the local-model routes. The same false-negative happens after any transient /features failure, because the context never retries.

Useful? React with 👍 / 👎.

Comment on lines +52 to +54
if (!localInference && loadedProvider === 'local') {
loadedProvider = null;
await upsert('voice_dictation_provider', '', false);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid clearing local dictation on feature-probe failures

Even with the new loading guard, any permanent failure to populate localInference (for example, connecting the desktop to an older external server with no /features route, or a startup request failure) leaves it false. When a user who already has voice_dictation_provider=local opens Dictation settings, this branch persists '' back to config and silently disables a previously working dictation setup, so the feature probe can now destroy user preferences.

Useful? React with 👍 / 👎.

@jh-block jh-block added this pull request to the merge queue Mar 23, 2026
Merged via the queue into block:main with commit c493c61 Mar 23, 2026
21 checks passed
@soilSpoon soilSpoon deleted the feature/gate-local-inference branch March 23, 2026 16:03
wpfleger96 added a commit that referenced this pull request Mar 23, 2026
* origin/main:
  fix: handle reasoning content blocks in OpenAI-compat streaming parser (#8078)
  chore(acp): build native packages on latest mac (#8075)
  Display delegate sub agents logs in UI (#7519)
  Update tar version to avoid CVE-2026-33056 (#8073)
  refactor: consolidate duplicated dependencies into workspace (#8041)
  tui: set up for publishing via github actions (#8020)
  feat: feature-gate local inference dependencies (#7976)
  feat: ability to manage sub recipes in desktop ui (#6360)
lifeizhou-ap added a commit that referenced this pull request Mar 24, 2026
* main: (37 commits)
  fix: handle reasoning content blocks in OpenAI-compat streaming parser (#8078)
  chore(acp): build native packages on latest mac (#8075)
  Display delegate sub agents logs in UI (#7519)
  Update tar version to avoid CVE-2026-33056 (#8073)
  refactor: consolidate duplicated dependencies into workspace (#8041)
  tui: set up for publishing via github actions (#8020)
  feat: feature-gate local inference dependencies (#7976)
  feat: ability to manage sub recipes in desktop ui (#6360)
  Tweak the release process: no more merge to main (#7994)
  fix: gemini models via databricks (#8042)
  feat(apps): Pass toolInfo to MCP Apps via hostContext (#7506)
  fix: remove configured marker when deleting oauth provider configuration (#7887)
  docs: add vmware-aiops MCP extension documentation (#8055)
  Show setup instructions for ACP providers in settings modal (#8065)
  deps: replace sigstore-verification with sigstore-verify to kill vulns (#8064)
  feat(acp): add session/set_config and stabilize list, delete and close (#7984)
  docs: Correct `gosoe` typo to `goose` (#8062)
  fix: use default provider and model when provider in session no longer exists (#8035)
  feat: add GOOSE_SHELL env var to configure preferred shell (#7909)
  fix(desktop): fullscreen header bar + always-visible close controls (#8033)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs_human label to set when a robot looks at a PR and can't handle it

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants