Skip to content

refactor(config): unify LLM config under global namespace#44

Merged
iohub merged 1 commit into
mainfrom
feat-openai-go
May 6, 2026
Merged

refactor(config): unify LLM config under global namespace#44
iohub merged 1 commit into
mainfrom
feat-openai-go

Conversation

@iohub
Copy link
Copy Markdown
Owner

@iohub iohub commented May 6, 2026

  • Move [llm] section to [global.llm] in configuration files
  • Migrate provider definitions from [llm.providers.] to [global.llm.providers.]
  • Update Go config to use c.Global.LLM.Providers instead of c.LLM.Providers
  • Remove legacy LLMConfig struct and GetActiveProvider() fallback logic
  • Update Go LLM client to reference c.Config.Global.LLM
  • Simplify Rust config by removing LlmConfig, ProviderConfig, AgentConfig structs
  • Add default_embedding_db_uri() and default_graph_db_uri() functions
  • Update documentation (README, ARCHITECTURE) to reflect new config structure

Summary by Sourcery

Unify LLM configuration under the [global.llm] namespace and remove legacy per-root [llm] structures in both Go and Rust components.

New Features:

  • Add support for impl_plan as a tools.llm override target.
  • Introduce default paths for embedding and graph databases in the Rust codebase configuration.

Enhancements:

  • Centralize provider definitions under global.llm.providers in config files and Go configuration structures.
  • Update Go LLM client and provider resolution logic to use the unified global.llm configuration and stricter error handling when no provider is configured.
  • Simplify Rust configuration by dropping unused LLM and agent-related structs, retaining only codebase-related settings.
  • Adjust validation and helper methods to handle a missing global.llm block safely and return empty provider lists when unconfigured.

Documentation:

  • Update example TOML configs, README files, and architecture docs to reflect the new global.llm and global.llm.providers configuration schema.

- Move [llm] section to [global.llm] in configuration files
- Migrate provider definitions from [llm.providers.*] to [global.llm.providers.*]
- Update Go config to use c.Global.LLM.Providers instead of c.LLM.Providers
- Remove legacy LLMConfig struct and GetActiveProvider() fallback logic
- Update Go LLM client to reference c.Config.Global.LLM
- Simplify Rust config by removing LlmConfig, ProviderConfig, AgentConfig structs
- Add default_embedding_db_uri() and default_graph_db_uri() functions
- Update documentation (README, ARCHITECTURE) to reflect new config structure
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 6, 2026

Reviewer's Guide

Refactors LLM configuration to live exclusively under the [global.llm] namespace, removes legacy/duplicate LLM config structures and fallbacks in Go and Rust, updates resolution logic and tooling to use the unified config, adds a new tool override, and refreshes defaults and documentation to match.

Class diagram for unified Go LLM config under global.llm

classDiagram
    class ProviderConfig {
        +string Model
        +float64 Temperature
        +int MaxTokens
        +string APIBaseURL
        +string APIKey
        +string AWSRegion
        +string ModelProvider
    }

    class GlobalLLMConfig {
        +string UseProvider
        +map[string]ProviderConfig Providers
    }

    class ToolLLMOverride {
        +string UseProvider
    }

    class ToolsLLMConfig {
        +string UseProvider
        +ToolLLMOverride* MicroAgent
        +ToolLLMOverride* Thinking
        +ToolLLMOverride* ImplPlan
    }

    class AgentLLMOverride {
        +string UseProvider
    }

    class AgentsLLMConfig {
        +string UseProvider
        +map[string]AgentLLMOverride Agents
    }

    class TopLevelConfig {
        +GlobalLLMConfig* LLM
    }

    class AppConfig {
        +bool EnableStreaming
    }

    class AgentConfig {
        +int ConductorMaxSteps
        +int CodingMaxSteps
        +int RepoMaxSteps
        +string Lang
    }

    class Config {
        +TopLevelConfig Global
        +AgentsLLMConfig Agents
        +ToolsLLMConfig Tools
        +AppConfig App
        +AgentConfig Agent
        +ProviderConfig getProvider(name string)
        +ToolLLMOverride* getToolOverride(toolName string)
        +ProviderConfig ResolveProvider(agentName string, toolName string)
        +[]string GetProviderNames()
        +string resolveEffectiveProviderName()
        +error validate()
    }

    class Client {
        +Config* Config
        +ProviderConfig* ResolveProvider(agentName string, toolName string)
        +string ResolveProviderName(agentName string, toolName string)
        +string resolveProviderName(provider* ProviderConfig)
        +string GenerateCompletionWithMemory(ctx context.Context, memory []Message, prompt string)
    }

    Config --> TopLevelConfig : Global
    Config --> AgentsLLMConfig : Agents
    Config --> ToolsLLMConfig : Tools
    Config --> AppConfig : App
    Config --> AgentConfig : Agent

    TopLevelConfig --> GlobalLLMConfig : LLM
    GlobalLLMConfig --> ProviderConfig : Providers
    AgentsLLMConfig --> AgentLLMOverride : Agents
    ToolsLLMConfig --> ToolLLMOverride : MicroAgent
    ToolsLLMConfig --> ToolLLMOverride : Thinking
    ToolsLLMConfig --> ToolLLMOverride : ImplPlan

    Client --> Config
    Client --> ProviderConfig
Loading

Class diagram for simplified Rust config with codebase defaults

classDiagram
    class Config {
        +CodeBaseConfig codebase
        +Config load()
    }

    class CodeBaseConfig {
        +bool enable_embedding
        +string embedding_db_uri
        +string graph_db_uri
        +EmbeddingConfig embedding
    }

    class EmbeddingConfig {
        +string model
        +Option~usize~ dimensions
    }

    class default_embedding_db_uri {
        +string default_embedding_db_uri()
    }

    class default_graph_db_uri {
        +string default_graph_db_uri()
    }

    Config --> CodeBaseConfig : codebase
    CodeBaseConfig --> EmbeddingConfig : embedding

    default_embedding_db_uri ..> CodeBaseConfig : serde_default_embedding_db_uri
    default_graph_db_uri ..> CodeBaseConfig : serde_default_graph_db_uri
Loading

Flow diagram for Go provider resolution order without legacy llm fallback

flowchart TD
    A["ResolveProvider(agentName, toolName)"] --> B{"toolName not empty?"}
    B -->|yes| C["toolOverride = getToolOverride(toolName)"]
    C --> D{"toolOverride not nil and toolOverride.UseProvider not empty?"}
    D -->|yes| E["return getProvider(toolOverride.UseProvider)"]
    D -->|no| F{"agentName not empty?"}
    B -->|no| F

    F --> G["agentOverride = Agents.Agents[agentName]"]
    G --> H{"agentOverride.UseProvider not empty?"}
    H -->|yes| I["return getProvider(agentOverride.UseProvider)"]
    H -->|no| J{"Agents.UseProvider not empty?"}
    J -->|yes| K["return getProvider(Agents.UseProvider)"]
    J -->|no| L{"Global.LLM not nil and Global.LLM.UseProvider not empty?"}

    L -->|yes| M["return getProvider(Global.LLM.UseProvider)"]
    L -->|no| N["error: no LLM provider configured"]
Loading

File-Level Changes

Change Details Files
Unify Go LLM configuration under global.llm and remove legacy [llm] section and fallbacks.
  • Remove LLMConfig and the llm field from the root Config struct in favor of Global.LLM.Providers and Global.LLM.UseProvider.
  • Extend GlobalLLMConfig to carry the shared Providers map that was previously on LLMConfig.
  • Delete GetActiveProvider and legacy provider resolution fallback using the [llm] section; ResolveProvider now errors when no provider is configured.
  • Update getProvider, GetProviderNames, resolveEffectiveProviderName, and validate to work through c.Global.LLM and to handle nil Global.LLM safely.
internal/config/config.go
Adjust Go LLM client to use the new global.llm provider configuration and resolution API, and add a new impl_plan tool override.
  • Replace uses of Config.LLM and GetActiveProvider with Config.Global.LLM and ResolveProvider in the LLM client.
  • Update provider-name resolution helpers to iterate over Config.Global.LLM.Providers.
  • Add ImplPlan field to ToolsLLMConfig and wire it into getToolOverride so tools.llm.impl_plan can override the provider.
  • Change logging in LLM client to log the selected provider via Global.LLM.UseProvider.
internal/config/config.go
internal/llm/llm.go
Simplify Rust configuration to drop LLM-related structs and introduce defaults for codebase storage paths.
  • Remove Rust-side LlmConfig, ProviderConfig, AppConfig, and AgentConfig from the public Config and TOML parsing layer.
  • Narrow Config to only include CodeBaseConfig, matching current usage in the codebase component.
  • Introduce default_embedding_db_uri and default_graph_db_uri helpers and wire them via serde defaults on CodeBaseConfig.embedding_db_uri and graph_db_uri.
  • Mark enable_embedding with serde(default) to allow it to be omitted in config files.
codebase/src/config.rs
Migrate TOML examples and samples to the new [global.llm] and [global.llm.providers.*] layout and rely on defaults for codebase URIs.
  • Update sample config to remove the legacy [llm] section and move all provider definitions to [global.llm.providers.*].
  • Tighten the documented LLM provider selection precedence chain to tools.llm > agents.llm > global.llm and drop llm from comments.
  • Remove explicit embedding_db_uri and graph_db_uri entries in sample config to rely on new Rust defaults.
  • Update English and Chinese READMEs and both ARCHITECTURE docs to demonstrate [global.llm] and [global.llm.providers.] instead of [llm].
config/config.toml
README.md
README_zh.md
codebase/docs/ARCHITECTURE.md
docs/ARCHITECTURE.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@iohub iohub merged commit 7d3f514 into main May 6, 2026
1 check passed
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Several log lines in internal/llm/llm.go still log c.Config.Global.LLM.UseProvider as the model/provider, which will be misleading when an agent/tool override is used; consider logging the resolved provider name (e.g., via ResolveProviderName or by wiring through the name used in ResolveProvider).
  • The validation error "no providers configured in LLM section" in config.validate() no longer matches the refactored structure (providers are now under global.llm.providers); updating this message to reference the new namespace will make configuration issues clearer.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Several log lines in `internal/llm/llm.go` still log `c.Config.Global.LLM.UseProvider` as the model/provider, which will be misleading when an agent/tool override is used; consider logging the resolved provider name (e.g., via `ResolveProviderName` or by wiring through the name used in `ResolveProvider`).
- The validation error `"no providers configured in LLM section"` in `config.validate()` no longer matches the refactored structure (providers are now under `global.llm.providers`); updating this message to reference the new namespace will make configuration issues clearer.

## Individual Comments

### Comment 1
<location path="codebase/src/config.rs" line_range="5" />
<code_context>
-    pub app: AppConfig,
-    pub agent: AgentConfig,
-    pub codebase: CodeBaseConfig,
+fn default_embedding_db_uri() -> String {
+    let home = dirs::home_dir().unwrap_or_default();
+    home.join(".codeactor/data/embedding")
</code_context>
<issue_to_address>
**suggestion:** default_*_db_uri uses unwrap_or_default while Config::load errors on missing home_dir, leading to inconsistent behavior.

Here a missing home directory quietly falls back to an empty `PathBuf`, yielding a relative `.codeactor/...` path. Please either share the same `home_dir` resolution (and error behavior) as `Config::load`, or intentionally support this relative-path fallback and document it so default URIs and the loader behave consistently.

Suggested implementation:

```rust
use std::fs;
use std::path::PathBuf;
use tracing::info;

/// Returns the base data directory used by default DB URIs.
///
/// This intentionally falls back to a *relative* `.codeactor/data` directory
/// when no home directory can be resolved (e.g. in some containerized or
/// restricted environments). Callers that require a hard failure on missing
/// home directories should not use this helper and instead mirror the
/// `Config::load` behavior.
fn default_data_dir() -> PathBuf {
    dirs::home_dir()
        .map(|home| home.join(".codeactor/data"))
        .unwrap_or_else(|| PathBuf::from(".codeactor/data"))
}

/// Default embedding DB URI.
///
/// If a home directory is available, this will be
/// `$HOME/.codeactor/data/embedding`. Otherwise it falls back to the relative
/// path `.codeactor/data/embedding` from the current working directory.
fn default_embedding_db_uri() -> String {
    default_data_dir()
        .join("embedding")
        .to_string_lossy()
        .to_string()
}

/// Default graph DB URI.
///
/// If a home directory is available, this will be
/// `$HOME/.codeactor/data/graph`. Otherwise it falls back to the relative
/// path `.codeactor/data/graph` from the current working directory.
fn default_graph_db_uri() -> String {
    default_data_dir()
        .join("graph")
        .to_string_lossy()
        .to_string()

```

To make the behavior fully consistent across the codebase (as mentioned in your review comment), you should also:
1. Refactor `Config::load` to use the new `default_data_dir()` helper (or a variant of it) instead of resolving `dirs::home_dir()` independently.
2. If `Config::load` must continue to *error* when `home_dir` is missing, introduce a separate helper (e.g. `fn resolve_home_dir_or_error() -> Result<PathBuf, ConfigError>`) that encapsulates that behavior, and document the distinction between:
   - "strict" home-dir resolution used by `Config::load`, and
   - the "lenient" fallback used by `default_*_db_uri`.
3. Update any documentation or config help text to mention that default DB URIs may fall back to a relative `.codeactor/data/...` path when no home directory is available, so users are not surprised by the behavior.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread codebase/src/config.rs
pub app: AppConfig,
pub agent: AgentConfig,
pub codebase: CodeBaseConfig,
fn default_embedding_db_uri() -> String {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: default_*_db_uri uses unwrap_or_default while Config::load errors on missing home_dir, leading to inconsistent behavior.

Here a missing home directory quietly falls back to an empty PathBuf, yielding a relative .codeactor/... path. Please either share the same home_dir resolution (and error behavior) as Config::load, or intentionally support this relative-path fallback and document it so default URIs and the loader behave consistently.

Suggested implementation:

use std::fs;
use std::path::PathBuf;
use tracing::info;

/// Returns the base data directory used by default DB URIs.
///
/// This intentionally falls back to a *relative* `.codeactor/data` directory
/// when no home directory can be resolved (e.g. in some containerized or
/// restricted environments). Callers that require a hard failure on missing
/// home directories should not use this helper and instead mirror the
/// `Config::load` behavior.
fn default_data_dir() -> PathBuf {
    dirs::home_dir()
        .map(|home| home.join(".codeactor/data"))
        .unwrap_or_else(|| PathBuf::from(".codeactor/data"))
}

/// Default embedding DB URI.
///
/// If a home directory is available, this will be
/// `$HOME/.codeactor/data/embedding`. Otherwise it falls back to the relative
/// path `.codeactor/data/embedding` from the current working directory.
fn default_embedding_db_uri() -> String {
    default_data_dir()
        .join("embedding")
        .to_string_lossy()
        .to_string()
}

/// Default graph DB URI.
///
/// If a home directory is available, this will be
/// `$HOME/.codeactor/data/graph`. Otherwise it falls back to the relative
/// path `.codeactor/data/graph` from the current working directory.
fn default_graph_db_uri() -> String {
    default_data_dir()
        .join("graph")
        .to_string_lossy()
        .to_string()

To make the behavior fully consistent across the codebase (as mentioned in your review comment), you should also:

  1. Refactor Config::load to use the new default_data_dir() helper (or a variant of it) instead of resolving dirs::home_dir() independently.
  2. If Config::load must continue to error when home_dir is missing, introduce a separate helper (e.g. fn resolve_home_dir_or_error() -> Result<PathBuf, ConfigError>) that encapsulates that behavior, and document the distinction between:
    • "strict" home-dir resolution used by Config::load, and
    • the "lenient" fallback used by default_*_db_uri.
  3. Update any documentation or config help text to mention that default DB URIs may fall back to a relative .codeactor/data/... path when no home directory is available, so users are not surprised by the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant