Skip to content

Multi-Model Provider Support#4

Merged
OBenner merged 35 commits intodevelopfrom
auto-claude/046-multi-model-provider-support
Mar 1, 2026
Merged

Multi-Model Provider Support#4
OBenner merged 35 commits intodevelopfrom
auto-claude/046-multi-model-provider-support

Conversation

@OBenner
Copy link
Owner

@OBenner OBenner commented Feb 6, 2026

Support multiple LLM providers beyond Claude including OpenAI GPT, Google Gemini, and local models via Ollama. Allow users to choose models based on task complexity, cost, or preference.


Note

Medium Risk
Touches core agent session creation and .env mutation from the UI, which can impact build execution and configuration correctness across providers; most other changes are additive (pricing tables, IPC/UI, and tests).

Overview
Adds multi-provider AI engine configuration (Claude/LiteLLM/OpenRouter) via new .env variables and updates agent session creation to go through a provider factory (notably in agents/coder.py, plus new provider-backed session helpers for planner/QA).

Expands cost tracking to cover additional model families (OpenAI, Gemini, Zhipu GLM, Ollama zero-cost) with richer pricing tables and warnings when unknown models are encountered, and wires the desktop app to load cost_report.json over IPC.

Introduces an Electron “AI Providers” settings section with IPC handlers to read/write provider settings into the project .env, a model-list IPC endpoint, plus new UI/i18n/types for provider settings and a cost breakdown chart. Adds new E2E test suites/docs for OpenAI, Ollama, and per-agent provider selection, and removes older standalone verification scripts.

Written by Cursor Bugbot for commit 31a285a. Configure here.

Summary by CodeRabbit

  • New Features

    • Multi-provider AI support with per-agent provider/model overrides, provider settings UI, IPC handlers, and model enumeration.
    • Project cost reporting UI: provider/model cost breakdown chart and project cost report loader.
  • Documentation

    • Expanded backend .env example with multi-provider configuration, model listings, fallback & cost guidance.
  • Tests

    • Extensive new/updated E2E and unit tests for providers, per-agent selection, and cost tracking (including Ollama/OpenAI scenarios).
  • i18n

    • Added cost- and provider-related translations (en/fr).

OBenner and others added 26 commits February 3, 2026 20:34
…e multi-provider models

- Add logging support with logger for cost tracking
- Enhance calculate_cost() docstring to document multi-provider support
  (Claude, OpenAI, Google Gemini, Ollama)
- Add warning log when unknown model falls back to default pricing
- Add debug log for all cost calculations
- Improve code clarity with explicit pricing lookup and error handling

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…tion logic

- Add create_planner_session() factory function that uses provider config
- Replace direct create_client() call with provider factory pattern
- Follow pattern from core/providers/factory.py for consistency
- Support multi-provider architecture (Claude, with stubs for LiteLLM/OpenRouter)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…n examples

- Add MULTI-PROVIDER CONFIGURATION section explaining AI_ENGINE_PROVIDER
- Add PER-AGENT PROVIDER CONFIGURATION section with AGENT_PROVIDER_<TYPE> env vars
- Document supported providers (claude, litellm, openrouter)
- Add provider-specific API key documentation (OpenAI, Google, OpenRouter, Ollama)
- Include example strategies (Privacy-First, Cost Optimization, Hybrid, Google Gemini)
- Expand AGENT-SPECIFIC MODEL CONFIGURATION to list OpenAI, Gemini, and Ollama models
- Update cost tracking note to mention multi-provider support
…e, OpenRouter)

Added API key input fields to ProviderSettingsSection component:
- OpenAI API Key input (for LiteLLM provider with GPT-4 models)
- Google AI API Key input (for LiteLLM provider with Gemini models)
- OpenRouter API Key input (for OpenRouter provider)
- Save button to persist settings
- Conditional rendering based on selected provider
- i18n translation keys in English and French

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…e to ProviderSettingsSection

- Added state for plannerModel, coderModel, and qaModel
- Added UI for model selection inputs for each agent type (planner, coder, qa)
- Model selection appears when provider is litellm or openrouter
- Added translation keys in English and French (aiProvider.models)
- Integrated model values into handleSave function

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…and run task

Created comprehensive E2E testing suite for OpenAI provider integration:

Automated Tests (tests/test_openai_provider_e2e.py):
- Provider configuration from environment variables
- OpenAI cost tracking and pricing validation
- Provider factory integration tests
- E2E tests with mocked API (fast, offline)
- Optional live API tests (requires OPENAI_API_KEY)
- Per-agent provider configuration tests

Test Coverage:
- OpenAI GPT-4 provider configuration
- Cost calculation accuracy for GPT-4 models
- Multi-model cost tracking
- Mixed Claude/OpenAI usage scenarios
- Per-agent provider selection

The test suite includes:
1. Unit tests for provider config and cost tracking
2. Integration tests for provider factory
3. E2E tests (mocked) for full workflow
4. E2E tests (live) for actual API integration
5. Manual test procedure documentation in docstrings

Run with: pytest tests/test_openai_provider_e2e.py -v

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Added comprehensive README for OpenAI provider E2E tests:
- Test execution instructions (mocked and live API modes)
- Manual test procedure reference
- Test coverage details
- Troubleshooting guide
- Validation instructions

Completes subtask-7-1: E2E test for OpenAI provider configuration.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…anner=opus, coder=gpt-4)

- Created comprehensive E2E test suite for per-agent provider selection
- Test file: tests/test_per_agent_provider_e2e.py
- Tests planner using Claude Opus and coder using OpenAI GPT-4
- Verifies cost_report.json shows both Claude and OpenAI usage
- Includes mocked and live API test scenarios
- Created verification script to validate core functionality
- All verification tests pass successfully
Fixes:
- Added Claude 3.5 Sonnet models (2 variants)
- Added Claude 3 models (Opus, Sonnet, Haiku)
- Added Zhipu AI GLM-4 series models (9 variants)
- Updated .env.example with pricing documentation
- Updated module docstrings

Models Added:
- Claude 3.5: claude-3-5-sonnet-20241022, claude-3-5-sonnet-20240620
- Claude 3: claude-3-opus-20240229, claude-3-sonnet-20240229, claude-3-haiku-20240307
- GLM: glm-4.7, glm-4.5, glm-4-plus, glm-4, glm-4-air, glm-4-airx, glm-4-flash, glm-4-flashx, glm-3-turbo

Verified:
- All new models present in MODEL_PRICING
- Cost calculation works correctly for all new models
- Documentation updated in .env.example

QA Fix Session: 1

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 8 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

monkeypatch.setenv("ANTHROPIC_API_KEY", "test-key")

# Create provider config for planner
config = ProviderConfig.from_env(agent_type="planner")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests call from_env with unsupported agent_type parameter

High Severity

ProviderConfig.from_env() does not accept an agent_type parameter, but the E2E tests in test_per_agent_provider_e2e.py call ProviderConfig.from_env(agent_type="planner") and ProviderConfig.from_env(agent_type="coder"). Since the method signature is from_env(cls), the agent_type keyword argument will cause a TypeError at runtime. This means the per-agent provider selection feature is untested — the tests that validate it will crash rather than actually testing the behavior.

Fix in Cursor Fix in Web

{providerCost.modelBreakdown
.sort((a, b) => b.cost - a.cost)
.map((model) => {
const modelPercentage = (model.cost / providerCost.cost) * 100;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Division by zero when provider cost is zero

Medium Severity

modelPercentage is computed as model.cost / providerCost.cost * 100. When providerCost.cost is zero (which is the expected case for Ollama/local models — a key feature of this PR), this produces NaN or Infinity, rendering as "NaN%" in the UI.

Fix in Cursor Fix in Web

print("OK")
except ImportError as e:
print(f"FAIL: {e}")
exit(1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally committed test file in backend root

Low Severity

test_reviewer_import.py appears to be a quick manual verification script that mutates os.environ at import time. It's placed in the backend root rather than the tests/ directory and is not part of the test suite. It looks like a development artifact that was accidentally included in the commit.

Fix in Cursor Fix in Web

"roadmap_discovery": "claude",
"competitor_analysis": "claude",
"ideation": "claude",
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AGENT_DEFAULT_PROVIDERS dict is defined but never referenced

Medium Severity

AGENT_DEFAULT_PROVIDERS is a 46-line dictionary that is defined and documented in .env.example as priority level 3 for provider resolution, but get_provider_for_agent never reads from it. The function falls through directly to the global AI_ENGINE_PROVIDER config, skipping the per-agent defaults entirely. This is dead code that creates a false expectation for anyone reading the docs.

Fix in Cursor Fix in Web

},
# ========================================
# Fallback
# ========================================
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing claude-opus-4-20250514 from cost pricing database

Medium Severity

The model claude-opus-4-20250514 is used throughout the codebase (tests, frontend model lists, provider adapters) but is absent from MODEL_PRICING. Cost calculations for this model silently fall back to default Sonnet pricing ($3/$15 per 1M tokens) instead of the correct Opus pricing ($15/$75), resulting in significantly underreported costs.

Fix in Cursor Fix in Web

coderModel,
qaModel,
});
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provider settings save button only logs to console

High Severity

The handleSave function only calls console.log with the settings (including API keys) instead of invoking the SETTINGS_SAVE_PROVIDER IPC handler that's already registered on the backend. When a user configures a non-Claude provider, enters API keys and model selections, and clicks "Save," nothing is persisted. The component also never calls SETTINGS_LOAD_PROVIDER to populate fields with existing values, so previously saved settings won't appear either. The entire provider settings UI is non-functional.

Fix in Cursor Fix in Web

return `${varName}=${vars[varName]}`;
}
}
return line;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Env generator uncomments variables creating duplicate definitions

Medium Severity

generateProviderEnvContent receives ALL parsed env variables in vars, not just provider-related ones. When processing lines, it uncomments any # VAR_NAME= comment line if vars[varName] is defined. If the same variable exists both as a comment (e.g., a documentation example) and as an active assignment, the comment gets uncommented while the active line is also updated, creating duplicate variable definitions in the .env file.

Fix in Cursor Fix in Web

session_config,
project_dir=project_dir,
spec_dir=spec_dir,
agent_type=agent_type_for_session,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coder passes Claude-specific kwargs to any provider's create_session

High Severity

When AI_ENGINE_PROVIDER is set to litellm or openrouter, provider.create_session() is called with Claude-specific keyword arguments (project_dir, spec_dir, agent_type, max_thinking_tokens) that only ClaudeAgentProvider.create_session() accepts. LiteLLMProvider.create_session(self, config) and OpenRouterProvider.create_session(self, config) don't accept these kwargs, so this call will crash with a TypeError. Unlike create_planner_session, create_qa_fixer_session, and create_qa_reviewer_session, which explicitly check provider.name == "claude" and raise a helpful NotImplementedError, the coder path has no such guard.

Fix in Cursor Fix in Web

@OBenner
Copy link
Owner Author

OBenner commented Feb 7, 2026

/review

@pantoaibot
Copy link

pantoaibot bot commented Feb 7, 2026

PR Summary:

Summary: Introduces multi-model/provider support across backend agents, cost tracking, frontend settings/UI, IPC, and a large E2E test suite — switching agent sessions to a provider factory (ProviderConfig + create_engine_provider) and adding provider-aware cost reporting and UI controls.

Key changes:

  • Backend: provider factory & session model

    • Replaces direct create_client usage with ProviderConfig.from_env() and create_engine_provider(); sessions created via provider.create_session and expose .client.
    • Coder, Planner, QA reviewer/fixer now use the provider/session factory (new create_*_session helpers and run wrappers).
    • Many agent modules updated to accept provider/session abstraction; some provider branches (non-claude) are TODO and raise NotImplementedError.
    • New utilities to select provider per agent (phase_config: AGENT_DEFAULT_PROVIDERS, get_provider_for_agent).
    • Removed legacy manual fallback/verification scripts; added new verification helpers and scripts for per-agent/provider flows.
  • Cost tracking & models

    • core.cost_tracking expanded: MODEL_PRICING now contains many providers/models (Claude, OpenAI, Google Gemini, Ollama, Zhipu GLM).
    • Cost calculation supports multi-provider models, logs warnings for unknown models, and treats local Ollama models as zero-cost.
    • CostTracker persists cost_report.json (records, total_cost, last_updated) for UI consumption.
  • QA & agent modules

    • QA reviewer and fixer now include provider factory factory functions and async run wrappers to unify session creation across providers.
    • Memory integration and logging preserved; new error handling and session lifecycle around provider sessions.
  • Frontend: provider settings, IPC, analytics

    • New ProviderSettingsSection UI for choosing provider, API keys, and per-agent model overrides.
    • Added IPC handlers: SETTINGS_SAVE_PROVIDER, SETTINGS_LOAD_PROVIDER, SETTINGS_GET_AVAILABLE_MODELS, PROJECT_LOAD_COST_REPORT.
    • Added cost analytics: CostBreakdownChart component and exports; i18n entries for provider/cost UI strings.
    • Settings dialog updated to include "Providers" section and wiring placeholders for save/load.
    • Shared types extended to include ProviderSettings and CostReport/UsageRecord types.
  • Tests & docs

    • Large set of new E2E test files added: tests for OpenAI, Ollama, per-agent provider selection; test import helper for QA reviewer.
    • Test docs added (README_E2E_OPENAI.md) and manual test procedures present in tests.
    • Legacy test_fallback_manual.py removed.

Breaking / migration notes:

  • create_client usage removed — code that previously called core.client.create_client must be updated to use ProviderConfig/create_engine_provider and provider.create_session (session.client).
  • New env var AI_ENGINE_PROVIDER (and per-agent AGENT_PROVIDER_) and per-agent model env vars (AGENT_MODEL_) are supported — update .env or settings accordingly.
  • Some provider paths still only implement Claude; non-claude providers may raise NotImplementedError until adapters are added.
  • cost_report.json format added/extended — frontend and tooling expecting this file should read the new shape (spec_dir, total_cost, records, last_updated).
  • New IPC channels introduced (provider settings + cost report) — renderer code using IPC will need to adopt them.

Risk/impact:

  • Medium: core session creation and provider selection changed across agent code paths and settings/UI integration. Review integrations where create_client was used and update environment/config to set AI_ENGINE_PROVIDER and per-agent overrides.

Reviewed by Panto AI

Comment on lines 284 to +314

# Create client (fresh context) with phase-specific model and thinking
# Create provider using factory pattern for multi-provider support
# The provider is selected via AI_ENGINE_PROVIDER env var (default: claude)
provider_config = ProviderConfig.from_env()
provider = create_engine_provider(provider_config)

# Create session with phase-specific model and thinking
# Use appropriate agent_type for correct tool permissions and thinking budget
client = create_client(
project_dir,
spec_dir,
phase_model,
agent_type="planner" if first_run else "coder",
agent_type_for_session = "planner" if first_run else "coder"
from core.providers.base import SessionConfig

session_config = SessionConfig(
name=f"{agent_type_for_session}-session-{iteration}",
model=phase_model,
extra={
"agent_type": agent_type_for_session,
"max_thinking_tokens": phase_thinking_budget,
},
)

session = provider.create_session(
session_config,
project_dir=project_dir,
spec_dir=spec_dir,
agent_type=agent_type_for_session,
max_thinking_tokens=phase_thinking_budget,
)

# Get the underlying Claude SDK client from the session
client = session.client

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL_BUG] The new provider/session creation assumes provider.create_session(...) returns an object with .client and that the .client implements 'async with' context manager semantics. This will break if a provider implementation differs (different attribute names or sync client). Add explicit validation and error handling: - Validate provider and returned session: if not hasattr(session, 'client') raise a clear ProviderError. - Validate client supports async context manager (has aenter/aexit), or adapt by wrapping sync clients in an async executor or providing a session.close() fallback. - Surround provider.create_session with try/except to catch provider creation errors and log/provide a helpful message (include provider_config.provider name and model). This avoids hard crashes when a non-claude provider is configured.

# After creating the session in apps/backend/agents/coder.py
from core.providers.errors import ProviderError  # if such error type exists

session = provider.create_session(
    session_config,
    project_dir=project_dir,
    spec_dir=spec_dir,
    agent_type=agent_type_for_session,
    max_thinking_tokens=phase_thinking_budget,
)

# Validate session shape and client capabilities for multi-provider safety
if not hasattr(session, "client"):
    raise ProviderError(
        f"Provider session for agent_type='{agent_type_for_session}' is missing 'client' attribute "
        f"(provider={getattr(provider_config, 'provider', 'unknown')}, model={phase_model})"
    )

client = session.client

# Optionally adapt non-async clients if future providers return a sync client
if not hasattr(client, "__aenter__") or not hasattr(client, "__aexit__"):
    raise ProviderError(
        f"Client returned by provider does not support async context manager usage: "
        f"type={type(client)} (provider={getattr(provider_config, 'provider', 'unknown')}, model={phase_model})"
    )

Comment on lines +79 to +96
def create_multi_provider_env_config(spec_dir: Path) -> Path:
"""Create .env file with per-agent provider configuration."""
env_file = spec_dir.parent.parent.parent / ".env"
env_content = """# Per-Agent Provider Configuration
# Planner uses Claude Opus
AGENT_PROVIDER_PLANNER=claude
AGENT_MODEL_PLANNER=claude-opus-4-20250514

# Coder uses OpenAI GPT-4
AGENT_PROVIDER_CODER=litellm
AGENT_MODEL_CODER=gpt-4
OPENAI_API_KEY=test-key-123

# Default provider (for other agents)
AI_ENGINE_PROVIDER=claude
"""
env_file.write_text(env_content)
return env_file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL_BUG] create_multi_provider_env_config writes a .env file in the derived path (spec_dir.parent.parent.parent / ".env"). Avoid writing real API keys or persistent config into repo-root-like locations during tests. Use monkeypatch.setenv to set environment variables in-process, or write to a temporary file that you delete after the test. If writing files is required, ensure they live under an isolated temp directory and are cleaned up, and avoid using real secrets.

# In tests/test_per_agent_provider_e2e.py
from pathlib import Path


def create_multi_provider_env_config(spec_dir: Path) -> dict[str, str]:
    """Return env vars for multi-provider config without touching the real .env.

    The caller is responsible for applying these with monkeypatch.setenv.
    """
    return {
        "AGENT_PROVIDER_PLANNER": "claude",
        "AGENT_MODEL_PLANNER": "claude-opus-4-20250514",
        "AGENT_PROVIDER_CODER": "litellm",
        "AGENT_MODEL_CODER": "gpt-4",
        "OPENAI_API_KEY": "test-key-123",
        "AI_ENGINE_PROVIDER": "claude",
    }


def apply_multi_provider_env(monkeypatch, spec_dir: Path) -> None:
    """Helper to apply test-only env vars for multi-provider config."""
    for key, value in create_multi_provider_env_config(spec_dir).items():
        monkeypatch.setenv(key, value)

Comment on lines +21 to +65
# Save original env vars
orig_provider = os.environ.get("AI_ENGINE_PROVIDER")
orig_planner = os.environ.get("AGENT_PROVIDER_PLANNER")
orig_coder = os.environ.get("AGENT_PROVIDER_CODER")

try:
# Set test configuration
os.environ["AI_ENGINE_PROVIDER"] = "claude"
os.environ["AGENT_PROVIDER_PLANNER"] = "claude"
os.environ["AGENT_MODEL_PLANNER"] = "claude-opus-4-20250514"
os.environ["AGENT_PROVIDER_CODER"] = "litellm"
os.environ["AGENT_MODEL_CODER"] = "gpt-4"

# Test planner uses claude
planner_provider = get_provider_for_agent("planner")
assert planner_provider == "claude", f"✗ Planner should use Claude, got {planner_provider}"
print("✓ Planner uses Claude")

# Test coder uses litellm
coder_provider = get_provider_for_agent("coder")
assert coder_provider == "litellm", f"✗ Coder should use LiteLLM, got {coder_provider}"
print("✓ Coder uses LiteLLM")

# Test fallback to default
os.environ.pop("AGENT_PROVIDER_QA_REVIEWER", None)
qa_provider = get_provider_for_agent("qa_reviewer")
assert qa_provider == "claude", f"✗ QA reviewer should fall back to Claude, got {qa_provider}"
print("✓ QA reviewer falls back to default Claude")

finally:
# Restore original env vars
if orig_provider:
os.environ["AI_ENGINE_PROVIDER"] = orig_provider
else:
os.environ.pop("AI_ENGINE_PROVIDER", None)

if orig_planner:
os.environ["AGENT_PROVIDER_PLANNER"] = orig_planner
else:
os.environ.pop("AGENT_PROVIDER_PLANNER", None)

if orig_coder:
os.environ["AGENT_PROVIDER_CODER"] = orig_coder
else:
os.environ.pop("AGENT_PROVIDER_CODER", None)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[REFACTORING] You save and restore only AGENT_PROVIDER_* env vars (orig_provider, orig_planner, orig_coder) but the test also sets AGENT_MODEL_PLANNER, AGENT_MODEL_CODER and could set other vars (OPENAI_API_KEY, AGENT_PROVIDER_QA_REVIEWER, etc.). Expand the save/restore logic to capture and restore all environment variables you mutate (including AGENT_MODEL_*, OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.), or use monkeypatch.setenv/delenv inside pytest fixtures. That prevents side effects on the developer environment when running the quick verification script.

def test_per_agent_provider_configuration():
    """Test that per-agent provider configuration works."""
    from phase_config import get_provider_for_agent

    print("Testing per-agent provider configuration...")

    # Snapshot all env vars we may touch so we can fully restore later
    env_keys = [
        "AI_ENGINE_PROVIDER",
        "AGENT_PROVIDER_PLANNER",
        "AGENT_MODEL_PLANNER",
        "AGENT_PROVIDER_CODER",
        "AGENT_MODEL_CODER",
        "AGENT_PROVIDER_QA_REVIEWER",
        "AGENT_MODEL_QA_REVIEWER",
        "OPENAI_API_KEY",
        "ANTHROPIC_API_KEY",
    ]
    orig_env = {k: os.environ.get(k) for k in env_keys}

    try:
        # Set test configuration
        os.environ["AI_ENGINE_PROVIDER"] = "claude"
        os.environ["AGENT_PROVIDER_PLANNER"] = "claude"
        os.environ["AGENT_MODEL_PLANNER"] = "claude-opus-4-20250514"
        os.environ["AGENT_PROVIDER_CODER"] = "litellm"
        os.environ["AGENT_MODEL_CODER"] = "gpt-4"

        planner_provider = get_provider_for_agent("planner")
        assert planner_provider == "claude", f"✗ Planner should use Claude, got {planner_provider}"
        print("✓ Planner uses Claude")

        coder_provider = get_provider_for_agent("coder")
        assert coder_provider == "litellm", f"✗ Coder should use LiteLLM, got {coder_provider}"
        print("✓ Coder uses LiteLLM")

        os.environ.pop("AGENT_PROVIDER_QA_REVIEWER", None)
        qa_provider = get_provider_for_agent("qa_reviewer")
        assert qa_provider == "claude", f"✗ QA reviewer should fall back to Claude, got {qa_provider}"
        print("✓ QA reviewer falls back to default Claude")

    finally:
        # Fully restore original environment
        for key, value in orig_env.items():
            if value is None:
                os.environ.pop(key, None)
            else:
                os.environ[key] = value

Comment on lines +313 to +326
export interface ProviderSettings {
// Selected provider (claude, litellm, openrouter)
provider?: 'claude' | 'litellm' | 'openrouter';

// API Keys
openaiApiKey?: string;
googleApiKey?: string;
openrouterApiKey?: string;

// Per-agent model configuration
plannerModel?: string;
coderModel?: string;
qaModel?: string;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[REFACTORING] ProviderSettings.provider is currently a union of 'claude' | 'litellm' | 'openrouter'. Given the goal to support multiple backends and future providers (OpenAI, Google/Gemini, Ollama, self-hosted), consider widening this to a string union (e.g., provider?: 'claude' | 'litellm' | 'openrouter' | 'openai' | 'google' | 'ollama' | string) or an enum that's easy to extend. This improves forward compatibility and avoids type churn when adding new providers.

// apps/frontend/src/shared/types/settings.ts

// Provider Settings for Multi-Model Support
export type ProviderId =
  | 'claude'
  | 'litellm'
  | 'openrouter'
  | 'openai'
  | 'google'
  | 'ollama'
  | (string & {}); // allow forward-compatible custom providers

export interface ProviderSettings {
  // Selected provider (claude, litellm, openrouter, openai, google, ollama, etc.)
  provider?: ProviderId;

  // API Keys
  openaiApiKey?: string;
  googleApiKey?: string;
  openrouterApiKey?: string;

  // Per-agent model configuration
  plannerModel?: string;
  coderModel?: string;
  qaModel?: string;
}

Comment on lines +309 to +325
// Cost tracking types (from cost_tracking.py)
export interface UsageRecord {
agent_type: string;
model: string;
input_tokens: number;
output_tokens: number;
cost: number;
timestamp: string;
}

export interface CostReport {
spec_dir: string;
total_cost: number;
records: UsageRecord[];
last_updated: string;
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[REFACTORING] You added UsageRecord/CostReport TS interfaces (copied from cost_tracking.py). Consider placing these shared types in a single frontend-shared types file (or generating them from a canonical schema) so the shapes stay in sync with backend cost tracking. Also add optional fields and nullability where appropriate to mirror backend behavior (e.g., cost may be absent or null in some records).

// apps/frontend/src/shared/types/cost.ts

export interface UsageRecord {
  agent_type: string;
  model: string;
  input_tokens: number;
  output_tokens: number;
  cost?: number | null;
  timestamp: string;
}

export interface CostReport {
  spec_dir: string;
  total_cost?: number | null;
  records: UsageRecord[];
  last_updated: string;
}
// apps/frontend/src/shared/types/task.ts
// ...existing imports...
-import interface declarations for UsageRecord/CostReport here
+// Cost tracking types
+export type { UsageRecord, CostReport } from './cost';

@pantoaibot
Copy link

pantoaibot bot commented Feb 7, 2026

Reviewed up to commit:31a285a2f1a19028e09284cb80014c121ff15a41

Additional Suggestion
apps/backend/core/cost_tracking.py, line:313-381 UsageRecord is deserialized using UsageRecord.from_dict(...) (see loading logic around lines ~378-381) but the dataclass defines only to_dict() and does not implement from_dict. Add a @classmethod from_dict(cls, data: dict) -> 'UsageRecord' to safely construct instances (validate types, coerce missing fields) or change the loader to construct UsageRecord(...) directly. Example: @classmethod def from_dict(cls, d): return cls(agent_type=d['agent_type'], model=d['model'], input_tokens=int(d['input_tokens']), output_tokens=int(d['output_tokens']), cost=float(d.get('cost', 0.0)), timestamp=d.get('timestamp', datetime.utcnow().isoformat()+'Z')). Ensure robust error handling for malformed records to avoid raising at load time.
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any, Mapping


@dataclass
class UsageRecord:
    """Single usage record for an agent session."""

    agent_type: str
    model: str
    input_tokens: int
    output_tokens: int
    cost: float
    timestamp: str

    def to_dict(self) -> dict[str, Any]:
        """Convert to dictionary for JSON serialization."""
        return {
            "agent_type": self.agent_type,
            "model": self.model,
            "input_tokens": self.input_tokens,
            "output_tokens": self.output_tokens,
            "cost": self.cost,
            "timestamp": self.timestamp,
        }

    @classmethod
    def from_dict(cls, data: Mapping[str, Any]) -> "UsageRecord":
        """Safely construct a UsageRecord from a dict.

        This is used when loading historical cost_report.json files. It is
        intentionally defensive so that newly added fields or partially
        missing data don't crash cost loading.
        """
        try:
            return cls(
                agent_type=str(data.get("agent_type", "unknown")),
                model=str(data.get("model", "unknown")),
                input_tokens=int(data.get("input_tokens", 0)),
                output_tokens=int(data.get("output_tokens", 0)),
                cost=float(data.get("cost", 0.0)),
                timestamp=str(
                    data.get(
                        "timestamp",
                        datetime.utcnow().isoformat() + "Z",
                    )
                ),
            )
        except (TypeError, ValueError, KeyError) as exc:
            # If a record is malformed, log and fall back to a zeroed-out record
            logger.warning(
                "Failed to deserialize UsageRecord from %r: %s", data, exc
            )
            return cls(
                agent_type="unknown",
                model="unknown",
                input_tokens=0,
                output_tokens=0,
                cost=0.0,
                timestamp=datetime.utcnow().isoformat() + "Z",
            )


class CostTracker:
    ...
    def _load_existing_records(self) -> None:
        """Load existing records from cost_report.json if present."""
        if not self._report_file.exists():
            self.records = []
            return

        try:
            with open(self._report_file, "r", encoding="utf-8") as f:
                data = json.load(f)

            raw_records = data.get("records", [])
            self.records = [
                UsageRecord.from_dict(record)
                for record in raw_records
                if isinstance(record, dict)
            ]
        except (json.JSONDecodeError, KeyError, TypeError):
            # If file is corrupted, start fresh
            logger.warning(
                "Failed to load existing cost records from %s; starting fresh",
                self._report_file,
            )
            self.records = []

Reviewed by Panto AI

OBenner and others added 2 commits February 28, 2026 21:35
Merge origin/develop into PR #4 branch, resolving 9 file conflicts:
- .env.example: use unified Auto Code naming
- coder.py, planner.py, fixer.py: keep both provider factory and develop imports
- settings-handlers.ts: merge getAvailableModels + provider config handlers
- AppSettings.tsx: merge all new sections and imports
- en/common.json, fr/common.json: merge cost + memory/contextViewer sections
- settings.ts: merge ProviderSettings + keyboard shortcuts + AIProviderConfig types

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ocks

- Fix circular import: prompts_pkg → agents → coder/planner → prompts_pkg
  by making load_template and is_first_run lazy imports
- Add updateProviderConfig/getProviderConfig/validateProviderConfig to
  ElectronAPI Window type declaration and browser mock
- Wire ProviderSettingsSection to use existing updateProviderConfig IPC
  instead of non-existent saveProviderSettings
- Fix test_planner.py mocks: create_client → create_planner_session
- Fix provider e2e tests: config.model → config.get_model_for_provider()
- Fix AGENT_MODEL_<TYPE> override to apply to correct provider's model field
- Fix Python 3.14 compatibility: ast.Num/Str/Bytes removed in 3.14
- Fix test_issue_884 mocks for new provider factory pattern

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 28, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds end-to-end multi‑provider AI engine support: provider config and factory, per‑agent provider resolution, provider-backed AgentSession usage across agents, expanded multi‑provider cost tracking and reporting, frontend provider settings + cost UI, and extensive tests and docs. Changes are primarily new features and test additions; most logic shifts provider/session creation.

Changes

Cohort / File(s) Summary
Env & docs
apps/backend/.env.example
Expanded multi‑provider and per‑agent env guidance, model lists, cost/fallback notes (documentation only).
Provider core
apps/backend/core/providers/config.py, apps/backend/core/providers/factory.py, apps/backend/core/providers/__init__.py
Introduce ProviderConfig.from_env(agent_type), per‑agent overrides, create_agent_session factory, Claude‑specific SessionConfig and export of factory.
Agent integration
apps/backend/agents/coder.py, apps/backend/agents/planner.py, apps/backend/qa/fixer.py, apps/backend/qa/reviewer.py
Replace direct client creation with provider-backed AgentSession flows, add create_*_session helpers, lazy imports to avoid cycles, and validate session.client presence.
Provider resolution
apps/backend/phase_config.py
Add AGENT_DEFAULT_PROVIDERS and get_provider_for_agent(agent_type) with precedence: AGENT_PROVIDER_ → AI_ENGINE_PROVIDER → defaults.
Cost tracking & utils
apps/backend/core/cost_tracking.py, verify_per_agent_provider.py
Expand MODEL_PRICING across providers, fallback pricing, defensive UsageRecord.from_dict, logging/warnings on unknown models, and standalone verification script.
Prompts & AST
apps/backend/prompts_pkg/prompts.py, apps/backend/merge/rename_detector.py
Move template loader to lazy import; unify AST constant detection for compatibility across Python versions.
Frontend IPC & main
apps/frontend/src/main/ipc-handlers/settings-handlers.ts, apps/frontend/src/main/ipc-handlers/project-handlers.ts, apps/frontend/src/shared/constants/ipc.ts
Add IPC handlers for provider settings (save/load/get models) and project cost report loading with path validation and error handling; new IPC channel constants.
Frontend UI & components
apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx, apps/frontend/src/renderer/components/analytics/CostBreakdownChart.tsx, apps/frontend/src/renderer/components/settings/AppSettings.tsx, .../settings/index.ts, .../analytics/index.ts
Add ProviderSettingsSection UI, ProviderSettingsSection export, CostBreakdownChart component + types, change nav id to 'providers', and update barrel exports.
Frontend types & i18n
apps/frontend/src/shared/types/settings.ts, apps/frontend/src/shared/types/task.ts, apps/frontend/src/shared/types/ipc.ts, apps/frontend/src/shared/i18n/locales/*
Add ProviderSettings, extend AIEngineProvider (add zhipuai), augment AIProviderConfig with per‑agent fields, add UsageRecord/CostReport types, ElectronAPI provider methods, and EN/FR translations.
Frontend wiring & mocks
apps/frontend/src/renderer/lib/mocks/settings-mock.ts, apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx
Add mock provider config APIs and implement load/save flows for ProviderSettingsSection (handles provider selection, API keys, per‑agent models).
Tests: provider & E2E
tests/test_openai_provider_e2e.py, tests/test_ollama_provider_e2e.py, tests/test_per_agent_provider_e2e.py, tests/test_issue_884_plan_schema.py, tests/test_planner.py, ...tests/*
Large set of new/updated tests for OpenAI/Ollama/per‑agent flows; tests updated to mock create_engine_provider/create_session; many E2E test modules added.
Test hygiene & formatting
tests/...
Widespread test formatting, import reorders, typing modernizations, minor behavioral adjustments in tests (mostly non‑functional).

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent (planner/coder/qa)
    participant Config as ProviderConfig
    participant Factory as create_engine_provider
    participant Provider as Provider (Claude/LiteLLM/Ollama/...)
    participant Session as AgentSession
    participant Client as SDK Client
    participant Cost as CostTracker

    Agent->>Config: ProviderConfig.from_env(agent_type)
    Config-->>Agent: resolved provider & model
    Agent->>Factory: create_engine_provider(config)
    Factory-->>Provider: instantiate provider impl
    Agent->>Provider: provider.create_session(SessionConfig)
    Provider-->>Session: AgentSession (contains session.client)
    Session-->>Client: initialize underlying SDK client (session.client)
    Agent->>Client: send prompts / requests
    Client-->>Agent: responses + token usage
    Agent->>Cost: record UsageRecord (agent_type, model, tokens, cost)
    Cost-->>Disk: write/update cost_report.json
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Possibly related PRs

  • #85 — closely matches this multi‑provider abstraction (provider config, factory/session creation, frontend provider wiring).
  • #58 — overlaps provider factory/config, cost tracking, and frontend provider settings; similar scope.
  • #5 — related per‑agent provider/model persistence and provider/session plumbing.

Poem

🐰 Hop, hop—providers line up in a row,

Claude, LiteLLM, Ollama set to go.
Sessions wake, clients hum and rhyme,
Tokens counted, costs recorded in time.
I nibble the docs and celebrate the climb.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Multi-Model Provider Support' clearly summarizes the main objective of adding multi-provider AI engine support (Claude, OpenAI, Google Gemini, Zhipu GLM, LiteLLM/OpenRouter, Ollama) as documented in the PR description.
Docstring Coverage ✅ Passed Docstring coverage is 94.93% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch auto-claude/046-multi-model-provider-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

- Merge latest origin/develop which includes PR #5 (Runtime Provider
  and Model Selection with Task Restart)
- Resolve conflict in planner.py: keep provider factory + add
  get_provider_config from develop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 25

♻️ Duplicate comments (10)
apps/backend/qa/reviewer.py (1)

831-858: ⚠️ Potential issue | 🔴 Critical

QA reviewer factory still hardcodes Claude-only behavior.

Line 831 reads env without agent_type, and Lines 835-858 reject all non-Claude providers. This breaks per-agent reviewer provider selection and contradicts the new multi-provider runtime path.

🔧 Proposed fix (delegate to shared factory)
-from core.providers import create_engine_provider
-from core.providers.base import SessionConfig
-from core.providers.config import ProviderConfig
+from core.providers import create_agent_session
@@
 def create_qa_reviewer_session(
@@
-    # Create provider from environment configuration
-    config = ProviderConfig.from_env()
-    provider = create_engine_provider(config)
-
-    # For Claude provider, create session with provider-specific parameters
-    if provider.name == "claude":
-        from core.providers.adapters.claude import ClaudeAgentProvider
-
-        if not isinstance(provider, ClaudeAgentProvider):
-            raise TypeError(f"Expected ClaudeAgentProvider, got {type(provider)}")
-
-        # Create session using provider's create_session method
-        session = provider.create_session(
-            config=SessionConfig(
-                name="qa-reviewer-session",
-                model=model,
-            ),
-            project_dir=project_dir,
-            spec_dir=spec_dir,
-            agent_type="qa_reviewer",
-            max_thinking_tokens=max_thinking_tokens,
-        )
-        return session
-    else:
-        # For other providers, implement their session creation
-        # TODO: Add support for LiteLLM and OpenRouter providers
-        raise NotImplementedError(
-            f"Provider {provider.name} not yet implemented for QA reviewer agent"
-        )
+    return create_agent_session(
+        agent_type="qa_reviewer",
+        project_dir=project_dir,
+        spec_dir=spec_dir,
+        model=model,
+        max_thinking_tokens=max_thinking_tokens,
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/qa/reviewer.py` around lines 831 - 858, The QA reviewer factory
currently hardcodes Claude-only behavior: change ProviderConfig.from_env() to
include agent_type="qa_reviewer" so the environment selection can pick per-agent
providers, and stop rejecting non-claude engines — delegate session creation to
the shared provider factory instead of raising NotImplementedError; specifically
use create_engine_provider(config) and call provider.create_session(...) with
SessionConfig(name="qa-reviewer-session", model=model) and the existing kwargs
(project_dir, spec_dir, agent_type="qa_reviewer", max_thinking_tokens) for any
provider type, keeping the Claude type-check only if provider-specific behavior
is required.
apps/backend/agents/planner.py (1)

64-91: ⚠️ Potential issue | 🔴 Critical

create_planner_session currently breaks non-Claude providers and bypasses per-agent provider resolution.

Line 64 uses ProviderConfig.from_env() without agent_type, and Lines 67-91 hard-fail all non-Claude providers with NotImplementedError. This blocks planner runs when AGENT_PROVIDER_PLANNER or a non-Claude global provider is configured.

🔧 Proposed fix (delegate to shared provider factory)
-from core.providers import create_engine_provider
-from core.providers.base import SessionConfig
-from core.providers.config import ProviderConfig
+from core.providers import create_agent_session
@@
 def create_planner_session(
@@
-    # Create provider from environment configuration
-    config = ProviderConfig.from_env()
-    provider = create_engine_provider(config)
-
-    # For Claude provider, create session with provider-specific parameters
-    if provider.name == "claude":
-        from core.providers.adapters.claude import ClaudeAgentProvider
-
-        if not isinstance(provider, ClaudeAgentProvider):
-            raise TypeError(f"Expected ClaudeAgentProvider, got {type(provider)}")
-
-        # Create session using provider's create_session method
-        session = provider.create_session(
-            config=SessionConfig(
-                name="planner-session",
-                model=model,
-            ),
-            project_dir=project_dir,
-            spec_dir=spec_dir,
-            agent_type="planner",
-            max_thinking_tokens=max_thinking_tokens,
-        )
-        return session
-    else:
-        # For other providers, implement their session creation
-        # TODO: Add support for LiteLLM and OpenRouter providers
-        raise NotImplementedError(
-            f"Provider {provider.name} not yet implemented for planner agent"
-        )
+    return create_agent_session(
+        agent_type="planner",
+        project_dir=project_dir,
+        spec_dir=spec_dir,
+        model=model,
+        max_thinking_tokens=max_thinking_tokens,
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/agents/planner.py` around lines 64 - 91, The planner session
code currently uses ProviderConfig.from_env() without the agent_type and
hard-fails non-Claude providers; change it to call
ProviderConfig.from_env(agent_type="planner") so env-specific per-agent provider
resolution is used, then delegate session creation to the provider instance
instead of special-casing Claude: keep the existing
create_engine_provider(config) call, remove the Claude-only branch, and call
provider.create_session(...) (with the same SessionConfig, project_dir,
spec_dir, agent_type="planner", max_thinking_tokens) for any provider that
implements it; if a provider lacks create_session, raise a clear TypeError or
fallback to the shared provider factory (reuse
create_engine_provider/provider.create_session) rather than NotImplementedError.
apps/backend/qa/fixer.py (1)

745-768: ⚠️ Potential issue | 🔴 Critical

QA fixer session factory is still Claude-only and skips per-agent provider config.

Line 745 uses ProviderConfig.from_env() without agent_type, and Lines 748-768 reject all non-Claude providers. That breaks AGENT_PROVIDER_QA_FIXER and global non-Claude QA setups.

🔧 Proposed fix (use shared agent session factory)
-from core.providers import create_engine_provider
-from core.providers.base import SessionConfig
-from core.providers.config import ProviderConfig
+from core.providers import create_agent_session
@@
 def create_qa_fixer_session(
@@
-    config = ProviderConfig.from_env()
-    provider = create_engine_provider(config)
-
-    if provider.name == "claude":
-        from core.providers.adapters.claude import ClaudeAgentProvider
-
-        if not isinstance(provider, ClaudeAgentProvider):
-            raise TypeError(f"Expected ClaudeAgentProvider, got {type(provider)}")
-
-        session = provider.create_session(
-            config=SessionConfig(
-                name="qa-fixer-session",
-                model=model,
-            ),
-            project_dir=project_dir,
-            spec_dir=spec_dir,
-            agent_type="qa_fixer",
-            max_thinking_tokens=max_thinking_tokens,
-        )
-        return session
-    else:
-        raise NotImplementedError(
-            f"Provider {provider.name} not yet implemented for QA fixer agent"
-        )
+    return create_agent_session(
+        agent_type="qa_fixer",
+        project_dir=project_dir,
+        spec_dir=spec_dir,
+        model=model,
+        max_thinking_tokens=max_thinking_tokens,
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/qa/fixer.py` around lines 745 - 768, The factory currently
hardcodes Claude and ignores per-agent provider config; change
ProviderConfig.from_env() to include the agent type (e.g.,
ProviderConfig.from_env(agent_type="qa_fixer")) and stop rejecting non-Claude
providers: call create_engine_provider(config) and return
provider.create_session(...) for any provider instead of raising
NotImplementedError; remove the Claude-only type check for ClaudeAgentProvider
and, if provider-specific adapters are required, route through the shared agent
session factory (or a common create_agent_session helper) while still passing
SessionConfig(name="qa-fixer-session", model=model) and the same
project_dir/spec_dir/agent_type/max_thinking_tokens to provider.create_session.
tests/test_openai_provider_e2e.py (2)

95-101: ⚠️ Potential issue | 🟠 Major

Remove .env fallback writes from test helpers.

The fallback path still writes OPENAI_API_KEY to disk; keep provider setup process-local with monkeypatch.setenv only.

Suggested change
 def create_openai_env_config(
     spec_dir: Path,
     openai_api_key: str = "test-key",
-    monkeypatch: pytest.MonkeyPatch | None = None,
-) -> Path:
+    monkeypatch: pytest.MonkeyPatch,
+) -> None:
@@
-    if monkeypatch:
-        monkeypatch.setenv("AI_ENGINE_PROVIDER", "litellm")
-        monkeypatch.setenv("LITELLM_MODEL", "gpt-4")
-        monkeypatch.setenv("OPENAI_API_KEY", openai_api_key)
-        return spec_dir.parent.parent.parent / ".env"
-
-    env_file = spec_dir.parent.parent.parent / ".env"
-    env_content = f"""# OpenAI Provider Configuration
-AI_ENGINE_PROVIDER=litellm
-LITELLM_MODEL=gpt-4
-OPENAI_API_KEY={openai_api_key}
-"""
-    env_file.write_text(env_content)
-    return env_file
+    monkeypatch.setenv("AI_ENGINE_PROVIDER", "litellm")
+    monkeypatch.setenv("LITELLM_MODEL", "gpt-4")
+    monkeypatch.setenv("OPENAI_API_KEY", openai_api_key)
As per coding guidelines, `tests/**` should ensure proper mocking and test isolation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_openai_provider_e2e.py` around lines 95 - 101, The test currently
writes an .env file via env_file.write_text(...) using env_content which leaks
OPENAI_API_KEY to disk; remove the fallback write (delete the
env_file/env_content/write_text usage in tests/test_openai_provider_e2e.py) and
instead set the environment key process-locally using
monkeypatch.setenv("OPENAI_API_KEY", openai_api_key) in the test or fixture that
configures the provider (references: env_file, env_content, write_text).

52-52: ⚠️ Potential issue | 🟠 Major

Patch the adapter-imported symbol instead of litellm.completion at package root.

Patching the root symbol can miss calls if the provider uses a module-local litellm import, so external calls may not actually be intercepted.

#!/bin/bash
# Find real LiteLLM completion call sites and imports to patch the exact module symbol.
rg -nP --type=py '\blitellm\.completion\s*\('
rg -nP --type=py '(^from litellm import|^import litellm)'

As per coding guidelines, tests/** should ensure proper mocking and test isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_openai_provider_e2e.py` at line 52, The test currently patches the
package-root symbol with with patch("litellm.completion") as mock_completion:,
which can miss calls if the provider module imports litellm locally; update the
patch target to the adapter/provider module's imported symbol (i.e., patch the
module-local name used by the code under test, for example
patch("your_provider_module.litellm.completion") or
patch("your_provider_module.completion") depending on how the provider imports
it) so the mock intercepts actual calls from the provider; locate the import in
the provider module and replace the root patch with that module-specific import
path.
tests/test_ollama_provider_e2e.py (2)

52-52: ⚠️ Potential issue | 🟠 Major

Patch the adapter call-site symbol rather than litellm.completion globally.

Global patching is brittle when provider code imports litellm into module scope.

#!/bin/bash
# Locate completion call sites/import locations to patch the exact symbol used by provider code.
rg -nP --type=py '\blitellm\.completion\s*\('
rg -nP --type=py '(^from litellm import|^import litellm)'

As per coding guidelines, tests/** should ensure proper mocking and test isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_ollama_provider_e2e.py` at line 52, Replace the global patch of
litellm.completion in tests/test_ollama_provider_e2e.py with a patch of the
exact symbol the provider imports (i.e., patch the adapter call-site used by the
provider code rather than "litellm.completion"); locate the provider module
import (use the suggested rg commands) and change with
patch("path.to.provider_module.completion") or patch.object(provider_module,
"completion") where provider_module is the module that imported litellm,
ensuring mock_completion still refers to the same patched symbol.

93-101: ⚠️ Potential issue | 🟠 Major

Avoid .env fallback writes in test helper paths.

Even for Ollama, keep provider configuration process-local to avoid persistent test artifacts.

As per coding guidelines, tests/** should ensure proper test isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_ollama_provider_e2e.py` around lines 93 - 101, The test helper
currently writes a persistent .env file to repo test helper paths via the
env_file variable; instead, make the provider configuration process-local by not
writing to repo files—use a temporary file or the pytest
tmp_path/tmp_path_factory fixture (or set environment variables directly via
monkeypatch.setenv) instead of env_file.write_text; update the helper that
constructs env_content (the code creating env_file and returning it) to return a
temp path or simply apply monkeypatched env vars so no .env is left on disk
after tests.
tests/test_per_agent_provider_e2e.py (2)

95-110: ⚠️ Potential issue | 🟠 Major

Keep per-agent test config in-process; remove .env write fallback.

This helper still writes provider/API-key config to disk; tests should use monkeypatch only.

As per coding guidelines, tests/** should ensure proper mocking and test isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_per_agent_provider_e2e.py` around lines 95 - 110, The helper
currently writes a .env file to disk (see env_file, env_content and
env_file.write_text) which breaks test isolation; change this helper to stop
writing files and instead set the same environment keys using monkeypatch.setenv
for AGENT_PROVIDER_PLANNER, AGENT_MODEL_PLANNER, AGENT_PROVIDER_CODER,
AGENT_MODEL_CODER, OPENAI_API_KEY and AI_ENGINE_PROVIDER (and return whatever
the tests expect, e.g., None or the monkeypatch fixture), removing
env_file.write_text and the file return so tests run entirely in-process without
touching disk.

52-52: ⚠️ Potential issue | 🟠 Major

Patch the provider module symbol, not the litellm root symbol.

Root-level patching can miss module-local imports and let real calls slip through.

#!/bin/bash
# Confirm where LiteLLM completion is called/imported so tests patch exact call-site symbols.
rg -nP --type=py '\blitellm\.completion\s*\('
rg -nP --type=py '(^from litellm import|^import litellm)'

As per coding guidelines, tests/** should ensure proper mocking and test isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_per_agent_provider_e2e.py` at line 52, The test is patching the
root symbol "litellm.completion" which can miss module-local imports; change the
patch to target the provider module's imported symbol (the exact name used in
the module under test) instead of the litellm root — e.g. replace with
patch("<provider_module>.completion") or use patch.object on the provider module
where it does "from litellm import completion" (locate the provider module
import of completion by searching for "from litellm import" or "import litellm"
in the codebase) so that the local reference used by the code under test is
mocked.
apps/frontend/src/main/ipc-handlers/settings-handlers.ts (1)

1085-1087: 🧹 Nitpick | 🔵 Trivial

Silent error handling may mask issues for users.

When an error occurs during Ollama model detection (line 1085), the handler returns success: true with an empty models array. This hides the actual failure reason from users, who won't know if Ollama isn't running, the script failed, or another issue occurred.

Consider logging the error and returning a more informative response:

Proposed fix
-            } catch {
-              return { success: true, data: { models: [] } };
+            } catch (ollamaError) {
+              console.warn('[SETTINGS_GET_AVAILABLE_MODELS][ollama] Detection failed:', ollamaError);
+              // Return empty list but with warning - Ollama may not be running
+              return { 
+                success: true, 
+                data: { models: [] },
+                // Optionally: add a warning field if the type supports it
+              };
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/main/ipc-handlers/settings-handlers.ts` around lines 1085 -
1087, The catch block that currently returns { success: true, data: { models: []
} } when detecting Ollama models should not swallow errors; instead log the
caught error (use the module's logger/processLogger or console.error) and return
an informative failure response (e.g., success: false with an error message or
error field) so callers know Ollama detection failed; update the catch in
apps/frontend/src/main/ipc-handlers/settings-handlers.ts (the try/catch around
Ollama model detection) to log the error and return a failure result rather than
a silent empty models array.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/agents/coder.py`:
- Around line 827-864: The code eagerly creates a provider session/client
(ProviderConfig.from_env, create_engine_provider, provider.create_session,
session.client) even when use_process_isolation is True and the isolated path
calls run_agent_session_isolated(...) and never uses that client; move or delay
creation of provider.create_session and the session.client access until after
checking use_process_isolation so that if use_process_isolation is True you
instead call run_agent_session_isolated(...) directly with
provider_config/ProviderConfig or provider name as needed and skip creating the
session/client; apply the same change to the duplicate block around the other
occurrence (the region referenced at lines 1042-1074) to avoid unnecessary
failures/latency.
- Around line 829-835: The current call to ProviderConfig.from_env() ignores
per-agent overrides; change the code so ProviderConfig.from_env receives the
session agent type (use the existing agent_type_for_session variable) before
creating the provider and session—i.e., obtain provider_config =
ProviderConfig.from_env(agent_type=agent_type_for_session) (or the equivalent
named parameter used by ProviderConfig), then pass that config into
create_engine_provider(provider_config) so
AGENT_PROVIDER_PLANNER/AGENT_PROVIDER_CODER are honored for planner vs coder
sessions.

In `@apps/backend/core/cost_tracking.py`:
- Around line 470-474: The debug log uses an f-string which eagerly formats
values; change the logger call in cost calculation to use lazy formatting by
passing a format string and the variables as arguments (replace the f-string
inside logger.debug in cost tracking where logger.debug(...) currently
references model, input_tokens, output_tokens, total_cost) so the log becomes
logger.debug("Cost calculation: model=%s, input_tokens=%s, output_tokens=%s,
cost=$%.6f", model, input_tokens, output_tokens, total_cost) to defer
interpolation when the log level is enabled.
- Around line 454-459: Replace the f-string in the logger call with lazy
`%`-style formatting so the string interpolation is deferred when the log level
is disabled: update the logger.warning call that references MODEL_PRICING (the
block checking `if model not in MODEL_PRICING`) to pass a format string and
`model` as an argument instead of using f-strings.

In `@apps/backend/core/providers/factory.py`:
- Around line 258-260: Remove the redundant runtime re-import "from pathlib
import Path as _Path" in factory.py and instead add "Path" to the module-level
regular imports (the same section that already imports other runtime symbols) so
runtime code uses the standard Path symbol; keep the TYPE_CHECKING block
unchanged (which already imports Path for type hints) and update any references
in the file that currently use "_Path" to use "Path" so imports are not
duplicated.
- Around line 232-289: The create_agent_session factory lacks a return type;
annotate its signature to return AgentSession and import AgentSession under
TYPE_CHECKING (or from its defining module) so static checkers know the return
type; update the function def for create_agent_session to include ->
AgentSession and ensure any necessary typing imports (e.g., TYPE_CHECKING and
AgentSession) are added, leaving the rest of the logic (ProviderConfig,
create_engine_provider, SessionConfig, provider.create_session) unchanged.

In `@apps/backend/merge/rename_detector.py`:
- Around line 79-96: The comparator rebuilds _constant_types on every recursive
call; move the constant-type detection out of the recursive comparator to module
scope so it is computed once. Create a module-level _CONSTANT_TYPES (or
similarly named) that performs the same logic including checking ast.Constant
and any legacy ast.Num/ast.Str/ast.Bytes, then in the comparator (the function
using node1/node2 in rename_detector.py) replace the local _constant_types build
with a simple isinstance(node, _CONSTANT_TYPES) check and keep the existing
value extraction logic unchanged.

In `@apps/backend/phase_config.py`:
- Around line 626-634: The current logic returns
AGENT_DEFAULT_PROVIDERS.get(agent_type) (default_provider) before consulting the
AI_ENGINE_PROVIDER env var, causing the global provider to be ignored for most
agents; change precedence so that the environment global_provider
(os.environ.get("AI_ENGINE_PROVIDER")) is checked first and returned if present,
otherwise fall back to AGENT_DEFAULT_PROVIDERS.get(agent_type); reference the
symbols agent_type, AGENT_DEFAULT_PROVIDERS, default_provider, global_provider
and ensure any provider normalization/validation is applied consistently when
reading AI_ENGINE_PROVIDER.

In `@apps/frontend/src/main/ipc-handlers/project-handlers.ts`:
- Around line 545-565: The code builds costReportPath from
getSpecsDir(project.autoBuildPath) which may be relative to the process CWD;
resolve the specs directory against the selected project root (e.g. use
path.resolve(project.path /* or project.root */,
getSpecsDir(project.autoBuildPath)) ), then build costReportPath from that
absolute specsDir, update the security check to compare resolvedPath against
this resolvedSpecsDir, and use the resolvedPath when calling fileExists and
fsPromises.readFile; references: getSpecsDir, project.autoBuildPath,
costReportPath, fileExists, fsPromises.readFile.

In `@apps/frontend/src/main/ipc-handlers/settings-handlers.ts`:
- Around line 1026-1055: The hardcoded model arrays assigned to the variable
models inside the provider switch cases for 'claude', 'litellm', and
'openrouter' will go stale; add an inline comment above each case (or a single
block above the switch) noting the last-updated date and a TODO to fetch/update
models dynamically, and refactor into a single helper like
getModelsForProvider(provider) or MODEL_LISTS constant (referencing the switch
cases 'claude', 'litellm', 'openrouter' and the variable models) so it’s clear
where to replace the static lists with API-based fetching in the future.
- Around line 519-521: The provider type cast for building providerSettings is
too narrow; change the cast on envVars['AI_ENGINE_PROVIDER'] to the full
AIEngineProvider union (or remove the cast and coerce safely) so
ProviderSettings.provider can be any of 'claude' | 'litellm' | 'openrouter' |
'openai' | 'google' | 'ollama'; update the default accordingly (e.g., default to
'claude' or a safe AIEngineProvider) and ensure usage in the providerSettings
assignment (the ProviderSettings object and the AI_ENGINE_PROVIDER env lookup)
matches the AIEngineProvider type used elsewhere (see ProviderSettings and
AIEngineProvider symbols).

In `@apps/frontend/src/renderer/components/analytics/CostBreakdownChart.tsx`:
- Around line 181-183: The render mutates providerCost.modelBreakdown by calling
sort() directly; update the CostBreakdownChart render to avoid in-place mutation
by creating a shallow copy of modelBreakdown before sorting (e.g., use spread or
slice) when producing the sorted list used in .map(), so
providerCost.modelBreakdown remains unchanged; locate the usage around
providerCost.modelBreakdown.sort(...) and replace with a sorted copy.
- Around line 196-198: In CostBreakdownChart
(apps/frontend/src/renderer/components/analytics/CostBreakdownChart.tsx) replace
the hardcoded "tokens" text with a react-i18next translation key: import and
call useTranslation() (or the existing hook) and use t('analytics.tokens') (or
the agreed key) instead of the literal "tokens" wherever you render
formatTokens(model.inputTokens + model.outputTokens) and the other occurrence;
ensure you pass a count if pluralization is needed (e.g., t('analytics.tokens',
{ count: totalTokens })) and update/define the corresponding translation entry
in the i18n resource files.

In `@apps/frontend/src/renderer/components/settings/AppSettings.tsx`:
- Line 79: The AppSection union defines both 'provider' and 'providers' which
creates duplicate sections; pick a single canonical name (e.g., 'providers'),
update the AppSection type to remove the other entry, and then rename all usages
(route keys, menu items, switch/case branches, component props, and any
constants or route definitions that reference 'provider' or 'providers') to the
chosen name so routing and UI map to a single section; ensure imports/exports
and any components named Provider... vs Providers... are consistently updated to
the same identifier and update any tests or type annotations that reference the
old name.

In `@apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx`:
- Around line 55-243: The component ProviderSettingsSection uses unprefixed i18n
keys (e.g., t('aiProvider.title'), t('aiProvider.providers.claude.name'),
t('aiProvider.hints.envOverride'), etc. and the save label
t('common:actions.save')); update all local keys used in this file to use the
settings namespace by replacing occurrences like t('aiProvider.*') with
t('settings:aiProvider.*') (and similarly for nested keys such as providers,
apiKeys, models, hints), keeping external namespaces (e.g.,
'common:actions.save') unchanged; search for all uses of the t(...) call in
ProviderSettingsSection.tsx (including placeholders, labels, descriptions) and
perform the namespace substitutions so keys become settings:aiProvider.*.
- Around line 38-43: The config object constructed in ProviderSettingsSection
(variable config) omits the per-agent model selections so plannerModel,
coderModel, and qaModel collected in component state/UI are not persisted;
update the config to include plannerModel: plannerModel || undefined,
coderModel: coderModel || undefined, and qaModel: qaModel || undefined (or
equivalent optional values) so those state fields are included in the payload
sent/saved.
- Around line 106-246: The Save button is currently rendered only inside the
block gated by selectedProvider !== 'claude', preventing saves when the user
selects 'claude'; move the Button that calls handleSave so it is rendered
outside (after) the conditional block that checks selectedProvider !== 'claude'
(or render it unconditionally) while keeping the provider-specific API key and
model inputs (the litellm/openrouter JSX and the model selection divs)
unchanged; ensure references to selectedProvider, handleSave, and the Save
Button JSX are preserved and wired to the existing onClick handler.

In `@apps/frontend/src/renderer/lib/mocks/settings-mock.ts`:
- Around line 34-42: The provider mocks are too limited: update the mock
implementations for getProviderConfig, validateProviderConfig (and optionally
updateProviderConfig) to include all supported providers (claude, openai,
google, litellm, openrouter, ollama) in
validateProviderConfig.availableProviders and return a realistic
AIProviderConfig-shaped object from getProviderConfig containing
provider-specific fields (API keys, model names, base URLs, and any
provider-specific flags) so UI provider-selection and provider-specific form
fields are exercised; locate the functions getProviderConfig,
validateProviderConfig, and updateProviderConfig in settings-mock.ts and ensure
the returned shapes follow the AIProviderConfig interface for each provider.

In `@tests/README_E2E_OPENAI.md`:
- Around line 67-88: The markdown headings (e.g., "### Provider Configuration
Tests", "### Cost Tracking Tests", "### Provider Factory Tests", "### E2E
Integration Tests") are missing a blank line after them; update the
README_E2E_OPENAI.md content so each "### ..." heading is followed by a single
blank line before the subsequent list or paragraph to satisfy MD022. Locate the
heading lines in the diff and insert one blank line after each heading line to
ensure consistent spacing throughout the Test Coverage section.

In `@tests/test_ollama_provider_e2e.py`:
- Around line 313-330: The test only asserts ProviderConfig values; update it to
instantiate the provider via create_engine_provider and assert the returned
object is the expected provider type and usable: after creating config =
ProviderConfig.from_env(), call create_engine_provider(config) and verify the
result (e.g., not None and instance of the LiteLLM/Ollama provider class or that
it exposes the expected methods like send or generate); ensure you still keep
the environment monkeypatches (AI_ENGINE_PROVIDER, LITELLM_MODEL,
OLLAMA_API_BASE) and replace the existing final assertions with checks against
the provider instance returned by create_engine_provider to validate end-to-end
factory behavior.
- Around line 341-379: The test test_e2e_ollama_provider_mock currently bypasses
the provider path by calling CostTracker.log_usage directly; instead, exercise
the real provider/client flow so usage is generated by the code under test: use
ProviderConfig.from_env to instantiate the configured provider/client (via
whatever provider factory your code uses), perform a mock-driven generation/call
through that client (using the mock_ollama_api fixture) so the provider session
triggers cost collection, and then assert the CostTracker
(CostTracker(spec_dir=spec_dir)) recorded the expected usage; replace the direct
tracker.log_usage call with a real provider invocation that produces
input/output tokens so CostTracker captures them.

In `@tests/test_openai_provider_e2e.py`:
- Around line 286-305: The test currently only validates ProviderConfig and
never calls the factory; update test_create_litellm_provider to actually invoke
create_engine_provider with the config returned from ProviderConfig.from_env()
(or pass config.provider and config.get_model_for_provider() as required) and
assert the returned object's type or key behavior (e.g., class name or a known
method/property) to ensure create_engine_provider is exercised; reference
create_engine_provider and ProviderConfig in the assertions and keep environment
setup/monkeypatching as-is so the factory creates the LiteLLM provider.
- Around line 369-406: The test currently only calls CostTracker.log_usage and
never exercises provider/session code or the mock_openai_api fixture; update
test_e2e_openai_provider_mock to instantiate the provider from ProviderConfig
and make a minimal model call so the mocked API is hit: after config =
ProviderConfig.from_env(), call the ProviderConfig-provided factory (e.g.,
ProviderConfig.from_env().create_provider() or equivalent factory method on
ProviderConfig) to obtain a provider instance, create a session or client (e.g.,
provider.create_session() or provider.client) and perform a tiny generate/call
(e.g., provider.generate(prompt="test") or provider.call_model(...)) and assert
the response; keep the existing CostTracker usage but ensure the provider call
uses the mock_openai_api fixture so provider/session integration is validated.

In `@tests/test_per_agent_provider_e2e.py`:
- Around line 415-468: The test test_e2e_multi_provider_mock currently only sets
env vars and calls ProviderConfig.from_env and CostTracker.log_usage directly,
so it doesn't exercise provider sessions; replace the direct usage logging with
calls to the mocked provider/session fixtures and assert those sessions were
invoked: add (or use existing) a mock provider fixture to intercept network
calls for the planner and coder, instantiate the actual agent run/execute method
that uses ProviderConfig (e.g. call the planner and coder agent entrypoints used
elsewhere in tests), then assert the mock clients received expected requests and
that CostTracker recorded usage (keep calls to create_simple_spec and
create_implementation_plan). Ensure you reference and use
ProviderConfig.from_env, the agent execution entrypoints used in production, the
CostTracker instance, and tracker.log_usage assertions to validate both session
invocation and cost tracking.
- Around line 310-346: The tests only validate ProviderConfig.from_env but never
exercise the factory; update both test_create_claude_provider_for_planner and
test_create_litellm_provider_for_coder to call the factory
create_engine_provider (or the module function that builds a provider from a
ProviderConfig) using the ProviderConfig returned by ProviderConfig.from_env (or
pass agent_type if factory supports it), then assert the returned provider
instance is the expected implementation (e.g., Claude provider class for
"claude" and LiteLLM provider class for "litellm") and/or that its model/name
matches get_model_for_provider(); keep the existing environment setup and
imports but add the factory invocation and assertions against the created
provider instance.

---

Duplicate comments:
In `@apps/backend/agents/planner.py`:
- Around line 64-91: The planner session code currently uses
ProviderConfig.from_env() without the agent_type and hard-fails non-Claude
providers; change it to call ProviderConfig.from_env(agent_type="planner") so
env-specific per-agent provider resolution is used, then delegate session
creation to the provider instance instead of special-casing Claude: keep the
existing create_engine_provider(config) call, remove the Claude-only branch, and
call provider.create_session(...) (with the same SessionConfig, project_dir,
spec_dir, agent_type="planner", max_thinking_tokens) for any provider that
implements it; if a provider lacks create_session, raise a clear TypeError or
fallback to the shared provider factory (reuse
create_engine_provider/provider.create_session) rather than NotImplementedError.

In `@apps/backend/qa/fixer.py`:
- Around line 745-768: The factory currently hardcodes Claude and ignores
per-agent provider config; change ProviderConfig.from_env() to include the agent
type (e.g., ProviderConfig.from_env(agent_type="qa_fixer")) and stop rejecting
non-Claude providers: call create_engine_provider(config) and return
provider.create_session(...) for any provider instead of raising
NotImplementedError; remove the Claude-only type check for ClaudeAgentProvider
and, if provider-specific adapters are required, route through the shared agent
session factory (or a common create_agent_session helper) while still passing
SessionConfig(name="qa-fixer-session", model=model) and the same
project_dir/spec_dir/agent_type/max_thinking_tokens to provider.create_session.

In `@apps/backend/qa/reviewer.py`:
- Around line 831-858: The QA reviewer factory currently hardcodes Claude-only
behavior: change ProviderConfig.from_env() to include agent_type="qa_reviewer"
so the environment selection can pick per-agent providers, and stop rejecting
non-claude engines — delegate session creation to the shared provider factory
instead of raising NotImplementedError; specifically use
create_engine_provider(config) and call provider.create_session(...) with
SessionConfig(name="qa-reviewer-session", model=model) and the existing kwargs
(project_dir, spec_dir, agent_type="qa_reviewer", max_thinking_tokens) for any
provider type, keeping the Claude type-check only if provider-specific behavior
is required.

In `@apps/frontend/src/main/ipc-handlers/settings-handlers.ts`:
- Around line 1085-1087: The catch block that currently returns { success: true,
data: { models: [] } } when detecting Ollama models should not swallow errors;
instead log the caught error (use the module's logger/processLogger or
console.error) and return an informative failure response (e.g., success: false
with an error message or error field) so callers know Ollama detection failed;
update the catch in apps/frontend/src/main/ipc-handlers/settings-handlers.ts
(the try/catch around Ollama model detection) to log the error and return a
failure result rather than a silent empty models array.

In `@tests/test_ollama_provider_e2e.py`:
- Line 52: Replace the global patch of litellm.completion in
tests/test_ollama_provider_e2e.py with a patch of the exact symbol the provider
imports (i.e., patch the adapter call-site used by the provider code rather than
"litellm.completion"); locate the provider module import (use the suggested rg
commands) and change with patch("path.to.provider_module.completion") or
patch.object(provider_module, "completion") where provider_module is the module
that imported litellm, ensuring mock_completion still refers to the same patched
symbol.
- Around line 93-101: The test helper currently writes a persistent .env file to
repo test helper paths via the env_file variable; instead, make the provider
configuration process-local by not writing to repo files—use a temporary file or
the pytest tmp_path/tmp_path_factory fixture (or set environment variables
directly via monkeypatch.setenv) instead of env_file.write_text; update the
helper that constructs env_content (the code creating env_file and returning it)
to return a temp path or simply apply monkeypatched env vars so no .env is left
on disk after tests.

In `@tests/test_openai_provider_e2e.py`:
- Around line 95-101: The test currently writes an .env file via
env_file.write_text(...) using env_content which leaks OPENAI_API_KEY to disk;
remove the fallback write (delete the env_file/env_content/write_text usage in
tests/test_openai_provider_e2e.py) and instead set the environment key
process-locally using monkeypatch.setenv("OPENAI_API_KEY", openai_api_key) in
the test or fixture that configures the provider (references: env_file,
env_content, write_text).
- Line 52: The test currently patches the package-root symbol with with
patch("litellm.completion") as mock_completion:, which can miss calls if the
provider module imports litellm locally; update the patch target to the
adapter/provider module's imported symbol (i.e., patch the module-local name
used by the code under test, for example
patch("your_provider_module.litellm.completion") or
patch("your_provider_module.completion") depending on how the provider imports
it) so the mock intercepts actual calls from the provider; locate the import in
the provider module and replace the root patch with that module-specific import
path.

In `@tests/test_per_agent_provider_e2e.py`:
- Around line 95-110: The helper currently writes a .env file to disk (see
env_file, env_content and env_file.write_text) which breaks test isolation;
change this helper to stop writing files and instead set the same environment
keys using monkeypatch.setenv for AGENT_PROVIDER_PLANNER, AGENT_MODEL_PLANNER,
AGENT_PROVIDER_CODER, AGENT_MODEL_CODER, OPENAI_API_KEY and AI_ENGINE_PROVIDER
(and return whatever the tests expect, e.g., None or the monkeypatch fixture),
removing env_file.write_text and the file return so tests run entirely
in-process without touching disk.
- Line 52: The test is patching the root symbol "litellm.completion" which can
miss module-local imports; change the patch to target the provider module's
imported symbol (the exact name used in the module under test) instead of the
litellm root — e.g. replace with patch("<provider_module>.completion") or use
patch.object on the provider module where it does "from litellm import
completion" (locate the provider module import of completion by searching for
"from litellm import" or "import litellm" in the codebase) so that the local
reference used by the code under test is mocked.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4de763f and b460650.

📒 Files selected for processing (35)
  • apps/backend/.env.example
  • apps/backend/agents/coder.py
  • apps/backend/agents/planner.py
  • apps/backend/core/cost_tracking.py
  • apps/backend/core/providers/__init__.py
  • apps/backend/core/providers/config.py
  • apps/backend/core/providers/factory.py
  • apps/backend/merge/rename_detector.py
  • apps/backend/phase_config.py
  • apps/backend/prompts_pkg/prompts.py
  • apps/backend/qa/fixer.py
  • apps/backend/qa/reviewer.py
  • apps/frontend/src/main/ipc-handlers/project-handlers.ts
  • apps/frontend/src/main/ipc-handlers/settings-handlers.ts
  • apps/frontend/src/renderer/components/analytics/CostBreakdownChart.tsx
  • apps/frontend/src/renderer/components/analytics/index.ts
  • apps/frontend/src/renderer/components/settings/AppSettings.tsx
  • apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx
  • apps/frontend/src/renderer/components/settings/index.ts
  • apps/frontend/src/renderer/lib/mocks/settings-mock.ts
  • apps/frontend/src/shared/constants/ipc.ts
  • apps/frontend/src/shared/i18n/locales/en/common.json
  • apps/frontend/src/shared/i18n/locales/en/settings.json
  • apps/frontend/src/shared/i18n/locales/fr/common.json
  • apps/frontend/src/shared/i18n/locales/fr/settings.json
  • apps/frontend/src/shared/types/ipc.ts
  • apps/frontend/src/shared/types/settings.ts
  • apps/frontend/src/shared/types/task.ts
  • tests/README_E2E_OPENAI.md
  • tests/test_issue_884_plan_schema.py
  • tests/test_ollama_provider_e2e.py
  • tests/test_openai_provider_e2e.py
  • tests/test_per_agent_provider_e2e.py
  • tests/test_planner.py
  • verify_per_agent_provider.py

Comment on lines +369 to +406
def test_e2e_openai_provider_mock(self, test_env_openai, monkeypatch, mock_openai_api):
"""
E2E Test (MOCKED): Configure OpenAI provider and verify flow.

This test uses mocked API responses for fast, offline testing.
"""
from core.providers.config import ProviderConfig
from core.cost_tracking import CostTracker

temp_dir, spec_dir, project_dir = test_env_openai

# Step 1: Configure environment (simulating UI configuration)
monkeypatch.setenv("AI_ENGINE_PROVIDER", "litellm")
monkeypatch.setenv("LITELLM_MODEL", "gpt-4")
monkeypatch.setenv("OPENAI_API_KEY", "test-mock-key-123")

# Step 2: Verify provider configuration loaded correctly
config = ProviderConfig.from_env()
assert config.provider == "litellm", f"Provider should be litellm, got {config.provider}"
assert config.get_model_for_provider() == "gpt-4", f"Model should be gpt-4, got {config.get_model_for_provider()}"

# Step 3: Create minimal spec and plan
spec_file = create_simple_spec(spec_dir)
plan_file = create_implementation_plan(spec_dir)

assert spec_file.exists(), "spec.md should be created"
assert plan_file.exists(), "implementation_plan.json should be created"

# Step 4: Simulate agent execution with cost tracking
tracker = CostTracker(spec_dir=spec_dir)

# Simulate coder agent using GPT-4
tracker.log_usage(
agent_type="coder",
model="gpt-4",
input_tokens=1000,
output_tokens=500,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Mocked E2E path does not hit provider/session integration.

The test simulates usage via CostTracker.log_usage only, so provider creation and API call flow are not validated despite the mock fixture.

As per coding guidelines, tests/** should be comprehensive.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_openai_provider_e2e.py` around lines 369 - 406, The test currently
only calls CostTracker.log_usage and never exercises provider/session code or
the mock_openai_api fixture; update test_e2e_openai_provider_mock to instantiate
the provider from ProviderConfig and make a minimal model call so the mocked API
is hit: after config = ProviderConfig.from_env(), call the
ProviderConfig-provided factory (e.g.,
ProviderConfig.from_env().create_provider() or equivalent factory method on
ProviderConfig) to obtain a provider instance, create a session or client (e.g.,
provider.create_session() or provider.client) and perform a tiny generate/call
(e.g., provider.generate(prompt="test") or provider.call_model(...)) and assert
the response; keep the existing CostTracker usage but ensure the provider call
uses the mock_openai_api fixture so provider/session integration is validated.

Comment on lines +415 to +468
def test_e2e_multi_provider_mock(self, test_env_multi_provider, monkeypatch):
"""
E2E Test (MOCKED): Configure different providers and verify flow.

This test uses mocked API responses for fast, offline testing.
"""
from core.providers.config import ProviderConfig
from core.cost_tracking import CostTracker

temp_dir, spec_dir, project_dir = test_env_multi_provider

# Step 1: Configure per-agent providers (simulating UI configuration)
monkeypatch.setenv("AI_ENGINE_PROVIDER", "claude")
monkeypatch.setenv("AGENT_PROVIDER_PLANNER", "claude")
monkeypatch.setenv("AGENT_MODEL_PLANNER", "claude-opus-4-20250514")
monkeypatch.setenv("AGENT_PROVIDER_CODER", "litellm")
monkeypatch.setenv("AGENT_MODEL_CODER", "gpt-4")
monkeypatch.setenv("OPENAI_API_KEY", "test-mock-key-123")
monkeypatch.setenv("ANTHROPIC_API_KEY", "test-mock-key-456")

# Step 2: Verify provider configurations loaded correctly
planner_config = ProviderConfig.from_env(agent_type="planner")
assert planner_config.provider == "claude", f"Planner provider should be claude, got {planner_config.provider}"
assert planner_config.get_model_for_provider() == "claude-opus-4-20250514", f"Planner model should be opus, got {planner_config.get_model_for_provider()}"

coder_config = ProviderConfig.from_env(agent_type="coder")
assert coder_config.provider == "litellm", f"Coder provider should be litellm, got {coder_config.provider}"
assert coder_config.get_model_for_provider() == "gpt-4", f"Coder model should be gpt-4, got {coder_config.get_model_for_provider()}"

# Step 3: Create minimal spec and plan
spec_file = create_simple_spec(spec_dir)
plan_file = create_implementation_plan(spec_dir)

assert spec_file.exists(), "spec.md should be created"
assert plan_file.exists(), "implementation_plan.json should be created"

# Step 4: Simulate agent execution with cost tracking
tracker = CostTracker(spec_dir=spec_dir)

# Simulate planner agent using Claude Opus
tracker.log_usage(
agent_type="planner",
model="claude-opus-4-20250514",
input_tokens=2000,
output_tokens=1000,
)

# Simulate coder agent using GPT-4
tracker.log_usage(
agent_type="coder",
model="gpt-4",
input_tokens=1000,
output_tokens=500,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Mocked multi-provider E2E is not actually mocked integration.

This test does not include the mock fixture and only logs usage directly; provider/session execution remains untested.

As per coding guidelines, tests/** should be comprehensive and properly isolated.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_per_agent_provider_e2e.py` around lines 415 - 468, The test
test_e2e_multi_provider_mock currently only sets env vars and calls
ProviderConfig.from_env and CostTracker.log_usage directly, so it doesn't
exercise provider sessions; replace the direct usage logging with calls to the
mocked provider/session fixtures and assert those sessions were invoked: add (or
use existing) a mock provider fixture to intercept network calls for the planner
and coder, instantiate the actual agent run/execute method that uses
ProviderConfig (e.g. call the planner and coder agent entrypoints used elsewhere
in tests), then assert the mock clients received expected requests and that
CostTracker recorded usage (keep calls to create_simple_spec and
create_implementation_plan). Ensure you reference and use
ProviderConfig.from_env, the agent execution entrypoints used in production, the
CostTracker instance, and tracker.log_usage assertions to validate both session
invocation and cost tracking.

Copy link
Contributor

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
apps/frontend/src/shared/types/task.ts (2)

390-405: 🛠️ Refactor suggestion | 🟠 Major

Move cost-report interfaces to a canonical cost types module.

These cost-specific interfaces don’t belong in task.ts; keeping them here increases drift risk as cost schema evolves. Re-export from a dedicated apps/frontend/src/shared/types/cost.ts instead.

♻️ Proposed refactor in this file
-// Cost tracking types (from cost_tracking.py)
-export interface UsageRecord {
-  agent_type: string;
-  model: string;
-  input_tokens: number;
-  output_tokens: number;
-  cost: number;
-  timestamp: string;
-}
-
-export interface CostReport {
-  spec_dir: string;
-  total_cost: number;
-  records: UsageRecord[];
-  last_updated: string;
-}
+// Cost tracking types
+export type { UsageRecord, CostReport } from './cost';
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/shared/types/task.ts` around lines 390 - 405, Extract the
cost-specific interfaces UsageRecord and CostReport from task.ts into a new
canonical module (create types/cost.ts) and export them there; then remove their
definitions from task.ts and re-export them from task.ts by importing {
UsageRecord, CostReport } from the new cost module so existing imports that
reference these symbols continue to work; also search for any local references
to UsageRecord/CostReport and update imports to the new module if any files
import them directly from the moved location.

396-403: ⚠️ Potential issue | 🟠 Major

cost and total_cost are typed too strictly for multi-provider pricing fallbacks.

Line 396 and Line 402 being required number can mask runtime payloads where pricing is unknown/unavailable. Use optional nullable numbers to avoid unsafe assumptions in consumers.

🛡️ Proposed type-safety fix
 export interface UsageRecord {
   agent_type: string;
   model: string;
   input_tokens: number;
   output_tokens: number;
-  cost: number;
+  cost?: number | null;
   timestamp: string;
 }
 
 export interface CostReport {
   spec_dir: string;
-  total_cost: number;
+  total_cost?: number | null;
   records: UsageRecord[];
   last_updated: string;
 }
#!/bin/bash
set -euo pipefail

echo "== Locate backend cost schema definitions =="
COST_FILES="$(fd 'cost_tracking.py' -t f || true)"
if [ -z "${COST_FILES}" ]; then
  echo "cost_tracking.py not found"
else
  # Inspect whether cost/total_cost are Optional/nullable/conditionally set
  rg -n -C4 "class UsageRecord|class CostReport|cost\\b|total_cost\\b|to_dict|json.dump|unknown model" ${COST_FILES}
fi

echo
echo "== Inspect frontend consumers for strict numeric assumptions =="
rg -n --type=ts -C3 "\\b(cost|total_cost)\\b"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/shared/types/task.ts` around lines 396 - 403, The
UsageRecord and CostReport numeric fields are too strict: change
UsageRecord.cost and CostReport.total_cost to optional nullable types (e.g.,
cost?: number | null and total_cost?: number | null) so consumers won’t assume a
concrete numeric value; update any code that reads UsageRecord or CostReport to
handle undefined/null (guards or fallback values) and adjust related type
annotations/usages referencing these symbols to accommodate the optional
nullable shape.
apps/backend/agents/planner.py (1)

64-91: ⚠️ Potential issue | 🔴 Critical

create_planner_session still blocks multi-provider planner runs and skips planner-specific overrides.

This code path ignores AGENT_PROVIDER_PLANNER/AGENT_MODEL_PLANNER by using ProviderConfig.from_env() without agent_type, and it hard-fails non-Claude providers with NotImplementedError. Follow-up planning will break as soon as planner is configured to anything other than Claude.

🔧 Proposed fix (delegate to shared factory)
-from core.providers import create_engine_provider
-from core.providers.base import SessionConfig
-from core.providers.config import ProviderConfig, get_provider_config
+from core.providers import create_agent_session
+from core.providers.config import get_provider_config
...
 def create_planner_session(
     project_dir: Path,
     spec_dir: Path,
     model: str | None = None,
     max_thinking_tokens: int | None = None,
 ):
-    # Create provider from environment configuration
-    config = ProviderConfig.from_env()
-    provider = create_engine_provider(config)
-
-    # For Claude provider, create session with provider-specific parameters
-    if provider.name == "claude":
-        from core.providers.adapters.claude import ClaudeAgentProvider
-
-        if not isinstance(provider, ClaudeAgentProvider):
-            raise TypeError(f"Expected ClaudeAgentProvider, got {type(provider)}")
-
-        # Create session using provider's create_session method
-        session = provider.create_session(
-            config=SessionConfig(
-                name="planner-session",
-                model=model,
-            ),
-            project_dir=project_dir,
-            spec_dir=spec_dir,
-            agent_type="planner",
-            max_thinking_tokens=max_thinking_tokens,
-        )
-        return session
-    else:
-        # For other providers, implement their session creation
-        # TODO: Add support for LiteLLM and OpenRouter providers
-        raise NotImplementedError(
-            f"Provider {provider.name} not yet implemented for planner agent"
-        )
+    return create_agent_session(
+        agent_type="planner",
+        project_dir=project_dir,
+        spec_dir=spec_dir,
+        model=model,
+        max_thinking_tokens=max_thinking_tokens,
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/agents/planner.py` around lines 64 - 91, create_planner_session
incorrectly uses ProviderConfig.from_env() without the planner agent_type and
then hard-fails non-claude providers; update it to load provider config honoring
planner-specific env vars (e.g., call
ProviderConfig.from_env(agent_type="planner") or otherwise pass
agent_type="planner" into the shared factory), then delegate session creation to
the common provider/session factory instead of special-casing Claude (so use
create_engine_provider with the planner-aware config and call the provider's
create_session when available), remove the unconditional NotImplementedError for
non-claude providers, and keep the existing Claude type-check
(ClaudeAgentProvider) only as a safeguard while allowing other provider
implementations to create planner sessions via their create_session methods.
apps/backend/agents/coder.py (1)

868-905: ⚠️ Potential issue | 🟠 Major

Session bootstrap happens too early and ignores agent-scoped provider selection.

This block creates provider/session even when process isolation is enabled (unused client), and ProviderConfig.from_env() is called without agent_type, so planner/coder-specific provider overrides are skipped.

♻️ Proposed restructuring
-        provider_config = ProviderConfig.from_env()
-        provider = create_engine_provider(provider_config)
-
-        # Create session with phase-specific model and thinking
-        # Use appropriate agent_type for correct tool permissions and thinking budget
-        agent_type_for_session = "planner" if first_run else "coder"
+        # Derive once and reuse across both execution paths
+        agent_type_for_session = "planner" if current_phase == "planning" else "coder"
         from core.providers.base import SessionConfig
-
-        session_config = SessionConfig(
-            name=f"{agent_type_for_session}-session-{iteration}",
-            model=phase_model,
-            extra={
-                "agent_type": agent_type_for_session,
-                "max_thinking_tokens": phase_thinking_budget,
-            },
-        )
-
-        # Only pass Claude-specific kwargs to Claude provider
-        if provider.name == "claude":
-            session = provider.create_session(
-                session_config,
-                project_dir=project_dir,
-                spec_dir=spec_dir,
-                agent_type=agent_type_for_session,
-                max_thinking_tokens=phase_thinking_budget,
-            )
-        else:
-            session = provider.create_session(session_config)
-
-        if not hasattr(session, "client"):
-            raise AttributeError(
-                f"Provider {provider.name} session missing 'client' attribute"
-            )
-
-        client = session.client
...
         use_process_isolation = (
             os.getenv("AGENT_PROCESS_ISOLATION", "").lower() == "true"
         )
 
         if use_process_isolation:
             # Run in isolated subprocess for crash resistance
-            agent_type = "planner" if first_run else "coder"
             if verbose or iteration == 1:
                 print_status(
                     "Process isolation: ENABLED (crash-resistant mode)", "info"
                 )
             status, response, usage_metadata = await run_agent_session_isolated(
                 project_dir=project_dir,
                 spec_dir=spec_dir,
-                agent_type=agent_type,
+                agent_type=agent_type_for_session,
                 model=phase_model,
                 starting_message=prompt,
                 system_prompt=None,
                 max_thinking_tokens=phase_thinking_budget,
-                session_name=f"{agent_type}-session-{iteration}",
+                session_name=f"{agent_type_for_session}-session-{iteration}",
                 limits=None,  # Use default ResourceLimits
             )
         else:
+            provider_config = ProviderConfig.from_env(agent_type=agent_type_for_session)
+            provider = create_engine_provider(provider_config)
+            session_config = SessionConfig(
+                name=f"{agent_type_for_session}-session-{iteration}",
+                model=phase_model,
+                extra={
+                    "agent_type": agent_type_for_session,
+                    "max_thinking_tokens": phase_thinking_budget,
+                },
+            )
+            if provider.name == "claude":
+                session = provider.create_session(
+                    session_config,
+                    project_dir=project_dir,
+                    spec_dir=spec_dir,
+                    agent_type=agent_type_for_session,
+                    max_thinking_tokens=phase_thinking_budget,
+                )
+            else:
+                session = provider.create_session(session_config)
+            if not hasattr(session, "client"):
+                raise AttributeError(
+                    f"Provider {provider.name} session missing 'client' attribute"
+                )
+            client = session.client
             # Run in current process (legacy mode)
             async with client:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/agents/coder.py` around lines 868 - 905, The provider/session is
being created too early and without agent-scoped selection; update the flow so
ProviderConfig.from_env receives the agent_type (use agent_type_for_session) and
only create the provider and call create_session when process isolation is
disabled or the session/client is actually needed; specifically, change the code
around ProviderConfig.from_env, create_engine_provider, and
provider.create_session so you pass agent_type into
ProviderConfig.from_env(agent_type=agent_type_for_session), defer calling
create_engine_provider(...) and provider.create_session(...) until after the
process-isolation check, and skip creating/using session.client when process
isolation is enabled (or otherwise not required), while keeping the existing
Claude-specific create_session kwargs for provider.name == "claude".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/core/providers/config.py`:
- Around line 184-198: The per-agent model override block doesn't handle the
ZhipuAI provider, so when provider == AIEngineProvider.ZHIPUAI.value the
agent_model is never applied; add an elif branch alongside the others to set
zhipuai_model = agent_model when provider equals AIEngineProvider.ZHIPUAI.value
(matching the pattern used for claude_model, openai_model, google_model,
litellm_model, openrouter_model, and ollama_model).

In `@apps/backend/core/providers/factory.py`:
- Around line 282-284: The Raises docstring in the factory function is
incorrect: it currently lists NotImplementedError but the implementation
actually surfaces provider/factory-specific errors and can raise AttributeError
for malformed sessions; update the Raises section of the function's docstring
(in apps/backend/core/providers/factory.py) to list the actual exceptions thrown
(e.g., provider/factory-related exception classes used in this module and
AttributeError for malformed session objects) and ensure the description matches
the code paths in the functions that construct providers (search for the factory
function and call sites that raise provider/factory errors and AttributeError).

---

Duplicate comments:
In `@apps/backend/agents/coder.py`:
- Around line 868-905: The provider/session is being created too early and
without agent-scoped selection; update the flow so ProviderConfig.from_env
receives the agent_type (use agent_type_for_session) and only create the
provider and call create_session when process isolation is disabled or the
session/client is actually needed; specifically, change the code around
ProviderConfig.from_env, create_engine_provider, and provider.create_session so
you pass agent_type into
ProviderConfig.from_env(agent_type=agent_type_for_session), defer calling
create_engine_provider(...) and provider.create_session(...) until after the
process-isolation check, and skip creating/using session.client when process
isolation is enabled (or otherwise not required), while keeping the existing
Claude-specific create_session kwargs for provider.name == "claude".

In `@apps/backend/agents/planner.py`:
- Around line 64-91: create_planner_session incorrectly uses
ProviderConfig.from_env() without the planner agent_type and then hard-fails
non-claude providers; update it to load provider config honoring
planner-specific env vars (e.g., call
ProviderConfig.from_env(agent_type="planner") or otherwise pass
agent_type="planner" into the shared factory), then delegate session creation to
the common provider/session factory instead of special-casing Claude (so use
create_engine_provider with the planner-aware config and call the provider's
create_session when available), remove the unconditional NotImplementedError for
non-claude providers, and keep the existing Claude type-check
(ClaudeAgentProvider) only as a safeguard while allowing other provider
implementations to create planner sessions via their create_session methods.

In `@apps/frontend/src/shared/types/task.ts`:
- Around line 390-405: Extract the cost-specific interfaces UsageRecord and
CostReport from task.ts into a new canonical module (create types/cost.ts) and
export them there; then remove their definitions from task.ts and re-export them
from task.ts by importing { UsageRecord, CostReport } from the new cost module
so existing imports that reference these symbols continue to work; also search
for any local references to UsageRecord/CostReport and update imports to the new
module if any files import them directly from the moved location.
- Around line 396-403: The UsageRecord and CostReport numeric fields are too
strict: change UsageRecord.cost and CostReport.total_cost to optional nullable
types (e.g., cost?: number | null and total_cost?: number | null) so consumers
won’t assume a concrete numeric value; update any code that reads UsageRecord or
CostReport to handle undefined/null (guards or fallback values) and adjust
related type annotations/usages referencing these symbols to accommodate the
optional nullable shape.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b460650 and 3018fc8.

📒 Files selected for processing (5)
  • apps/backend/agents/coder.py
  • apps/backend/agents/planner.py
  • apps/backend/core/providers/config.py
  • apps/backend/core/providers/factory.py
  • apps/frontend/src/shared/types/task.ts

- Replace empty ProviderSettingsSectionProps interface with type alias
- Replace non-null assertions with optional chaining in shell-escape test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx (4)

237-244: ⚠️ Potential issue | 🟠 Major

Save button is hidden when claude is selected.

The Save button is inside the selectedProvider !== 'claude' conditional block. If a user switches from another provider back to claude, they cannot save that change since the button disappears.

🔧 Proposed fix to move Save button outside conditional
           </div>
-
-            {/* Save button */}
-            <div className="pt-2">
-              <Button onClick={handleSave}>
-                {t('common:actions.save')}
-              </Button>
-            </div>
           </div>
         )}
+
+        {/* Save button - always visible */}
+        <div className="pt-4">
+          <Button onClick={handleSave} disabled={saving}>
+            {saving ? t('common:actions.saving') : t('common:actions.save')}
+          </Button>
+        </div>
       </div>
     </SettingsSection>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx`
around lines 237 - 244, The Save button is currently rendered inside the
selectedProvider !== 'claude' conditional so it disappears when the user selects
'claude'; move the Button (and its container div with onClick={handleSave}) out
of that conditional rendering in ProviderSettingsSection so the Button always
renders regardless of selectedProvider value, ensuring handleSave remains the
click handler and existing i18n key t('common:actions.save') is preserved.

53-54: 🛠️ Refactor suggestion | 🟠 Major

Use explicit namespace prefixes for all i18n keys.

All t('aiProvider.*') calls throughout this file should use the explicit settings: namespace prefix (e.g., t('settings:aiProvider.title')). This applies to approximately 30+ occurrences in the file.

As per coding guidelines: "Use translation keys with namespace format (e.g., navigation:items.githubPRs)".

♻️ Example fix pattern
-      title={t('aiProvider.title')}
-      description={t('aiProvider.description')}
+      title={t('settings:aiProvider.title')}
+      description={t('settings:aiProvider.description')}

Apply this pattern to all t('aiProvider.*') calls in the file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx`
around lines 53 - 54, The i18n keys in ProviderSettingsSection (component
ProviderSettingsSection in
apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx) use
shorthand keys like t('aiProvider.title'); update every t('aiProvider.*') call
to use the explicit namespace prefix t('settings:aiProvider.*') (there are ~30+
occurrences). Edit the JSX and any helper/constant usages in that file to
replace the key pattern accordingly so all translation calls follow the
"settings:" namespace format.

36-41: ⚠️ Potential issue | 🟠 Major

Per-agent model selections are not included in the save payload.

plannerModel, coderModel, and qaModel are collected via UI inputs but never added to the config object, so user-entered model settings are silently discarded.

🔧 Proposed fix
       const config: Partial<import('../../../shared/types/settings').AIProviderConfig> = {
         provider: selectedProvider,
         openaiApiKey: openaiApiKey || undefined,
         googleApiKey: googleApiKey || undefined,
         openrouterApiKey: openrouterApiKey || undefined,
+        plannerModel: plannerModel || undefined,
+        coderModel: coderModel || undefined,
+        qaModel: qaModel || undefined,
       };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx`
around lines 36 - 41, The save payload built in ProviderSettingsSection.tsx
omits the per-agent model fields, so add plannerModel, coderModel, and qaModel
to the config object (the const config currently containing provider,
openaiApiKey, googleApiKey, openrouterApiKey) and set each to the corresponding
local state/props (e.g., plannerModel || undefined, coderModel || undefined,
qaModel || undefined) so user selections from the UI are persisted when the save
function runs.

17-30: ⚠️ Potential issue | 🟠 Major

Missing initial config load on component mount.

The component doesn't load existing provider settings when mounted. Users won't see their current configuration, only empty defaults. Consider adding a useEffect to fetch current settings via window.electronAPI?.getProviderConfig?.() or similar.

🔧 Proposed fix to add config loading
 export function ProviderSettingsSection(props: ProviderSettingsSectionProps) {
   const { t } = useTranslation(['settings', 'common']);
   const [selectedProvider, setSelectedProvider] = useState<ProviderType>('claude');
   const [openaiApiKey, setOpenaiApiKey] = useState('');
   const [googleApiKey, setGoogleApiKey] = useState('');
   const [openrouterApiKey, setOpenrouterApiKey] = useState('');
 
   // Model selection per agent type
   const [plannerModel, setPlannerModel] = useState('');
   const [coderModel, setCoderModel] = useState('');
   const [qaModel, setQaModel] = useState('');
 
   const [saving, setSaving] = useState(false);
   const [saveStatus, setSaveStatus] = useState<'idle' | 'success' | 'error'>('idle');
+  const [loading, setLoading] = useState(true);
+
+  useEffect(() => {
+    const loadConfig = async () => {
+      try {
+        const config = await window.electronAPI?.getProviderConfig?.();
+        if (config) {
+          setSelectedProvider(config.provider ?? 'claude');
+          setOpenaiApiKey(config.openaiApiKey ?? '');
+          setGoogleApiKey(config.googleApiKey ?? '');
+          setOpenrouterApiKey(config.openrouterApiKey ?? '');
+          setPlannerModel(config.plannerModel ?? '');
+          setCoderModel(config.coderModel ?? '');
+          setQaModel(config.qaModel ?? '');
+        }
+      } finally {
+        setLoading(false);
+      }
+    };
+    loadConfig();
+  }, []);

Add useEffect to imports:

-import { useState } from 'react';
+import { useState, useEffect } from 'react';
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx`
around lines 17 - 30, Add a useEffect inside the ProviderSettingsSection to load
existing provider config on mount by calling
window.electronAPI?.getProviderConfig?.(), then populate the component state
setters (setSelectedProvider, setOpenaiApiKey, setGoogleApiKey,
setOpenrouterApiKey, setPlannerModel, setCoderModel, setQaModel) with returned
values (guarding for undefined keys), and handle errors by setting saveStatus to
'error' or logging; ensure the call runs only once (empty dependency array) and
gracefully handles a missing API or malformed response.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx`:
- Around line 71-96: There are duplicate/conflicting AIEngineProvider type
definitions; remove the local redefinition in AdvancedSettings.tsx and import
the canonical AIEngineProvider from the shared types file (the export in
shared/types/settings.ts where AIEngineProvider is defined) so both
AdvancedSettings.tsx and ProviderSettingsSection.tsx use the same union type;
update any references in AdvancedSettings.tsx to use the imported
AIEngineProvider and ensure the UI logic (e.g., the selectedProvider checks)
still accepts 'claude' | 'litellm' | 'openrouter' via that shared type.

---

Duplicate comments:
In `@apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx`:
- Around line 237-244: The Save button is currently rendered inside the
selectedProvider !== 'claude' conditional so it disappears when the user selects
'claude'; move the Button (and its container div with onClick={handleSave}) out
of that conditional rendering in ProviderSettingsSection so the Button always
renders regardless of selectedProvider value, ensuring handleSave remains the
click handler and existing i18n key t('common:actions.save') is preserved.
- Around line 53-54: The i18n keys in ProviderSettingsSection (component
ProviderSettingsSection in
apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx) use
shorthand keys like t('aiProvider.title'); update every t('aiProvider.*') call
to use the explicit namespace prefix t('settings:aiProvider.*') (there are ~30+
occurrences). Edit the JSX and any helper/constant usages in that file to
replace the key pattern accordingly so all translation calls follow the
"settings:" namespace format.
- Around line 36-41: The save payload built in ProviderSettingsSection.tsx omits
the per-agent model fields, so add plannerModel, coderModel, and qaModel to the
config object (the const config currently containing provider, openaiApiKey,
googleApiKey, openrouterApiKey) and set each to the corresponding local
state/props (e.g., plannerModel || undefined, coderModel || undefined, qaModel
|| undefined) so user selections from the UI are persisted when the save
function runs.
- Around line 17-30: Add a useEffect inside the ProviderSettingsSection to load
existing provider config on mount by calling
window.electronAPI?.getProviderConfig?.(), then populate the component state
setters (setSelectedProvider, setOpenaiApiKey, setGoogleApiKey,
setOpenrouterApiKey, setPlannerModel, setCoderModel, setQaModel) with returned
values (guarding for undefined keys), and handle errors by setting saveStatus to
'error' or logging; ensure the call runs only once (empty dependency array) and
gracefully handles a missing API or malformed response.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3018fc8 and 0c7bc89.

📒 Files selected for processing (2)
  • apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx
  • apps/frontend/src/shared/utils/__tests__/shell-escape.test.ts

OBenner and others added 2 commits March 1, 2026 00:46
Backend:
- Fix per-agent model override to route to correct provider field
- Add ZhipuAI to per-agent model override block
- Fix provider precedence in get_provider_for_agent
- Defer provider/session creation in coder until after isolation check
- Remove Claude-only NotImplementedError from qa_fixer and qa_reviewer
- Fix factory.py docstring and move Path import to module level
- Add claude-opus-4 pricing and defensive error handling in cost_tracking
- Use lazy %-style formatting in logger calls
- Fix Python 3.14 compatibility for ast.Num/Str/Bytes removal
- Move _CONSTANT_TYPES to module scope in rename_detector

Frontend:
- Fix ProviderSettingsSection: proper i18n namespaces, always-visible save
  button, model field inputs, useEffect config loading, shared type import
- Fix CostBreakdownChart: prevent in-place sort mutation, use i18n for tokens
- Fix AppSettings: remove duplicate 'provider' from AppSection type
- Fix AdvancedSettings: import AIEngineProvider from shared types
- Fix settings-handlers: widen provider type cast, log Ollama errors
- Fix project-handlers: resolve costReportPath, add path validation
- Add realistic provider config mocks to settings-mock.ts
- Add plannerModel/coderModel/qaModel to AIProviderConfig type

Tests:
- Skip litellm-dependent tests when litellm is not installed
- Remove .env file fallback writes in e2e test fixtures
- Add create_engine_provider factory assertions in provider tests
- Fix README markdown heading spacing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add 'zhipuai' to AIEngineProvider type union
- Add zhipuaiApiKey/zhipuaiModel to AIProviderConfig interface
- Add ZhipuAI option in ProviderSettingsSection dropdown
- Add ZhipuAI API key input field
- Add ZhipuAI to AdvancedSettings provider list
- Add i18n translations (en/fr) for ZhipuAI provider
- Update mock to include zhipuai fields and available provider

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test User and others added 2 commits March 1, 2026 21:07
- Fix S1244: replace float equality with pytest.approx in ollama e2e tests
- Fix S3457: remove f prefix from f-strings without substitutions
- Fix S3776: reduce cognitive complexity in settings-handlers.ts by
  extracting applyProviderSettingsToVars, readEnvFileSafe,
  generateFreshEnvContent, collectActiveVarNames, updateEnvLine,
  fetchOllamaModels helpers and STATIC_PROVIDER_MODELS constant
- Fix S3776: reduce complexity in ProviderSettingsSection.tsx by
  extracting ProviderApiKeyFields and PerAgentModelFields sub-components
- Fix S6582: use optional chaining (result.data?.models) in settings-handlers.ts
- Fix TOCTOU: replace existsSync+readFileSync with ENOENT-catching readEnvFileSafe
  in saveProviderSettings and loadProviderSettings handlers
- Fix i18n: add 'tokens' key to cost section in en/common.json and fr/common.json
- Apply ruff format to test files (import sorting, whitespace normalization)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 35

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
tests/test_critique_integration.py (1)

1-306: 🧹 Nitpick | 🔵 Trivial

Consider converting to pytest format to align with coding guidelines.

While the tests are comprehensive and well-structured, the file uses a standalone script pattern rather than pytest conventions. As per coding guidelines, tests should follow pytest conventions. Consider refactoring to:

  • Use pytest's test discovery (functions automatically discovered)
  • Replace manual assertions with pytest assertions (better error messages)
  • Remove the custom main() test runner
  • Leverage pytest features like fixtures and parametrize where applicable

This would enable running these tests via pytest alongside other project tests and provide better integration with CI/CD pipelines.

As per coding guidelines: "Ensure tests are comprehensive and follow pytest conventions."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_critique_integration.py` around lines 1 - 306, This file uses a
manual test runner (main() + if __name__ == "__main__") instead of pytest
discovery; remove the main() function and the __main__ block so pytest can
auto-discover the existing top-level test_* functions
(test_critique_data_structures, test_critique_prompt_generation,
test_critique_response_parsing, test_implementation_plan_integration,
test_complete_workflow, test_summary_formatting), keep/convert any setup to
pytest fixtures if needed (e.g., replace shared setup/print calls with
fixtures), and ensure there are no side-effect sys.exit or prints that interfere
with CI; optionally replace any complex repeated assertions with pytest
parametrize/fixtures for clarity.
tests/test_implementation_plan.py (1)

1787-1812: 🧹 Nitpick | 🔵 Trivial

Consider moving the import to module level.

The from core.progress import get_next_subtask import is repeated inside each test method. While this works for test isolation (especially if the import might fail), it's unconventional. Consider moving to module level or class level with a skip marker if the module might not be available.

# At module level with conditional skip:
try:
    from core.progress import get_next_subtask
    HAS_PROGRESS = True
except ImportError:
    HAS_PROGRESS = False

# Then in class:
`@pytest.mark.skipif`(not HAS_PROGRESS, reason="core.progress not available")
class TestStuckSubtaskSkipping:
    ...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_implementation_plan.py` around lines 1787 - 1812, The test
currently imports get_next_subtask inside test_stuck_subtask_is_skipped; move
that import to the module or class level to avoid repeating it in every test and
to make availability checks/conditional skips easier: add a top-level guarded
import for from core.progress import get_next_subtask (or an equivalent
class-level import) and expose a HAS_PROGRESS flag to use with
pytest.mark.skipif on the test class or test function so tests fail-fast/skip
when core.progress is unavailable; update references in
test_stuck_subtask_is_skipped to use the moved get_next_subtask symbol.
tests/test_breaking_change_detector.py (1)

641-647: ⚠️ Potential issue | 🟡 Minor

Assert required keyword-only parameter classification too.

The test verifies presence in params and optional classification for c, but it doesn't validate that b is marked required. Add that assertion to fully cover the contract.

Proposed test completion
         assert "a" in element.params
         assert "b" in element.params
         assert "c" in element.params
         # b is keyword-only required, c is keyword-only optional
+        assert "b" in element.required_params
         assert "c" in element.optional_params

As per coding guidelines, tests/**: Ensure tests are comprehensive and follow pytest conventions. Check for proper mocking and test isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_breaking_change_detector.py` around lines 641 - 647, The test
currently checks that params contains "a","b","c" and that "c" is in
element.optional_params but doesn't assert that "b" is recognized as required
keyword-only; update the assertions after obtaining element from api_elements to
also assert that "b" is not in element.optional_params (or that "b" is in
element.required_params if that attribute exists) so the test verifies required
vs optional classification for keyword-only params (use the existing variables
element, params, optional_params to locate where to add this assertion).
apps/frontend/src/shared/i18n/locales/fr/settings.json (1)

717-734: ⚠️ Potential issue | 🟡 Minor

aiProvider.providers is missing localized entries for some supported providers.

This map includes claude/litellm/openrouter/zhipuai but not openai/google/ollama. If those providers are selectable, labels/descriptions will be incomplete in French.

🌍 Proposed localization completion
       "openrouter": {
         "name": "OpenRouter",
         "description": "Routage multi-modèles cloud avec plus de 400 modèles"
       },
+      "openai": {
+        "name": "OpenAI",
+        "description": "Modèles GPT d'OpenAI"
+      },
+      "google": {
+        "name": "Google Gemini",
+        "description": "Modèles Gemini de Google"
+      },
+      "ollama": {
+        "name": "Ollama (Local)",
+        "description": "Modèles locaux exécutés sur votre machine"
+      },
       "zhipuai": {
         "name": "ZhipuAI (GLM)",
         "description": "Modèles GLM de Zhipu AI (glm-4, glm-4-flash)"
       }

As per coding guidelines "All user-facing text must use i18n translation keys ... Update all language files (minimum: English and French) when adding new text."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/shared/i18n/locales/fr/settings.json` around lines 717 -
734, The providers map for aiProvider.providers is missing localized entries for
openai, google, and ollama in French; add corresponding keys under "providers"
with "name" and "description" for "openai", "google", and "ollama" to match
existing entries (e.g., similar structure to
"claude"/"litellm"/"openrouter"/"zhipuai"), and ensure you also add/update the
same keys in the English locale so both languages include labels/descriptions
for these selectable providers.
apps/backend/core/cost_tracking.py (1)

59-315: 🧹 Nitpick | 🔵 Trivial

Comprehensive multi-provider pricing table.

The MODEL_PRICING dictionary provides good coverage for all supported providers. Consider adding a comment about the data source dates or a reminder to periodically verify pricing accuracy, as API pricing can change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/core/cost_tracking.py` around lines 59 - 315, Add a short
explanatory comment above the MODEL_PRICING dictionary that references the data
source/date and a reminder to periodically verify pricing (e.g., "source:
provider docs/pricing pages as of <month year>; review/update quarterly or when
provider announces changes"); ensure it directly mentions MODEL_PRICING so
future maintainers know the mapping is time-sensitive and where to check.
tests/test_qa_report_recurring.py (1)

15-15: 🧹 Nitpick | 🔵 Trivial

Remove unused imports.

Dict and List from typing are imported but no longer used after the type annotation modernization to list[dict] on lines 193-194 and 203.

♻️ Proposed fix
-from typing import Dict, List
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_qa_report_recurring.py` at line 15, Remove the now-unused typing
imports by deleting "Dict" and "List" from the top-level import statement in
tests/test_qa_report_recurring.py; update the import to only include what is
still used (or remove the entire "from typing ..." line if nothing else is used)
so the file no longer imports Dict or List after the type annotations were
switched to list[dict] on the qa report functions.
♻️ Duplicate comments (9)
apps/frontend/src/main/ipc-handlers/project-handlers.ts (2)

563-566: 🧹 Nitpick | 🔵 Trivial

Consider adding file size guard for large cost reports.

While JSON parsing errors are caught, reading an unexpectedly large file could block the main process. Consider adding a size check as suggested in the past review comment.

🛡️ Proposed fix with size guard
         // Read and parse cost report
+        const stat = await fsPromises.stat(costReportPath);
+        const maxSizeBytes = 1024 * 1024; // 1MB safety limit
+        if (stat.size > maxSizeBytes) {
+          return {
+            success: false,
+            error: 'Cost report file is too large'
+          };
+        }
+
         const costReportContent = await fsPromises.readFile(costReportPath, 'utf-8');
         const costReport: CostReport = JSON.parse(costReportContent);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/main/ipc-handlers/project-handlers.ts` around lines 563 -
566, The code reads and JSON-parses the cost report via fsPromises.readFile into
costReportContent (parsed to CostReport) without guarding against very large
files; add a size check before reading by calling
fsPromises.stat(costReportPath) and comparing stat.size to a defined
MAX_COST_REPORT_BYTES constant, and if the file is too large, throw or return an
error (or log and abort) rather than reading the file, then proceed to read and
JSON.parse only when size is within the limit; update the handler around
costReportPath/costReportContent/CostReport to enforce this guard and surface a
clear error message.

539-553: ⚠️ Potential issue | 🟡 Minor

Add path separator to prefix check to prevent directory escape.

The startsWith check on line 551 doesn't include a path separator, which could allow a malicious specId to escape if a directory exists with a matching prefix. For example, if specs is the base and specs-malicious exists as a sibling directory, the check would pass.

🛡️ Proposed fix
         // Ensure resolved path is within specs directory
         const resolvedPath = path.resolve(costReportPath);
         const resolvedSpecsDir = path.resolve(specsDir);
-        if (!resolvedPath.startsWith(resolvedSpecsDir)) {
+        if (!resolvedPath.startsWith(resolvedSpecsDir + path.sep)) {
           return { success: false, error: 'Invalid spec ID' };
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/frontend/src/main/ipc-handlers/project-handlers.ts` around lines 539 -
553, The resolved path containment check is vulnerable because
resolvedPath.startsWith(resolvedSpecsDir) can be bypassed by sibling dirs with
the same prefix; update the validation around specId/specsDir/costReportPath to
ensure resolvedSpecsDir is treated as a directory prefix (e.g., compare using
resolvedSpecsDir + path.sep or use a path-aware containment check) before
returning success — modify the check that uses resolvedPath and resolvedSpecsDir
so it requires the separator after resolvedSpecsDir (or uses path.relative and
verifies it does not start with '..') to prevent directory escape.
apps/backend/phase_config.py (1)

604-636: 🧹 Nitpick | 🔵 Trivial

Provider precedence is correctly implemented.

The function now correctly applies the priority order: agent-specific env var → global AI_ENGINE_PROVIDER → agent defaults → fallback. This addresses the previous review concern about global settings being shadowed.

Consider validating returned provider values.

Environment variables could contain invalid provider names. Consider adding validation similar to ProviderConfig.from_env():

💡 Optional validation
 def get_provider_for_agent(agent_type: str) -> str:
+    valid_providers = {"claude", "openai", "google", "litellm", "openrouter", "zhipuai", "ollama"}
+
     # 1. Check for agent-specific environment variable override
     env_var_name = f"AGENT_PROVIDER_{agent_type.upper()}"
     env_provider = os.environ.get(env_var_name)
     if env_provider:
-        return env_provider
+        env_provider = env_provider.lower()
+        if env_provider in valid_providers:
+            return env_provider
 
     # 2. Global AI_ENGINE_PROVIDER env var overrides the hardcoded defaults
     global_provider = os.environ.get("AI_ENGINE_PROVIDER")
     if global_provider:
-        return global_provider
+        global_provider = global_provider.lower()
+        if global_provider in valid_providers:
+            return global_provider
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/phase_config.py` around lines 604 - 636, The
get_provider_for_agent function should validate the provider string returned
from environment or defaults: after obtaining env_provider, global_provider, or
default_provider (from AGENT_DEFAULT_PROVIDERS) call a validation/normalization
step (e.g., reusing ProviderConfig.from_env() logic or a small whitelist of
allowed providers like {"claude","litellm","openrouter"}) to ensure the chosen
value is valid; if invalid, log or fallback to a safe default ("claude") and
return that. Update get_provider_for_agent to perform this validation and
normalization before returning any provider.
tests/test_openai_provider_e2e.py (2)

368-412: ⚠️ Potential issue | 🟠 Major

Mocked E2E still bypasses provider/session integration.

test_e2e_openai_provider_mock directly logs usage and never executes provider creation/session calls, so integration regressions in provider wiring are not covered.

As per coding guidelines: “tests/**: Ensure tests are comprehensive and follow pytest conventions. Check for proper mocking and test isolation.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_openai_provider_e2e.py` around lines 368 - 412, The test
test_e2e_openai_provider_mock currently only calls ProviderConfig.from_env and
directly uses CostTracker.log_usage, so it never exercises provider/session
creation or the provider wiring; modify the test to instantiate the provider and
session flow after loading ProviderConfig (e.g., call the provider factory or
method that creates a session from ProviderConfig, such as the code path that
uses ProviderConfig.get_model_for_provider() to create a provider
client/session) and assert the session/client is created and used (mock the
provider client network calls with mock_openai_api), then call the same
CostTracker.log_usage to record costs—this ensures the test covers provider
creation, session wiring, and still uses mocks for isolation.

53-65: ⚠️ Potential issue | 🟠 Major

Patch the runtime call-site module instead of litellm.completion root.

Root-level patching can miss calls if provider code uses a locally bound import path.

As per coding guidelines: “tests/**: Ensure tests are comprehensive and follow pytest conventions. Check for proper mocking and test isolation.”

tests/test_per_agent_provider_e2e.py (2)

50-64: ⚠️ Potential issue | 🟠 Major

Patch the provider-module call site, not litellm.completion at package root.

This mock is fragile and may miss calls if the code under test uses a module-local litellm binding.

As per coding guidelines: “tests/**: Ensure tests are comprehensive and follow pytest conventions. Check for proper mocking and test isolation.”

#!/bin/bash
# Verify where litellm is imported/called so patch target matches runtime call site.
rg -nP --type=py -C3 '\bimport\s+litellm\b|\blitellm\.completion\s*\('
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_per_agent_provider_e2e.py` around lines 50 - 64, The test
currently patches "litellm.completion" at the package root which is fragile if
the code under test imports litellm into its own module namespace; update
mock_multi_provider_api to patch the actual call site used by the code under
test (i.e., patch the module-local binding, not the package root). Locate where
the code under test imports or references litellm (the module that contains the
function/class under test) and change patch("litellm.completion") to
patch("<module_where_used>.litellm.completion") so the mock intercepts the
runtime call; keep the mocked response object (mock_gpt4_response) and yield the
patched object as before. Ensure tests run and adjust the import path if the
test uses different modules that import litellm.

428-489: ⚠️ Potential issue | 🟠 Major

test_e2e_multi_provider_mock still bypasses provider/session execution.

The test validates env/config and writes usage directly via CostTracker.log_usage; it does not execute provider factory/session flow, so integration regressions can still pass.

As per coding guidelines: “tests/**: Ensure tests are comprehensive and follow pytest conventions. Check for proper mocking and test isolation.”

tests/test_ollama_provider_e2e.py (2)

53-69: ⚠️ Potential issue | 🟠 Major

Patch the module-level call site used by provider code, not litellm.completion root.

This patch target can miss actual runtime calls depending on import binding in the adapter/provider module.

As per coding guidelines: “tests/**: Ensure tests are comprehensive and follow pytest conventions. Check for proper mocking and test isolation.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_ollama_provider_e2e.py` around lines 53 - 69, The test currently
patches litellm.completion directly which can miss calls if the Ollama provider
imports completion into its own module; change the patch target to the
provider's module-level symbol (the module where the Ollama adapter/provider
imports or references completion) so that calls made by the provider code are
intercepted; replace the patch("litellm.completion") usage with a patch against
the provider module's completion symbol (the same place used by the provider
under test), keeping the same mock_response/mock_completion setup.

358-427: ⚠️ Potential issue | 🟠 Major

Mocked Ollama E2E still bypasses provider/session invocation.

The test records usage directly via CostTracker and does not execute provider/session logic, so it cannot catch integration regressions.

As per coding guidelines: “tests/**: Ensure tests are comprehensive and follow pytest conventions. Check for proper mocking and test isolation.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_ollama_provider_e2e.py` around lines 358 - 427, The test currently
bypasses provider/session code by calling CostTracker.log_usage directly; update
test_e2e_ollama_provider_mock to exercise the provider stack under the mock
instead: use ProviderConfig.from_env() to construct the provider (or call the
factory that returns a Provider instance), open a ProviderSession or call the
provider's inference method (e.g., session.send, provider.predict or similar) so
the request flows through the same code paths that trigger CostTracker
recording, and remove the direct CostTracker.log_usage call; keep
mock_ollama_api in place so the provider I/O is mocked and assert the same
cost_report.json and record values after performing the provider interaction.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/.env.example`:
- Line 219: Update the explanatory comment that lists supported cost-tracking
providers to include Zhipu GLM; find the line reading "Supports cost tracking
for Claude, OpenAI, Google Gemini, and Ollama (zero cost)" and append or insert
"Zhipu GLM" so the list matches the newly added Zhipu cost support.

In `@apps/backend/agents/coder.py`:
- Around line 869-870: The process-isolation branch recomputes the session agent
type from first_run and may use the wrong type once first_run flips; instead
compute and store agent_type_for_session once (as currently set) and use that
stored variable inside the isolation code path rather than re-evaluating
first_run; update references in the isolation block(s) that currently derive
agent type from first_run (including the other occurrence around lines
~1057-1068) to use agent_type_for_session so planner sessions always run with
the intended "planner" value when first_run was true.

In `@apps/backend/agents/planner.py`:
- Around line 159-160: Before assigning and using session.client in the async
context, add defensive validation that session has a client attribute and that
the retrieved client object supports async context management (i.e., has
callable __aenter__ and __aexit__); if not, raise a clear TypeError (or return
an informative error) that mentions "session.client" and the expected async
context-manager interface so callers get actionable diagnostics. Perform this
check immediately around the code that sets client = session.client and before
any "async with client:" usage in this module (also replicate the same
validation for the second occurrence around the block currently at lines
205-208) so both paths fail fast with a helpful message when an adapter
deviates.
- Around line 80-85: The non-Claude branch that calls provider.create_session
with SessionConfig(name="planner-session", model=model) is missing the extra
metadata (agent_type and thinking budget) used by the Claude path, causing
behavior drift; update the SessionConfig passed to provider.create_session in
planner.py to include the same extra metadata keys (e.g., extra={"agent_type":
"planner", "thinking_budget": <value>} or the existing constants used in the
Claude branch) so both providers create sessions with consistent metadata.

In `@apps/backend/qa/fixer.py`:
- Around line 792-799: The call to run_qa_fixer_session inside run_qa_fixer
omits the selected model so the delegate falls back to its default; update the
async with client block to forward the active model (e.g. add
model=fix_session.model or the appropriate selected model variable) to
run_qa_fixer_session so the session uses the intended model rather than
"sonnet".
- Around line 760-765: The non-Claude session creation via
provider.create_session(SessionConfig(...)) is missing propagation of QA agent
metadata (agent_type and thinking) used in the Claude path; update the
non-Claude branch where SessionConfig is constructed to include the same
metadata keys (e.g., agent_type="qa_fixer" or "qa_reviewer" as appropriate and
thinking metadata) so QA agents have the same tool wiring; ensure this metadata
is passed into SessionConfig (or provider.create_session) when
ELECTRON_MCP_ENABLED may enable Electron MCP tools, matching the Claude-path
fields so the qa_reviewer/qa_fixer sessions get automatic access to Electron MCP
tools for E2E testing.

In `@apps/backend/qa/reviewer.py`:
- Around line 897-903: Before using session.client in reviewer.py, add defensive
checks: verify session is not None and has a non-None attribute client (check
hasattr(session, "client") and session.client is not None), confirm the client
supports async context management (hasattr(client, "__aenter__") and
hasattr(client, "__aexit__")), and if any check fails raise a clear, descriptive
exception mentioning the session/provider and that run_qa_agent_session cannot
proceed; then proceed to use async with client and call run_qa_agent_session
only when these guards pass.
- Around line 847-852: The non-Claude session creation using
provider.create_session(SessionConfig(...)) is missing reviewer agent metadata;
update the SessionConfig passed to create_session (the SessionConfig used for
the QA reviewer) to include agent-specific fields such as
agent_type="qa_reviewer" and a sensible thinking_budget value, and when
ELECTRON_MCP_ENABLED is true ensure the session metadata includes access
flags/metadata enabling Electron MCP tools (e.g., include an
electron_mcp_enabled or tools list entry so qa_reviewer/qa_fixer get automatic
MCP tool routing). Ensure these fields are added alongside the existing name and
model in the SessionConfig so QA behavior and tool routing remain consistent
with the Claude branch.

In `@apps/frontend/src/main/ipc-handlers/settings-handlers.ts`:
- Around line 108-115: The keyMap array and the provider validation flow are
missing ZhipuAI entries so Zhipu settings aren’t persisted/loaded/validated; add
Zhipu keys to the keyMap (e.g., map ProviderSettings fields like 'zhipuApiKey'
and any Zhipu model fields to their env names), and extend the provider
validation/selection logic (the same code paths that handle
OpenAI/Google/OpenRouter) to include the ZhipuAI branch so its API key and model
settings are read, written and validated consistently (update functions
referenced by ProviderSettings, keyMap and the provider validation code path).
- Around line 437-466: The SETTINGS_SAVE_PROVIDER IPC handler performs a
read-modify-write on the .env using readEnvFileSafe, parseEnvFile,
applyProviderSettingsToVars, generateProviderEnvContent and writeFileSync and
can be raced by concurrent handlers (also the similar handler at 1152-1191).
Serialize these operations by adding a per-project (or per-envPath) mutex/queue:
acquire the lock at the start of the handler, then perform readEnvFileSafe →
parseEnvFile → applyProviderSettingsToVars → generateProviderEnvContent →
writeFileSync, and finally release the lock (ensure release on errors), so
concurrent saves for the same project/envPath are queued and cannot overwrite
each other. Use a small in-memory Map keyed by projectId or envPath to store
Promises or a simple mutex class to implement the queue. Ensure both handlers
use the same lock mechanism.
- Around line 1000-1075: The SETTINGS_GET_AVAILABLE_MODELS handler rejects valid
providers because STATIC_PROVIDER_MODELS only lists keys for claude, litellm,
and openrouter; update the provider mapping so supported providers like openai,
google, and zhipuai resolve to model lists (either by adding entries to
STATIC_PROVIDER_MODELS or adding an alias map that maps 'openai' -> 'litellm' /
'openrouter' etc.), and then use that mapped key in the IPC handler before
returning Unknown provider; adjust lookup in the IPC handler
(IPC_CHANNELS.SETTINGS_GET_AVAILABLE_MODELS) to consult the alias map or
expanded STATIC_PROVIDER_MODELS and ensure fetchOllamaModels remains used only
for 'ollama'.
- Line 450: The code builds envPath with path.join(project.path,
project.autoBuildPath, '.env') which can resolve outside the selected project
root if autoBuildPath is absolute or contains traversal; change to resolve and
validate the final path is inside project.path: compute const resolvedEnvPath =
path.resolve(project.path, project.autoBuildPath, '.env') and then verify
path.relative(project.path, resolvedEnvPath) does not start with '..' (and for
Windows also ensure drive letters match); if the check fails, reject/throw or
fallback to a safe default instead of using resolvedEnvPath. Ensure you update
both places that set envPath (the occurrences referencing project.autoBuildPath
and project.path).

In `@apps/frontend/src/renderer/components/analytics/CostBreakdownChart.tsx`:
- Around line 64-70: getProviderName currently returns hardcoded English labels
and misses some provider IDs; replace the names map with translation-key lookups
using the react-i18next hook (useTranslation) so labels call
t('analytics:provider.openai'), t('analytics:provider.claude'), etc., and add
keys for all supported providers including litellm, openrouter, zhipuai, ollama
and google; update the getProviderName function to normalize provider
(provider.toLowerCase()) and map to t(...) for each id, falling back to provider
when no key exists, and add the corresponding translation entries to English and
French language files using the namespace:key pattern (analytics:provider.<id>).

In `@apps/frontend/src/renderer/components/settings/AdvancedSettings.tsx`:
- Around line 27-31: The AI_ENGINE_PROVIDERS array in AdvancedSettings.tsx
doesn't include all members of the AIEngineProvider union (missing 'openai',
'google', 'ollama'), so the dropdown cannot display or reselect those values;
update AI_ENGINE_PROVIDERS to include entries for 'openai', 'google', and
'ollama' with appropriate labelKey/descriptionKey strings matching your i18n
keys, and ensure any other occurrences (the second block noted around lines
515-549) are updated consistently so the select control can match and retain
existing AIEngineProvider values.

In `@apps/frontend/src/renderer/components/settings/AppSettings.tsx`:
- Line 22: Remove the unused Cloud import from the AppSettings.tsx module:
locate the import statement that includes the Cloud symbol (likely in the
imports near the top of the file) and delete just the Cloud identifier (or the
entire import line if it only imports Cloud) so the file no longer references an
unused symbol.

In `@apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx`:
- Around line 343-349: The JSX contains hardcoded fallback strings passed to the
i18n translator (t) — replace those inline literals with proper translation keys
and remove the string fallbacks: update the t(...) calls inside
ProviderSettingsSection (where saving, saveStatus and the Button render are
used) to use keys like "settings:provider.saving", "settings:provider.saved",
and "settings:provider.error" (or your project’s naming convention) and ensure
you add corresponding entries to the English and French language JSON/YAML
files; keep the same t(...) calls but remove the second-argument literal
fallback and add the new keys to the locale files so the UI texts come from i18n
instead of hardcoded strings.
- Around line 271-303: The provider dropdown is missing SelectItem entries for
the supported provider values, causing existing configs (openai, google, ollama)
to be non-editable; add SelectItem elements with value="openai", value="google",
and value="ollama" inside the same SelectContent block (matching the pattern
used for "claude"/"litellm"/"openrouter"/"zhipuai") and use the corresponding
translation keys (e.g. settings:aiProvider.providers.openai.name and
.description, and similarly for google and ollama) so the UI covers the full set
of selectedProvider options.

In `@apps/frontend/src/shared/i18n/locales/en/settings.json`:
- Around line 44-47: Add the matching French translations for the new
"providers" translation object you added in settings.json: replicate the keys
"providers.title" and "providers.description" in the French locale settings.json
with appropriate French text so the French bundle no longer falls back to
English; ensure the key names exactly match ("providers", with nested "title"
and "description") and mirror any other new keys added in the same diff range
(e.g., lines near 763-864) so all user-facing strings exist in both English and
French.
- Around line 769-786: The locales file's "providers" map is missing entries for
openai, google, and ollama so selected providers won't render; add keys
"openai", "google", and "ollama" to the "providers" object with appropriate
"name" and "description" values (e.g., OpenAI — official OpenAI API, Google —
Google PaLM/Generative AI, Ollama — local Ollama models) so the
aiProvider.providers translations include these providers and UI can display
them consistently; update the "providers" object near the existing "claude",
"litellm", "openrouter", and "zhipuai" entries.

In `@apps/frontend/src/shared/types/settings.ts`:
- Around line 381-389: ProviderSettings currently lacks a field for ZhipuAI
credentials while AIEngineProvider includes 'zhipuai'; add an optional
zhipuaiApiKey?: string property to the ProviderSettings interface so it can
represent ZhipuAI API keys (update the ProviderSettings declaration to include
zhipuaiApiKey alongside openaiApiKey, googleApiKey, openrouterApiKey, etc.).

In `@tests/conftest.py`:
- Around line 335-357: The git-related subprocess.run calls in the test fixture
use capture_output=True but do not pass check=True, so failures are ignored and
can corrupt fixtures; update the subprocess.run invocations that run git
commands (e.g., the git config, git add, git commit, git branch -M, and other
git add/commit/rev-parse calls elsewhere in tests/conftest.py) to include
check=True so a non-zero exit raises and fails the test setup immediately
(locate calls by searching for subprocess.run([... "git", ...], cwd=temp_dir,
capture_output=True) and add check=True).

In `@tests/manual_user_correction_test.py`:
- Around line 43-46: The file currently uses a manual script pattern (prints,
main(), and if __name__ == "__main__") which prevents pytest discovery; refactor
by removing the main() wrapper and the top-level execution guard and convert the
procedural logic into one or more pytest test functions (e.g.,
test_user_correction_end_to_end) that accept built-in fixtures such as tmp_path
instead of manual temp handling; replace print-based checks and any manual
control flow with assert statements to validate expected outcomes, and move any
setup/teardown into fixtures or helper functions so pytest can discover and run
the tests automatically.

In `@tests/README_E2E_OPENAI.md`:
- Around line 99-100: Update the spec directory references to use the full,
consistent directory name "046-multi-model-provider-support" instead of the
short "046": change the example command that calls python
.auto-claude/specs/046-multi-model-provider-support/validate_openai_e2e.py
--spec-id <SPEC_ID> and any other occurrences shown (the shorter "046"
references at the later example lines) so all paths and copy/paste snippets
consistently use "046-multi-model-provider-support".

In `@tests/test_agent_flow.py`:
- Around line 82-87: The git subprocess invocations (e.g., get_latest_commit
using subprocess.run(... ["git", "rev-parse", "HEAD"] ) and the other
subprocess.run calls that execute git) currently omit check=True and can
silently fail; update each subprocess.run call that runs a git command to
include check=True so failures raise CalledProcessError and tests fail fast, and
if the surrounding code expects returncode checks adapt it to either let the
exception propagate or catch subprocess.CalledProcessError and handle/log
accordingly (search for subprocess.run usages in this file to find all instances
to update).

In `@tests/test_architecture_validator.py`:
- Around line 12-20: Remove the redundant module-level sys.path modification in
tests/test_architecture_validator.py: delete the sys.path.insert(0,
str(Path(__file__).parent.parent / "apps" / "backend")) line (the sys.path
mutation) because conftest.py already sets the import path globally; leave the
other imports and tempfile/Path usage intact so only the duplicated sys.path
insertion is removed.

In `@tests/test_breaking_change_detector.py`:
- Around line 300-305: The test currently nests if-checks so it silently passes
when no changes are found; update the assertions to explicitly require both that
changes is non-empty and that at least one signature change exists. Replace the
outer conditional with an assertion like assert len(changes) > 0 (or assert
changes) and replace the inner conditional with an assertion that
signature_changes is non-empty (or assert any(c.change_type ==
"signature_change" for c in changes)) and then assert the severity of the first
signature_change is in ["medium", "high"]; target the variables changes and
signature_changes in tests/test_breaking_change_detector.py to make the test
fail when no signature change is detected.
- Around line 12-18: Remove the module-level sys.path.insert(...) from the test
file and instead centralize test-path setup in a pytest fixture (e.g., in
conftest.py) that prepends the desired path for the test session using pytest's
monkeypatch.syspath_prepend(str(Path(...))) or a session-scoped fixture that
temporarily modifies sys.path; specifically, delete the sys.path.insert call and
replace it with a session-scoped fixture named (for example) backend_path that
computes the same Path(...) value and calls monkeypatch.syspath_prepend(...) so
path changes are isolated and cleaned up after tests.

In `@tests/test_graphiti.py`:
- Around line 65-67: Remove the permanent pytest.mark.skip on
test_status_when_missing_openai_key and instead make the test explicitly isolate
env state: in the test (test_status_when_missing_openai_key) use monkeypatch to
set relevant provider env vars (e.g., OPENAI_API_KEY and any other provider
keys) to empty or delete them, and patch/load_dotenv (or call dotenv.load_dotenv
with a temp path) so the app loads a deterministic no-key environment; ensure
any code that reads env (e.g., get_status or Graphiti config helpers) is
exercised under that mocked environment and restore state after the test.

In `@tests/test_long_running_stability.py`:
- Around line 271-275: The checkpoint_interval_minutes parameter in
run_automated_stability_test can be zero which will cause a
division/modulo-by-zero where checkpoint checks occur; add an explicit guard at
the start of run_automated_stability_test to validate
checkpoint_interval_minutes (e.g., raise a ValueError if <= 0) or normalize it
to a safe minimum (e.g., checkpoint_interval = max(1,
checkpoint_interval_minutes)) and then use that normalized value wherever
checkpoint checks or modulo operations are performed; reference the
run_automated_stability_test function and the checkpoint_interval_minutes
parameter to locate and update the validation/normalization logic.

In `@tests/test_merge_orchestrator.py`:
- Around line 140-152: The test runs git commands via subprocess.run (the calls
executing "git checkout -b auto-claude/task-001", "git add .", and "git commit
-m 'Add new function'") without check=True, allowing silent failures; modify
those subprocess invocations in tests/test_merge_orchestrator.py to fail fast by
either passing check=True to subprocess.run or replacing with
subprocess.check_call/check_output so any git setup error raises immediately and
surfaces in the test output.

In `@tests/test_merge_semantic_analyzer.py`:
- Around line 100-102: The test currently computes [c for c in analysis.changes
if c.change_type == ChangeType.ADD_CLASS] and discards it so it doesn’t actually
assert that a class was detected; update the test (test_python_class_detection)
to capture that list (e.g., added_classes = [...]) and assert on it (for example
assert len(added_classes) > 0 or assert any(...) to verify a class or its
methods were detected), using the existing analysis variable and
ChangeType.ADD_CLASS symbol so the test fails when no class addition is present.

In `@tests/test_ollama_provider_e2e.py`:
- Around line 176-192: Replace the environment-only assertions in
test_ollama_no_api_key_required with assertions that exercise the provider
initialization and a minimal session/generation call: using the test_env_ollama
and monkeypatch fixtures, set AI_ENGINE_PROVIDER and LITELLM_MODEL as you
already do, then call the project's provider initialization factory/function
(the code path that creates the lite-llm/Ollama provider — e.g., the create/get
provider function your tests normally use) to instantiate the provider/session
and assert it returns a usable object and can produce a minimal response (for
example, verify provider is not None and provider.generate()/session.create()
returns a non-error result or expected structure). Keep the existing env
removals (monkeypatch.delenv) and add these provider-level assertions so the
test verifies behavior, not just environment state.

In `@tests/test_openai_provider_e2e.py`:
- Around line 174-186: The test test_openai_api_key_required currently only
checks that OPENAI_API_KEY is unset; instead, invoke the actual litellm provider
initialization/validation code path (e.g., call the provider factory or init
function used elsewhere in tests such as Provider.create, load_provider, or
config.validate for the litellm provider) and assert the expected failure:
either that it raises a specific exception, returns an error result, or emits a
warning (use pytest.raises or caplog to assert the message). Update
test_openai_api_key_required to set AI_ENGINE_PROVIDER and LITELLM_MODEL as
before, remove the bare getenv assertion, call the provider init/validation (the
same function used by runtime), and assert the correct behavior (exception
type/message or logged warning) to validate missing OPENAI_API_KEY handling.

In `@tests/test_pr_worktree_manager.py`:
- Around line 8-9: The dynamic import using importlib.util should validate the
spec and its loader before executing to avoid obscure AttributeErrors during
test collection; after calling importlib.util.spec_from_file_location and before
spec.loader.exec_module (and before importlib.util.module_from_spec if you rely
on the loader), check that spec is not None and spec.loader is not None, and
raise a clear RuntimeError (or AssertionError) with a descriptive message if
either is missing so failures are deterministic and informative; apply the same
checks around the other import blocks referenced (lines ~20-22) that use
spec/spec.loader.
- Around line 76-79: Replace the direct subprocess.run([...], cwd=origin_dir,
...) call (search for subprocess.run and origin_dir in
tests/test_pr_worktree_manager.py) with the repository's platform execution
utility used by backend/tests (import the shared exec helper from the
tests/platform abstraction module) and invoke it to run "git init --bare" with
the same cwd; also use the platform/path joining utilities from that module to
construct any paths instead of manual concatenation.

---

Outside diff comments:
In `@apps/backend/core/cost_tracking.py`:
- Around line 59-315: Add a short explanatory comment above the MODEL_PRICING
dictionary that references the data source/date and a reminder to periodically
verify pricing (e.g., "source: provider docs/pricing pages as of <month year>;
review/update quarterly or when provider announces changes"); ensure it directly
mentions MODEL_PRICING so future maintainers know the mapping is time-sensitive
and where to check.

In `@apps/frontend/src/shared/i18n/locales/fr/settings.json`:
- Around line 717-734: The providers map for aiProvider.providers is missing
localized entries for openai, google, and ollama in French; add corresponding
keys under "providers" with "name" and "description" for "openai", "google", and
"ollama" to match existing entries (e.g., similar structure to
"claude"/"litellm"/"openrouter"/"zhipuai"), and ensure you also add/update the
same keys in the English locale so both languages include labels/descriptions
for these selectable providers.

In `@tests/test_breaking_change_detector.py`:
- Around line 641-647: The test currently checks that params contains
"a","b","c" and that "c" is in element.optional_params but doesn't assert that
"b" is recognized as required keyword-only; update the assertions after
obtaining element from api_elements to also assert that "b" is not in
element.optional_params (or that "b" is in element.required_params if that
attribute exists) so the test verifies required vs optional classification for
keyword-only params (use the existing variables element, params, optional_params
to locate where to add this assertion).

In `@tests/test_critique_integration.py`:
- Around line 1-306: This file uses a manual test runner (main() + if __name__
== "__main__") instead of pytest discovery; remove the main() function and the
__main__ block so pytest can auto-discover the existing top-level test_*
functions (test_critique_data_structures, test_critique_prompt_generation,
test_critique_response_parsing, test_implementation_plan_integration,
test_complete_workflow, test_summary_formatting), keep/convert any setup to
pytest fixtures if needed (e.g., replace shared setup/print calls with
fixtures), and ensure there are no side-effect sys.exit or prints that interfere
with CI; optionally replace any complex repeated assertions with pytest
parametrize/fixtures for clarity.

In `@tests/test_implementation_plan.py`:
- Around line 1787-1812: The test currently imports get_next_subtask inside
test_stuck_subtask_is_skipped; move that import to the module or class level to
avoid repeating it in every test and to make availability checks/conditional
skips easier: add a top-level guarded import for from core.progress import
get_next_subtask (or an equivalent class-level import) and expose a HAS_PROGRESS
flag to use with pytest.mark.skipif on the test class or test function so tests
fail-fast/skip when core.progress is unavailable; update references in
test_stuck_subtask_is_skipped to use the moved get_next_subtask symbol.

In `@tests/test_qa_report_recurring.py`:
- Line 15: Remove the now-unused typing imports by deleting "Dict" and "List"
from the top-level import statement in tests/test_qa_report_recurring.py; update
the import to only include what is still used (or remove the entire "from typing
..." line if nothing else is used) so the file no longer imports Dict or List
after the type annotations were switched to list[dict] on the qa report
functions.

---

Duplicate comments:
In `@apps/backend/phase_config.py`:
- Around line 604-636: The get_provider_for_agent function should validate the
provider string returned from environment or defaults: after obtaining
env_provider, global_provider, or default_provider (from
AGENT_DEFAULT_PROVIDERS) call a validation/normalization step (e.g., reusing
ProviderConfig.from_env() logic or a small whitelist of allowed providers like
{"claude","litellm","openrouter"}) to ensure the chosen value is valid; if
invalid, log or fallback to a safe default ("claude") and return that. Update
get_provider_for_agent to perform this validation and normalization before
returning any provider.

In `@apps/frontend/src/main/ipc-handlers/project-handlers.ts`:
- Around line 563-566: The code reads and JSON-parses the cost report via
fsPromises.readFile into costReportContent (parsed to CostReport) without
guarding against very large files; add a size check before reading by calling
fsPromises.stat(costReportPath) and comparing stat.size to a defined
MAX_COST_REPORT_BYTES constant, and if the file is too large, throw or return an
error (or log and abort) rather than reading the file, then proceed to read and
JSON.parse only when size is within the limit; update the handler around
costReportPath/costReportContent/CostReport to enforce this guard and surface a
clear error message.
- Around line 539-553: The resolved path containment check is vulnerable because
resolvedPath.startsWith(resolvedSpecsDir) can be bypassed by sibling dirs with
the same prefix; update the validation around specId/specsDir/costReportPath to
ensure resolvedSpecsDir is treated as a directory prefix (e.g., compare using
resolvedSpecsDir + path.sep or use a path-aware containment check) before
returning success — modify the check that uses resolvedPath and resolvedSpecsDir
so it requires the separator after resolvedSpecsDir (or uses path.relative and
verifies it does not start with '..') to prevent directory escape.

In `@tests/test_ollama_provider_e2e.py`:
- Around line 53-69: The test currently patches litellm.completion directly
which can miss calls if the Ollama provider imports completion into its own
module; change the patch target to the provider's module-level symbol (the
module where the Ollama adapter/provider imports or references completion) so
that calls made by the provider code are intercepted; replace the
patch("litellm.completion") usage with a patch against the provider module's
completion symbol (the same place used by the provider under test), keeping the
same mock_response/mock_completion setup.
- Around line 358-427: The test currently bypasses provider/session code by
calling CostTracker.log_usage directly; update test_e2e_ollama_provider_mock to
exercise the provider stack under the mock instead: use
ProviderConfig.from_env() to construct the provider (or call the factory that
returns a Provider instance), open a ProviderSession or call the provider's
inference method (e.g., session.send, provider.predict or similar) so the
request flows through the same code paths that trigger CostTracker recording,
and remove the direct CostTracker.log_usage call; keep mock_ollama_api in place
so the provider I/O is mocked and assert the same cost_report.json and record
values after performing the provider interaction.

In `@tests/test_openai_provider_e2e.py`:
- Around line 368-412: The test test_e2e_openai_provider_mock currently only
calls ProviderConfig.from_env and directly uses CostTracker.log_usage, so it
never exercises provider/session creation or the provider wiring; modify the
test to instantiate the provider and session flow after loading ProviderConfig
(e.g., call the provider factory or method that creates a session from
ProviderConfig, such as the code path that uses
ProviderConfig.get_model_for_provider() to create a provider client/session) and
assert the session/client is created and used (mock the provider client network
calls with mock_openai_api), then call the same CostTracker.log_usage to record
costs—this ensures the test covers provider creation, session wiring, and still
uses mocks for isolation.

In `@tests/test_per_agent_provider_e2e.py`:
- Around line 50-64: The test currently patches "litellm.completion" at the
package root which is fragile if the code under test imports litellm into its
own module namespace; update mock_multi_provider_api to patch the actual call
site used by the code under test (i.e., patch the module-local binding, not the
package root). Locate where the code under test imports or references litellm
(the module that contains the function/class under test) and change
patch("litellm.completion") to patch("<module_where_used>.litellm.completion")
so the mock intercepts the runtime call; keep the mocked response object
(mock_gpt4_response) and yield the patched object as before. Ensure tests run
and adjust the import path if the test uses different modules that import
litellm.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0c7bc89 and b4d2477.

📒 Files selected for processing (137)
  • apps/backend/.env.example
  • apps/backend/agents/coder.py
  • apps/backend/agents/planner.py
  • apps/backend/core/cost_tracking.py
  • apps/backend/core/providers/config.py
  • apps/backend/core/providers/factory.py
  • apps/backend/merge/rename_detector.py
  • apps/backend/phase_config.py
  • apps/backend/qa/fixer.py
  • apps/backend/qa/reviewer.py
  • apps/frontend/src/main/ipc-handlers/project-handlers.ts
  • apps/frontend/src/main/ipc-handlers/settings-handlers.ts
  • apps/frontend/src/renderer/components/analytics/CostBreakdownChart.tsx
  • apps/frontend/src/renderer/components/settings/AdvancedSettings.tsx
  • apps/frontend/src/renderer/components/settings/AppSettings.tsx
  • apps/frontend/src/renderer/components/settings/ProviderSettingsSection.tsx
  • apps/frontend/src/renderer/lib/mocks/settings-mock.ts
  • apps/frontend/src/shared/constants/ipc.ts
  • apps/frontend/src/shared/i18n/locales/en/common.json
  • apps/frontend/src/shared/i18n/locales/en/settings.json
  • apps/frontend/src/shared/i18n/locales/fr/common.json
  • apps/frontend/src/shared/i18n/locales/fr/settings.json
  • apps/frontend/src/shared/types/settings.ts
  • tests/README_E2E_OPENAI.md
  • tests/agents/test_code_reviewer.py
  • tests/conftest.py
  • tests/e2e/test_context_optimization.py
  • tests/integration/test_failure_analysis_e2e.py
  • tests/manual_user_correction_test.py
  • tests/qa_report_helpers.py
  • tests/review_fixtures.py
  • tests/runners/github/test_code_review_service.py
  • tests/runners/github/test_import_chain.py
  • tests/test_agent.py
  • tests/test_agent_configs.py
  • tests/test_agent_flow.py
  • tests/test_analyzer_port_detection.py
  • tests/test_architecture_validator.py
  • tests/test_artifacts.py
  • tests/test_auth.py
  • tests/test_breaking_change_detector.py
  • tests/test_builtin_templates.py
  • tests/test_check_encoding.py
  • tests/test_ci_discovery.py
  • tests/test_claude_provider.py
  • tests/test_client.py
  • tests/test_context_gatherer.py
  • tests/test_critique_integration.py
  • tests/test_debug_assistant_integration.py
  • tests/test_decision_tracking_e2e.py
  • tests/test_dependency_validator.py
  • tests/test_discovery.py
  • tests/test_finding_validation.py
  • tests/test_fixtures.py
  • tests/test_followup.py
  • tests/test_github_bot_detection.py
  • tests/test_github_pr_e2e.py
  • tests/test_github_pr_review.py
  • tests/test_graphiti.py
  • tests/test_graphiti_search.py
  • tests/test_implementation_plan.py
  • tests/test_integration_phase4.py
  • tests/test_issue_884_plan_schema.py
  • tests/test_json_output.py
  • tests/test_learning_from_failures_e2e.py
  • tests/test_long_running_stability.py
  • tests/test_merge_ai_resolver.py
  • tests/test_merge_auto_merger.py
  • tests/test_merge_conflict_detector.py
  • tests/test_merge_conflict_markers.py
  • tests/test_merge_file_tracker.py
  • tests/test_merge_fixtures.py
  • tests/test_merge_orchestrator.py
  • tests/test_merge_parallel.py
  • tests/test_merge_semantic_analyzer.py
  • tests/test_merge_types.py
  • tests/test_model_resolution.py
  • tests/test_multi_model_orchestration.py
  • tests/test_ollama_provider_e2e.py
  • tests/test_openai_provider_e2e.py
  • tests/test_output_validator.py
  • tests/test_owasp_scanner.py
  • tests/test_per_agent_provider_e2e.py
  • tests/test_performance_analyzer.py
  • tests/test_phase_event.py
  • tests/test_platform.py
  • tests/test_pr_worktree_manager.py
  • tests/test_prevention_config.py
  • tests/test_prevention_scanner.py
  • tests/test_qa_criteria.py
  • tests/test_qa_loop.py
  • tests/test_qa_loop_enhancements.py
  • tests/test_qa_report_config.py
  • tests/test_qa_report_iteration.py
  • tests/test_qa_report_manual_plan.py
  • tests/test_qa_report_project_detection.py
  • tests/test_qa_report_recurring.py
  • tests/test_qa_tools.py
  • tests/test_recovery.py
  • tests/test_review_approval.py
  • tests/test_review_feedback.py
  • tests/test_review_helpers.py
  • tests/test_review_integration.py
  • tests/test_review_state.py
  • tests/test_review_validation.py
  • tests/test_review_verdict.py
  • tests/test_risk_classifier.py
  • tests/test_roadmap_validation.py
  • tests/test_scan_secrets.py
  • tests/test_scope_analyzer.py
  • tests/test_sdk_structured_output.py
  • tests/test_security.py
  • tests/test_security_auditor.py
  • tests/test_security_cache.py
  • tests/test_security_scanner.py
  • tests/test_service_orchestrator.py
  • tests/test_session_context.py
  • tests/test_session_persistence_e2e.py
  • tests/test_session_resume.py
  • tests/test_signature_parser_verification.py
  • tests/test_spec_complexity.py
  • tests/test_spec_phases.py
  • tests/test_spec_pipeline.py
  • tests/test_spec_validate_pkg_validators_context_validator.py
  • tests/test_spec_validate_pkg_validators_prereqs_validator.py
  • tests/test_spec_validate_pkg_validators_spec_document_validator.py
  • tests/test_statistics_tool.py
  • tests/test_structured_outputs.py
  • tests/test_template_generator.py
  • tests/test_template_library.py
  • tests/test_template_registry.py
  • tests/test_test_generation.py
  • tests/test_token_optimization.py
  • tests/test_validation_strategy.py
  • tests/test_workspace.py
  • tests/test_workspace_integration.py
  • tests/test_worktree.py
💤 Files with no reviewable changes (1)
  • tests/review_fixtures.py

Comment on lines +869 to +870
agent_type_for_session = "planner" if first_run else "coder"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Process-isolation path can use the wrong agent type for planning sessions.

agent_type_for_session is computed correctly, but isolation recomputes from first_run after first_run may have flipped to False, so planner prompts can run with agent_type="coder".

💡 Suggested patch
         if use_process_isolation:
             # Run in isolated subprocess for crash resistance
-            agent_type = "planner" if first_run else "coder"
+            agent_type = agent_type_for_session
             if verbose or iteration == 1:
                 print_status(
                     "Process isolation: ENABLED (crash-resistant mode)", "info"
                 )

Also applies to: 1057-1068

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/agents/coder.py` around lines 869 - 870, The process-isolation
branch recomputes the session agent type from first_run and may use the wrong
type once first_run flips; instead compute and store agent_type_for_session once
(as currently set) and use that stored variable inside the isolation code path
rather than re-evaluating first_run; update references in the isolation block(s)
that currently derive agent type from first_run (including the other occurrence
around lines ~1057-1068) to use agent_type_for_session so planner sessions
always run with the intended "planner" value when first_run was true.

Comment on lines +80 to +85
session = provider.create_session(
SessionConfig(
name="planner-session",
model=model,
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep planner session metadata consistent across providers.

The non-Claude path omits extra metadata (agent_type, thinking budget), creating behavior drift versus the Claude branch.

💡 Suggested patch
     else:
         session = provider.create_session(
             SessionConfig(
                 name="planner-session",
                 model=model,
+                extra={
+                    "agent_type": "planner",
+                    "max_thinking_tokens": max_thinking_tokens,
+                },
             )
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/agents/planner.py` around lines 80 - 85, The non-Claude branch
that calls provider.create_session with SessionConfig(name="planner-session",
model=model) is missing the extra metadata (agent_type and thinking budget) used
by the Claude path, causing behavior drift; update the SessionConfig passed to
provider.create_session in planner.py to include the same extra metadata keys
(e.g., extra={"agent_type": "planner", "thinking_budget": <value>} or the
existing constants used in the Claude branch) so both providers create sessions
with consistent metadata.

Comment on lines +159 to +160
# Get the underlying SDK client from the session
client = session.client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add defensive session/client validation before entering async context.

session.client is assumed to exist and support async with; when an adapter deviates, this fails late with low-context errors.

💡 Suggested patch
     # Get the underlying SDK client from the session
-    client = session.client
+    if not hasattr(session, "client"):
+        raise AttributeError("Planner provider session is missing 'client'")
+    client = session.client
+    if not hasattr(client, "__aenter__") or not hasattr(client, "__aexit__"):
+        raise TypeError(
+            f"Planner provider client must support async context manager, got {type(client)}"
+        )

Also applies to: 205-208

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/agents/planner.py` around lines 159 - 160, Before assigning and
using session.client in the async context, add defensive validation that session
has a client attribute and that the retrieved client object supports async
context management (i.e., has callable __aenter__ and __aexit__); if not, raise
a clear TypeError (or return an informative error) that mentions
"session.client" and the expected async context-manager interface so callers get
actionable diagnostics. Perform this check immediately around the code that sets
client = session.client and before any "async with client:" usage in this module
(also replicate the same validation for the second occurrence around the block
currently at lines 205-208) so both paths fail fast with a helpful message when
an adapter deviates.

Comment on lines +760 to +765
session = provider.create_session(
SessionConfig(
name="qa-fixer-session",
model=model,
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Propagate QA agent metadata in the non-Claude session path.

The non-Claude branch drops agent_type/thinking metadata, which risks inconsistent QA tool wiring versus the Claude path.

💡 Suggested patch
     else:
         session = provider.create_session(
             SessionConfig(
                 name="qa-fixer-session",
                 model=model,
+                extra={
+                    "agent_type": "qa_fixer",
+                    "max_thinking_tokens": max_thinking_tokens,
+                },
             )
         )

Based on learnings: When implementing agent code, ensure QA agents (qa_reviewer and qa_fixer) have automatic access to Electron MCP tools for E2E testing the frontend Electron app when ELECTRON_MCP_ENABLED=true.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/qa/fixer.py` around lines 760 - 765, The non-Claude session
creation via provider.create_session(SessionConfig(...)) is missing propagation
of QA agent metadata (agent_type and thinking) used in the Claude path; update
the non-Claude branch where SessionConfig is constructed to include the same
metadata keys (e.g., agent_type="qa_fixer" or "qa_reviewer" as appropriate and
thinking metadata) so QA agents have the same tool wiring; ensure this metadata
is passed into SessionConfig (or provider.create_session) when
ELECTRON_MCP_ENABLED may enable Electron MCP tools, matching the Claude-path
fields so the qa_reviewer/qa_fixer sessions get automatic access to Electron MCP
tools for E2E testing.

Comment on lines +100 to 102
[c for c in analysis.changes if c.change_type == ChangeType.ADD_CLASS]
# Depending on implementation, might detect class or its methods
assert len(analysis.changes) > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

test_python_class_detection no longer verifies class detection.

Line 100 computes a list and discards it, so this test passes even if no class addition is detected.

Suggested fix
-        # Should detect the Greeter class
-        [c for c in analysis.changes if c.change_type == ChangeType.ADD_CLASS]
-        # Depending on implementation, might detect class or its methods
-        assert len(analysis.changes) > 0
+        # Should detect the Greeter class
+        class_additions = [
+            c for c in analysis.changes if c.change_type == ChangeType.ADD_CLASS
+        ]
+        assert len(class_additions) >= 1

As per coding guidelines tests/**: “Ensure tests are comprehensive and follow pytest conventions.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_merge_semantic_analyzer.py` around lines 100 - 102, The test
currently computes [c for c in analysis.changes if c.change_type ==
ChangeType.ADD_CLASS] and discards it so it doesn’t actually assert that a class
was detected; update the test (test_python_class_detection) to capture that list
(e.g., added_classes = [...]) and assert on it (for example assert
len(added_classes) > 0 or assert any(...) to verify a class or its methods were
detected), using the existing analysis variable and ChangeType.ADD_CLASS symbol
so the test fails when no class addition is present.

Comment on lines +176 to +192
def test_ollama_no_api_key_required(self, test_env_ollama, monkeypatch):
"""Test that Ollama does not require an API key (local model)."""
temp_dir, spec_dir, project_dir = test_env_ollama

# Set provider without any API keys
monkeypatch.setenv("AI_ENGINE_PROVIDER", "litellm")
monkeypatch.setenv("LITELLM_MODEL", "ollama/llama3")

# Remove any API keys that might be set
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
monkeypatch.delenv("GOOGLE_API_KEY", raising=False)
monkeypatch.delenv("ANTHROPIC_API_KEY", raising=False)

# This should work fine for Ollama (no API key needed)
assert os.getenv("OPENAI_API_KEY") is None
assert os.getenv("GOOGLE_API_KEY") is None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

test_ollama_no_api_key_required should assert system behavior, not only env state.

Please assert provider creation/session behavior succeeds without API keys for Ollama, instead of only checking os.getenv(...) is None.

As per coding guidelines: “tests/**: Ensure tests are comprehensive and follow pytest conventions. Check for proper mocking and test isolation.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_ollama_provider_e2e.py` around lines 176 - 192, Replace the
environment-only assertions in test_ollama_no_api_key_required with assertions
that exercise the provider initialization and a minimal session/generation call:
using the test_env_ollama and monkeypatch fixtures, set AI_ENGINE_PROVIDER and
LITELLM_MODEL as you already do, then call the project's provider initialization
factory/function (the code path that creates the lite-llm/Ollama provider —
e.g., the create/get provider function your tests normally use) to instantiate
the provider/session and assert it returns a usable object and can produce a
minimal response (for example, verify provider is not None and
provider.generate()/session.create() returns a non-error result or expected
structure). Keep the existing env removals (monkeypatch.delenv) and add these
provider-level assertions so the test verifies behavior, not just environment
state.

Comment on lines +174 to +186
def test_openai_api_key_required(self, test_env_openai, monkeypatch):
"""Test that OpenAI API key is required for litellm provider."""
temp_dir, spec_dir, project_dir = test_env_openai

# Set provider but not API key
monkeypatch.setenv("AI_ENGINE_PROVIDER", "litellm")
monkeypatch.setenv("LITELLM_MODEL", "gpt-4")
monkeypatch.delenv("OPENAI_API_KEY", raising=False)

# This should log a warning or fail gracefully
# Actual implementation may vary
assert os.getenv("OPENAI_API_KEY") is None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

test_openai_api_key_required is asserting environment state, not behavior.

This test should assert the actual failure/validation path (e.g., provider creation or config validation), not only that OPENAI_API_KEY is unset.

As per coding guidelines: “tests/**: Ensure tests are comprehensive and follow pytest conventions. Check for proper mocking and test isolation.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_openai_provider_e2e.py` around lines 174 - 186, The test
test_openai_api_key_required currently only checks that OPENAI_API_KEY is unset;
instead, invoke the actual litellm provider initialization/validation code path
(e.g., call the provider factory or init function used elsewhere in tests such
as Provider.create, load_provider, or config.validate for the litellm provider)
and assert the expected failure: either that it raises a specific exception,
returns an error result, or emits a warning (use pytest.raises or caplog to
assert the message). Update test_openai_api_key_required to set
AI_ENGINE_PROVIDER and LITELLM_MODEL as before, remove the bare getenv
assertion, call the provider init/validation (the same function used by
runtime), and assert the correct behavior (exception type/message or logged
warning) to validate missing OPENAI_API_KEY handling.

Comment on lines +8 to +9
# Import the module to test - use direct path to avoid package imports
import importlib.util
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add defensive checks before module execution for clearer test collection failures.

The dynamic import path setup is fine for isolation, but add explicit validation for spec/spec.loader before execution so failures are deterministic and actionable instead of attribute errors during collection.

Proposed hardening
 spec = importlib.util.spec_from_file_location("pr_worktree_manager", module_path)
+if spec is None or spec.loader is None:
+    raise ImportError(f"Unable to load module spec for: {module_path}")
 pr_worktree_module = importlib.util.module_from_spec(spec)
 spec.loader.exec_module(pr_worktree_module)

Also applies to: 20-22

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_pr_worktree_manager.py` around lines 8 - 9, The dynamic import
using importlib.util should validate the spec and its loader before executing to
avoid obscure AttributeErrors during test collection; after calling
importlib.util.spec_from_file_location and before spec.loader.exec_module (and
before importlib.util.module_from_spec if you rely on the loader), check that
spec is not None and spec.loader is not None, and raise a clear RuntimeError (or
AssertionError) with a descriptive message if either is missing so failures are
deterministic and informative; apply the same checks around the other import
blocks referenced (lines ~20-22) that use spec/spec.loader.

Comment on lines +76 to +79
["git", "init", "--bare"],
cwd=origin_dir,
check=True,
capture_output=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use platform execution utilities instead of direct subprocess.run.

Line 76 executes git commands directly; this violates the repo rule requiring platform abstraction for command execution. Please route this through the shared execution utility used by backend/tests.

As per coding guidelines, "Use path joining utilities and execution utilities from the platform abstraction module instead of manual string concatenation for file paths and command execution".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_pr_worktree_manager.py` around lines 76 - 79, Replace the direct
subprocess.run([...], cwd=origin_dir, ...) call (search for subprocess.run and
origin_dir in tests/test_pr_worktree_manager.py) with the repository's platform
execution utility used by backend/tests (import the shared exec helper from the
tests/platform abstraction module) and invoke it to run "git init --bare" with
the same cwd; also use the platform/path joining utilities from that module to
construct any paths instead of manual concatenation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 1, 2026

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/backend/.env.example (1)

38-225: ⚠️ Potential issue | 🟡 Minor

Comprehensive multi-provider documentation. Past review addressed.

The multi-provider configuration documentation is thorough and well-structured. The past review comment requesting Zhipu GLM be added to the cost tracking list (line 219) has been properly addressed ✅.

However, there's a minor documentation gap: OpenRouter is mentioned as a provider option (lines 47, 57, with OPENROUTER_API_KEY configuration), but it's not included in the cost tracking provider list at line 219. This omission could confuse users who select OpenRouter and expect cost tracking.

Recommendation: Either add OpenRouter to the cost tracking list if supported, or add a brief note explaining why cost tracking is unavailable for OpenRouter (e.g., if it aggregates multiple providers and tracks costs differently).

📝 Suggested clarification

If OpenRouter supports cost tracking:

-# - Supports cost tracking for Claude, OpenAI, Google Gemini, Zhipu GLM, and Ollama (zero cost)
+# - Supports cost tracking for Claude, OpenAI, Google Gemini, Zhipu GLM, OpenRouter, and Ollama (zero cost)

Or, if OpenRouter doesn't support cost tracking, add a note:

 # - Supports cost tracking for Claude, OpenAI, Google Gemini, Zhipu GLM, and Ollama (zero cost)
+# - Note: OpenRouter cost tracking is handled by the underlying provider models
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/.env.example` around lines 38 - 225, The documentation mentions
OpenRouter as a provider (OPENROUTER_API_KEY / "OpenRouter") but omits it from
the cost tracking provider list; update the docs to either (A) add "OpenRouter"
to the cost tracking list if cost_tracking supports it (ensure you mention
OpenRouter in the same section that lists supported cost-tracked providers), or
(B) add a short note next to the cost tracking section clarifying that
OpenRouter is not cost-tracked (e.g., because it aggregates multiple backends)
so users know why it's excluded—update the multi-provider/cost tracking text and
reference "OPENROUTER_API_KEY" and "cost tracking" in the change for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@apps/backend/.env.example`:
- Around line 38-225: The documentation mentions OpenRouter as a provider
(OPENROUTER_API_KEY / "OpenRouter") but omits it from the cost tracking provider
list; update the docs to either (A) add "OpenRouter" to the cost tracking list
if cost_tracking supports it (ensure you mention OpenRouter in the same section
that lists supported cost-tracked providers), or (B) add a short note next to
the cost tracking section clarifying that OpenRouter is not cost-tracked (e.g.,
because it aggregates multiple backends) so users know why it's excluded—update
the multi-provider/cost tracking text and reference "OPENROUTER_API_KEY" and
"cost tracking" in the change for clarity.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b4d2477 and a41a92c.

📒 Files selected for processing (1)
  • apps/backend/.env.example

@OBenner OBenner merged commit 31d45c4 into develop Mar 1, 2026
19 checks passed
@OBenner OBenner deleted the auto-claude/046-multi-model-provider-support branch March 1, 2026 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant