Skip to content

[codex] Add cost-aware task model router#205

Merged
OBenner merged 4 commits intodevelopfrom
codex/intelligent-model-router
Apr 29, 2026
Merged

[codex] Add cost-aware task model router#205
OBenner merged 4 commits intodevelopfrom
codex/intelligent-model-router

Conversation

@OBenner
Copy link
Copy Markdown
Owner

@OBenner OBenner commented Apr 29, 2026

Base Branch

  • This PR targets the develop branch (required for all feature/fix PRs)
  • This PR targets main (hotfix only - maintainers)

Description

Adds runtime, cost-aware model routing for agent subtasks. The router analyzes work type, predicted risks, file count, and task wording to select a low/medium/high complexity route, then maps that route to a configured provider/model.

Related Issue

Closes #

Type of Change

  • 🐛 Bug fix
  • ✨ New feature
  • 📚 Documentation
  • ♻️ Refactor
  • 🧪 Test

Area

  • Frontend
  • Backend
  • Fullstack

Commit Message Format

Follow conventional commits: <type>: <subject>

Types: feat, fix, docs, style, refactor, test, chore

Example: feat: add user authentication system

Checklist

  • I've synced with develop branch
  • I've tested my changes locally
  • I've followed the code principles (SOLID, DRY, KISS)
  • My PR is small and focused (< 400 lines ideally)
  • (Python only) All file operations specify encoding="utf-8" for text files

Platform Testing Checklist

CRITICAL: This project supports Windows, macOS, and Linux. Platform-specific bugs are a common source of breakage.

  • Windows tested (either on Windows or via CI)
  • macOS tested (either on macOS or via CI)
  • Linux tested (CI covers this)
  • Used centralized platform/ module instead of direct process.platform checks
  • No hardcoded paths (used Path / existing config resolution)

If you only have access to one OS: CI now tests on all platforms. Ensure all checks pass before submitting.

CI/Testing Requirements

  • All CI checks pass on all platforms (Windows, macOS, Linux)
  • All existing tests pass
  • New features include test coverage
  • Bug fixes include regression tests

Screenshots

N/A - backend-only change.

Feature Toggle

  • Behind localStorage flag: use_feature_name
  • Behind settings toggle
  • Behind environment variable/config
  • N/A - Feature is complete and ready for all users

Configured through apps/backend/config/model_routing.yaml and MODEL_ROUTER_{LEVEL}_PROVIDER / MODEL_ROUTER_{LEVEL}_MODEL environment overrides.

Breaking Changes

Breaking: No

Details:
create_agent_session() remains backward compatible. The new subtask parameter is optional, and explicit model overrides still bypass routing.

Validation

  • .venv/bin/python -m ruff --version - ruff 0.14.10
  • .venv/bin/python -m ruff check apps/backend/ tests/test_task_router.py - passed
  • .venv/bin/python -m ruff format apps/backend/ tests/test_task_router.py --check --diff - passed
  • .venv/bin/python -m pytest tests/test_provider_factory.py tests/test_task_router.py - 72 passed
  • .venv/bin/python -m py_compile apps/backend/core/providers/task_router.py apps/backend/core/providers/task_router_config.py apps/backend/core/providers/factory.py apps/backend/context/prioritizer.py tests/test_task_router.py - passed

Summary by CodeRabbit

Release Notes

  • New Features

    • Tasks are now automatically routed to the most suitable AI model based on complexity analysis and resource requirements.
    • Added cost estimation for task routing decisions.
  • Tests

    • Expanded test coverage for task routing and model selection logic.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

Adds a cost-aware task complexity router and YAML configuration that scores subtasks (risk, file counts, keywords, security bonus) into high/medium/low tiers and selects provider/model/max_tokens; integrates routing into session creation and includes tests and config-loading/validation logic.

Changes

Cohort / File(s) Summary
Routing config
apps/backend/config/model_routing.yaml
New YAML defining complexity thresholds, scoring weights, file-count rules, keyword signals, provider/model mappings per tier, max_tokens, default input/output token estimates, and env-var override patterns.
Routing implementation
apps/backend/core/providers/task_router.py, apps/backend/core/providers/task_router_config.py
New TaskComplexityRouter and TaskRoute dataclass; computes normalized complexity score from base/risk/file/keyword/security signals, clamps and maps to low/medium/high, selects provider/model from config, estimates cost; config loader merges defaults with YAML, applies env overrides and validations.
Factory integration
apps/backend/core/providers/factory.py
`create_agent_session(..., subtask: dict
Minor utility change
apps/backend/context/prioritizer.py
Replaced direct UTC import with module-level alias UTC = timezone.utc.
Tests
tests/test_task_router.py
New pytest suite covering scoring logic (clamping, thresholds, keyword matches), malformed/empty subtasks robustness, config loading/overrides/validation, and factory integration behavior.

Sequence Diagram

sequenceDiagram
    participant Factory as Agent Factory
    participant Router as TaskComplexityRouter
    participant Config as Routing Config Loader
    participant Models as Model Pricing/Registry

    Factory->>Router: route(subtask, agent_type)
    Router->>Config: load_model_routing_config()
    Config-->>Router: routing config (thresholds, mappings)
    Router->>Models: validate provider/model & pricing
    Router->>Router: compute complexity score (base, risk, files, keywords, security)
    Router->>Router: clamp score → map to tier (low/medium/high)
    Router->>Router: select provider/model and estimate cost
    Router-->>Factory: TaskRoute(provider, model, complexity, score, cost, reasoning)
    Factory->>Factory: apply route to ProviderConfig (clone & mutate)
    Factory-->>Factory: return configured session
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I counted risks and files with care,
I sniffed for keywords in the air.
I hopped through configs, chose a model true,
Routed each task to what it’s due.
Tests applaud my little hop and cue.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title '[codex] Add cost-aware task model router' directly and accurately summarizes the primary change: introducing a new cost-aware task routing system for model selection.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/intelligent-model-router

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/config/model_routing.yaml`:
- Around line 19-25: Remove the duplicate "migrate" entry from the
architecture_keywords list in model_routing.yaml so each keyword is unique;
locate the architecture_keywords array (symbol: architecture_keywords) and
delete the redundant "migrate" string occurrence to avoid skewing scoring and
make tuning clearer.

In `@apps/backend/core/providers/factory.py`:
- Around line 41-42: The code uses str(getattr(route, "provider")) and
str(getattr(route, "model")), which triggers Ruff B009; replace these constant
getattr calls with direct attribute access (e.g., use str(route.provider) and
str(route.model) or simply route.provider and route.model if they are already
strings) to silence the lint error and preserve the same behavior for the
TaskRoute-like object.

In `@apps/backend/core/providers/task_router_config.py`:
- Around line 9-20: The import block at the top of the module is unsorted and
triggers Ruff I001; reorder and format imports into the standard groups (future,
standard library, third-party, local application) and sort each group
alphabetically, ensuring single blank lines between groups; specifically
reorganize the imports that include "from __future__ import annotations", stdlib
names (copy, logging, os, pathlib.Path, typing.Any), third-party names (yaml),
and local providers (from core.providers.config import AIEngineProvider and from
core.providers.cost_calculator import MODEL_PRICING) to comply with isort/Ruff
rules (or run isort/ruff --fix) so the linter passes.
- Around line 177-180: The YAML load can raise yaml.YAMLError but callers expect
ValueError per the function docstring; wrap the yaml.safe_load(config_file) call
in a try/except that catches yaml.YAMLError (and optionally MarkedYAMLError) and
re-raise a ValueError with a clear message including the path to the config file
so the contract is honored; keep the rest of the logic (assigning to loaded and
the isinstance check) unchanged and reference the existing variables
`config_file`, `loaded`, and `path` when creating the error message.

In `@apps/backend/core/providers/task_router.py`:
- Around line 155-162: The substring checks on description cause false
positives; update the any(...) checks that compare description against
architecture_keywords and trivial_keywords to use boundary-aware matching (e.g.,
regex word boundaries or tokenization) instead of simple "in" substring checks
so terms like "platform" won't match "format"; specifically replace the
predicates in the blocks referencing description, architecture_keywords,
trivial_keywords and the scoring updates using architecture_keyword_score and
trivial_keyword_score to use re.search(r'\b' + re.escape(keyword) + r'\b',
description) (or equivalent token-based lookup) before adjusting score.
- Around line 206-208: The current count uses len(files_to_modify) +
len(files_to_create) which double-counts paths appearing in both lists; update
the logic in the function handling subtask (referencing files_to_modify and
files_to_create) to build a deduplicated set of file paths from both lists
(e.g., combine them and take set()) and return the length of that set so shared
paths are counted only once; ensure you still handle missing keys/default empty
lists as before.

In `@tests/test_task_router.py`:
- Around line 23-124: The tests rely on default MODEL_ROUTER_* behavior but
don’t clear environment overrides, so add an autouse pytest fixture (e.g.,
isolate_model_router_env) that uses monkeypatch to remove any environment keys
starting with "MODEL_ROUTER_" before each test runs; implement it in the test
module and ensure it runs automatically (autouse=True) so TaskComplexityRouter
instantiations in tests use the intended defaults and the route/score assertions
remain deterministic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b1124ae9-4f27-4861-abdd-082f3f7f6508

📥 Commits

Reviewing files that changed from the base of the PR and between 71a78c2 and 7081309.

📒 Files selected for processing (6)
  • apps/backend/config/model_routing.yaml
  • apps/backend/context/prioritizer.py
  • apps/backend/core/providers/factory.py
  • apps/backend/core/providers/task_router.py
  • apps/backend/core/providers/task_router_config.py
  • tests/test_task_router.py

Comment on lines +19 to +25
architecture_keywords:
- "architect"
- "design"
- "refactor"
- "migrate"
- "migrate"
- "rewrite"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Remove duplicate keyword entry in architecture_keywords.

Line 24 repeats "migrate", which is redundant and makes score tuning harder to reason about.

Suggested cleanup
   architecture_keywords:
     - "architect"
     - "design"
     - "refactor"
     - "migrate"
-    - "migrate"
     - "rewrite"
     - "restructure"
     - "overhaul"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/config/model_routing.yaml` around lines 19 - 25, Remove the
duplicate "migrate" entry from the architecture_keywords list in
model_routing.yaml so each keyword is unique; locate the architecture_keywords
array (symbol: architecture_keywords) and delete the redundant "migrate" string
occurrence to avoid skewing scoring and make tuning clearer.

Comment thread apps/backend/core/providers/factory.py Outdated
Comment thread apps/backend/core/providers/task_router_config.py
Comment thread apps/backend/core/providers/task_router_config.py Outdated
Comment thread apps/backend/core/providers/task_router.py
Comment thread apps/backend/core/providers/task_router.py Outdated
Comment thread tests/test_task_router.py
@OBenner OBenner force-pushed the codex/intelligent-model-router branch from 7081309 to a5e641e Compare April 29, 2026 14:27
@OBenner OBenner changed the title Add cost-aware task model router [codex] Add cost-aware task model router Apr 29, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/core/providers/task_router_config.py`:
- Around line 176-183: The loader currently opens the file with path.open(...)
and may let OSError (e.g., permission denied) bubble out, violating the
documented config-load contract; modify the block around path.open and
yaml.safe_load in the function that reads the model routing config so that any
OSError (or other IO-related exceptions from opening/reading the file) is caught
and re-raised as a ValueError with a clear message like "Cannot read model
routing config: {path}" while preserving the original exception as the __cause__
(raise ... from exc), keeping the existing yaml.YAMLError handling and the check
that loaded is a dict.

In `@apps/backend/core/providers/task_router.py`:
- Around line 66-68: In route(), don't assume subtask is dict-like: validate
that the incoming subtask is a mapping (e.g., isinstance(subtask, dict) or has
required attributes) before passing to detect_work_type and
self.risk_analyzer.analyze_subtask_risks; if it's malformed, replace it with an
empty dict (safe_subtask = {}), optionally log a warning/error referencing the
original value, and then call detect_work_type(safe_subtask) and
self.risk_analyzer.analyze_subtask_risks(safe_subtask) to avoid AttributeError
from downstream analyzers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ac303e77-55f7-4e7a-b2f1-6d603f78d225

📥 Commits

Reviewing files that changed from the base of the PR and between 7081309 and 0fb43aa.

📒 Files selected for processing (6)
  • apps/backend/config/model_routing.yaml
  • apps/backend/context/prioritizer.py
  • apps/backend/core/providers/factory.py
  • apps/backend/core/providers/task_router.py
  • apps/backend/core/providers/task_router_config.py
  • tests/test_task_router.py

Comment thread apps/backend/core/providers/task_router_config.py
Comment thread apps/backend/core/providers/task_router.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/core/providers/task_router_config.py`:
- Around line 130-155: The _validate_config() routine currently checks provider
and model independently and assumes routing and complexity_thresholds are
mappings; update _validate_config to first assert that config.get("routing") and
config.get("complexity_thresholds") are dicts (raise ValueError if not), then
for each complexity level ("high","medium","low") assert routing[level] is a
dict and contains both "provider" and "model", normalize provider aliases (e.g.,
map "claude" -> "anthropic") and lowercase them, verify the provider is in
AIEngineProvider (use AIEngineProvider values set), verify model exists in
MODEL_PRICING, and finally ensure the MODEL_PRICING entry for that model maps to
the same normalized provider (so invalid pairs like provider: google + model:
gpt-4o fail at load time); also validate complexity_thresholds entries are
numeric and that high > medium > low.

In `@tests/test_task_router.py`:
- Around line 287-370: Add a test that verifies passing an explicit model to
create_agent_session bypasses routing: call factory.create_agent_session with
subtask={"..."} and model="gpt-4o-mini", patch ProviderConfig.from_env, patch
factory.create_engine_provider to return a DummyProvider, and patch
TaskComplexityRouter (or assert its constructor/methods are not called); then
assert create_engine_provider was called with a ProviderConfig whose
openai_model (or model field) equals the explicit model and that the returned
session.config.model equals that model, and assert TaskComplexityRouter was not
invoked (use symbols create_agent_session, ProviderConfig.from_env,
create_engine_provider, TaskComplexityRouter to locate code).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 97d5949a-a288-4c54-ab11-a897443a129e

📥 Commits

Reviewing files that changed from the base of the PR and between 0fb43aa and f0c23e6.

📒 Files selected for processing (3)
  • apps/backend/core/providers/task_router.py
  • apps/backend/core/providers/task_router_config.py
  • tests/test_task_router.py

Comment thread apps/backend/core/providers/task_router_config.py
Comment thread tests/test_task_router.py
@github-actions github-actions Bot added size/XL and removed size/L labels Apr 29, 2026
@sonarqubecloud
Copy link
Copy Markdown

@OBenner OBenner merged commit b2477f7 into develop Apr 29, 2026
19 checks passed
@OBenner OBenner deleted the codex/intelligent-model-router branch April 29, 2026 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant