[codex] Add cost-aware task model router by OBenner · Pull Request #205 · OBenner/Auto-Coding

OBenner · 2026-04-29T13:57:12Z

Base Branch

This PR targets the develop branch (required for all feature/fix PRs)
This PR targets main (hotfix only - maintainers)

Description

Adds runtime, cost-aware model routing for agent subtasks. The router analyzes work type, predicted risks, file count, and task wording to select a low/medium/high complexity route, then maps that route to a configured provider/model.

Related Issue

Closes #

Type of Change

Area

Frontend
Backend
Fullstack

Commit Message Format

Follow conventional commits: <type>: <subject>

Types: feat, fix, docs, style, refactor, test, chore

Example: feat: add user authentication system

Checklist

I've synced with develop branch
I've tested my changes locally
I've followed the code principles (SOLID, DRY, KISS)
My PR is small and focused (< 400 lines ideally)
(Python only) All file operations specify encoding="utf-8" for text files

Platform Testing Checklist

CRITICAL: This project supports Windows, macOS, and Linux. Platform-specific bugs are a common source of breakage.

Windows tested (either on Windows or via CI)
macOS tested (either on macOS or via CI)
Linux tested (CI covers this)
Used centralized platform/ module instead of direct process.platform checks
No hardcoded paths (used Path / existing config resolution)

If you only have access to one OS: CI now tests on all platforms. Ensure all checks pass before submitting.

CI/Testing Requirements

All CI checks pass on all platforms (Windows, macOS, Linux)
All existing tests pass
New features include test coverage
Bug fixes include regression tests

Screenshots

N/A - backend-only change.

Feature Toggle

Behind localStorage flag: use_feature_name
Behind settings toggle
Behind environment variable/config
N/A - Feature is complete and ready for all users

Configured through apps/backend/config/model_routing.yaml and MODEL_ROUTER_{LEVEL}_PROVIDER / MODEL_ROUTER_{LEVEL}_MODEL environment overrides.

Breaking Changes

Breaking: No

Details:
create_agent_session() remains backward compatible. The new subtask parameter is optional, and explicit model overrides still bypass routing.

Validation

.venv/bin/python -m ruff --version - ruff 0.14.10
.venv/bin/python -m ruff check apps/backend/ tests/test_task_router.py - passed
.venv/bin/python -m ruff format apps/backend/ tests/test_task_router.py --check --diff - passed
.venv/bin/python -m pytest tests/test_provider_factory.py tests/test_task_router.py - 72 passed
.venv/bin/python -m py_compile apps/backend/core/providers/task_router.py apps/backend/core/providers/task_router_config.py apps/backend/core/providers/factory.py apps/backend/context/prioritizer.py tests/test_task_router.py - passed

Summary by CodeRabbit

Release Notes

New Features
- Tasks are now automatically routed to the most suitable AI model based on complexity analysis and resource requirements.
- Added cost estimation for task routing decisions.
Tests
- Expanded test coverage for task routing and model selection logic.

coderabbitai · 2026-04-29T13:57:27Z

📝 Walkthrough

Walkthrough

Adds a cost-aware task complexity router and YAML configuration that scores subtasks (risk, file counts, keywords, security bonus) into high/medium/low tiers and selects provider/model/max_tokens; integrates routing into session creation and includes tests and config-loading/validation logic.

Changes

Cohort / File(s)	Summary
Routing config `apps/backend/config/model_routing.yaml`	New YAML defining complexity thresholds, scoring weights, file-count rules, keyword signals, provider/model mappings per tier, `max_tokens`, default input/output token estimates, and env-var override patterns.
Routing implementation `apps/backend/core/providers/task_router.py`, `apps/backend/core/providers/task_router_config.py`	New `TaskComplexityRouter` and `TaskRoute` dataclass; computes normalized complexity score from base/risk/file/keyword/security signals, clamps and maps to low/medium/high, selects provider/model from config, estimates cost; config loader merges defaults with YAML, applies env overrides and validations.
Factory integration `apps/backend/core/providers/factory.py`	`create_agent_session(..., subtask: dict
Minor utility change `apps/backend/context/prioritizer.py`	Replaced direct `UTC` import with module-level alias `UTC = timezone.utc`.
Tests `tests/test_task_router.py`	New pytest suite covering scoring logic (clamping, thresholds, keyword matches), malformed/empty subtasks robustness, config loading/overrides/validation, and factory integration behavior.

Sequence Diagram

sequenceDiagram
    participant Factory as Agent Factory
    participant Router as TaskComplexityRouter
    participant Config as Routing Config Loader
    participant Models as Model Pricing/Registry

    Factory->>Router: route(subtask, agent_type)
    Router->>Config: load_model_routing_config()
    Config-->>Router: routing config (thresholds, mappings)
    Router->>Models: validate provider/model & pricing
    Router->>Router: compute complexity score (base, risk, files, keywords, security)
    Router->>Router: clamp score → map to tier (low/medium/high)
    Router->>Router: select provider/model and estimate cost
    Router-->>Factory: TaskRoute(provider, model, complexity, score, cost, reasoning)
    Factory->>Factory: apply route to ProviderConfig (clone & mutate)
    Factory-->>Factory: return configured session

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I counted risks and files with care,
I sniffed for keywords in the air.
I hopped through configs, chose a model true,
Routed each task to what it’s due.
Tests applaud my little hop and cue.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[codex] Add cost-aware task model router' directly and accurately summarizes the primary change: introducing a new cost-aware task routing system for model selection.
Docstring Coverage	✅ Passed	Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/intelligent-model-router

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/config/model_routing.yaml`:
- Around line 19-25: Remove the duplicate "migrate" entry from the
architecture_keywords list in model_routing.yaml so each keyword is unique;
locate the architecture_keywords array (symbol: architecture_keywords) and
delete the redundant "migrate" string occurrence to avoid skewing scoring and
make tuning clearer.

In `@apps/backend/core/providers/factory.py`:
- Around line 41-42: The code uses str(getattr(route, "provider")) and
str(getattr(route, "model")), which triggers Ruff B009; replace these constant
getattr calls with direct attribute access (e.g., use str(route.provider) and
str(route.model) or simply route.provider and route.model if they are already
strings) to silence the lint error and preserve the same behavior for the
TaskRoute-like object.

In `@apps/backend/core/providers/task_router_config.py`:
- Around line 9-20: The import block at the top of the module is unsorted and
triggers Ruff I001; reorder and format imports into the standard groups (future,
standard library, third-party, local application) and sort each group
alphabetically, ensuring single blank lines between groups; specifically
reorganize the imports that include "from __future__ import annotations", stdlib
names (copy, logging, os, pathlib.Path, typing.Any), third-party names (yaml),
and local providers (from core.providers.config import AIEngineProvider and from
core.providers.cost_calculator import MODEL_PRICING) to comply with isort/Ruff
rules (or run isort/ruff --fix) so the linter passes.
- Around line 177-180: The YAML load can raise yaml.YAMLError but callers expect
ValueError per the function docstring; wrap the yaml.safe_load(config_file) call
in a try/except that catches yaml.YAMLError (and optionally MarkedYAMLError) and
re-raise a ValueError with a clear message including the path to the config file
so the contract is honored; keep the rest of the logic (assigning to loaded and
the isinstance check) unchanged and reference the existing variables
`config_file`, `loaded`, and `path` when creating the error message.

In `@apps/backend/core/providers/task_router.py`:
- Around line 155-162: The substring checks on description cause false
positives; update the any(...) checks that compare description against
architecture_keywords and trivial_keywords to use boundary-aware matching (e.g.,
regex word boundaries or tokenization) instead of simple "in" substring checks
so terms like "platform" won't match "format"; specifically replace the
predicates in the blocks referencing description, architecture_keywords,
trivial_keywords and the scoring updates using architecture_keyword_score and
trivial_keyword_score to use re.search(r'\b' + re.escape(keyword) + r'\b',
description) (or equivalent token-based lookup) before adjusting score.
- Around line 206-208: The current count uses len(files_to_modify) +
len(files_to_create) which double-counts paths appearing in both lists; update
the logic in the function handling subtask (referencing files_to_modify and
files_to_create) to build a deduplicated set of file paths from both lists
(e.g., combine them and take set()) and return the length of that set so shared
paths are counted only once; ensure you still handle missing keys/default empty
lists as before.

In `@tests/test_task_router.py`:
- Around line 23-124: The tests rely on default MODEL_ROUTER_* behavior but
don’t clear environment overrides, so add an autouse pytest fixture (e.g.,
isolate_model_router_env) that uses monkeypatch to remove any environment keys
starting with "MODEL_ROUTER_" before each test runs; implement it in the test
module and ensure it runs automatically (autouse=True) so TaskComplexityRouter
instantiations in tests use the intended defaults and the route/score assertions
remain deterministic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b1124ae9-4f27-4861-abdd-082f3f7f6508

📥 Commits

Reviewing files that changed from the base of the PR and between 71a78c2 and 7081309.

📒 Files selected for processing (6)

apps/backend/config/model_routing.yaml
apps/backend/context/prioritizer.py
apps/backend/core/providers/factory.py
apps/backend/core/providers/task_router.py
apps/backend/core/providers/task_router_config.py
tests/test_task_router.py

coderabbitai · 2026-04-29T14:02:42Z

+  architecture_keywords:
+    - "architect"
+    - "design"
+    - "refactor"
+    - "migrate"
+    - "migrate"
+    - "rewrite"


🧹 Nitpick | 🔵 Trivial

Remove duplicate keyword entry in architecture_keywords.

Line 24 repeats "migrate", which is redundant and makes score tuning harder to reason about.

Suggested cleanup

architecture_keywords: - "architect" - "design" - "refactor" - "migrate" - - "migrate" - "rewrite" - "restructure" - "overhaul"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/backend/config/model_routing.yaml` around lines 19 - 25, Remove the duplicate "migrate" entry from the architecture_keywords list in model_routing.yaml so each keyword is unique; locate the architecture_keywords array (symbol: architecture_keywords) and delete the redundant "migrate" string occurrence to avoid skewing scoring and make tuning clearer.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/core/providers/task_router_config.py`:
- Around line 176-183: The loader currently opens the file with path.open(...)
and may let OSError (e.g., permission denied) bubble out, violating the
documented config-load contract; modify the block around path.open and
yaml.safe_load in the function that reads the model routing config so that any
OSError (or other IO-related exceptions from opening/reading the file) is caught
and re-raised as a ValueError with a clear message like "Cannot read model
routing config: {path}" while preserving the original exception as the __cause__
(raise ... from exc), keeping the existing yaml.YAMLError handling and the check
that loaded is a dict.

In `@apps/backend/core/providers/task_router.py`:
- Around line 66-68: In route(), don't assume subtask is dict-like: validate
that the incoming subtask is a mapping (e.g., isinstance(subtask, dict) or has
required attributes) before passing to detect_work_type and
self.risk_analyzer.analyze_subtask_risks; if it's malformed, replace it with an
empty dict (safe_subtask = {}), optionally log a warning/error referencing the
original value, and then call detect_work_type(safe_subtask) and
self.risk_analyzer.analyze_subtask_risks(safe_subtask) to avoid AttributeError
from downstream analyzers.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ac303e77-55f7-4e7a-b2f1-6d603f78d225

📥 Commits

Reviewing files that changed from the base of the PR and between 7081309 and 0fb43aa.

📒 Files selected for processing (6)

apps/backend/config/model_routing.yaml
apps/backend/context/prioritizer.py
apps/backend/core/providers/factory.py
apps/backend/core/providers/task_router.py
apps/backend/core/providers/task_router_config.py
tests/test_task_router.py

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/core/providers/task_router_config.py`:
- Around line 130-155: The _validate_config() routine currently checks provider
and model independently and assumes routing and complexity_thresholds are
mappings; update _validate_config to first assert that config.get("routing") and
config.get("complexity_thresholds") are dicts (raise ValueError if not), then
for each complexity level ("high","medium","low") assert routing[level] is a
dict and contains both "provider" and "model", normalize provider aliases (e.g.,
map "claude" -> "anthropic") and lowercase them, verify the provider is in
AIEngineProvider (use AIEngineProvider values set), verify model exists in
MODEL_PRICING, and finally ensure the MODEL_PRICING entry for that model maps to
the same normalized provider (so invalid pairs like provider: google + model:
gpt-4o fail at load time); also validate complexity_thresholds entries are
numeric and that high > medium > low.

In `@tests/test_task_router.py`:
- Around line 287-370: Add a test that verifies passing an explicit model to
create_agent_session bypasses routing: call factory.create_agent_session with
subtask={"..."} and model="gpt-4o-mini", patch ProviderConfig.from_env, patch
factory.create_engine_provider to return a DummyProvider, and patch
TaskComplexityRouter (or assert its constructor/methods are not called); then
assert create_engine_provider was called with a ProviderConfig whose
openai_model (or model field) equals the explicit model and that the returned
session.config.model equals that model, and assert TaskComplexityRouter was not
invoked (use symbols create_agent_session, ProviderConfig.from_env,
create_engine_provider, TaskComplexityRouter to locate code).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 97d5949a-a288-4c54-ab11-a897443a129e

📥 Commits

Reviewing files that changed from the base of the PR and between 0fb43aa and f0c23e6.

📒 Files selected for processing (3)

apps/backend/core/providers/task_router.py
apps/backend/core/providers/task_router_config.py
tests/test_task_router.py

sonarqubecloud · 2026-04-29T18:30:22Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions Bot added area/backend size/L labels Apr 29, 2026

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

Add cost-aware task model router

a5e641e

OBenner force-pushed the codex/intelligent-model-router branch from 7081309 to a5e641e Compare April 29, 2026 14:27

OBenner changed the title ~~Add cost-aware task model router~~ [codex] Add cost-aware task model router Apr 29, 2026

Fix model router review feedback

0fb43aa

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread apps/backend/core/providers/task_router_config.py

Comment thread apps/backend/core/providers/task_router.py Outdated

fix: harden model router edge cases

f0c23e6

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread apps/backend/core/providers/task_router_config.py

Comment thread tests/test_task_router.py

fix: validate model router config strictly

96d3785

github-actions Bot added size/XL and removed size/L labels Apr 29, 2026

OBenner merged commit b2477f7 into develop Apr 29, 2026
19 checks passed

OBenner deleted the codex/intelligent-model-router branch April 29, 2026 18:34

Conversation

OBenner commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Base Branch

Description

Related Issue

Type of Change

Area

Commit Message Format

Checklist

Platform Testing Checklist

CI/Testing Requirements

Screenshots

Feature Toggle

Breaking Changes

Validation

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud Bot commented Apr 29, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

OBenner commented Apr 29, 2026 •

edited

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading