Add capability-aware routing & failover (tools/vision/reasoning/json) by BillJr99 · Pull Request #33 · BillJr99/llmproxy

BillJr99 · 2026-05-29T12:45:05Z

Summary

Adds comprehensive capability-aware routing and failover to llmproxy, allowing requests to be intelligently routed to models that support specific capabilities (tools, vision, reasoning, json) and automatically failing over when a model claims to support a capability but doesn't deliver it in the response.

Key Changes

Core Capability Detection & Routing

Capability detectors: Implemented pure detector functions for each capability:
- _request_has_tools(), _tool_use_forced(), _response_has_tool_call() for tools
- _request_has_image() for vision
- _request_wants_reasoning() for reasoning
- _request_wants_json(), _response_is_json() for json
Capability metadata: Added _CAPABILITIES dict mapping each capability to its request detector, strict detector (for forced cases), and response validator
Model tagging: New model_capabilities config field allows tagging models with supported capabilities (case-insensitive, auto-populated from scrapers)
Proactive ordering: _order_by_capability() reorders candidates so models supporting needed capabilities are tried first, with fallback to unknown-capability models
Reactive failover: Modified _proxy_cycling_non_streaming() to detect when a 200 response failed to deliver a forced capability (e.g., tool_choice: "required" but no tool calls) and automatically try the next candidate

Virtual Endpoints

Added _CAPABILITY_VIRTUALS constant defining capability-based virtual models
New virtual endpoints: llmproxy__tools, llmproxy__vision, llmproxy__tools/free, llmproxy__vision/free (and legacy llmproxy/ forms)
Implemented _get_capability_model_candidates() and _get_capability_free_candidates() selectors
Updated _get_virtual_candidates() to dispatch capability-based virtual models
Virtual models appear in /v1/models list when at least one model is tagged with that capability

Configuration & Setup

Added model_capabilities field to config schema (optional, top-level object)
Setup wizard now includes "Tag model with capabilities" and "Remove capability tag" menu options
Defensive parsing: _model_capabilities() handles missing/malformed config gracefully

Scraper Integration

OpenRouter source now extracts capabilities from supported_parameters (tools/reasoning/structured_outputs) and architecture.input_modalities (image → vision)
Scraper aggregation merges capabilities from high-confidence sources
apply_updates() stores capabilities only for free-tier models (parallel to free_limits)
Capabilities are dropped when models are removed from the free set
Config sync is add-only: user-set capabilities are never overwritten by scraper updates

Testing

Comprehensive test suite in tests/test_capabilities.py covering:
- All detector functions (tools, vision, reasoning, json)
- Capability ordering and failover logic
- Virtual endpoint dispatch
- Defensive config parsing
Scraper tests verify capabilities are only stored for free models and dropped on removal

Notable Implementation Details

Safe defaults: Response validators return True on malformed JSON or unexpected shapes to avoid spurious failover on unparseable responses
Streaming limitation: Reactive 200-body capability checks only apply to non-streaming requests; streaming still benefits from proactive ordering but cannot inspect delta chunks without buffering
Tool choice semantics: tool_choice: "auto" or "none" never trigger failover even without tool calls (model may legitimately answer without tools)
Stable reordering: Capability ordering never drops candidates, so incomplete metadata never causes hard failures
Provider/model lookup: Capability lookups support both bare model IDs and full provider/model forms, matching existing model_reasoning behavior

https://claude.ai/code/session_019YMQmPWsAUtALVqqY9FHPo

Virtual models previously only failed over on HTTP status >= 400, so a free model that returns 200 while silently ignoring tools/function calls looked like a success and broke tool-using clients. This adds a general model-capability framework (tools / vision / reasoning / json): - Runtime (server.py): a capability registry with request detectors, strict detectors, and response validators. Virtual-model requests now (1) proactively prefer candidates that support the needed capabilities via a stable reorder that never drops candidates, and (2) reactively fail over when a *forced* capability isn't delivered by a 200 (tools forced but no tool_calls; JSON mode but non-JSON body). Reactive 200-body detection is non-streaming only; vision/reasoning rely on existing HTTP-error failover. - New capability virtual endpoints: llmproxy__tools, llmproxy__tools/free, llmproxy__vision, llmproxy__vision/free (plus legacy llmproxy/ forms), advertised in /v1/models when backing candidates exist, with config hints. - New optional model_capabilities config map (model -> [caps]), threaded through the scraper pipeline (Evidence, OpenRouter supported_parameters/input_modalities, aggregate/apply_updates/regenerate/reconcile), providers.py, config.example.json, and the setup wizard (tag/remove/view + auto-populate). Empty = full backward compat; reconcile is add-only so hand-set tags are never pruned. - Docs: README capability section + endpoints + model_capabilities field. - Tests: detectors, ordering, reactive failover, virtual dispatch/hints, and pipeline (OpenRouter mapping, apply_updates, add-only reconcile, wizard shape). https://claude.ai/code/session_019YMQmPWsAUtALVqqY9FHPo

BillJr99 merged commit e97e21e into main May 29, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add capability-aware routing & failover (tools/vision/reasoning/json)#33

Add capability-aware routing & failover (tools/vision/reasoning/json)#33
BillJr99 merged 1 commit into
mainfrom
claude/hermes-free-models-tool-calls-24M2k

BillJr99 commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

BillJr99 commented May 29, 2026

Summary

Key Changes

Core Capability Detection & Routing

Virtual Endpoints

Configuration & Setup

Scraper Integration

Testing

Notable Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants