feat: Auto-pin quality scoring, OpenRouter tier refactor and live usage sidebar#1332
Conversation
…nused fields and indexes
…nd enhance candidate selection
…dleware interference
|
@AnishSarkar22 is attempting to deploy a commit to the Rohan Verma's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…cation of usage metrics
Description
OpenRouter integration: per-model billing tier
billing_tierflag, and stabilize generated config IDs across refreshes.billing_tier/anonymous_enabled; split anonymous flag into granular per-tier enables in YAML/env config.openrouter/freeauto-select entry and drop its references from the LLM router'sis_premium_modellogic./endpointshealth data into the quality score used for auto-pin ranking./endpointshealth enrichment.Auto-model pin service: quality-aware, tier-locked selection
app/services/quality_score.py) with dedicated unit tests.138+ service cleanup).Chat streaming: preflight + early repin
stream_new_chat.busy_mutexthread-lock management to prevent stale middleware interference, with new unit test.Zero: user table + live usage meters
139_add_user_to_zero_publication.pyto selectively replicate user usage metrics.userTableschema +queries.user.me()synced query to the Zero client./users/meandtoken-statusfor fields now served live by Zero.Documents
documents-api.service.ts.Motivation and Context
FIX #
Screenshots
API Changes
Change Type
Testing Performed
Checklist
High-level PR Summary
This PR implements an Auto (Fastest) model pinning system that intelligently selects and persists LLM models for chat threads. The system uses a quality-scoring algorithm (provider prestige, recency, pricing, context window, capabilities) combined with health monitoring from OpenRouter endpoints to rank available models. Key features include: per-thread model pinning that survives across turns, runtime rate-limit recovery with automatic failover to alternative models, health-based gating to exclude unreliable providers, and tiered selection (operator-curated YAML configs lock first for premium users, then dynamic OpenRouter models). The implementation also includes preflight health checks before expensive agent operations, a runtime cooldown mechanism to prevent immediate reselection of failed models, and comprehensive changes to OpenRouter integration including per-model tier derivation (free vs premium based on pricing signals) and stable deterministic config IDs that survive catalog churn. Additionally, there are frontend improvements for clickable file paths in markdown and a new API endpoint to resolve documents by virtual path.
⏱️ Estimated Review Time: 3+ hours
💡 Review Order Suggestion
surfsense_backend/alembic/versions/138_add_thread_auto_model_pinning_fields.pysurfsense_backend/app/db.pysurfsense_backend/app/services/quality_score.pysurfsense_backend/app/services/openrouter_integration_service.pysurfsense_backend/app/config/__init__.pysurfsense_backend/app/services/auto_model_pin_service.pysurfsense_backend/app/services/llm_router_service.pysurfsense_backend/app/agents/new_chat/middleware/busy_mutex.pysurfsense_backend/app/tasks/chat/stream_new_chat.pysurfsense_backend/app/routes/search_spaces_routes.pysurfsense_backend/app/routes/documents_routes.pysurfsense_web/lib/apis/documents-api.service.tssurfsense_web/components/assistant-ui/markdown-text.tsxsurfsense_web/components/layout/ui/shell/LayoutShell.tsxsurfsense_backend/app/config/global_llm_config.example.yamlsurfsense_backend/tests/unit/services/test_auto_model_pin_service.pysurfsense_backend/tests/unit/agents/new_chat/test_busy_mutex.pysurfsense_backend/tests/unit/test_stream_new_chat_contract.pysurfsense_backend/tests/unit/services/test_quality_score.pysurfsense_backend/tests/unit/services/test_openrouter_integration_service.pysurfsense_backend/tests/unit/services/test_or_health_enrichment.pysurfsense_backend/tests/unit/services/test_llm_router_pool_filter.pysurfsense_backend/tests/unit/services/test_openrouter_legacy_config.pysurfsense_web/components/assistant-ui/markdown-text.tsxsurfsense_web/components/layout/ui/shell/LayoutShell.tsxsurfsense_backend/app/routes/documents_routes.pysurfsense_web/lib/apis/documents-api.service.tsSummary by CodeRabbit
Release Notes
New Features
Bug Fixes
Refactor