Skip to content

feat: Auto-pin quality scoring, OpenRouter tier refactor and live usage sidebar#1332

Merged
MODSetter merged 26 commits intoMODSetter:devfrom
AnishSarkar22:feat/model-pinnning-mode
May 1, 2026
Merged

feat: Auto-pin quality scoring, OpenRouter tier refactor and live usage sidebar#1332
MODSetter merged 26 commits intoMODSetter:devfrom
AnishSarkar22:feat/model-pinnning-mode

Conversation

@AnishSarkar22
Copy link
Copy Markdown
Contributor

@AnishSarkar22 AnishSarkar22 commented May 1, 2026

Description

OpenRouter integration: per-model billing tier

  • Derive billing tier per-model from catalogue pricing instead of a global billing_tier flag, and stabilize generated config IDs across refreshes.
  • Deprecate global billing_tier / anonymous_enabled; split anonymous flag into granular per-tier enables in YAML/env config.
  • Remove the virtual openrouter/free auto-select entry and drop its references from the LLM router's is_premium_model logic.
  • Blend per-model /endpoints health data into the quality score used for auto-pin ranking.
  • Clear healthy-status cache on catalogue refresh so new data flows through immediately.
  • Add unit tests covering the pool filter, per-model tier derivation, legacy-config deprecation warnings, and /endpoints health enrichment.

Auto-model pin service: quality-aware, tier-locked selection

  • Add a new pure-function quality scoring module (app/services/quality_score.py) with dedicated unit tests.
  • Stamp "Auto (Fastest)" ranking metadata on the YAML configs so tiers are self-describing.
  • Quality-aware, tier-locked candidate selection gated by health status.
  • Runtime cooldown on error-prone candidates + improved candidate selection logic.
  • Short-TTL healthy-status cache so preflight checks can be reused cheaply.
  • Simplify the thread-level pinning schema by removing unused fields and indexes (migration 138 + service cleanup).

Chat streaming: preflight + early repin

  • Add a lightweight LLM preflight probe for the auto-pin flow in stream_new_chat.
  • Wire the preflight + early repin logic into the auto-mode chat flow, with contract tests.
  • Harden busy_mutex thread-lock management to prevent stale middleware interference, with new unit test.

Zero: user table + live usage meters

  • Alembic migration 139_add_user_to_zero_publication.py to selectively replicate user usage metrics.
  • Add userTable schema + queries.user.me() synced query to the Zero client.
  • Sidebar: live premium-token meter via Zero.
  • Sidebar: live pages meter via Zero for authenticated users.
  • Settings: live buy-tokens meter via Zero.
  • Stop REST-polling /users/me and token-status for fields now served live by Zero.

Documents

  • New backend endpoint to retrieve a document by virtual path, with matching web client method in documents-api.service.ts.

Motivation and Context

FIX #

Screenshots

API Changes

  • This PR includes API changes

Change Type

  • Bug fix
  • New feature
  • Performance improvement
  • Refactoring
  • Documentation
  • Dependency/Build system
  • Breaking change
  • Other (specify):

Testing Performed

  • Tested locally
  • Manual/QA verification

Checklist

  • Follows project coding standards and conventions
  • Documentation updated as needed
  • Dependencies updated as needed
  • No lint/build errors or new warnings
  • All relevant tests are passing

High-level PR Summary

This PR implements an Auto (Fastest) model pinning system that intelligently selects and persists LLM models for chat threads. The system uses a quality-scoring algorithm (provider prestige, recency, pricing, context window, capabilities) combined with health monitoring from OpenRouter endpoints to rank available models. Key features include: per-thread model pinning that survives across turns, runtime rate-limit recovery with automatic failover to alternative models, health-based gating to exclude unreliable providers, and tiered selection (operator-curated YAML configs lock first for premium users, then dynamic OpenRouter models). The implementation also includes preflight health checks before expensive agent operations, a runtime cooldown mechanism to prevent immediate reselection of failed models, and comprehensive changes to OpenRouter integration including per-model tier derivation (free vs premium based on pricing signals) and stable deterministic config IDs that survive catalog churn. Additionally, there are frontend improvements for clickable file paths in markdown and a new API endpoint to resolve documents by virtual path.

⏱️ Estimated Review Time: 3+ hours

💡 Review Order Suggestion
Order File Path
1 surfsense_backend/alembic/versions/138_add_thread_auto_model_pinning_fields.py
2 surfsense_backend/app/db.py
3 surfsense_backend/app/services/quality_score.py
4 surfsense_backend/app/services/openrouter_integration_service.py
5 surfsense_backend/app/config/__init__.py
6 surfsense_backend/app/services/auto_model_pin_service.py
7 surfsense_backend/app/services/llm_router_service.py
8 surfsense_backend/app/agents/new_chat/middleware/busy_mutex.py
9 surfsense_backend/app/tasks/chat/stream_new_chat.py
10 surfsense_backend/app/routes/search_spaces_routes.py
11 surfsense_backend/app/routes/documents_routes.py
12 surfsense_web/lib/apis/documents-api.service.ts
13 surfsense_web/components/assistant-ui/markdown-text.tsx
14 surfsense_web/components/layout/ui/shell/LayoutShell.tsx
15 surfsense_backend/app/config/global_llm_config.example.yaml
16 surfsense_backend/tests/unit/services/test_auto_model_pin_service.py
17 surfsense_backend/tests/unit/agents/new_chat/test_busy_mutex.py
18 surfsense_backend/tests/unit/test_stream_new_chat_contract.py
19 surfsense_backend/tests/unit/services/test_quality_score.py
20 surfsense_backend/tests/unit/services/test_openrouter_integration_service.py
21 surfsense_backend/tests/unit/services/test_or_health_enrichment.py
22 surfsense_backend/tests/unit/services/test_llm_router_pool_filter.py
23 surfsense_backend/tests/unit/services/test_openrouter_legacy_config.py
⚠️ Inconsistent Changes Detected
File Path Warning
surfsense_web/components/assistant-ui/markdown-text.tsx Frontend changes for clickable file paths in markdown appear unrelated to the core Auto model pinning feature
surfsense_web/components/layout/ui/shell/LayoutShell.tsx CSS isolation change (adding 'isolate' class) seems disconnected from model pinning functionality
surfsense_backend/app/routes/documents_routes.py New document resolution endpoint by virtual path doesn't appear connected to the model pinning system
surfsense_web/lib/apis/documents-api.service.ts Frontend API service additions for document resolution are not part of the model pinning feature

Need help? Join our Discord

Summary by CodeRabbit

Release Notes

  • New Features

    • Added document lookup by file path for knowledge base navigation.
    • Clickable file paths in chat now resolve and open documents directly.
    • New user data query system for page and token usage tracking.
  • Bug Fixes

    • Fixed concurrent chat turn lock conflicts that could incorrectly release newer locks.
    • Improved rate-limit recovery: models auto-switch during errors instead of failing the turn.
    • Removed stale user state caching that prevented fresh data on page load.
  • Refactor

    • Enhanced auto model selection with quality scoring and health monitoring to favor reliable models.
    • OpenRouter now dynamically assigns per-model billing tiers instead of global settings.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 1, 2026

@AnishSarkar22 is attempting to deploy a commit to the Rohan Verma's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d5046be3-c84e-40b6-b3bd-df4383b78ab0

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AnishSarkar22 AnishSarkar22 changed the title Feat/model pinnning mode feat: auto-pin quality scoring + OpenRouter per-model tiers + Zero live usage May 1, 2026
@AnishSarkar22 AnishSarkar22 changed the title feat: auto-pin quality scoring + OpenRouter per-model tiers + Zero live usage feat: Auto-pin quality scoring, OpenRouter tier refactor and live usage sidebar May 1, 2026
@AnishSarkar22 AnishSarkar22 marked this pull request as ready for review May 1, 2026 22:12
@MODSetter MODSetter merged commit 451a989 into MODSetter:dev May 1, 2026
13 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants