feat(router): RAPS Bayesian reputation scoring (#1886) by bug-ops · Pull Request #2091 · bug-ops/zeph

bug-ops · 2026-03-21T22:50:11Z

Summary

Adds per-provider quality reputation tracking to AgentRouter using Beta distributions (RAPS — Reputation-Adjusted Provider Selection). Reputation tracks tool execution quality outcomes (invalid argument failures) separately from API availability, and adjusts routing scores over time to prefer providers that produce valid tool calls.

Per-provider Beta(alpha, beta) quality distributions; default uniform prior (1,1)
Session-level decay (decay_factor = 0.95) shrinks evidence toward prior on each load
Minimum observation threshold (min_observations = 5) gates routing influence
Cascade strategy is fully excluded — no mutex overhead, no wasted collection
Only InvalidParams tool errors count as quality failures; network/transient/timeout errors excluded

Architecture critique fixes

Three critical math errors from architecture review are resolved:

CRIT-1 (per-provider sampling): ema_reputation_factor() computes each provider's own Beta mean independently — meaningful argmax comparison
CRIT-2 (unbounded EMA): multiplicative formula ema_score * (1 + weight * (rep_factor - 0.5) * 2), score bounded proportionally to existing EMA value, neutral at rep_factor = 0.5
CRIT-3 (Thompson guarantees): shift_thompson_priors() adds weighted quality evidence into Thompson Beta params before sampling via select_with_priors() — preserves single-distribution sampling property, no convex blend

Config

[llm.router.reputation]
enabled = true
decay_factor = 0.95   # (0.0, 1.0], lower = faster forgetting
weight = 0.3          # [0.0, 1.0], blend strength
min_observations = 5  # gate: minimum quality events before routing is affected
# state_path = "~/.config/zeph/router_reputation_state.json"

Files changed

File	Change
`crates/zeph-llm/src/router/reputation.rs`	New — `ReputationTracker`, decay, prune, save/load, 28 tests
`crates/zeph-llm/src/router/mod.rs`	Reputation fields, `with_reputation()`, `record_quality_outcome()`, EMA/Thompson blending
`crates/zeph-llm/src/router/thompson.rs`	`select_with_priors()`, `get_distribution()`
`crates/zeph-llm/src/any.rs`	Delegate `record_quality_outcome()` and `save_reputation_state()`
`crates/zeph-llm/src/provider.rs`	Default no-op `record_quality_outcome()` in trait
`crates/zeph-core/src/config/providers.rs`	`ReputationConfig` struct, `RouterConfig.reputation` field
`crates/zeph-core/src/bootstrap/provider.rs`	`apply_reputation_if_enabled()`
`crates/zeph-core/src/agent/tool_execution/native.rs`	Quality outcome recording after tool execution
`src/init.rs`	`reputation: None` in wizard `RouterConfig` literal
`CHANGELOG.md`	Unreleased entry

Tests

+28 unit tests in reputation.rs. Total: 6327 passed, 15 skipped.

Test plan

cargo +nightly fmt --check — pass
cargo clippy --workspace --features full -- -D warnings — pass
cargo nextest run --config-file .github/nextest.toml --workspace --features full --lib --bins — 6327 passed
Merged with main, no conflicts remaining

Add per-provider quality reputation tracking to AgentRouter using Beta distributions (RAPS — Reputation-Adjusted Provider Selection). Key design decisions: - Tool execution quality outcomes (InvalidParams only) shift routing scores separately from API availability, tracked in Beta(alpha, beta) per provider - Session-level decay (default 0.95) shrinks evidence toward uniform prior, preventing stale observations from permanently biasing routing - Minimum observation threshold (default 5) gates all routing influence until enough data is accumulated - Cascade strategy is a no-op — reputation not used for fixed cost tiers Architecture critique fixes (all 3 CRIT issues resolved): - CRIT-1: reputation factor computed per-provider (each has its own Beta mean), not a single shared sample that cancels in argmax comparison - CRIT-2: EMA blending uses multiplicative formula `ema_score * (1 + weight * (rep_factor - 0.5) * 2)`, bounded proportionally to existing EMA score rather than unbounded additive term - CRIT-3: Thompson Sampling priors shifted by quality reputation parameters before sampling via shift_thompson_priors() + select_with_priors(), preserving single-distribution sampling guarantees (no convex combination of two samples) Implementation: - New crate: crates/zeph-llm/src/router/reputation.rs — ReputationTracker, ReputationEntry (embeds BetaDist from thompson.rs), apply_decay(), prune(), atomic save/load with 0o600 permissions, 28 unit tests - RouterProvider: reputation/reputation_state_path/reputation_weight fields, with_reputation() builder, record_quality_outcome(), save_reputation_state(), reputation_stats(); last_active_provider tracking for correct attribution - ThompsonState: select_with_priors() for shifted-prior sampling, get_distribution() accessor - LlmProvider trait: record_quality_outcome() default no-op - AnyProvider: delegates record_quality_outcome() and save_reputation_state() - native.rs: InvalidParams classified as quality failure; success recorded; network/transient/timeout errors excluded from quality signal - bootstrap/provider.rs: apply_reputation_if_enabled() wires config to router, skips Cascade strategy - Config: [llm.router.reputation] section — enabled, decay_factor, weight, min_observations, state_path - State persisted to ~/.config/zeph/router_reputation_state.json +28 tests (reputation.rs), total: 6321

bug-ops added 2 commits March 21, 2026 23:46

merge: sync with main, resolve CHANGELOG conflict

97b36df

github-actions bot added documentation Improvements or additions to documentation llm zeph-llm crate (Ollama, Claude) rust Rust code changes core zeph-core crate enhancement New feature or request size/XL Extra large PR (500+ lines) labels Mar 21, 2026

bug-ops enabled auto-merge (squash) March 21, 2026 22:51

fix(router): prefix unused variable a with underscore in test

b68d807

bug-ops merged commit 2c4eba4 into main Mar 21, 2026
25 checks passed

bug-ops deleted the feat-issue-1886-raps-bayesian-reputation branch March 21, 2026 23:03

bug-ops linked an issue Mar 21, 2026 that may be closed by this pull request

research(routing): RAPS Bayesian reputation scoring for AgentRouter robustness #1886

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(router): RAPS Bayesian reputation scoring (#1886)#2091

feat(router): RAPS Bayesian reputation scoring (#1886)#2091
bug-ops merged 3 commits intomainfrom
feat-issue-1886-raps-bayesian-reputation

bug-ops commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 21, 2026

Summary

Architecture critique fixes

Config

Files changed

Tests

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant