Skip to content

refactor(llm): DRY router fallback error path and fix spawn_blocking for state saves#3432

Merged
bug-ops merged 1 commit intomainfrom
refactor/m48/arch-cleanup
Apr 26, 2026
Merged

refactor(llm): DRY router fallback error path and fix spawn_blocking for state saves#3432
bug-ops merged 1 commit intomainfrom
refactor/m48/arch-cleanup

Conversation

@bug-ops
Copy link
Copy Markdown
Owner

@bug-ops bug-ops commented Apr 26, 2026

Summary

  • Extract record_fallback_error private helper to deduplicate the error-arm logic shared by chat and chat_stream fallback loops. The success paths diverge intentionally (quality gate + CoE vs plain stream) and are kept separate; a NOTE in both methods documents the decision.
  • Convert save_thompson_state, save_bandit_state, save_reputation_state to pub async fn wrapping tokio::task::spawn_blocking — resolves the FIXME for CPU-bound work on the async executor.
  • AnyProvider::save_router_state now runs all three saves concurrently via tokio::join!.
  • Single caller in zeph-core::agent::mod updated with .await.

Motivation

Part of the m48 architectural cleanup plan (PR 2 of 8). Router was the highest bug-density file (5 follow-up PRs in recent history). Deduplicating the error path and offloading blocking I/O reduces future regression surface.

Test plan

  • cargo +nightly fmt --check — clean
  • cargo clippy --workspace --all-features -- -D warnings — 0 warnings
  • cargo nextest run --config-file .github/nextest.toml -p zeph-llm -p zeph-core --lib --bins — 847 + 1502 passed

@github-actions github-actions Bot added llm zeph-llm crate (Ollama, Claude) rust Rust code changes core zeph-core crate refactor Code refactoring without functional changes size/M Medium PR (51-200 lines) labels Apr 26, 2026
…for state saves

Extract private `record_fallback_error` helper to deduplicate the error
arm logic shared by `chat` and `chat_stream` fallback loops. The success
paths diverge (quality gate + CoE vs plain stream open) and are kept
separate; a NOTE comment in both methods documents the design decision.

Convert `save_thompson_state`, `save_bandit_state`, and
`save_reputation_state` from sync to `pub async fn` using
`tokio::task::spawn_blocking` to avoid blocking the async executor on
serialization + filesystem writes. `AnyProvider::save_router_state` now
runs all three saves concurrently via `tokio::join!`. The sole caller in
`zeph-core` is updated with `.await`.

Resolves the FIXME at router/mod.rs:~733 (CPU-bound work on async thread).
@bug-ops bug-ops force-pushed the refactor/m48/arch-cleanup branch from f1632f1 to 02240de Compare April 26, 2026 01:13
@bug-ops bug-ops enabled auto-merge (squash) April 26, 2026 01:13
@bug-ops bug-ops merged commit 160c5c8 into main Apr 26, 2026
32 checks passed
@bug-ops bug-ops deleted the refactor/m48/arch-cleanup branch April 26, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core zeph-core crate llm zeph-llm crate (Ollama, Claude) refactor Code refactoring without functional changes rust Rust code changes size/M Medium PR (51-200 lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant