Skip to content

fix(skills): normalize :latest suffix in embedding model name comparison#2910

Merged
bug-ops merged 1 commit intomainfrom
2894-tui-hang-embed-model-latest
Apr 11, 2026
Merged

fix(skills): normalize :latest suffix in embedding model name comparison#2910
bug-ops merged 1 commit intomainfrom
2894-tui-hang-embed-model-latest

Conversation

@bug-ops
Copy link
Copy Markdown
Owner

@bug-ops bug-ops commented Apr 11, 2026

Summary

Fixes #2894 — TUI hangs 3-4 minutes at startup when the config specifies nomic-embed-text-v2-moe but the Qdrant zeph_skills collection was populated with nomic-embed-text-v2-moe:latest.

Root cause: EmbeddingRegistry::sync used string equality for model name comparison. Ollama silently appends :latest when no tag is specified, so a bare model name in config always mismatched the stored tag, triggering a full collection recreation and sequential re-embedding of all skills.

Changes

  • Model name normalizationnormalize_model_name() strips :latest suffix before comparison (Ollama-specific, documented in comment). Applied at comparison time only; stored Qdrant payloads are not modified, no migration needed.
  • Parallel embedding — sequential embed loop replaced with buffer_unordered(concurrency) where concurrency: usize defaults to 4, clamped to minimum 1. Each future returns (key, hash, Result) in completion order; individual failures warn+skip without aborting the batch.
  • TUI progress indicatoron_progress: Option<Box<dyn Fn(usize, usize) + Send>> callback added to EmbeddingRegistry::sync, called in real time inside the streaming loop (not after collect). Wired in agent/mod.rs via session.status_tx to emit "Syncing skills: N/M" to the TUI spinner. No Channel trait changes.

Test plan

  • normalize_model_name("model:latest")"model" — unit test added
  • normalize_model_name("model:v2")"model:v2" — unit test added (only :latest stripped)
  • model_has_changed_latest_vs_bare_is_false — root cause of tui hangs at startup due to full skill re-embedding when embedding model name lacks :latest suffix #2894 covered by unit test
  • concurrency_zero_clamped_to_one — guard unit test added
  • Integration tests for on_progress and partial failure marked #[ignore] (require Qdrant)
  • cargo nextest run --workspace --features full --lib --bins — 8166 passed
  • cargo clippy --workspace --features full -- -D warnings — clean
  • cargo +nightly fmt --check — clean

EmbeddingRegistry::sync used string equality to compare the stored
embedding model name against the config value. Ollama silently appends
:latest when no tag is specified, so configs with "model" and collections
populated with "model:latest" triggered a full collection recreation and
re-embedding of all skills on every startup (~3-4 minutes for 123 skills).

Three-part fix:
- Add normalize_model_name() that strips :latest before comparison
  (Ollama-specific, applied at comparison time only, no stored data change)
- Replace sequential embed loop with buffer_unordered(4) and add
  concurrency: usize field (default 4, clamped to 1 minimum)
- Add on_progress callback to EmbeddingRegistry::sync; wire to
  session.status_tx in agent/mod.rs to emit "Syncing skills: N/M"
  in real time during re-embedding (TUI spinner compliance)

Closes #2894
@github-actions github-actions Bot added documentation Improvements or additions to documentation skills zeph-skills crate memory zeph-memory crate (SQLite) rust Rust code changes core zeph-core crate bug Something isn't working size/L Large PR (201-500 lines) labels Apr 11, 2026
@bug-ops bug-ops enabled auto-merge (squash) April 11, 2026 16:46
@bug-ops bug-ops merged commit 2219860 into main Apr 11, 2026
30 checks passed
@bug-ops bug-ops deleted the 2894-tui-hang-embed-model-latest branch April 11, 2026 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working core zeph-core crate documentation Improvements or additions to documentation memory zeph-memory crate (SQLite) rust Rust code changes size/L Large PR (201-500 lines) skills zeph-skills crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tui hangs at startup due to full skill re-embedding when embedding model name lacks :latest suffix

1 participant