Skip to content

v0.1.15 — Load balancing + Ollama resilience

Choose a tag to compare

@ssdavidai ssdavidai released this 23 Feb 11:56

Changes

Staggered worker startup

Workers now launch 10 seconds apart instead of all at once. This prevents the "thundering herd" where all 3 OpenClaw tools + surveyor would fire LLM/API calls simultaneously on startup, competing for Anthropic API slots, Ollama compute, and Milvus locks.

Startup order: curator → (10s) → janitor → (10s) → distiller → (10s) → surveyor

Ollama embedding resilience

  • Connection pooling: Persistent httpx.AsyncClient reused across all embedding requests instead of creating a new TCP connection per request
  • More retries: Increased from 3 → 5 attempts with exponential backoff (up to 32s wait)
  • Request throttle: 200ms delay between sequential embedding requests to reduce Ollama memory pressure during sustained batch operations
  • Proper cleanup: HTTP client closed on daemon shutdown

Included from v0.1.14

  • Separate OpenClaw agents per tool (curator/janitor/distiller no longer share vault-curator)
  • Agent AGENTS.md updated to allow /tmp/ writes for manifest files
  • Watchdog directory event filtering (no more "Is a directory" errors)

Install

pip install alfred-vault==0.1.15