v0.1.15 — Load balancing + Ollama resilience
Changes
Staggered worker startup
Workers now launch 10 seconds apart instead of all at once. This prevents the "thundering herd" where all 3 OpenClaw tools + surveyor would fire LLM/API calls simultaneously on startup, competing for Anthropic API slots, Ollama compute, and Milvus locks.
Startup order: curator → (10s) → janitor → (10s) → distiller → (10s) → surveyor
Ollama embedding resilience
- Connection pooling: Persistent
httpx.AsyncClientreused across all embedding requests instead of creating a new TCP connection per request - More retries: Increased from 3 → 5 attempts with exponential backoff (up to 32s wait)
- Request throttle: 200ms delay between sequential embedding requests to reduce Ollama memory pressure during sustained batch operations
- Proper cleanup: HTTP client closed on daemon shutdown
Included from v0.1.14
- Separate OpenClaw agents per tool (curator/janitor/distiller no longer share
vault-curator) - Agent AGENTS.md updated to allow
/tmp/writes for manifest files - Watchdog directory event filtering (no more "Is a directory" errors)
Install
pip install alfred-vault==0.1.15