Run a small tool-calling LLM (Qwen3-1.7B or 4B) alongside the memory system.
- Chat interface in the web GUI
- Tools: query agent memory, web search, configure system settings
- Helps non-technical users set up agents via conversation
- Investigate Outlines for structured generation
- Investigate DSPy for prompt optimization
- Can run on NPU alongside embed/rerank models
Run a small tool-calling LLM (Qwen3-1.7B or 4B) alongside the memory system.