-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Source
"Less is More: Better Reasoning for Language Models with Fewer Tools" (arxiv 2411.15399)
Core Idea
Filter which tool schemas are included in the LLM prompt on a per-turn basis, exposing only schemas relevant to the current subtask. The paper demonstrates that reducing the visible tool set significantly improves function-calling accuracy and reduces unnecessary tool invocations.
The mechanism: a lightweight relevance classifier (embedding similarity between current task description and tool descriptions) gates which tool definitions are injected into the context before each LLM call.
Current Zeph Gap
All registered tools are injected into every LLM request regardless of relevance. With 20+ tools (shell, scrape, memory_search, memory_save, search_code, scheduler, load_skill, MCP tools, etc.), sending all schemas every request:
- Wastes context budget (tool schemas can consume 3-8k tokens)
- Degrades tool selection accuracy (LLM must discriminate among all options)
- Reduces effective context window for task-relevant content
Implementation Sketch
- At startup, compute and cache embeddings for each tool description
- At context-build time (before each LLM call), embed the current task/user message
- Compute cosine similarity between task embedding and each tool embedding
- Include only tools scoring above a threshold (e.g., 0.3) plus always-on tools (memory_search, memory_save)
- Log filtered tool count in debug output for observability
Always-on tools (never filtered): memory_search, memory_save, load_skill — core agent capabilities
Filterable tools: shell, scrape, search_code, scheduler, MCP tools, A2A tools
Complexity
Low. Tool description embeddings can be precomputed once and cached in a HashMap<tool_name, Vec>. The filter runs at context-build time (already a pipeline step in zeph-core). No model changes required.
Expected Benefit
- 30-60% reduction in tool schema tokens per request (from full list to 3-5 relevant tools)
- Improved tool selection accuracy (fewer irrelevant options = fewer wrong selections)
- More effective context window utilization
See Also
- CompositeExecutor in zeph-tools (parallel execution already in place)
- Context builder in zeph-core/src/agent/context/ (where tool schemas are injected)
- ToolDefinition caching (PR #1xxx, already done — embeddings would be similar)