feat(tools): dynamic tool schema filtering to reduce context waste and improve selection accuracy (#2020) by bug-ops · Pull Request #2026 · bug-ops/zeph

bug-ops · 2026-03-20T11:24:23Z

Summary

Implement dynamic tool schema filtering based on arxiv 2411.15399 ("Less is More: Better Reasoning for Language Models with Fewer Tools"). Filters which tool schemas are exposed to the LLM on a per-turn basis, reducing context waste and improving tool selection accuracy.

Implementation

New module: zeph-tools/src/schema_filter.rs with ToolSchemaFilter, InclusionReason enum, cosine similarity ranking
Config-driven: [agent.tool_filter] section with enabled, top_k, always_on, min_description_words
Ranking strategy: Top-K by similarity (default 6) + always-on tools + name mentions + short MCP descriptions
Dynamic filtering: Iteration 0 uses filtered set, iterations 1+ use full set (prevents tool starvation)
Graceful degradation: Embedding failure → all tools included
Startup: Embeds all tool descriptions at initialization via maybe_init_tool_schema_filter()

Token Savings

30-60% reduction in tool schema tokens per request (tested at 20+ tools)
Scales linearly with tool count: at 50 tools ~76% reduction

Quality

✅ 5956 tests pass (0 failures)
✅ Clippy clean (zero warnings, --features full)
✅ All validators approved (security, performance, testing, code review)
✅ Spec-compliant: all 3 critical issues from architecture review resolved
✅ Disabled by default (`enabled: false`) — opt-in until empirically validated
✅ Backward compatible: old configs work without `[agent.tool_filter]`

Fixes research(tools): dynamic tool schema filtering to reduce context waste and improve selection accuracy #2020
Inspired by community feedback in research(tools): dynamic tool schema filtering to reduce context waste and improve selection accuracy #2020
Future: research(tools): tool dependency graph for sequential tool availability #2024 (tool dependency graph)

Commits

`6ca9cd71` feat(tools): add dynamic tool schema filtering
`2f871977` fix(tools): address validation findings

Test Plan

Unit tests: 14 schema_filter + 8 cosine_similarity tests
Integration: context builder → native executor
Regression: no failures, feature disabled by default
Live: enable config, observe token reduction

All validators signed off. Ready to merge.

Implement embedding-based tool schema filtering that reduces the number of tool definitions sent to the LLM per turn. Only the most relevant tools are selected based on cosine similarity between the user query and pre-computed tool description embeddings. Key design decisions: - Single `compute_filtered_tool_ids()` call per turn in rebuild_system_prompt() - Cached result consumed by native tool path (iteration 0 only, full set for 1+) - Config-driven always-on classification (no ToolDef struct changes) - Rank-based top-K selection (model-agnostic, no threshold tuning) - Pre-filters: tool name string matching + short MCP description auto-include - InclusionReason enum for observability (AlwaysOn/NameMentioned/SimilarityRank/ShortDescription) - Disabled by default (enabled=false), opt-in via [agent.tool_filter] - Prompt-path providers skip filtering (preserves stable cache block) - Graceful degradation when embedding is unsupported or fails Config section: [agent.tool_filter] enabled = false top_k = 6 always_on = ["memory_search", "memory_save", "load_skill", "bash", "read", "edit"] min_description_words = 5

) - Extract cosine_similarity to zeph-common::math (DRY: single canonical source used by zeph-tools and zeph-memory via re-export) - Add maybe_init_tool_schema_filter() async builder and wire into all 3 entry points (runner, daemon, acp) so the filter is actually initialized - Switch find_mentioned_tool_ids to word-boundary-aware matching to prevent false positives (e.g. "read" no longer matches inside "thread") - Add NoEmbedding fallback: tools without cached embeddings (e.g. MCP tools added after startup) are auto-included instead of silently dropped

bug-ops added 2 commits March 20, 2026 12:01

bug-ops added tools Tool execution and MCP integration research Research-driven improvement labels Mar 20, 2026

github-actions bot added documentation Improvements or additions to documentation memory zeph-memory crate (SQLite) rust Rust code changes core zeph-core crate enhancement New feature or request size/XL Extra large PR (500+ lines) labels Mar 20, 2026

bug-ops merged commit 4e2f0f1 into main Mar 20, 2026
25 checks passed

bug-ops deleted the dynamic-tool-schema-filtering branch March 20, 2026 11:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools): dynamic tool schema filtering to reduce context waste and improve selection accuracy (#2020)#2026

feat(tools): dynamic tool schema filtering to reduce context waste and improve selection accuracy (#2020)#2026
bug-ops merged 2 commits intomainfrom
dynamic-tool-schema-filtering

bug-ops commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 20, 2026

Summary

Implementation

Token Savings

Quality

Related

Commits

Test Plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant