[FEATURE]: Experimental Tool/Skill Cache

### Feature hasn't been suggested before.

- [x] I have verified this feature I'm about to request hasn't been suggested before.

### Describe the enhancement you want to request

Specifically, I have verified this is not a duplicate of #5416, #4317 or similar. These features are about generic prompt caching, while my suggestion specifically targets tools and skills.

**Disclaimer:** I did use Claude Sonnet 4.6 to help me structure the request. I certify that I (the human) am the owner and respondent of the idea.

### Problem

We are seeing the adoption of long-running agents with a wide range of functionalities provided by tools and skills. As users build out more complex agents, the number of registered tools and skills can grow significantly.

With that amount of tools and skills:

- Every LLM step carries the full catalog in the context window — tool definitions through the AI SDK's `tools` dict, and skill names/descriptions through the `SkillTool` description string. This grows linearly with the number of tools/skills and is re-sent on every step.
- Manually controlling the tools and skills becomes overwhelming for users. In fact, these agents are likely automatic, so humans may not even be in the loop.

### Proposed Solution

I build on a belief of 80/20 principle: 80% of the time, the LLM only needs 20% (or a small subset) of the available tools/skills to complete the task. By keeping the full catalog in a discoverable but not immediately visible "L2 cache", we can reduce context size and improve relevance without sacrificing capability.

An L1/L2 cache layer sitting between the raw tool/skill discovery and the LLM call:

- **L1**: Actively used items. Injected normally (full schema for tools, listed in SkillTool description for skills).
- **L2**: Registered but cold items. Hidden from the LLM's default context. The LLM is told a discovery mechanism exists and uses it when standard tools don't suffice.
- **LRU promotion**: Every successful tool/skill use updates `last_used_at`. On each promotion, if L1 exceeds the configured max, the least recently used L1 item is demoted to L2.

**This does not change how tools/skills that are **not registered in the cache** behave. An uncached tool (e.g. through local `skill` folder, inside the json file) always passes through normally. This ensures 100% backward compatibility.**

The LLM is provided with a single tool that allows it to do vector search of the L2 cache with a natural language description of what it needs. The recall of vector search can be improved over time with better algorithms.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: Experimental Tool/Skill Cache #15256

Feature hasn't been suggested before.

Describe the enhancement you want to request

Problem

Proposed Solution

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[FEATURE]: Experimental Tool/Skill Cache #15256

Description

Feature hasn't been suggested before.

Describe the enhancement you want to request

Problem

Proposed Solution

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions