Implement remote skills discovery with disk cache and dedicated tools#1822
Implement remote skills discovery with disk cache and dedicated tools#1822dgageot merged 1 commit intodocker:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
I've reviewed the changes for remote skills discovery implementation. The overall design is solid, but I found a couple of minor issues that could be improved:
Summary
This PR successfully implements remote skill sources with disk caching and a dedicated SkillsToolset. The security measures (URL validation, path traversal prevention, size limits) are appropriate. However, there are two low-severity issues related to cache handling and concurrent access patterns.
Add support for remote skill sources via the well-known skills discovery spec (https://github.com/cloudflare/agent-skills-discovery-rfc). The skills config now accepts both the existing boolean form and a list of sources: skills: true # equivalent to ["local"] skills: - local - https://example.com Remote skills are fetched from /.well-known/skills/index.json, and all files listed in the index are prefetched into a disk cache under the XDG cache directory (~/.cache/cagent/skills/ on Linux). Cache-Control headers are respected for expiry. Instead of depending on read_file/fetch tools, skills now get their own dedicated toolset (SkillsToolset) with two tools: - read_skill: loads a skill's SKILL.md by name - read_skill_file: loads supporting resources (references, scripts, assets) by skill name and relative path The agent never sees whether a skill is local or remote. The toolset handles progressive disclosure per the spec: only name+description at startup, SKILL.md on activation, supporting files on demand. Other changes: - Replace Skills *bool config field with SkillsConfig supporting both bool and list-of-sources YAML/JSON formats - Add paths.GetCacheDir() using os.UserCacheDir() for XDG compliance - Remove skills system prompt injection from session.go (now handled by the toolset's Instructable interface) - Remove filesystem/read_file validation requirement for skills - Validate skill sources (must be "local" or HTTP/HTTPS URL) Assisted-By: cagent
|
Now I'm thinking maybe we should remove the agent top-level |
|
@rumpl I'm not necessarily against this.. I just worry a little bit about the proliferation of default tools. I wonder if it could make sense to (also?) have a way for the user to inject whatever discovered skills are available at will, e.g. with Just food for thought though |
|
That's auto-loading by the user, sure, but feels orthogonal to remote skills |
|
orthogonal? i was referring to this
I feel we don't want a toolsets definition to be the only way to define things. If i don't want to give the agent extra tools, the top level property to allow me to #someskill still seems to make sense to me. Totally fine with remote skills per-se (assuming the v0.1 draft rfc gets generally adopted by the ecosystem) |
Add support for remote skill sources via the well-known skills discovery spec (https://github.com/cloudflare/agent-skills-discovery-rfc). The skills config now accepts both the existing boolean form and a list of sources:
Remote skills are fetched from /.well-known/skills/index.json, and all files listed in the index are prefetched into a disk cache under the XDG cache directory (~/.cache/cagent/skills/ on Linux). Cache-Control headers are respected for expiry.
Instead of depending on read_file/fetch tools, skills now get their own dedicated toolset (SkillsToolset) with two tools:
The agent never sees whether a skill is local or remote. The toolset handles progressive disclosure per the spec: only name+description at startup, SKILL.md on activation, supporting files on demand.
Other changes:
Assisted-By: cagent