Skip to content

let skills run agentic + gate on integrations via requires:#20

Merged
cjus merged 1 commit into
mainfrom
carlos/skills-as-agents
May 15, 2026
Merged

let skills run agentic + gate on integrations via requires:#20
cjus merged 1 commit into
mainfrom
carlos/skills-as-agents

Conversation

@cjus
Copy link
Copy Markdown
Owner

@cjus cjus commented May 15, 2026

Why

Two related changes that take operator-defined skills from "templated single-shot prompts" to a real workflow primitive, while making the boot-time honesty story tighter.

1. Skills run with the full tool surface (skills-as-agents)

The v1 design treated a SKILL.md as a one-shot text transform: maxTurns: 1, disallowedTools deny-list of everything useful, canUseTool: deny-all. That ruled out the whole class of skills the operator actually wants — "append this entry to my Notion Journal," "draft a Gmail reply about X," "look this up in the projects DB." Those need tool calls.

This PR lifts the constraint without weakening the safety story:

  • Claude tiers (primary / secondary). runSkill now passes the same claude_code tool preset a normal turn uses. Only Agent and Task stay denied (sub-agents off at the SDK + policy layers, belt-and-suspenders). The interactive canUseTool factory plus the PreToolUse / PostToolUse / PostToolUseFailure hooks come from the same factory runAgent uses, so cost cap, loop detector, and the Telegram-confirm UX behave identically inside a skill. Integrations MCP server attached when present.
  • Ollama tier. runSkillBareWithTools routes the body through the same runToolLoop driver as a regular Ollama turn when OllamaSkillDeps.tools + toolTiers + broker are wired. Self-tool entry (skills__<self>) filtered out of the catalog the body sees — direct recursion impossible. Pure text-transform skills (no integrations / tools wired) still fall through to the one-shot /api/chat path; back-compat preserved.
  • New frontmatter: max_turns (integer 1–10, default 1). Doubles as the SDK maxTurns on Claude tiers and runToolLoop.maxIterations on Ollama — one knob constrains both paths. Indirect cycles (A → skills__Bskills__A) bounded by max_turns plus the shared loop detector (third identical (tool, input) per turn → deny).

2. requires: frontmatter gates skills on integration deps

Today a /log skill that calls notion_search appears in /help + autocomplete regardless of whether NOTION_API_KEY is set. Operator types /log foo❌ Failed to log: notion_search not available. Gate is invisible until use-time.

New requires: field (bare string or string array, entries match [a-z][a-z0-9_-]{0,31}) names integrations the skill depends on. Missing at boot → skill skipped with a non-fatal skills.load_error and absent from the registry, /help, and setMyCommands autocomplete. loadSkillsSync gains a 4th arg loadedIntegrationNames, sourced from integration result.sources.filter(s => s.toolCount > 0) (probe-failed integrations do NOT satisfy the gate — a skill that calls notion_search needs the tool to actually exist).

Omitting requires: preserves back-compat — tldr and friends still load unconditionally.

Verification

  • npm run typecheck clean
  • bun test 740/740 green (10 new tests for requires: in skills.test.ts, fixture updates in skill-tools.test.ts)
  • Live boot: NOTION_API_KEY absent → /log absent from /help + Telegram autocomplete; key set + Journal database shared with the Solrac integration → /log <text> chains notion_searchnotion_create_page and returns the page URL. Confirmed end-to-end in this session.

Docs

Updated in this PR (same logical change, easier to review together):

  • docs/USAGE.mdSKILL.md schema example now includes max_turns; rewrote "What skills can do," Ollama-tier cost paragraph, recursion-safety bullet, latency note.
  • docs/ARCHITECTURE.md — frontmatter schema (2 required + 4 optional), Claude + Ollama execution paragraphs, "Skills as tools" header now disambiguates the two axes (skills using tools vs skills callable as tools by the agent), recursion-safety mechanism (self-filter + max_turns + loop detector, not zero-tools).
  • docs/FEATURES.md — skills bullet updated.
  • docs/GLOSSARY.md — Solrac-skill entry updated.
  • docs/ROADMAP.md — OQ#16 restructured into the two axes; Phase 2 reframed as axis-2 expansion only.

Stale solrac/PLAN.md (the shipped scheduler plan from bff8ca4) deleted — canonical story now lives in docs/USAGE.md + docs/ARCHITECTURE.md.

Anti-goal check

No anti-goal reversal. Pure additive feature work. SDK pin (0.2.119) untouched.

Deferred (non-blocking)

  • CHANGELOG entry intentionally not added in this PR — fold into a follow-up batch with adjacent unreleased entries if desired.
  • OQ#1 — skill-to-skill requires: not implemented. Integration granularity only. Avoids ordering games.
  • OQ#2 — /help hint for skipped skills: silent. Boot log is the audit trail.
  • OQ#3 — env-var force-disable (SOLRAC_SKILL_<NAME>_DISABLED): separate axis from dep gating.
  • OQ#5 — tool-level requires_tools: integration granularity sufficient at v1.

Test plan

  • Reviewer reads the rewritten runSkill (src/commands.ts) and confirms the hook trio mirrors runAgent's pattern (agent.ts)
  • Reviewer verifies runSkillBareWithTools filters the self-tool entry before passing the catalog to runToolLoop
  • Reviewer confirms loadedIntegrationNames is built from result.sources.filter(s => s.toolCount > 0) (not result.tools, which would let a probe-failed integration with cached metadata satisfy the gate)
  • Reviewer skims the doc diff and confirms wording is consistent across USAGE / ARCHITECTURE / GLOSSARY / ROADMAP

skills are no longer tool-less single-shot turns. claude tiers get the
full claude_code preset (agent/task still denied); ollama tier routes
through the same runToolLoop the regular ollama turn uses when
integrations are wired. budget is per-skill max_turns frontmatter
(1-10, default 1), bounded by the existing cost cap + loop detector +
three-tier policy. self-tool entry filtered from the catalog the body
sees (direct recursion impossible); indirect cycles bounded by
max_turns and the shared loop detector.

new requires: frontmatter gates a skill on named integrations being
loaded at boot. bare string or array. missing deps -> skill skipped
with skills.load_error, absent from /help and telegram autocomplete.
loadSkillsSync gains a 4th arg loadedIntegrationNames, sourced from
integration result.sources filtered to toolCount > 0 (probe-failed
integrations do not satisfy the gate).

docs (features, usage, architecture, glossary, roadmap) updated to
match. stale scheduler PLAN.md at repo root deleted (shipped in
bff8ca4; canonical story now lives in usage + architecture).
@cjus cjus merged commit d6119d6 into main May 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant