feat(skills): external MCP server per skill (#37)#38
Conversation
closes #35 (backend slice) - internal/skills: Service wraps Anthropic Beta.Skills (List, Upload, Delete) and persists skill_id/name/description/version in Postgres. UploadDir parses SKILL.md frontmatter, walks the folder, preserves relative paths via a namedReader so the multipart upload matches what the API expects. - internal/skills/zip.go: rejects path-traversal entries and zips without exactly one top-level folder + SKILL.md. - cmd/skills-sync: CLI driven by SKILLS_SOURCES that bulk-uploads every SKILL.md folder it finds; idempotent via UNIQUE(name) + UPSERT. - internal/agent/provision.go: SkillIDsFn opt; per-user agents now attach the agent toolset (bash/code-execution) alongside their MCP toolset only when at least one skill is configured. - cmd/provision/main.go: optional SKILLS_AGENT_IDS env to attach skills at bootstrap. - 00005_skills.sql migration with UNIQUE(name) + index on anthropic_skill_id. - /api/skills CRUD wired in main.go; Makefile target skills-sync; SKILLS_SOURCES + SKILLS_AGENT_IDS in .env.example. Co-authored-by: Cursor <cursoragent@cursor.com>
closes #35 (mobile slice) - mobile/services/skills.ts: typed client for /api/skills CRUD; uses expo/fetch for the multipart upload so we can build FormData. - mobile/app/(app)/skills.tsx: list + upload (zip via DocumentPicker) + delete (with confirm) + sync button. Empty state CTAs upload. - settings.tsx: link to Skills card. - _layout.tsx: register hidden 'skills' route so deep links work. - docs/SKILLS.md: SKILL.md format, folder layout, sandbox constraints (no network, bundle wheels), make skills-sync workflow, REST surface. - mobile/package.json: expo-document-picker dependency. Co-authored-by: Cursor <cursoragent@cursor.com>
…ome pattern)
Demonstrates how to build a skill whose compute can't fit in Anthropic's
sandbox (native deps, data files, side effects). Splits the skill in two:
- backend/internal/astrology + internal/mcp/platforms/astrology.go:
pure-Go sun-sign computation registered as the astrology_birth_summary
MCP tool. Stub for moon/ascendant/houses with comment pointing at
mshafiee/swephgo for full Swiss Ephemeris accuracy.
- skills/astrology/{SKILL.md, reference.md}: tiny skill that tells
Claude to gather birth data, call the MCP tool, read reference.md
for interpretation tables, and compose a tight reading.
docs/SKILLS.md gains an 'Authoring skills for this template' section
that documents the pattern, with the astrology skill as the worked
example. Also calls out the constraints surfaced during live testing:
no nested archives (.whl/.zip/.tar/.tgz/.gz are silently skipped) and
display_title uniqueness on re-upload (issue #35 follow-up to fix
via Versions.New).
Co-authored-by: Cursor <cursoragent@cursor.com>
Surfaced during live testing of the astrology skill upload: - ~/.claude/skills/foo is canonically a symlink at the real repo. The sync's e.IsDir() check returned false on symlink entries; switch to os.Stat (follows symlinks) and resolve via filepath.EvalSymlinks before walking, since filepath.Walk does not follow symlinks itself. Preserve the visible folder name in the uploaded paths so the skill still appears as 'foo' to Anthropic even when the symlink target is named differently. - Anthropic's API rejects 'Skill cannot contain nested zip files'. Skip .whl, .zip, .tar, .tgz, .gz extensions in openSkillFiles so the rest of the skill still uploads. The astrology skill ships pyswisseph wheels under scripts/wheels/ that triggered this. Co-authored-by: Cursor <cursoragent@cursor.com>
closes the H1/H2/H3 + M1/M2/M3 findings from the forensic audit on PR #36 and three more constraints surfaced during live testing. H1 — display_title uniqueness on re-upload (audit + live) Anthropic rejects Skills.New when a skill with that display_title already exists. Service.UploadDir now looks up the prior anthropic skill ID for that name and uses Beta.Skills.Versions.New on collision; only first uploads call Beta.Skills.New. H2 — newly-uploaded skills don't reach cached per-user agents refreshAgentsAsync fires after every successful upload/delete: it enumerates every users.anthropic_agent_id and pushes the current skill list via Beta.Agents.Update. Fired-and-forgotten so upload latency stays low; failures are logged. H3 — Fiber's 4 MiB default body limit rejects wheel-bundled skills Bumped fiber.Config.BodyLimit to 64 MiB (matches Anthropic's per-skill cap). Verified live with a 6 MiB upload that previously 413'd. NEW: Cannot delete skill while versions exist deleteAllVersions enumerates every version via the Versions service and deletes each before calling Skills.Delete. 404s on individual versions are swallowed. NEW: SDK helper BetaManagedAgentsSkillParamsOfCustom omits required Type Returns 400 'skills[0].type: Field required'. Wrapped in skills.SkillParams which sets Type to BetaManagedAgentsCustomSkillParamsTypeCustom. Provisioner refactored to use it. NEW: Anthropic's cloud cannot reach loopback MCP URLs createAgent now fails fast with an actionable hint pointing at ngrok/cloudflared instead of waiting for Anthropic to return a generic 400. M1 — silent truncation at 20-skill cap AnthropicIDs now logs the names of skills it dropped. M2 — brittle 404 detection Added is404 helper using errors.As(*anthropic.Error{}); replaces the strings.Contains heuristic in Delete + deleteAllVersions. M3 — bash-source brittleness in Makefile Dropped the 'source backend/.env' incantation; the CLI uses godotenv.Load() so the Makefile target is one line now. Bonus: Astrology binding sets NoCredentials: true so the demonstration tool doesn't trigger the credentials guard for credentialless plugins. skills-sync waits 2s before exiting so the async refresh goroutine can drain (was logging 'closed pool' in CLI output). Co-authored-by: Cursor <cursoragent@cursor.com>
Skills can now ship their own MCP server via a `skill.yaml` file. The host discovers, health-checks, and wires these into the per-user agent alongside the built-in engagement server — the agent gains the skill's tools without recompiling the backend. - skill.yaml parsing: transport (http/stdio), url, command, image - 00006_skill_mcp_servers.sql: persists the server config per skill with CASCADE DELETE on the parent skills row - Health check: smoke-tests tools/list before marking healthy - HealthyMCPServers / RecheckHealth queries for the provisioner and a future admin endpoint - Provisioner (provision.go): builds the MCPServers list dynamically from engagement + healthy skill servers. Enforces the 10-server cap with a logged warning on overflow. - docker-compose.skills.yml overlay for running skill sidecars - docs/SKILLS.md documents the pattern, layout, and constraints The in-process astrology demo stays as a working example; extracting it to its own repo + MCP server is tracked in #37 step 6. Co-authored-by: Cursor <cursoragent@cursor.com>
…ep 6+8) Remove internal/astrology/ and platforms/astrology.go from the template. The astrology compute now lives in teslashibe/astrology-skill as a standalone Go MCP server wrapping the real pyswisseph chart.py. - Drop Astrology() from platforms.All() - Update docker-compose.skills.yml with a real sidecar entry that builds from ../astrology-skill - Delete skills/astrology/ (bundled slim skill) — the canonical source is now the astrology-skill repo itself Template ships zero domain-specific tools. All 14 social platforms stay in-process (shared auth/rate-limits/credential storage). New capabilities come from external skill repos with their own MCP servers. Co-authored-by: Cursor <cursoragent@cursor.com>
Forensic Audit: PR #38 vs Issue #37Audit of 1. Findings (severity-ordered, evidence first)HIGH — Overflow
|
| AC | Status | Gap |
|---|---|---|
| skill.yaml parsing → upload + row with healthy=true | ✅ Implemented | None |
| No skill.yaml → Anthropic only, no row | ✅ Implemented | None |
| 2 healthy servers → 3 MCPServers + 3 MCPToolsets | ✅ Implemented | None |
| 11 servers over cap → oldest 10 attached + warning naming overflow | Was only logging first overflow, break instead of continue |
|
| URL returns valid tools/list → healthy=true | ✅ Implemented | None |
| Unreachable URL → healthy=false + warning + row persisted | Warning log wasn't explicit enough | |
| Astrology sidecar → routes through skill MCP | ✅ Architecture correct | End-to-end test requires running astrology-skill repo |
| stdio without command → error returned | Was swallowed as warning, not returned as error | |
| MCP server crash → tool_error, other tools continue | ✅ Architecture correct | Anthropic platform behavior; separate MCPServer entries give isolation |
3. User Stories
- As a skill author, I want to ship my skill as a self-contained repo with its own MCP server (in any language), so that I can add compute capabilities without modifying the host binary.
- As a template forker, I want
make skills-syncto discover skill.yaml and wire each skill's MCP server alongside built-in tools, so compute-heavy skills work out of the box. - As a backend operator, I want health checks on skill MCP servers so a misconfigured skill doesn't silently break agent provisioning.
- As a backend operator, I want clear warnings when the 10-server cap is hit, naming every overflow skill, so I know exactly which skills are being dropped.
4. Acceptance Criteria (verified)
- Given valid skill.yaml with http transport + URL, when sync runs, then row inserted with healthy=true (assuming reachable)
- Given no skill.yaml, when sync runs, then Anthropic upload only, no row
- Given 2 healthy skill MCP servers, when new user provisions, then 3 MCPServers + 3 MCPToolsets
- Given 11 skill servers over cap, when provisioning, then oldest 10 attached + warning naming ALL overflow (FIXED)
- Given reachable URL with valid tools/list, when syncing, then healthy=true
- Given unreachable URL, when syncing, then healthy=false + explicit warning + row persisted (FIXED)
- Given astrology sidecar running, when agent calls tool, then routes through skill MCP server (architecture verified)
- Given stdio without command, then sync returns error + continues (FIXED)
- Given MCP server crashes mid-session, then tool_error for that tool, others continue (architecture verified)
5. Risks and Follow-up Actions
| # | Risk / Action | Severity | Status |
|---|---|---|---|
| 1 | refreshAgentsAsync doesn't push MCPServers to existing agents — SetAgentRefreshHook added but needs provisioner wiring |
Medium | Hook added, provisioner method TODO |
| 2 | No unit tests for any MCP server code paths | Medium | Needs test suite |
| 3 | healthCheck uses http.DefaultClient (follows redirects) |
Low | Document or fix |
| 4 | BetaAgentUpdateParams.Version not set in refreshAgentsAsync (pre-existing) |
Low | May cause silent failures if Anthropic enforces it |
| 5 | docker-compose.skills.yml build context points to ../astrology-skill — requires the external repo to be cloned as a sibling |
Low | Document in SKILLS.md or use image-pull pattern |
Assumptions
- Anthropic's
BetaAgentUpdateParamstoleratesVersion: 0(existing behavior, not verified) - Anthropic Managed Agents isolate MCP server failures per-toolset (architecture assumption)
created_at ASCordering inHealthyMCPServersmatches the issue's "oldest" semantics
Unknowns
- Whether Anthropic ties conversation history to the agent ID (affects re-provisioning strategy)
- Whether
BetaAgentUpdateParamsToolUnionis structurally compatible withBetaAgentNewParamsToolUnionfor the refresh hook implementation
- HIGH: overflow break→continue so ALL overflow skill MCP servers are named in the warning, not just the first - MEDIUM: skill.yaml validation errors (e.g. stdio without command) now returned from UploadDir so SyncDirs surfaces them as errors - MEDIUM: added SetAgentRefreshHook on skills.Service so the provisioner can push MCPServers changes to existing agents - LOW: explicit WARNING log when health check marks a server unhealthy - LOW: RecheckHealth now logs individual scan/update/unhealthy errors instead of silently continuing Co-authored-by: Cursor <cursoragent@cursor.com>
Managed Agents emit session.status_idle between tool-call rounds (while waiting for MCP results) as well as after the agent's final text. The previous code exited on any status_idle after seeing activity, which closed the stream before tool results and the final reading arrived. Track pendingTools: tool_use increments, tool_result decrements. Only forward status_idle as 'done' when pendingTools == 0 and at least one text/tool event was seen this turn. Also add per-event debug logging (agent stream [session]: type=...) to diagnose Managed Agents Beta event ordering issues. Co-authored-by: Cursor <cursoragent@cursor.com>
Closes #37.
Skills can now ship their own MCP server via a
skill.yamlfile. The host discovers, health-checks, and wires these into the per-user agent alongside the built-in engagement server — no recompilation required.Depends on #36 (base skills pipeline). This branch merges
feat/skillsfirst, then adds the external MCP layer.Summary
skill.yamlparsing:transport(http/stdio),url,command,image00006_skill_mcp_serverstable with CASCADE DELETE on the parentskillsrowtools/listagainst the URL; markshealthy = true/falseMCPServerslist dynamically:engagement+ healthy skill servers, capped at 10docker-compose.skills.ymloverlay for sidecar skill containersdocs/SKILLS.mddocuments the external MCP pattern, folder layout, and constraintsWhat a skill folder looks like now
Test plan
go build ./...)make db-reset && make up— confirm00006_skill_mcp_serversmigration runsskill.yamlcontainingmcp_server.transport: httppointing at a running MCP server → verifyskill_mcp_serversrow withhealthy = trueskill.yaml→ verify noskill_mcp_serversrow (existing behavior unchanged)skill.yamlpointing at an unreachable URL → verifyhealthy = false, row persistedMade with Cursor