feat(skill): exa-search — semantic web search + fetch_url decision rule (SAM-43) by spashii · Pull Request #70 · Dembrane/sam

spashii · 2026-05-24T16:35:11Z

What this enables

Sam learns when to reach for Exa (semantic URL discovery) vs fetch_url (read a URL you already have). The decision rule lives in the skill's when_to_use field so Sam sees it from the catalog without reading the body.

Consequences

Sam stops grepping the open web via fetch_url against URLs it doesn't have — that pattern doesn't actually work.
New cost-discipline rules in the skill body keep Exa queries lean: one query per task, refine don't paginate, includeDomains when the source family is known.
Dual-path: src/skills/exa-search/skill.md + .agents/skills/exa-search/SKILL.md symlink for OpenCode compat (SAM-45 pattern).

How to verify

pytest tests/runtime/test_source_integrity.py — 27 passed (the new skill picks up parametrized frontmatter + cron tests).
Next time Sam needs to find a doc page it doesn't have a URL for — catalog should surface this skill.

API shape verified

POST https://api.exa.ai/search, x-api-key header, numResults (camelCase), contents.highlights: true. Confirmed against https://exa.ai/docs/reference/search. The initial draft I had used num_results (snake_case) and would have failed at runtime.

Tier

Tier 1 (skill + frontmatter). No runtime changes.

Closes SAM-43.

… decision rule Operator added EXA_API_KEY to .env; this skill is the operating manual. The decision rule lives in 'when_to_use' so Sam sees it from the catalog without reading the skill body: have the URL → fetch_url; need to find URLs → exa-search → fetch_url on the chosen one. API shape verified against https://exa.ai/docs/reference/search: - POST https://api.exa.ai/search - Auth via 'x-api-key' header (or Bearer) - numResults param is camelCase (corrected from the initial draft) - contents.highlights: true gives snippets without a separate /contents endpoint (which doesn't exist — confirmed via docs) Cost discipline section is load-bearing: Exa charges per query, and fan-out is the most common token-trap. One query per task; refine don't paginate; includeDomains when the source family is known. Dual-path: src/skills/exa-search/skill.md + .agents/skills/exa-search/ SKILL.md symlink (SAM-45 dual-compat pattern from PR #66). Closes SAM-43.

linear · 2026-05-24T16:35:14Z

SAM-43

…ized catalog test Two pieces of skill-usage evaluation, both small: 1. Daily-maintenance §1 'Skill usage scan' subsection. Aggregates yesterday's audit log for read_file calls on src/skills/<name>/skill.md paths → per-skill discovery counts. Operator decision rule: consistent zero-reads with obvious triggers = catalog issue (refine frontmatter), obsolete skill (delete), or genuinely missed (note + watch). Counts land in §4 journal synthesis under a new '### Skill usage' sub-section so the trend is queryable. 2. Parametrized 'every skill in src/skills/<name>/' must appear in the assembled system prompt's catalog. Added at the eval-harness structural layer. Trigger: PR #70 shipped exa-search without any catalog-presence test — under the prior single-skill (test_skill_creator_visible_in_catalog), a new skill could ship invisible. Parametrize fixes it for every future skill automatically. Plus .gitignore entry for mining/ so the blog scratch from session-jsonl mining doesn't keep leaking into PRs (the entry on the ask_operator branch in PR #71 hasn't merged yet). Together (1) + (2) cover the operator's catalog-presence and discovery-observability questions on SAM-43. The deeper Opus-as-judge ('did Sam apply the right skill') stays as a separate follow-up. Tests: 9 skills defended by the parametrize. 16 eval tests pass (was 8). Full suite: 143 passed locally.

## What this enables Adds `EXA_API_KEY` to `infra/config.yaml`'s `secrets:` map. Terraform creates the GCP Secret Manager resource; the CI deploy binds it as a Cloud Run env var via the existing `--set-secrets` flag. ## Why PR #70 shipped the `exa-search` skill. It calls `api.exa.ai` using `$EXA_API_KEY` from env. Without this binding, the skill loads in the catalog but the bash call would fail at runtime — Sam would see the missing env var and report it as a blocker. ## Runbook for the operator (you) Two commands locally after merge: ```bash cd infra && terraform apply # creates GCP Secret Manager resource (idempotent) ./infra/scripts/upload-secrets.sh # uploads EXA_API_KEY value from local .env ``` Then the next deploy (any merge to main) picks it up. ## Tier Tier 3 (`infra/`). One-line config addition. No code change. ## Test plan - [x] Diff is one line — declarative config, no logic - [ ] After merge + your two commands + next deploy: Sam invokes the exa-search skill in a test thread and the curl returns valid JSON (not 401) Part of SAM-43.

) ## What this enables Two pieces of skill-usage evaluation, addressing the SAM-43 questions on catalog verification + audit-log-based usage measurement: ### 1. Skill usage scan in daily-maintenance §1 New subsection that aggregates yesterday's audit log for `read_file` calls on `src/skills/<name>/skill.md` paths → per-skill discovery counts. Counts land in §4 journal synthesis under a new `### Skill usage` sub-section so the trend is queryable over time. Decision rule when a skill has consistent zero reads: catalog issue (refine frontmatter), obsolete skill (delete), or genuinely missed (note + watch). Sam-as-LLM makes the call. ### 2. Parametrized catalog-presence test `test_every_skill_visible_in_catalog` in `tests/eval/test_structural.py` parametrizes over every `src/skills/<name>/` directory. Asserts the skill name appears in the assembled system prompt. Adding a new skill automatically gets defended — no manual test addition needed. **Trigger:** PR #70 shipped exa-search without any catalog-presence test. Under the prior single-skill check, a new skill could ship invisible. This parametrize fixes that for every future skill. ## Consequences - Operator sees per-skill usage trends daily without doing any querying. - Any future skill that doesn't appear in the catalog fails CI immediately (parametrize includes its case automatically). - The deeper "did Sam apply the right skill" eval (Opus-as-judge) stays as a follow-up — it's a meta-eval rubric design problem, not a build task. ## What this doesn't cover - Doesn't measure whether Sam applied the skill *correctly* once it was read. Just discovery. - Doesn't fire alerts when a skill is consistently zero-used; that's the operator's daily-synthesis call. ## How to verify - `pytest tests/eval/test_structural.py` — 16 passed (was 8); 9 new parametrized cases, one per skill. - After merge + a real cron fire: §4 journal synthesis includes a `### Skill usage` block with per-skill counts. ## Bonus `.gitignore` adds `mining/` so blog-scratch files from session-jsonl mining don't keep leaking into PRs (the entry on PR #71 hasn't merged yet). ## Tier Tier 1 (skill prose + tests). No runtime changes. Closes the catalog-verification + skill-usage-observability part of SAM-43.

## Why PR #70 (merged) landed a single `.agents/skills/exa-search/SKILL.md` symlink as a preview of the OpenCode dual-path pattern, ahead of the runtime support that was supposed to come in via #66. Review of #66 found the SKILL.md files there aren't real symlinks (see #66 (comment)), so #66 needs rework. In the meantime this lone entry is: - **Inconsistent** — the only `.agents/skills/` member on main, while every other skill lives only at `src/skills/<name>/skill.md`. - **Inert** — `src/runtime/prompts.py::_build_skill_catalog` globs `src/skills/` only, so the `.agents/skills/` symlink isn't read by the daemon. Remove it now to keep the tree consistent. Reintroduce the full dual-path set in one go when #66 is reworked with real symlinks + matching runtime support. ## Diff One file: `.agents/skills/exa-search/SKILL.md` (symlink) deleted. `src/skills/exa-search/skill.md` (the real source) is untouched, so the skill itself still works exactly as on main today. ## How to verify - `git ls-tree HEAD .agents/` → empty / no exa-search dir - `pytest tests/runtime/test_source_integrity.py` — exa-search picked up via `src/skills/exa-search/skill.md` like every other skill ## Bypass note Admin-merging to land before #66 is reworked; the change is a one-file deletion of an inert symlink.

spashii enabled auto-merge May 24, 2026 16:35

spashii mentioned this pull request May 24, 2026

infra: bind EXA_API_KEY secret for the exa-search skill (SAM-43) #72

Merged

2 tasks

spashii mentioned this pull request May 24, 2026

feat(skills): skill-usage observability + parametrized catalog test #73

Merged

spashii disabled auto-merge May 24, 2026 17:28

Merge branch 'main' into sam/exa-search-skill

89bd0e3

spashii merged commit 82760c0 into main May 24, 2026
2 checks passed

spashii deleted the sam/exa-search-skill branch May 24, 2026 17:29

spashii mentioned this pull request May 24, 2026

revert(skills): remove stray .agents/skills/exa-search symlink #74

Merged

spashii mentioned this pull request May 24, 2026

feat(runtime): support opencode skill structure with symlinks #66

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skill): exa-search — semantic web search + fetch_url decision rule (SAM-43)#70

feat(skill): exa-search — semantic web search + fetch_url decision rule (SAM-43)#70
spashii merged 2 commits into
mainfrom
sam/exa-search-skill

spashii commented May 24, 2026

Uh oh!

linear Bot commented May 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spashii commented May 24, 2026

What this enables

Consequences

How to verify

API shape verified

Tier

Uh oh!

linear Bot commented May 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant