two production skills from noticed for disciplined, benchmark-driven optimization.
| skill | what it optimizes |
|---|---|
| query-optimization-lab | slow / OOMing database queries (ClickHouse, Postgres) — local benchmark loop |
| agent-eval-lab | low-scoring LLM agent evals — onboarding evals + agent live evals |
both skills share one loop: measure, hypothesize, change one thing, re-measure, keep or discard. the judge is immutable. the benchmark decides.
see WORKSHOP.md for the full guide.
# install via Claude Code plugin marketplace
/plugin marketplace add noticedso/noticed-labs
/plugin install noticed-labs@noticed-labsor clone and point Claude Code at the local path:
git clone https://github.com/noticedso/noticed-labs ~/code/noticed-labs
/plugin marketplace add ~/code/noticed-labs
/plugin install noticed-labs@noticed-labsonce installed, the skills auto-trigger when Claude Code sees a slow query, OOMing pipeline, failing eval scenario, or a regression in agent behavior.
noticed-labs/
├── .claude-plugin/
│ ├── plugin.json plugin metadata (Claude Code reads this)
│ └── marketplace.json makes this repo a one-plugin marketplace
├── skills/
│ ├── query-optimization-lab/
│ │ ├── SKILL.md the skill (frontmatter + main loop)
│ │ ├── WORKFLOW.md step-by-step, file by file
│ │ ├── PATTERNS.md proven query rewrites (probe-then-hydrate, etc.)
│ │ └── EXPERIMENTS.md experiment log discipline
│ └── agent-eval-lab/
│ ├── SKILL.md onboarding evals + agent live evals
│ ├── WORKFLOW.md judge issues -> source files -> change -> re-run
│ ├── PATTERNS.md proven prompt / extraction fixes
│ └── EXPERIMENTS.md flakiness protocol for LLM non-determinism
├── README.md
├── WORKSHOP.md workshop guide (the talk this repo is built around)
└── LICENSE
triggers when: a query is slow, memory-heavy, timing out, causing socket hang ups, or scanning too many rows.
the loop:
- find the hot query (read prod
system.query_logread-only) - reproduce it locally (
npm run clickhouse:up) - benchmark current shape —
read_rows,read_bytes,memory_usage,elapsed_ns - one targeted query-shape change
- benchmark again on the same dataset
- keep only if a bottleneck metric improves and correctness holds
- only after the query is lighter, revisit concurrency
local-only is hard-coded into the skill: production is read-only for diagnosis.
triggers when: eval scenarios fail, judge scores drop, onboarding regresses, or you're iterating on system prompts / persona overlays / extraction prompts.
covers two complementary eval systems:
- onboarding evals — scripted user messages, 6 fixed dimensions, CSV reports
- agent live evals — user-simulator LLM, mission-driven success criteria, JSON reports
the judge is immutable. you improve the agent's behavior, not the grading.
both SKILL.md files reference noticed-specific paths (apps/noticed-agent/scripts/..., npm run eval:agent, etc). that's the point — skills are most useful when they know your repo's exact layout.
to fork:
- copy the two skill folders into your own plugin or marketplace repo
- rewrite the Repo Defaults section in each
SKILL.mdto point at your benchmark scripts, eval files, and migration paths - keep the Core Loop, Keep/Discard Rules, and Output Format sections — that discipline is portable
- update
descriptionin the frontmatter so your repo's queries / evals trigger the skill
WORKSHOP.md walks through this fork-and-adapt process step by step.
this repo is structured to be installable from any of the major Claude Code skill / plugin marketplaces:
| marketplace | how to submit |
|---|---|
| Anthropic plugins-official | external plugin via the plugin directory submission form |
| obra/superpowers-marketplace | PR adding an entry to marketplace.json with source.url |
| claudemarketplaces.com | community directory — index automatically |
| skillsmp.com | submit via their listing flow |
| awesome-claude-skills | PR adding the repo to the README |
we recommend self-hosting as a personal marketplace first (this repo's .claude-plugin/marketplace.json), then submitting to one of the curated marketplaces above once the skills are battle-tested.
- Anthropic Agent Skills docs
- Anthropic public skills repo
- Claude Code plugin marketplaces guide
- obra/superpowers methodology
MIT — see LICENSE