noticed-labs

two production skills from noticed for disciplined, benchmark-driven optimization.

skill	what it optimizes
query-optimization-lab	slow / OOMing database queries (ClickHouse, Postgres) — local benchmark loop
agent-eval-lab	low-scoring LLM agent evals — onboarding evals + agent live evals

both skills share one loop: measure, hypothesize, change one thing, re-measure, keep or discard. the judge is immutable. the benchmark decides.

see WORKSHOP.md for the full guide.

quick start

# install via Claude Code plugin marketplace
/plugin marketplace add noticedso/noticed-labs
/plugin install noticed-labs@noticed-labs

or clone and point Claude Code at the local path:

git clone https://github.com/noticedso/noticed-labs ~/code/noticed-labs
/plugin marketplace add ~/code/noticed-labs
/plugin install noticed-labs@noticed-labs

once installed, the skills auto-trigger when Claude Code sees a slow query, OOMing pipeline, failing eval scenario, or a regression in agent behavior.

what's inside

noticed-labs/
├── .claude-plugin/
│   ├── plugin.json         plugin metadata (Claude Code reads this)
│   └── marketplace.json    makes this repo a one-plugin marketplace
├── skills/
│   ├── query-optimization-lab/
│   │   ├── SKILL.md        the skill (frontmatter + main loop)
│   │   ├── WORKFLOW.md     step-by-step, file by file
│   │   ├── PATTERNS.md     proven query rewrites (probe-then-hydrate, etc.)
│   │   └── EXPERIMENTS.md  experiment log discipline
│   └── agent-eval-lab/
│       ├── SKILL.md        onboarding evals + agent live evals
│       ├── WORKFLOW.md     judge issues -> source files -> change -> re-run
│       ├── PATTERNS.md     proven prompt / extraction fixes
│       └── EXPERIMENTS.md  flakiness protocol for LLM non-determinism
├── README.md
├── WORKSHOP.md             workshop guide (the talk this repo is built around)
└── LICENSE

the two skills

query-optimization-lab

triggers when: a query is slow, memory-heavy, timing out, causing socket hang ups, or scanning too many rows.

the loop:

find the hot query (read prod system.query_log read-only)
reproduce it locally (npm run clickhouse:up)
benchmark current shape — read_rows, read_bytes, memory_usage, elapsed_ns
one targeted query-shape change
benchmark again on the same dataset
keep only if a bottleneck metric improves and correctness holds
only after the query is lighter, revisit concurrency

local-only is hard-coded into the skill: production is read-only for diagnosis.

agent-eval-lab

triggers when: eval scenarios fail, judge scores drop, onboarding regresses, or you're iterating on system prompts / persona overlays / extraction prompts.

covers two complementary eval systems:

onboarding evals — scripted user messages, 6 fixed dimensions, CSV reports
agent live evals — user-simulator LLM, mission-driven success criteria, JSON reports

the judge is immutable. you improve the agent's behavior, not the grading.

how to adapt these to your own repo

both SKILL.md files reference noticed-specific paths (apps/noticed-agent/scripts/..., npm run eval:agent, etc). that's the point — skills are most useful when they know your repo's exact layout.

to fork:

copy the two skill folders into your own plugin or marketplace repo
rewrite the Repo Defaults section in each SKILL.md to point at your benchmark scripts, eval files, and migration paths
keep the Core Loop, Keep/Discard Rules, and Output Format sections — that discipline is portable
update description in the frontmatter so your repo's queries / evals trigger the skill

WORKSHOP.md walks through this fork-and-adapt process step by step.

publishing to marketplaces

this repo is structured to be installable from any of the major Claude Code skill / plugin marketplaces:

marketplace	how to submit
Anthropic plugins-official	external plugin via the plugin directory submission form
obra/superpowers-marketplace	PR adding an entry to `marketplace.json` with `source.url`
claudemarketplaces.com	community directory — index automatically
skillsmp.com	submit via their listing flow
awesome-claude-skills	PR adding the repo to the README

we recommend self-hosting as a personal marketplace first (this repo's .claude-plugin/marketplace.json), then submitting to one of the curated marketplaces above once the skills are battle-tested.

references

license

MIT — see LICENSE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

noticed-labs

quick start

what's inside

the two skills

query-optimization-lab

agent-eval-lab

how to adapt these to your own repo

publishing to marketplaces

references

license

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude-plugin		.claude-plugin
skills		skills
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
WORKSHOP.md		WORKSHOP.md

Folders and files

Latest commit

History

Repository files navigation

noticed-labs

quick start

what's inside

the two skills

query-optimization-lab

agent-eval-lab

how to adapt these to your own repo

publishing to marketplaces

references

license

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages