fix: disambiguate agentv eval skill triggers from skill-creator

## Objective

Update AgentV's eval-related skill and agent descriptions to explicitly disambiguate from Anthropic's skill-creator when both are loaded in the same Claude session. Without this, generic requests like "run evals on my skill" or "optimize my skill" are ambiguous — both systems claim to handle them.

## Architecture Boundary

external-first (frontmatter changes in `plugins/agentv-dev/skills/` and `plugins/agentv-dev/agents/`)

## The Problem

**Skill-creator description:**
> Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.

**AgentV eval-orchestrator description:**
> Run AgentV evaluations by orchestrating eval subcommands. Use this skill when asked to run evals, evaluate an agent, test prompt quality using agentv, or run Agent Skills evals.json files.

**AgentV optimizer description:**
> Optimize agent prompts through evaluation-driven refinement.

Overlapping trigger keywords: "run evals", "optimize", "test", "benchmark", "skill performance". A user saying "run evals on my skill" or "optimize my agent" matches both systems. Claude has no signal for which to prefer.

## Disambiguation Strategy

The natural boundary is:
- **Skill-creator** = creating/improving Claude Code skills (SKILL.md files, `.skill` packages, skill description triggering)
- **AgentV** = evaluating AI agent output quality (EVAL.yaml, evals.json, `agentv eval run`, `agentv compare`)

AgentV's skill descriptions should make this boundary explicit by:
1. Including "AgentV" in trigger phrases (e.g., "Use when running AgentV evaluations" not just "Use when running evals")
2. Referencing AgentV-specific file types (EVAL.yaml, `.eval.yaml`, `agentv` CLI)
3. Adding a DO NOT TRIGGER clause for skill-creator's domain ("Do not use for creating or modifying SKILL.md files, packaging skills, or optimizing skill trigger descriptions")

## Design Latitude

- Choose exact wording for disambiguation
- May add a brief "When to use this vs skill-creator" note in descriptions or leave it implicit via trigger keywords
- Choose whether to add DO NOT TRIGGER patterns (explicit exclusion) or rely on positive trigger keywords being specific enough
- May update all eval-related skills or only the ones with highest conflict risk (eval-orchestrator, optimizer)

## Skills and Agents to Update

High conflict risk (must update):
- `plugins/agentv-dev/skills/agentv-eval-orchestrator/SKILL.md` — "run evals" conflicts directly
- `plugins/agentv-dev/skills/agentv-optimizer/SKILL.md` — "optimize" conflicts directly

Medium conflict risk (should update):
- `plugins/agentv-dev/skills/agentv-eval-builder/SKILL.md` — "create eval files" could be confused with skill-creator's eval authoring
- `plugins/agentv-dev/skills/agentv-trace-analyst/SKILL.md` — "benchmark" keyword overlaps

Low conflict risk (consider):
- Agent descriptions (`eval-judge.md`, `eval-candidate.md`) — only dispatched by skills, not triggered directly

## Acceptance Signals

- A user with both `agentv-dev` and `anthropics/skills` (skill-creator) loaded can say "run evals on my skill" and Claude can distinguish which system to use based on context (presence of EVAL.yaml → AgentV, presence of SKILL.md evals → skill-creator)
- AgentV skill descriptions reference AgentV-specific artifacts (EVAL.yaml, `agentv` CLI commands) as trigger indicators
- No ambiguous trigger keyword matches remain for the high-conflict-risk skills

## Non-Goals

- Changing skill-creator's frontmatter (that's Anthropic's repo)
- Adding a router skill that asks "which eval system?"
- Changing AgentV's skill names
- Modifying skill behavior (only descriptions/frontmatter)

## Related

- [Anthropic skill-creator SKILL.md](https://github.com/anthropics/skills/blob/main/skills/skill-creator/SKILL.md) — the competing trigger description


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: disambiguate agentv eval skill triggers from skill-creator #572

Objective

Architecture Boundary

The Problem

Disambiguation Strategy

Design Latitude

Skills and Agents to Update

Acceptance Signals

Non-Goals

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fix: disambiguate agentv eval skill triggers from skill-creator #572

Description

Objective

Architecture Boundary

The Problem

Disambiguation Strategy

Design Latitude

Skills and Agents to Update

Acceptance Signals

Non-Goals

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions