diff --git a/AGENTS.md b/AGENTS.md index 12a8831..13819e0 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -14,10 +14,10 @@ gitignored). | Directory | Audience | Purpose | |-----------|----------|---------| | **`.agents/skills/`** (`.claude/skills/`, `.cursor/skills/`) | Agents **developing** this repo | propose, plan-prompts, pr-open, pr-review | -| **`skills/tier-1/`** + **`skills/tier-2/`** (project root) | Agents **using** this tool on their own codebase | /callers, /routes, /explain-feature, /impact-of, etc. | +| **`skills/explore-codebase/`** (project root) | Agents **using** this tool on their own codebase | /explore-codebase — complete MCP operating manual | `.agents/` skills are loaded by the agent working *on* java-codebase-rag source -code. `skills/` are shipped to consumers — they instruct an agent to call the +code. `skills/` is shipped to consumers — it instructs an agent to call the MCP tools (`search`, `find`, `describe`, `neighbors`, `resolve`) against an indexed Java codebase. Do not mix the two: never import consumer skills into `.agents/skills/` or vice versa. @@ -55,7 +55,7 @@ when needed. - `docs/CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file map of what to edit when a target tree doesn't match defaults. - `tests/README.md` — testing philosophy. -- **`skills/tier-1/`** + **`skills/tier-2/`** — user-facing skills shipped to java-codebase-rag consumers. Tier 1 are single-intent listings (`/callers`, `/routes`, `/clients`, …); Tier 2 are multi-step workflows (`/explain-feature`, `/impact-of`, `/trace-request-flow`, `/mini-map`). Users opt in per tier. Developer workflow skills live in **`.agents/skills/`**, not here. +- **`skills/explore-codebase/`** — user-facing skill shipped to java-codebase-rag consumers. Single self-contained operating manual for the 5-tool MCP. Developer workflow skills live in **`.agents/skills/`**, not here. - **`propose/`** — design proposes. **In-flight** proposes live in **`propose/active/`**. **`propose/completed/`** — landed work and rationale. **List or search this tree** for current filenames; do not rely on enumerated diff --git a/README.md b/README.md index 4ab1d38..e431e0f 100644 --- a/README.md +++ b/README.md @@ -117,9 +117,9 @@ See [`mcp.json.example`](./mcp.json.example) for the same shape in `.mcp.json` ( Pick **one** of two options (not both — they cover the same navigation intents): -1. **[`docs/AGENT-GUIDE.md`](./docs/AGENT-GUIDE.md)** (recommended for most) — standalone MCP operating manual. Copy-paste the `BEGIN`/`END` block into your project's `QWEN.md`, `CLAUDE.md`, or `AGENTS.md`. Contains: five-tool reference, `NodeFilter` / edge taxonomy, ontology glossary, recovery playbook, and inline slash-style aliases (`/callers`, `/callees`, `/routes`, etc.) as prompt templates. Self-contained — no external file dependencies. +1. **[`docs/AGENT-GUIDE.md`](./docs/AGENT-GUIDE.md)** (recommended for most) — standalone MCP operating manual. Copy-paste the `BEGIN`/`END` block into your project's `QWEN.md`, `CLAUDE.md`, or `AGENTS.md`. Contains: five-tool reference, `NodeFilter` / edge taxonomy, ontology glossary, recovery playbook, and navigation patterns. Self-contained — no external file dependencies. -2. **[`skills/`](./skills/)** (for hosts with skill discovery) — 15 shipped `SKILL.md` files. If your MCP host supports skill discovery (Claude Code, Qwen Code, Cursor), the same navigation intents are available as discoverable `/` commands. Tier 1 = deterministic MCP chains (`/callers`, `/callees`, `/routes`, `/controllers`, `/clients`, `/producers`, `/handlers`, `/who-hits-route`, `/implements`, `/injects`, `/nl`). Tier 2 = bounded workflows (`/explain-feature`, `/impact-of`, `/trace-request-flow`, `/mini-map`). See [`skills/README.md`](./skills/README.md) for the full index. +2. **[`/explore-codebase`](./skills/explore-codebase/SKILL.md)** (for hosts with skill discovery) — single self-contained skill with the complete operating manual. If your MCP host supports skill discovery (Claude Code, Qwen Code, Cursor), load `/explore-codebase` to get the full tool reference, edge taxonomy, decision tree, and recovery playbook in one shot. Also: **[`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md)** — 7-phase agent-driven verification you run after indexing your real project. @@ -139,7 +139,7 @@ Full schemas, `NodeFilter` / `EdgeFilter` semantics, and the hints contract live ### Three-layer architecture -Layer 1 (storage) → Layer 2 (5 MCP tools) → Layer 3 (skills). Navigation skills in [`skills/`](./skills/) wrap the MCP tools into deterministic chains (Tier 1) and bounded workflows (Tier 2). See the [architecture diagram in `skills/README.md`](./skills/README.md#three-layer-architecture). +Layer 1 (storage) → Layer 2 (5 MCP tools) → Layer 3 (skill). The [`/explore-codebase`](./skills/explore-codebase/SKILL.md) skill provides the full operating manual for Layer 2. See the [architecture diagram in `skills/README.md`](./skills/README.md#three-layer-architecture). --- @@ -182,7 +182,7 @@ Run `java-codebase-rag --help` to list grouped subcommands. Operator playbook wi | [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) | Environment variables, project YAML, graph ontology, brownfield overrides, ignore patterns. | | [`docs/JAVA-CODEBASE-RAG-CLI.md`](./docs/JAVA-CODEBASE-RAG-CLI.md) | CLI operator playbook: workflows, exit codes, env alignment. | | [`docs/EDGE-NAVIGATION.md`](./docs/EDGE-NAVIGATION.md) | MCP-traversable edges, directions, dot-key composition. | -| [`skills/`](./skills/) | 15 navigation and workflow skills for hosts with skill discovery (alternative to copy-pasting AGENT-GUIDE). See [`skills/README.md`](./skills/README.md). | +| [`skills/`](./skills/) | Single `/explore-codebase` skill — complete MCP operating manual for hosts with skill discovery (alternative to copy-pasting AGENT-GUIDE). See [`skills/README.md`](./skills/README.md). | | [`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md) | 7-phase agent-driven verification after indexing your project. | | [`docs/CODEBASE_REQUIREMENTS.md`](./docs/CODEBASE_REQUIREMENTS.md) | Assumptions about your Java repo + per-file edit map for non-conforming codebases. | | [`automation/cursor_propose_only/README.md`](./automation/cursor_propose_only/README.md) | Optional proposal orchestration workflow (single-command autopilot, planning bundles, automated execution/review loops). | diff --git a/docs/paper/paper.tex b/docs/paper/paper.tex index df71efa..aa1e31d 100644 --- a/docs/paper/paper.tex +++ b/docs/paper/paper.tex @@ -107,7 +107,7 @@ \section{Inspirations} \paragraph{Model Context Protocol.} The MCP standard \cite{anthropic2024mcp} fixed the impedance mismatch between agents and tools: a single transport, a single way to declare schemas, and a single way to bind tools to hosts (Claude Code, Cursor, Qwen Code, and others). Without MCP this report would describe a Claude-Code-only system. With MCP, a single Python server reaches every host the user already prefers. The standard does one thing well and stops. -\paragraph{The agent-skills layer.} Anthropic's agent skills \cite{anthropic2025skills} provided the missing piece between raw tool calls and agent reasoning: a skill is a slash-invokable, declaratively-described chain of tool calls that encodes a recurring intent ("trace this request flow", "show me callers of this method"). Skills are how a small fixed MCP surface grows into hundreds of usable agent intents without growing the tool count. We describe the planned skills layer briefly in \S\ref{sec:future} and defer its specification to a separate document; empirical testing showed that a comprehensive prose guide (mirrored as \texttt{docs/AGENT-GUIDE.md}) is sufficient for current weak-model performance, so the skills layer is not yet on the critical path. +\paragraph{The agent-skills layer.} Anthropic's agent skills \cite{anthropic2025skills} provided the missing piece between raw tool calls and agent reasoning: a skill is a slash-invokable, declaratively-described chain of tool calls that encodes a recurring intent ("trace this request flow", "show me callers of this method"). Skills are how a small fixed MCP surface grows into hundreds of usable agent intents without growing the tool count. Empirical testing showed that a single comprehensive skill (\texttt{/explore-codebase}) loaded at query time outperforms a large set of narrow per-intent skills, because the agent retains the full decision tree and recovery context rather than operating from a sliced subset. \paragraph{What we are not.} We do not claim novelty over GraphRAG, LightRAG, or LSP-backed tooling. We claim that a particular synthesis --- minimal MCP surface, typed property graph, three-primitive navigation model, agent-shaped affordances --- is the right shape for code intelligence at the agentic-development layer. The synthesis is the contribution. @@ -194,7 +194,7 @@ \subsection{Layer 3: reason (the agent)} Layer 3 is whatever MCP-compatible host the developer prefers --- Claude Code, Qwen Code, Cursor, or another runtime. The host loads the java-codebase-rag MCP server, sees the five tools, and the agent reasons over them. There is no logic in this layer that is specific to java-codebase-rag; the entire affordance is the five tools and a prose agent guide (\texttt{docs/AGENT-GUIDE.md}) that documents the canonical workflows --- forced reasoning preamble, decision tree, edge taxonomy, worked examples. -A planned addition (deferred) is a thin skills layer that turns recurring intents (\texttt{/callers}, \texttt{/routes}, \texttt{/explain-feature}) into one-line slash invocations that compile to MCP-call chains. Empirical testing on the target codebase showed that the prose guide alone is sufficient for current weak-model accuracy, so the skills layer is not yet implemented. +A single skill (\texttt{/explore-codebase}) wraps the full operating manual --- edge taxonomy, \texttt{NodeFilter} reference, decision tree, recovery playbook --- into one loadable prompt. Empirical testing on the target codebase showed that a comprehensive prose guide loaded as one skill outperforms a large set of narrow per-intent skills; the agent retains the full decision tree and recovery context rather than operating from a sliced subset. % ============================================================================= \section{Agent workflow} @@ -248,7 +248,7 @@ \section{Future work} \label{sec:future} {\sloppy -Three threads are open and prioritised. \textbf{(1) Real-codebase evaluation.} Testing on a large legacy Java microservice estate is in progress; once stable, we expect to publish accuracy numbers (intent $\to$ correct-tool-chain rate, end-to-end answer correctness against human labels) for the five-tool surface against weak (Qwen) and strong (Claude Sonnet 4.5) hosts. \textbf{(2) Skills layer.} A 13-skill set --- 10 single-call navigation skills (\texttt{/callers}, \texttt{/callees}, \texttt{/routes}, \texttt{/controllers}, \ldots) and 3 multi-step workflow skills (\texttt{/explain-feature}, \texttt{/impact-of}, \texttt{/trace-request-flow}) --- is designed and on hold until the prose-guide approach shows insufficient. \textbf{(3) Tier-2 incremental rebuilds.} Today the index rebuilds the affected modules; we want commit-level incremental rebuilds for sub-second index updates on large monorepos. +Three threads are open and prioritised. \textbf{(1) Real-codebase evaluation.} Testing on a large legacy Java microservice estate is in progress; once stable, we expect to publish accuracy numbers (intent $\to$ correct-tool-chain rate, end-to-end answer correctness against human labels) for the five-tool surface against weak (Qwen) and strong (Claude Sonnet 4.5) hosts. \textbf{(2) Lightweight skills.} A lightweight \texttt{/search-codebase} skill (search + find + shallow neighbors + host glob) is planned as a low-context-cost alternative to the comprehensive \texttt{/explore-codebase} skill for quick lookups. \textbf{(3) Tier-2 incremental rebuilds.} Today the index rebuilds the affected modules; we want commit-level incremental rebuilds for sub-second index updates on large monorepos. \par} We deliberately list \emph{no} item that would grow the MCP tool count without proof that the existing five tools cannot accommodate a real intent. diff --git a/skills/README.md b/skills/README.md index ebce504..123ef62 100644 --- a/skills/README.md +++ b/skills/README.md @@ -1,31 +1,41 @@ -# skills/ — Layer 3 navigation and workflow skills +# skills/ — RAG navigation skill for the java-codebase-rag MCP -High-level intents over the 5-tool MCP (`search` / `find` / `describe` / `neighbors` / `resolve`). Skills are agent-side prompt scaffolding — they are **not** a second MCP API and **not** CLI subcommands. +A single self-contained skill (`explore-codebase`) that provides the complete operating manual for the 5-tool MCP (`search` / `find` / `describe` / `neighbors` / `resolve`). Skills are agent-side prompt scaffolding — they are **not** a second MCP API and **not** CLI subcommands. -## Pick the tier you need +## When to use -Skills are organized by tier — load only what you use. +Load this skill when your agent needs to explore an indexed Java codebase: locate symbols, trace call chains, find HTTP/messaging routes, walk cross-service boundaries, or answer any structural question. + +## Layout ``` skills/ - tier-1/ ← Navigation. 11 single-purpose skills. - tier-2/ ← Workflow. 4 multi-step skills that compose Tier 1 with bounds. + README.md ← this file + explore-codebase/SKILL.md ← complete MCP operating manual (standalone) ``` -- **Just want to list controllers/routes/clients?** Tier 1 is enough — `skills/tier-1/controllers`, `skills/tier-1/routes`, etc. -- **Need to trace a request, explain a feature, or analyze blast radius?** Tier 2 — `skills/tier-2/trace-request-flow`, etc. -- **Don't want skills at all?** Copy the block in `docs/AGENT-GUIDE.md` between `` and `` into your project's `AGENTS.md` / `CLAUDE.md`. Skills and the guide are **alternatives**, not complements — pick one. +## What's inside `explore-codebase` + +The skill is a single comprehensive prompt that includes: + +- **Five-tool reference** — `search`, `find`, `describe`, `neighbors`, `resolve` with full argument shapes +- **Node kinds** — Symbol, Route, Client, Producer +- **Edge taxonomy** — stored edges, composed dot-keys, direction semantics +- **NodeFilter reference** — all filter keys by node kind, strict frame rules +- **Decision tree** — "user asks X → start with tool Y → follow up with Z" +- **Recovery playbook** — common failure modes and fixes +- **Navigation patterns** — 12 common intent-to-tool-chain mappings +- **Ontology glossary** — roles, capabilities, symbol kinds, frameworks, match types +- **Worked example** — end-to-end feature exploration ## Three-layer architecture ``` ┌──────────────────────────────────────────────────────────────┐ │ Layer 3 — High-level intents (what the user actually thinks) │ -│ /trace-request-flow, /callees, /controllers, /routes, │ -│ /impact-of, /mini-map │ +│ "who calls X", "trace this route", "explain feature Y" │ │ ───────────────────────────────────────────────────────── │ -│ Implementation: SKILL.md files in skills/tier-1/ and │ -│ skills/tier-2/. │ +│ Implementation: explore-codebase SKILL.md │ ├──────────────────────────────────────────────────────────────┤ │ Layer 2 — Composable primitives (the MCP API) │ │ search, find, describe, neighbors, resolve │ @@ -35,86 +45,14 @@ skills/ └──────────────────────────────────────────────────────────────┘ ``` -## Tier 1 — Navigation (deterministic MCP chains) - -11 single-purpose skills. Each one is one MCP call (sometimes preceded by a `resolve`). - -| Skill | Purpose | One-shot tool chain | -| ----- | ------- | ------------------- | -| [`/nl`](tier-1/nl/SKILL.md) | Natural-language search into the graph | `search` → `describe` | -| [`/controllers`](tier-1/controllers/SKILL.md) | List controller classes | `find(kind="symbol", role="CONTROLLER")` | -| [`/routes`](tier-1/routes/SKILL.md) | List HTTP and messaging routes | `find(kind="route")` | -| [`/clients`](tier-1/clients/SKILL.md) | List outbound HTTP clients | `find(kind="client")` | -| [`/producers`](tier-1/producers/SKILL.md) | List outbound async producers | `find(kind="producer")` | -| [`/callers`](tier-1/callers/SKILL.md) | Who calls this method (in-process CALLS) | `resolve` → `neighbors(in, CALLS)` | -| [`/callees`](tier-1/callees/SKILL.md) | What this method calls (in-process CALLS) | `resolve` → `neighbors(out, CALLS)` | -| [`/handlers`](tier-1/handlers/SKILL.md) | Method that handles a route | `resolve` → `neighbors(in, EXPOSES)` | -| [`/who-hits-route`](tier-1/who-hits-route/SKILL.md) | All inbound paths to a route | `resolve` → `neighbors(in, [HTTP_CALLS, ASYNC_CALLS, EXPOSES])` | -| [`/implements`](tier-1/implements/SKILL.md) | Concrete classes implementing an interface | `resolve` → `neighbors(in, IMPLEMENTS)` | -| [`/injects`](tier-1/injects/SKILL.md) | Where a type is injected | `resolve` → `neighbors(in, INJECTS)` | - -## Tier 2 — Workflow (bounded multi-step) - -4 multi-step skills. Each one composes Tier 1 calls with explicit depth, recursion, and stop conditions. - -| Skill | Purpose | Shape | -| ----- | ------- | ----- | -| [`/explain-feature`](tier-2/explain-feature/SKILL.md) | Understand how a feature works end-to-end | `search` → `describe` → bounded `neighbors` walks | -| [`/impact-of`](tier-2/impact-of/SKILL.md) | What breaks if a symbol changes | `resolve` → `describe` → recursive inbound `neighbors` ≤ depth 2 | -| [`/trace-request-flow`](tier-2/trace-request-flow/SKILL.md) | Follow a request from entry to persistence | `resolve(route)` → handler → forward CALLS walk ≤ depth 4 + boundary hops | -| [`/mini-map`](tier-2/mini-map/SKILL.md) | Noise-filtered call map for a hot method | `resolve` → `edge_filter`'d CALLS + skill heuristics ≤ depth 4 | - -## Layout - -``` -skills/ - README.md ← this file - tier-1/ - nl/SKILL.md - controllers/SKILL.md - routes/SKILL.md - clients/SKILL.md - producers/SKILL.md - callers/SKILL.md - callees/SKILL.md - handlers/SKILL.md - who-hits-route/SKILL.md - implements/SKILL.md - injects/SKILL.md - tier-2/ - explain-feature/SKILL.md - impact-of/SKILL.md - trace-request-flow/SKILL.md - mini-map/SKILL.md -``` - -## SKILL.md structure - -Every SKILL.md is self-sufficient — load one skill, get a single working scaffolded prompt: - -1. **Frontmatter** (`name` + `description`) — used by Claude Code / Cursor / Qwen Code for auto-discovery. -2. **When to use** — concrete triggers and when to prefer a different skill. -3. **Tools used** — exactly which of `search` / `find` / `describe` / `neighbors` / `resolve` this skill calls. -4. **Reasoning preamble** — the mandatory `Q-class: ` line before each MCP call, with the taxonomy defined inline. -5. **Argument contract** — what the skill takes. -6. **Steps** — exact MCP calls with parameters. -7. **Recovery / stop conditions / recursion limit** (Tier 2: required; Tier 1: short). -8. **Worked example** — end-to-end on the `tests/bank-chat-system` fixture. -9. **Do not / Out of scope** — guardrails and pointers to neighboring skills. -10. **Going deeper** — pointer to `docs/AGENT-GUIDE.md` for the full reference. - -## Versioning - -Skills are versioned lockstep with the MCP. When `NodeFilter` keys, `edge_filter` axes, `edge_types`, or `kind` values change, skills are updated in the same PR. The static validator (`tests/test_agent_skills_static.py`) checks every SKILL.md against the live MCP allowlists. - ## Relationship to `docs/AGENT-GUIDE.md` -Skills and `docs/AGENT-GUIDE.md` are **alternatives**. Pick one: +`explore-codebase` and `docs/AGENT-GUIDE.md` are **alternatives** covering the same ground. Pick one: -- **Skills** — load on demand by name. Lower context cost per query. Best for hosts that natively support skills (Claude Code, Cursor, Qwen Code). -- **AGENT-GUIDE block** — paste once into your project's `AGENTS.md` / `CLAUDE.md`. Always-on. Best for hosts without skill loading, or when you want one persistent guide for everything. +- **`explore-codebase` skill** — loaded on demand by hosts with skill discovery (Claude Code, Qwen Code, Cursor). One skill to rule them all. +- **AGENT-GUIDE block** — paste the `BEGIN`/`END` copy-paste block into your project's `AGENTS.md` / `CLAUDE.md`. Always-on. Best for hosts without skill loading. -Do not mix the two — duplicate context confuses tool selection. The static validator (`TestAgentGuideConsistency`) verifies the AGENT-GUIDE copy-paste block does not reference `skills/`. +Do not mix the two — duplicate context confuses tool selection. ## Relationship to developer skills diff --git a/skills/explore-codebase/SKILL.md b/skills/explore-codebase/SKILL.md new file mode 100644 index 0000000..57e1faa --- /dev/null +++ b/skills/explore-codebase/SKILL.md @@ -0,0 +1,321 @@ +--- +name: explore-codebase +description: Complete operating manual for the java-codebase-rag MCP tools (search, find, describe, neighbors, resolve). Use this skill whenever you need to explore a Java codebase — locate symbols, trace call chains, find routes, walk cross-service boundaries, or answer any "where is X", "who calls Y", "what does Z depend on" question. Self-contained: includes edge taxonomy, NodeFilter reference, decision tree, argument shapes, recovery playbook, and navigation patterns. No external files needed. +--- + +# /explore-codebase — Codebase navigation via the java-codebase-rag MCP + +## When to use + +Any time you need to understand structure in an indexed Java codebase: locating symbols, tracing call chains, finding HTTP/messaging routes, walking cross-service boundaries, or answering questions like "where is X", "who calls Y", "what depends on Z". + +## Tools + +`search`, `find`, `describe`, `neighbors`, `resolve`. + +## Node kinds + +`Symbol` (types and methods), `Route` (HTTP and messaging entry points), `Client` (outbound HTTP call sites), `Producer` (outbound async call sites). + +## Indexed content + +Java production sources plus SQL and YAML. Use `search` `table`: `java`, `sql`, `yaml`, or `all`. + +## What this MCP is not + +- **Test files, build files, CI/deploy** — read those files directly in the repo. +- **Reflection and dynamic dispatch** — `CALLS` is static analysis only; the resolved set is a **lower bound**. +- **Proof of absence** — an empty result may mean the project was not indexed, the wrong `table`, or a filter that matches nothing. +- **Git history** — use `git log` / `git blame` for "who changed" / "when". + +When MCP disagrees with the open file, **the file wins**; treat the mismatch as a likely stale or incomplete index. + +## Brownfield annotations on methods + +If a method has any of these (including plural containers **`@CodebaseHttpRoutes`**, **`@CodebaseAsyncRoutes`**, **`@CodebaseHttpClients`**, **`@CodebaseProducers`**), that annotation is the **only** source for the facets it declares — framework inference on the **same** method is **not merged** for that axis: + +| Annotation | Declares | Framework rows bypassed (examples) | +| ---------- | -------- | ---------------------------------- | +| `@CodebaseHttpRoute` | inbound HTTP path / verb | Spring MVC/WebFlux mapping annotations | +| `@CodebaseAsyncRoute` | inbound async topic / route | `@KafkaListener`, `@RabbitListener`, ... | +| `@CodebaseHttpClient` | outbound HTTP client call site | `@FeignClient` method mappings, RestTemplate-style inference | +| `@CodebaseProducer` | outbound async producer call site | `KafkaTemplate` / `StreamBridge` producer inference | + +Trust the indexed brownfield row over a framework-only reading of the source. + +## Workflow (locate -> inspect -> walk) + +1. **Locate** — `resolve` for identifier-shaped strings; `search` for natural language or code fragments; `find` for structured `NodeFilter` discovery. +2. **Inspect** — `describe(id)` for the full record and `edge_summary` (per-label `in`/`out` counts). +3. **Walk** — `neighbors` in a loop with explicit **`direction`** and **`edge_types`**. Multi-hop traces are **your** reasoning, not a separate tool. + +## Forced reasoning preamble (every tool call) + +Before each MCP call, output one short line: + +``` +Q-class: +Pick: Why: <≤8 words> +``` + +Then use real JSON shapes (see below). If the call fails or returns nothing useful, use the **Recovery playbook** — do not thrash. + +## Edge taxonomy + +Use these strings **verbatim** in `neighbors(..., edge_types=[...])`. + +### Stored edges (one hop) + +| Group | Edge types | Semantics | +| ----- | ---------- | --------- | +| Type wiring | `EXTENDS`, `IMPLEMENTS`, `INJECTS` | `in` = who depends on this type; `out` = what this type depends on | +| Containment | `DECLARES`, `DECLARES_CLIENT`, `DECLARES_PRODUCER` | `in` = owner; `out` = owned member, client, or producer | +| Method overrides | `OVERRIDES` | Subtype **method** -> supertype **declaration** (same `signature`, one `IMPLEMENTS`/`EXTENDS` hop) | +| Method calls | `CALLS` | `in` = callers; `out` = callees (method Symbol -> method Symbol only) | +| Service boundary | `EXPOSES` | method Symbol -> Route (handler exposes route) | +| Cross-service | `HTTP_CALLS`, `ASYNC_CALLS` | `HTTP_CALLS`: Client -> Route; `ASYNC_CALLS`: Producer -> Route | + +### Composed edges — type Symbol origin (`direction="out"` only) + +| Edge type | Meaning | +| --------- | ------- | +| `DECLARES.DECLARES_CLIENT` | Members' HTTP clients in one hop | +| `DECLARES.DECLARES_PRODUCER` | Members' async producers in one hop | +| `DECLARES.EXPOSES` | Members' exposed routes in one hop | + +### Composed edges — non-static method Symbol origin (`direction="out"` only) + +| Edge type | Meaning | +| --------- | ------- | +| `OVERRIDDEN_BY` | Concrete overrider methods (stored `[:OVERRIDES]` dispatch hop) | +| `OVERRIDDEN_BY.DECLARES_CLIENT` | Clients declared on overriders (`via_id` = overrider method) | +| `OVERRIDDEN_BY.DECLARES_PRODUCER` | Producers on overriders | +| `OVERRIDDEN_BY.EXPOSES` | Routes exposed by overriders | + +Do not mix `DECLARES.*` and `OVERRIDDEN_BY.*` in one `edge_types` list on a single origin id — the handler rejects the whole request (only one axis applies per node). + +**Pagination:** default `neighbors` `limit=25` slices the merged flat + composed edge list. When `edge_summary` shows a large `out` count for a composed key, raise `limit` (and use `offset`) or issue separate calls per key. + +## Argument shapes + +### JSON, not stringified JSON + +| Param | Right | Wrong | +| ----- | ----- | ----- | +| `edge_types` | `["CALLS"]` | `"CALLS"` or `"[\"CALLS\"]"` | +| `exclude_roles` | `["DTO","OTHER"]` | stringified array | +| `filter` | `{"role":"CONTROLLER"}` | nested string JSON | +| `ids` (batch) | `["sym:...","sym:..."]` | comma-joined string | + +Omit keys you do not need. Empty string `""` is often a **real filter** that matches nothing. + +### Node ids + +| Kind | Prefixes | +| ---- | -------- | +| Symbol | `sym:` | +| Route | `route:` or `r:` | +| Client | `client:` or `c:` | +| Producer | `producer:` or `p:` | + +Use exact ids from `search.symbol_id`, `find`, `describe`, or `neighbors.other.id`. + +### Method / type identity (Symbol FQNs) + +``` +.[.]#(,,...) +``` + +Simple types in parentheses; generics erased (`List` -> `List`). No spaces after commas. No-arg: `()`. Constructor: `#(...)`. + +### `neighbors` — required every time + +- `direction`: `"in"` or `"out"` (no default). +- `edge_types`: non-empty list from the taxonomy above. + +Optional `filter` applies to each **other** endpoint; populated fields must match that neighbor's kind (strict frame). + +**Batching:** multiple `ids` expand first; `limit`/`offset` slice the **merged** edge list — raise `limit` when batching. + +**Mixed flat + composed `edge_types`:** flat edges are listed before composed edges, then pagination applies. A small `limit` with e.g. `["DECLARES","DECLARES.DECLARES_CLIENT"]` may return only member Symbols and no Clients — use the dot-key alone to list terminals. + +## Shared `NodeFilter` (`find`, `search.filter`, `neighbors.filter`) + +For **`find`**, `filter` is required — `{}` means no predicates (all nodes of that kind, subject to pagination). + +| Keys | Applies to | +| ---- | ---------- | +| `microservice`, `module`, `source_layer` | All kinds (`source_layer` mainly **client** / **producer**) | +| `role`, `exclude_roles`, `annotation`, `capability`, `fqn_prefix`, `symbol_kind`, `symbol_kinds` | **symbol** | +| `http_method`, `path_prefix`, `framework` | **route** | +| `client_kind`, `target_service`, `target_path_prefix`, `http_method` | **client** | +| `producer_kind`, `topic_prefix` | **producer** | + +`http_method` filters HTTP verbs on **routes** (declared method) and on **clients** (outbound call method). Not applicable to **symbol** rows. + +**Strict frame:** one populated field -> one stored attribute for that kind. Unknown keys or inapplicable populated fields -> `success=false` with a teaching `message`. No wildcards in `fqn_prefix`, `path_prefix`, or `target_path_prefix` (`*` / `?` rejected) — use `search(query=...)` for ranked text instead. `search.query` is opaque text, not a DSL. + +## Identifier resolution (`resolve`) + +**Input:** FQN or suffix, `sym:`/`route:`/`client:`/`producer:` id, `METHOD /path`, route path template, client `target_service`, `target_service` + path prefix, or producer topic. + +**`hint_kind`:** optional `symbol` | `route` | `client` | `producer`. When omitted, generators run across **all four** kinds (narrow with `hint_kind` when you know the kind). + +| `status` | Action | +| -------- | ------ | +| `one` | `describe(id=node.id)` | +| `many` | pick from `candidates` (`reason`, `score`, `NodeRef`), then `describe` | +| `none` | fall back to `search(query=...)` for NL/fuzzy discovery | + +Prefer **`resolve` -> `describe(id=...)`** over **`describe(fqn=...)`** when an FQN may collide (`describe(fqn=...)` returns the first row). + +**`microservice`** — service where the node lives. **`target_service`** (clients only) — remote service being called. **`source_layer`** (clients/producers) — which extraction layer produced the row (`builtin`, `layer_a_meta`, `layer_b_ann`, `layer_c_source`, `layer_b_fqn`, ...). **`role`** (symbols only) — architectural stereotype (`CONTROLLER`, `SERVICE`, ...). + +## Decision tree + +| User asks... | First step | Typical follow-up | +| ------------ | ---------- | ----------------- | +| Identifier-shaped string | `resolve` (+ optional `hint_kind`) | `describe` -> `neighbors` | +| Fuzzy / NL "where is X" | `search` | `describe` -> `neighbors` | +| All controllers in service S | `find(kind="symbol", filter={"microservice":"S","role":"CONTROLLER"})` | `neighbors` `CALLS` / `EXPOSES` | +| Interfaces in service S | `find(..., filter={"microservice":"S","symbol_kind":"interface"})` | `neighbors` / `describe` | +| HTTP / messaging entry points | `find(kind="route", filter={...})` | `describe` | +| Outbound HTTP clients | `find(kind="client", filter={...})` | `neighbors(..., "out", ["HTTP_CALLS"])` from client id | +| Outbound async producers | `find(kind="producer", filter={...})` | `neighbors(..., "out", ["ASYNC_CALLS"])` from producer id | +| Who calls method M? | id via `resolve` / `find` / `search` | `neighbors(ids, "in", ["CALLS"])` | +| What does M call? | same | `neighbors(ids, "out", ["CALLS"])` | +| Who hits this route? | route id | `neighbors(ids, "in", ["HTTP_CALLS","ASYNC_CALLS","EXPOSES"])` | +| Handler for route | route id | `neighbors(ids, "in", ["EXPOSES"])` | +| Who implements interface T? | type symbol id | `neighbors(ids, "in", ["IMPLEMENTS"])` | +| Where is T injected | type symbol id | `neighbors(ids, "in", ["INJECTS"])` | +| Impact / "what breaks if I change X"? | no magic tool | loop `neighbors` `in` with `CALLS`, `INJECTS`, ... until bounded | + +**Rules of thumb:** + +1. **Structure beats vector** for exact questions — use `resolve` / `find` + `neighbors`, not `search`, for "who calls ...". +2. **Vector beats structure** for fuzzy discovery — `search` first, then pivot to `describe` / `neighbors`. + +## Tool reference + +### `search` + +Ranked chunk retrieval. Args: `query`, `table` (`java`|`sql`|`yaml`|`all`, default `java`), `hybrid` (bool), `limit` (default 5), `offset`, `path_contains`, optional `filter` (symbol-applicable `NodeFilter` only). + +### `find` + +Exact listing for one kind. Args: `kind` (`symbol`|`route`|`client`|`producer`), **`filter`** (required object), `limit` (default 25), `offset`. Returns `NodeRef` rows (`id`, `kind`, `fqn`, `microservice`, `module`, `role` on symbols, `symbol_kind` on symbols). + +### `describe` + +Full node + `edge_summary`. Args: `id` (any kind) or `fqn` (symbol only; `id` wins). + +- **Stored keys** — counts for edges that exist in the graph. +- **Type symbols** (`class`, `interface`, `enum`, `record`, `annotation`) may add composed keys `DECLARES.DECLARES_CLIENT`, `DECLARES.DECLARES_PRODUCER`, `DECLARES.EXPOSES` — navigable via `neighbors` with those dot-keys (`out` only). +- **Method symbols** may add virtual keys `OVERRIDDEN_BY`, `OVERRIDDEN_BY.DECLARES_*`, `OVERRIDDEN_BY.EXPOSES` (navigable via `neighbors` on non-static method origins, `out` only), plus an **`OVERRIDES`** row merging stored `[:OVERRIDES]` incident counts with the rollup dispatch-up count (`max` per direction). + +Composed counts are **edge rows**, not distinct methods; `count > 0` means "there is something to walk". + +### `resolve` + +Identifier lookup; three statuses above. Args: `identifier`, optional `hint_kind`. + +### `neighbors` + +One hop. Args: `ids` (string or array), **`direction`**, **`edge_types`**, `limit` (default 25), `offset`, optional `filter` on the other node, optional **`edge_filter`** (`edge_types` must be exactly `['CALLS']` — no composed dot-keys or second stored label; fail-loud otherwise). + +Returns **edges** with `attrs` (`confidence`, `strategy`, `match`, ... on cross-service edges) and **`other`** node. + +**Cross-service edges** (`HTTP_CALLS`, `ASYNC_CALLS`): read `attrs.confidence` and `attrs.match` — low confidence or `unresolved`/`phantom`/`ambiguous` means treat as a resolver signal, not ground truth. + +**`CALLS` edges:** source-ordered (`call_site_line`, `call_site_byte`). `attrs.resolved=false` on `CALLS` rows means known-receiver-external (JDK/Spring) callees. **`include_unresolved=True`** (CALLS + `direction=out` only) interleaves unresolved sites with resolved `CALLS`. **`dedup_calls=True`** collapses identical `(origin, callee)` `CALLS` to one row with `call_site_lines`. Optional **`edge_filter`** projects before pagination: `min_confidence`; `include_strategies` / `exclude_strategies` (mutually exclusive); `callee_declaring_role`, `callee_declaring_roles`, `exclude_callee_declaring_roles` (`["OTHER"]` also drops known-external rows). + +## Ontology glossary + +**Roles (`filter.role` / `exclude_roles`):** `CONTROLLER`, `SERVICE`, `REPOSITORY`, `COMPONENT`, `CONFIG`, `ENTITY`, `CLIENT`, `MAPPER`, `DTO`, `OTHER`. + +**Capabilities (`filter.capability`):** `MESSAGE_LISTENER`, `MESSAGE_PRODUCER`, `HTTP_CLIENT`, `SCHEDULED_TASK`, `EXCEPTION_HANDLER`. + +**Symbol kinds (`symbol_kind` / `symbol_kinds`):** `class`, `interface`, `enum`, `record`, `annotation`, `method`, `constructor`. + +**Route `framework` (examples):** `spring_mvc`, `webflux`, `kafka`, `rabbitmq`, `jms`, `stream`, `codebase_async_route`, ... + +**Client kinds:** `feign_method`, `rest_template`, `web_client`. + +**Producer kinds:** `kafka_send`, `stream_bridge_send`. + +**HTTP call `attrs.match` / async `attrs.match`:** `cross_service`, `intra_service`, `ambiguous`, `phantom`, `unresolved`. + +## Recovery playbook + +| Symptom | Likely cause | Fix | +| ------- | ------------ | --- | +| `neighbors` validation error | Missing `direction` or `edge_types` | Add both explicitly | +| Empty `neighbors` | Wrong edge type or direction | Read `describe.edge_summary`; `EXPOSES` is Symbol->Route; `OVERRIDES` is method<->method only; `HTTP_CALLS` starts from **Client** ids | +| Cannot find symbol | Wrong id or empty index | `resolve` / `search`; try `find` with `fqn_prefix` | +| `find` returns too much | Broad filter | Add `microservice`, `fqn_prefix`, `path_prefix`, `topic_prefix`, ... | +| Route not found | Path mismatch | `find(kind="route", filter={"path_prefix":...})` | +| Empty `search` | Wrong `table`, no index, or chunk miss | Try `table="all"`; `find` with `fqn_prefix`; read source files directly | +| Empty results across several tools | Index missing, stale, or wrong project | You cannot rebuild the index via MCP — ask the operator; meanwhile use open files / `rg` | +| Result vs open file disagree | Stale or partial index | Trust the file; say index may be stale | +| Mixed composed families on one id | `DECLARES.*` + `OVERRIDDEN_BY.*` together | Split calls — type keys need a type id; override keys need a method id | +| Override dot-key on type / DECLARES on method | Wrong Symbol origin for axis | Read `describe.edge_summary`; use the axis that matches the node kind | + +After two failed attempts on the same intent, stop and report tool name, args, and response snippet. + +## Common navigation patterns + +| Intent | Tool chain | +| ------ | ---------- | +| Natural-language "find X" | `search(query=..., limit=8)` -> `describe(top_hit.symbol_id)` | +| List controllers in service S | `find(kind="symbol", filter={microservice:"S", role:"CONTROLLER"})` | +| List routes in service S | `find(kind="route", filter={microservice:"S"})` | +| List clients in service S | `find(kind="client", filter={microservice:"S"}, limit=100)` | +| List producers in service S | `find(kind="producer", filter={microservice:"S"}, limit=100)` | +| Who calls method M | `resolve` -> `neighbors(ids, "in", ["CALLS"])` | +| What does M call | `resolve` -> `neighbors(ids, "out", ["CALLS"])` | +| Handler for route R | `neighbors(route_id, "in", ["EXPOSES"])` | +| All inbound to route R | `neighbors(route_id, "in", ["HTTP_CALLS","ASYNC_CALLS","EXPOSES"])` | +| Implementors of interface T | `neighbors(type_id, "in", ["IMPLEMENTS"])` | +| Where is T injected | `neighbors(type_id, "in", ["INJECTS"])` | +| Impact of changing X | `resolve` -> `describe` -> bounded `neighbors(in, ["CALLS","INJECTS","IMPLEMENTS","EXTENDS"])` depth ≤2 | + +## Canonical workflow: "explain feature X" + +1. `search` with a short query; pick 1-3 hits with strong `symbol_id` / role fit. +2. `describe` on the chosen id; read `edge_summary`. +3. Walk with `neighbors` using **small** `edge_types` sets (e.g. `CALLS` out, or `EXPOSES` / cross-service edges for boundaries). +4. Stop when you can answer; do not prefetch unrelated subgraphs. + +## Worked example + +User: "how does operator assignment work?" +``` +Q-class: semantic Pick: search Why: NL feature name +-> search(query="operator assignment", limit=8) + -> sym:com.bank.chat.assign.service.OperatorAssignmentService (interface, SERVICE) + -> sym:com.bank.chat.assign.api.AssignController (CONTROLLER) + +Q-class: inspect Pick: describe Why: edge_summary on interface +-> describe(id="sym:...OperatorAssignmentService") + -> edge_summary { IMPLEMENTS.in: 2, INJECTS.in: 3, CALLS.in: 4 } + +Q-class: walk Pick: neighbors Why: find concrete implementors +-> neighbors(ids="sym:...OperatorAssignmentService", direction="in", edge_types=["IMPLEMENTS"]) + -> RoundRobin..., Weighted... (2 strategies) + +Q-class: walk Pick: neighbors Why: trace inbound from controller +-> neighbors(ids="sym:...AssignController", direction="out", edge_types=["CALLS"]) + -> AssignController -> OperatorAssignmentService#assign -> OperatorRepository#save + +Synthesize: "Operator assignment has two strategies (RoundRobin, Weighted) +behind an interface. Triggered via AssignController. Persists via OperatorRepository..." +``` + +## Do not + +- Do not answer from training data or general Java knowledge. +- Do not read source files when MCP can answer. +- Do not skip MCP calls and guess. +- Do not fabricate ids — always obtain them from `search` / `find` / `resolve`. +- Do not walk all edge types at once — small `edge_types` sets per call. +- Do not use this MCP when the answer is already in the open file, or for third-party library trivia from training data alone. Prefer the smallest call that answers the question. diff --git a/skills/tier-1/callees/SKILL.md b/skills/tier-1/callees/SKILL.md deleted file mode 100644 index 1d79404..0000000 --- a/skills/tier-1/callees/SKILL.md +++ /dev/null @@ -1,82 +0,0 @@ ---- -name: callees -description: Show what a method symbol calls (in-process CALLS). Use when the user asks "what does X call", "callees of X", "what does X invoke", or "what methods does Y use". Argument is a sym: id, or an identifier resolved via `resolve`. For noisy method bodies prefer /mini-map; for outbound HTTP/async see optional step 3 below. ---- - -# /callees — In-process callees of a method symbol - -## When to use - -The user wants the **direct in-process callees** of a method. CALLS edges only — outbound HTTP/async are reached via `DECLARES_CLIENT` / `DECLARES_PRODUCER` then `HTTP_CALLS` / `ASYNC_CALLS` (see optional step 3). - -If the method is a hot SERVICE/COMPONENT with > ~30 raw callees, prefer `/mini-map`. - -## Tools used - -`resolve`, `neighbors`. Optional second `neighbors` chain for HTTP/async. `search` only as a recovery fallback. - -## Reasoning preamble (mandatory) - -``` -Q-class: walk Pick: neighbors Why: out-edges of CALLS on method -``` - -**Q-class taxonomy:** `semantic` (`search`), `structured` (`find`/`resolve`), `inspect` (`describe`), **`walk`** (`neighbors`). - -## Argument contract - -Single positional argument: a method **symbol** id (`sym:...` preferred) OR an identifier-shaped string → `resolve(identifier=..., hint_kind="symbol")`. - -This skill is for **method symbols**. For inbound traffic to a route use `/who-hits-route`. - -## Steps - -1. **Resolve.** If the argument starts with `sym:`, use it directly. Else `resolve(identifier=, hint_kind="symbol")` (`one`/`many`/`none` handling per `/callers`). -2. **In-process callees.** Call `neighbors(ids=, direction="out", edge_types=["CALLS"])`. - Group by callee `fqn` + `microservice`. Mark rows where `attrs.resolved=false` (known-external receivers — JDK/Spring/etc.). -3. **Optional — outbound HTTP/async** (only when the user asks about cross-service calls): - - HTTP: `neighbors(ids=, direction="out", edge_types=["DECLARES_CLIENT"])` → for each Client id, `neighbors(ids=, direction="out", edge_types=["HTTP_CALLS"])`. - - Async: `neighbors(ids=, direction="out", edge_types=["DECLARES_PRODUCER"])` → for each Producer id, `neighbors(ids=, direction="out", edge_types=["ASYNC_CALLS"])`. - -## Noise hint - -After ontology 15, true receiver-failure call sites are **not** on `CALLS` — they are `UnresolvedCallSite` nodes (reachable via `include_unresolved=True` on the same `neighbors` call). `attrs.resolved=false` on a `CALLS` row means the callee is known but external (JDK/Spring/Lombok/etc.). - -For noisy bodies, prefer one of: -- `/mini-map ` — handles accessor/JDK filtering + DELEGATES/PERSISTS/READS labels. -- Pre-filter inline: `neighbors(ids=, direction="out", edge_types=["CALLS"], edge_filter={callee_declaring_role:"SERVICE"})`. - -## Recovery - -- Empty result but `describe.edge_summary` shows `CALLS.out > 0`: re-resolve — wrong overload. -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/callees ChatController#joinOperator(JoinOperatorRequest)` -You: -``` -Q-class: structured Pick: resolve Why: identifier-shaped argument -→ resolve(identifier="ChatController#joinOperator", hint_kind="symbol") - → sym:com.bank.chat.core.api.ChatController#joinOperator(JoinOperatorRequest) -Q-class: walk Pick: neighbors Why: out-edges of CALLS on method -→ neighbors(ids="sym:...", direction="out", edge_types=["CALLS"]) - → 6 in-process callees (service + repository) - → (optional) neighbors(out, ["DECLARES_CLIENT"]) → 1 client → HTTP_CALLS → 1 route -``` - -## Do not - -- Do not pass `HTTP_CALLS` or `ASYNC_CALLS` to `neighbors` on a bare method `sym:` — those edges live on Client/Producer nodes (decision #13 in the locked agent-skills propose). -- Do not fabricate `sym:` ids. -- Do not read source files when MCP can answer. - -## Out of scope - -- Recursive callees beyond depth 1 (use `/trace-request-flow` or `/mini-map`). -- Noise-filtered call maps (use `/mini-map`; fall back here only if the map is too thin). -- Filtering by microservice (compose with `/controllers` + `/callees` per result). - -## Going deeper - -The full edge-filter axes (`callee_declaring_role`, `min_confidence`, `dedup_calls`, `include_unresolved`) and the `DECLARES_CLIENT → HTTP_CALLS` rationale live in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/callees`. diff --git a/skills/tier-1/callers/SKILL.md b/skills/tier-1/callers/SKILL.md deleted file mode 100644 index 24f5b04..0000000 --- a/skills/tier-1/callers/SKILL.md +++ /dev/null @@ -1,76 +0,0 @@ ---- -name: callers -description: Show in-process callers of a method symbol via CALLS edges. Use when the user asks "who calls X", "callers of X", "what invokes X method", or "find usages of method Y". Argument is a sym: id or an identifier resolved via `resolve`. For inbound HTTP/async traffic to a route use /who-hits-route; for recursive impact analysis use /impact-of. ---- - -# /callers — In-process callers of a method symbol - -## When to use - -The user wants the **direct in-process callers** of a method. CALLS edges are method-Symbol → method-Symbol within the indexed graph (no cross-service HTTP/async — those are different edges). - -## Tools used - -`resolve` (when argument isn't already a `sym:` id) + `neighbors`. `search` only as a recovery fallback. - -## Reasoning preamble (mandatory) - -``` -Q-class: walk Pick: neighbors Why: in-edges of CALLS on method -``` - -If you call `resolve` first, that one uses `Q-class: structured` (id-shaped input). - -**Q-class taxonomy:** `semantic` (`search`), `structured` (`find` / `resolve` for id-shaped strings), `inspect` (`describe`), **`walk`** (`neighbors`). - -## Argument contract - -Single positional argument: a method **symbol** id (`sym:...` preferred) OR an identifier-shaped string (FQN fragment, `Class#method`, signature). - -This skill is for **method symbols**. For inbound traffic to an HTTP/async **route**, use `/who-hits-route`. - -## Steps - -1. **Resolve.** If the argument starts with `sym:`, use it directly. - Else call `resolve(identifier=, hint_kind="symbol")`. - - `status="one"` → use `node.id`. - - `status="many"` → list `candidates` and stop for user pick. - - `status="none"` → call `search(query=, limit=5)`; if still empty, stop and report. -2. **Walk inbound CALLS.** Call `neighbors(ids=, direction="in", edge_types=["CALLS"])`. -3. **Render.** Group by caller `fqn` + `microservice`. Show count when one caller has multiple call sites (the edge rows include `call_site_line`). - -## Recovery - -- `neighbors` returns empty but `describe()` shows `CALLS.in > 0`: re-run with the exact `id` from `describe`; the symbol you resolved may be the wrong overload. -- Resolved id is for a **type** Symbol, not a method: use `/implements` or `/injects` instead. -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/callers ChatController#joinOperator(JoinOperatorRequest)` -You: -``` -Q-class: structured Pick: resolve Why: identifier-shaped argument -→ resolve(identifier="ChatController#joinOperator", hint_kind="symbol") - → status=one id=sym:com.bank.chat.core.api.ChatController#joinOperator(JoinOperatorRequest) -Q-class: walk Pick: neighbors Why: in-edges of CALLS on method -→ neighbors(ids="sym:...", direction="in", edge_types=["CALLS"]) - → 3 callers in chat-core, 1 in chat-assign -``` - -## Do not - -- Do not answer from training data or general Java knowledge. -- Do not read source files when MCP can answer. -- Do not fabricate `sym:` ids — always obtain them from `resolve` / `find` / `search`. -- Do not pass `HTTP_CALLS` or `ASYNC_CALLS` to `neighbors` on a bare method `sym:` — those edges live on Client/Producer nodes. - -## Out of scope - -- Callers of a **route** (use `/who-hits-route`). -- Recursive callers beyond depth 1 (use `/impact-of`). -- Outbound callees (use `/callees`). - -## Going deeper - -CALLS edge details (call-site columns, what `attrs.resolved=false` means after ontology 15) and the full recovery playbook live in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/callers`. diff --git a/skills/tier-1/clients/SKILL.md b/skills/tier-1/clients/SKILL.md deleted file mode 100644 index 6ed659e..0000000 --- a/skills/tier-1/clients/SKILL.md +++ /dev/null @@ -1,76 +0,0 @@ ---- -name: clients -description: List outbound HTTP clients in the indexed Java codebase, optionally filtered by microservice. Use when the user asks "list clients", "show outbound HTTP calls", "what Feign clients exist in X", or "what HTTP calls does this service make". Returns Client nodes — feign methods, RestTemplate/WebClient call sites, brownfield-annotated outbound calls. Argument is an optional microservice name. ---- - -# /clients — List outbound HTTP clients - -## When to use - -The user wants a **list** of outbound HTTP call sites: `@FeignClient` methods, `RestTemplate`/`WebClient`/`HttpClient` call expressions, or brownfield `@CodebaseHttpClient`. Optionally scoped to one microservice. - -## Tools used - -`find` only. - -## Reasoning preamble (mandatory) - -``` -Q-class: structured Pick: find Why: client kind + optional microservice -``` - -**Q-class taxonomy reminder:** `semantic` (`search`), **`structured`** (`find`), `inspect` (`describe`), `walk` (`neighbors`). - -## Argument contract - -Optional positional argument: microservice name. Omit to list all clients. - -**`microservice` value note:** Not validated; invalid name returns empty. - -## Steps - -1. **Find.** Call: - - With microservice: `find(kind="client", filter={microservice:}, limit=100)` - - Without: `find(kind="client", filter={}, limit=100)` -2. **Render.** For each row show `fqn`, `microservice`, `client_kind` (e.g. `feign_method`, `rest_template`, `web_client`, `http_call`, `codebase_http`), `target_service` (when known), and `id`. -3. **Narrow if needed.** When results are broad, add `client_kind` or `target_service` to the filter. - -## Optional filters - -- `client_kind:"feign_method"` — only declared Feign methods -- `client_kind:"http_call"` — call-site clients (RestTemplate/WebClient/raw HttpClient) -- `target_service:"chat-service"` — only clients pointing at one target - -## Recovery - -- Empty with a microservice argument: re-run without the filter to confirm the microservice name. -- Many results: narrow with `client_kind` or `target_service`. -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/clients chat-core` -You: -``` -Q-class: structured Pick: find Why: client listing for one microservice -→ find(kind="client", filter={microservice:"chat-core"}, limit=100) - → client:com.bank.chat.core.client.ChatServiceClient#assign feign_method target=chat-service - → client:com.bank.chat.core.client.NotifyClient#send feign_method target=notify-service - → ... 7 more -``` - -## Do not - -- Do not infer clients from training data — they are project-specific. -- Do not read source files when `find` can answer. -- Do not call `search` for this — it's a structured listing. - -## Out of scope - -- The downstream **route** a client targets (use `neighbors(client_id, "out", ["HTTP_CALLS"])`, or `/trace-request-flow`). -- Async outbound (use `/producers`). -- Who declares a client (use `neighbors(client_id, "in", ["DECLARES_CLIENT"])`). - -## Going deeper - -The full `Client` schema (every `client_kind` value, when `target_service` is set, brownfield-annotated clients) and the `DECLARES_CLIENT → HTTP_CALLS` pattern are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/clients`. diff --git a/skills/tier-1/controllers/SKILL.md b/skills/tier-1/controllers/SKILL.md deleted file mode 100644 index 6d47aa3..0000000 --- a/skills/tier-1/controllers/SKILL.md +++ /dev/null @@ -1,78 +0,0 @@ ---- -name: controllers -description: List controller classes in the indexed Java codebase, optionally filtered by microservice. Use when the user asks "list controllers", "show me controllers in X", "what REST entry points exist", or "give me all @RestController classes". Returns Symbol nodes with role=CONTROLLER. Argument is an optional microservice name. ---- - -# /controllers — List controller classes - -## When to use - -The user wants a **list** of controller-stereotype classes (Spring `@RestController`, `@Controller`, JAX-RS resources, or brownfield-annotated equivalents). Optionally scoped to one microservice. - -## Tools used - -`find` only. - -## Reasoning preamble (mandatory) - -Before the MCP call, output one line: - -``` -Q-class: structured Pick: find Why: known role + optional microservice -``` - -**Q-class taxonomy reminder:** `semantic` (NL → `search`), **`structured`** (role/kind/microservice → `find`), `inspect` (→ `describe`), `walk` (→ `neighbors`). - -## Argument contract - -Optional positional argument: microservice name. Omit to list all controllers across all microservices. - -**`microservice` value note:** Not validated by the MCP. An invalid name silently returns an empty list. If results are empty, verify the name with `find(kind="microservice", filter={})` first. - -## Steps - -1. **Find.** Call: - - With microservice: `find(kind="symbol", filter={role:"CONTROLLER", microservice:})` - - Without: `find(kind="symbol", filter={role:"CONTROLLER"})` -2. **Render.** For each row show `fqn`, `microservice`, and `id`. Group by microservice when no filter was given. - -## Recovery - -- Empty result with a microservice argument: re-run without the microservice filter; if non-empty, the microservice name is wrong. Look up the canonical names via `find(kind="microservice", filter={})`. -- Empty result without a microservice argument: likely no `CONTROLLER` role assigned. Try `find(kind="symbol", filter={symbol_kind:"class", fqn_prefix:""})` and inspect with `describe` to see `role`. -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/controllers chat-core` -You: -``` -Q-class: structured Pick: find Why: role+microservice listing -→ find(kind="symbol", filter={role:"CONTROLLER", microservice:"chat-core"}) - → sym:com.bank.chat.core.api.ChatController microservice=chat-core - → sym:com.bank.chat.core.api.OperatorController microservice=chat-core -``` - -User: `/controllers` -You: -``` -Q-class: structured Pick: find Why: role listing, no scope -→ find(kind="symbol", filter={role:"CONTROLLER"}) - → grouped by microservice: chat-core (2), chat-assign (1), ... -``` - -## Do not - -- Do not answer from training data — controllers vary by project. -- Do not read source files when `find` can answer the question. -- Do not call `search` for this — it's a structured listing, not a fuzzy query. - -## Out of scope - -- HTTP route enumeration (use `/routes`). -- The handler method for a specific route (use `/handlers`). -- Listing controllers + their routes together (compose `/controllers` then `/routes` per service). - -## Going deeper - -Full `NodeFilter` reference (all valid keys for `find`, including `role`, `symbol_kind`, `framework`, `fqn_prefix`) and the role taxonomy live in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/controllers`. diff --git a/skills/tier-1/handlers/SKILL.md b/skills/tier-1/handlers/SKILL.md deleted file mode 100644 index 4845fc8..0000000 --- a/skills/tier-1/handlers/SKILL.md +++ /dev/null @@ -1,67 +0,0 @@ ---- -name: handlers -description: Show the method that handles an HTTP or messaging route via the EXPOSES edge. Use when the user asks "what handles X route", "handler for POST /foo", "which method handles this endpoint", or "find the controller method for path Y". Argument is a route: id or a route identifier (path, METHOD /path) resolved via `resolve`. ---- - -# /handlers — Handler method for a route - -## When to use - -The user has a **route** (path, METHOD /path, async topic, or `route:` id) and wants the method `sym:` that handles it. The edge is `Symbol —EXPOSES→ Route`, so the handler is the *in-neighbor* of the route. - -## Tools used - -`resolve` (when argument isn't already `route:`) + `neighbors`. `find` only as recovery fallback. - -## Reasoning preamble (mandatory) - -``` -Q-class: walk Pick: neighbors Why: in-edges of EXPOSES on route -``` - -If `resolve` runs first, that one is `Q-class: structured`. - -## Argument contract - -Single positional argument: a **route** id (`route:...` preferred) OR an identifier-shaped string (path like `/chat/join` or `METHOD /path`). - -## Steps - -1. **Resolve.** If argument starts with `route:` or `r:`, use directly. Else `resolve(identifier=, hint_kind="route")`: - - `status="one"` → use `node.id`. - - `status="many"` → list candidates, stop. - - `status="none"` → `find(kind="route", filter={path_prefix:})`; if still empty, stop and report. -2. **Walk EXPOSES inbound.** Call `neighbors(ids=, direction="in", edge_types=["EXPOSES"])`. -3. **Render.** Show the handler method `fqn` + `microservice` + (parent class via `describe` if useful). - -## Recovery - -- Multiple EXPOSES neighbors (rare): possible duplicate mapping across frameworks. Report all; let the user disambiguate. -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/handlers POST /chat/join` -You: -``` -Q-class: structured Pick: resolve Why: identifier-shaped route arg -→ resolve(identifier="POST /chat/join", hint_kind="route") - → status=one id=route:POST /chat/join -Q-class: walk Pick: neighbors Why: in-edges of EXPOSES on route -→ neighbors(ids="route:POST /chat/join", direction="in", edge_types=["EXPOSES"]) - → sym:com.bank.chat.core.api.ChatController#joinOperator(JoinOperatorRequest) microservice=chat-core -``` - -## Do not - -- Do not fabricate `route:` ids — always obtain them from `resolve` or `find`. -- Do not read source files when MCP can answer. - -## Out of scope - -- All inbound paths to a route — cross-service `HTTP_CALLS` + async `ASYNC_CALLS` + handler `EXPOSES` (use `/who-hits-route`). -- Following the request through the call graph (use `/trace-request-flow`). - -## Going deeper - -`EXPOSES` semantics (always Symbol→Route, one-to-one in well-formed projects) and the full `find(kind="route")` filter set are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/handlers`. diff --git a/skills/tier-1/implements/SKILL.md b/skills/tier-1/implements/SKILL.md deleted file mode 100644 index c5ea50d..0000000 --- a/skills/tier-1/implements/SKILL.md +++ /dev/null @@ -1,69 +0,0 @@ ---- -name: implements -description: Show concrete classes that implement an interface or abstract type. Use when the user asks "what implements X", "implementations of X", "concrete types for interface X", or "subclasses of X". Argument is a type sym: id or identifier resolved via `resolve`. Uses the IMPLEMENTS edge (also follow EXTENDS for class hierarchy). ---- - -# /implements — Concrete implementors of a type - -## When to use - -The user has an **interface** (or abstract class) Symbol and wants its concrete implementors. The edge is `Concrete —IMPLEMENTS→ Interface`, so implementors are the *in-neighbors*. - -For class-hierarchy parents/children use `EXTENDS` instead — see optional step 3. - -## Tools used - -`resolve` + `neighbors`. `search` only as a recovery fallback. - -## Reasoning preamble (mandatory) - -``` -Q-class: walk Pick: neighbors Why: in-edges of IMPLEMENTS on type -``` - -## Argument contract - -Single positional argument: a **type** Symbol id (`sym:...` preferred) OR an identifier-shaped string (FQN, simple class name). - -## Steps - -1. **Resolve.** If argument starts with `sym:`, use directly. Else `resolve(identifier=, hint_kind="symbol")`: - - `one` → use `node.id`. - - `many` → list candidates, stop. - - `none` → `search(query=, limit=5)`; if still empty, stop and report. -2. **Walk IMPLEMENTS inbound.** Call `neighbors(ids=, direction="in", edge_types=["IMPLEMENTS"])`. Each row is a concrete implementor. -3. **Optional — also follow EXTENDS** (when the user asks for the *full* subtype tree): `neighbors(ids=, direction="in", edge_types=["IMPLEMENTS","EXTENDS"])`. -4. **Render.** Show each implementor's `fqn` + `microservice`. Note if any are themselves interfaces (rare — multi-level interfaces). - -## Recovery - -- Empty result but the type is clearly an interface used in the codebase: confirm the resolved id is the **type** Symbol, not a method on it. `describe()` should show `symbol_kind: "interface"` or `"class"`. -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/implements OperatorAssignmentService` -You: -``` -Q-class: structured Pick: resolve Why: identifier-shaped argument -→ resolve(identifier="OperatorAssignmentService", hint_kind="symbol") - → status=one id=sym:com.bank.chat.assign.service.OperatorAssignmentService (interface) -Q-class: walk Pick: neighbors Why: in-edges of IMPLEMENTS on type -→ neighbors(ids="sym:...", direction="in", edge_types=["IMPLEMENTS"]) - → sym:com.bank.chat.assign.service.RoundRobinOperatorAssignmentService (chat-assign) - → sym:com.bank.chat.assign.service.WeightedOperatorAssignmentService (chat-assign) -``` - -## Do not - -- Do not enumerate implementors from training data — they are project-specific. -- Do not fabricate `sym:` ids. - -## Out of scope - -- Where the type is injected (use `/injects`). -- Per-method overrides — `IMPLEMENTS` is **type→type**; the per-method counterpart is `OVERRIDES` on a method id (composed key `OVERRIDDEN_BY.IMPLEMENTS`/`OVERRIDDEN_BY.EXTENDS` walks both directions). - -## Going deeper - -The `IMPLEMENTS` vs `EXTENDS` vs `OVERRIDES` distinction and the composed `OVERRIDDEN_BY.*` navigation are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/implements`. diff --git a/skills/tier-1/injects/SKILL.md b/skills/tier-1/injects/SKILL.md deleted file mode 100644 index 399ab8e..0000000 --- a/skills/tier-1/injects/SKILL.md +++ /dev/null @@ -1,70 +0,0 @@ ---- -name: injects -description: Show where a type is injected via dependency injection. Use when the user asks "where is X injected", "who injects X", "what depends on X via DI", or "find DI consumers of bean X". Argument is a type sym: id or identifier resolved via `resolve`. Uses the INJECTS edge (covers constructor, field, and setter injection). ---- - -# /injects — Where a type is injected via DI - -## When to use - -The user has a **type** Symbol (interface, abstract class, or concrete bean) and wants the call sites where it's injected via DI — Spring constructor / field / setter injection, Lombok `@RequiredArgsConstructor`, brownfield equivalents. - -The edge is `Consumer —INJECTS→ Type`, so consumers are the *in-neighbors*. - -## Tools used - -`resolve` + `neighbors`. `search` only as a recovery fallback. - -## Reasoning preamble (mandatory) - -``` -Q-class: walk Pick: neighbors Why: in-edges of INJECTS on type -``` - -## Argument contract - -Single positional argument: a **type** Symbol id (`sym:...` preferred) OR an identifier-shaped string (FQN, simple class name). - -## Steps - -1. **Resolve.** If argument starts with `sym:`, use directly. Else `resolve(identifier=, hint_kind="symbol")`: - - `one` → use `node.id`. - - `many` → list candidates, stop. - - `none` → `search(query=, limit=5)`; if still empty, stop and report. -2. **Walk INJECTS inbound.** Call `neighbors(ids=, direction="in", edge_types=["INJECTS"])`. -3. **Render.** For each row show consumer `fqn` + `microservice` + edge `attrs.mechanism` (e.g. `constructor`, `field`, `setter`, `lombok_required_args`) + `attrs.field_or_param` (the field or parameter name). - -## Recovery - -- Empty result but the type is clearly a bean: confirm resolved id is the type Symbol (`describe()` → `symbol_kind:"class"|"interface"`). -- For interface types, also check `/implements ` — if no implementor is annotated as a Spring bean, the type may not be DI-managed at all. -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/injects OperatorAssignmentService` -You: -``` -Q-class: structured Pick: resolve Why: identifier-shaped argument -→ resolve(identifier="OperatorAssignmentService", hint_kind="symbol") - → sym:com.bank.chat.assign.service.OperatorAssignmentService -Q-class: walk Pick: neighbors Why: in-edges of INJECTS on type -→ neighbors(ids="sym:...", direction="in", edge_types=["INJECTS"]) - → sym:com.bank.chat.assign.api.AssignController constructor field_or_param=service - → sym:com.bank.chat.assign.scheduler.HealthCheck field field_or_param=service -``` - -## Do not - -- Do not fabricate `sym:` ids. -- Do not infer DI usage from training data — it's project-specific. - -## Out of scope - -- Concrete implementors of the type (use `/implements`). -- Who **calls methods on** an injected dependency (use `/callers` on the method id, not the type). -- Reverse direction — what does a class inject? Use `neighbors(, "out", ["INJECTS"])` directly. - -## Going deeper - -`INJECTS` `attrs` schema (mechanism, field_or_param, qualifier when present) and DI-framework coverage (Spring, Lombok, brownfield) are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/injects`. diff --git a/skills/tier-1/nl/SKILL.md b/skills/tier-1/nl/SKILL.md deleted file mode 100644 index 5b2d276..0000000 --- a/skills/tier-1/nl/SKILL.md +++ /dev/null @@ -1,84 +0,0 @@ ---- -name: nl -description: Natural-language search into the java-codebase-rag graph index. Use when the user asks a fuzzy question like "find authentication code", "where is X handled", "show me code that does Y", or any concept search that doesn't start with a sym:/route:/client:/producer: id or a recognizable FQN. Argument is free-form text. Composes search → describe → optional neighbors. ---- - -# /nl — Natural-language search into the graph - -## When to use - -The user's request is **conceptual**, not identifier-shaped. Examples: - -- "find authentication code" -- "where do we handle operator assignment?" -- "show me anything about chat escalation" - -If the user gives a `sym:` / `route:` / `client:` / `producer:` id or a clear FQN, prefer `/callers`, `/handlers`, `/describe` (via `resolve`), etc. - -## Tools used - -`search`, `describe`, `neighbors` (rarely). - -## Reasoning preamble (mandatory) - -Before **each** MCP call, output one line: - -``` -Q-class: -Pick: Why: <≤8 words> -``` - -**Q-class taxonomy:** - -- **semantic** — fuzzy NL → `search` -- **structured** — known role/kind/microservice/FQN-prefix listing → `find` -- **inspect** — get the full record + edge summary of a known id → `describe` -- **walk** — follow edges from a known id → `neighbors` - -For `/nl` the first call is always `semantic` → `search`. - -## Argument contract - -Single positional argument: free-form text describing what to find. - -## Steps - -1. **Search.** Call `search(query=, limit=8)`. Each row has `symbol_id`, `microservice`, `symbol_kind`, `role`. Review for strong fit (role aligns with what the user wants, FQN looks plausible). -2. **Inspect top hit.** When the top result has a `symbol_id`, call `describe(id=)` to get the full record and `edge_summary` (per-label `in`/`out` counts). -3. **Stop or walk.** If `describe` answers the question, stop. Otherwise pick the most relevant edge type from `edge_summary` and call `neighbors(ids=, direction=, edge_types=[])`. Single hop only — for deeper traces hand off to `/explain-feature` or `/trace-request-flow`. - -## Recovery - -- `search` returns empty: try `search(query=, table="all")`, then fall back to reading source files. If a known FQN fragment exists in ``, try `find(kind="symbol", filter={fqn_prefix:})`. -- After two failed attempts on the same intent, stop and report the tool name, args, and response snippet. - -## Worked example - -User: `/nl operator assignment` -You: -``` -Q-class: semantic Pick: search Why: NL query, no id -→ search(query="operator assignment", limit=8) - → top hit: sym:com.bank.chat.assign.service.OperatorAssignmentService (role: SERVICE) -Q-class: inspect Pick: describe Why: get edge_summary -→ describe(id="sym:com.bank.chat.assign.service.OperatorAssignmentService") - → edge_summary: { CALLS: {in: 4, out: 12}, INJECTS: {in: 3, out: 1}, IMPLEMENTS: {in: 2} } -→ stop: caller has enough to ask "/implements OperatorAssignmentService" next -``` - -## Do not - -- Do not answer from training data or general Java knowledge. -- Do not read source files when MCP tools can provide the answer. -- Do not fabricate `symbol_id` values — they always come from `search` / `find` / `resolve`. -- Do not walk deeper than one hop in this skill — hand off to a Tier 2 skill. - -## Out of scope - -- Structured listing by role or kind (use `/controllers`, `/routes`, `/clients`, `/producers`). -- Identifier-shaped input where `resolve` would be more precise. -- Multi-hop traces (use `/explain-feature`, `/trace-request-flow`, `/impact-of`). - -## Going deeper - -The full operating manual (NodeFilter keys, edge taxonomy, recovery playbook, navigation patterns) lives in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/nl` — no need to read the guide first. diff --git a/skills/tier-1/producers/SKILL.md b/skills/tier-1/producers/SKILL.md deleted file mode 100644 index 3e52b46..0000000 --- a/skills/tier-1/producers/SKILL.md +++ /dev/null @@ -1,67 +0,0 @@ ---- -name: producers -description: List outbound async producers in the indexed Java codebase, optionally filtered by microservice. Use when the user asks "list producers", "show outbound async calls", "what Kafka producers are in X", or "list message senders". Returns Producer nodes — KafkaTemplate / StreamBridge call sites and brownfield-annotated producers. Argument is an optional microservice name. ---- - -# /producers — List outbound async producers - -## When to use - -The user wants a **list** of outbound async call sites: `KafkaTemplate.send`, `StreamBridge.send`, or brownfield `@CodebaseProducer`. Optionally scoped to one microservice. Symmetric counterpart to `/clients` for async. - -## Tools used - -`find` only. - -## Reasoning preamble (mandatory) - -``` -Q-class: structured Pick: find Why: producer kind + optional microservice -``` - -**Q-class taxonomy reminder:** `semantic` (`search`), **`structured`** (`find`), `inspect` (`describe`), `walk` (`neighbors`). - -## Argument contract - -Optional positional argument: microservice name. Omit to list all producers. - -**`microservice` value note:** Not validated; invalid name returns empty. - -## Steps - -1. **Find.** Call: - - With microservice: `find(kind="producer", filter={microservice:}, limit=100)` - - Without: `find(kind="producer", filter={}, limit=100)` -2. **Render.** For each row show `fqn`, `microservice`, `producer_kind` (e.g. `kafka_send`, `stream_bridge`, `codebase_producer`), `topic_prefix` (when known), and `id`. - -## Recovery - -- Empty with a microservice argument: re-run without the filter to confirm the microservice name. -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/producers chat-core` -You: -``` -Q-class: structured Pick: find Why: producer listing for one microservice -→ find(kind="producer", filter={microservice:"chat-core"}, limit=100) - → producer:com.bank.chat.core.publisher.ChatEventPublisher kafka_send topic_prefix=chat-events - → producer:com.bank.chat.core.publisher.AuditPublisher kafka_send topic_prefix=audit -``` - -## Do not - -- Do not infer producers from training data — they are project-specific. -- Do not read source files when `find` can answer. -- Do not call `search` for this. - -## Out of scope - -- Outbound HTTP (use `/clients`). -- The downstream async **route** a producer targets (use `neighbors(producer_id, "out", ["ASYNC_CALLS"])`). -- Who declares a producer (use `neighbors(producer_id, "in", ["DECLARES_PRODUCER"])`). - -## Going deeper - -The full `Producer` schema (every `producer_kind` value, `topic_prefix` semantics, brownfield-annotated producers) and the `DECLARES_PRODUCER → ASYNC_CALLS` pattern are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/producers`. diff --git a/skills/tier-1/routes/SKILL.md b/skills/tier-1/routes/SKILL.md deleted file mode 100644 index 33e7187..0000000 --- a/skills/tier-1/routes/SKILL.md +++ /dev/null @@ -1,78 +0,0 @@ ---- -name: routes -description: List HTTP and messaging routes in the indexed Java codebase, optionally filtered by microservice. Use when the user asks "list routes", "show me endpoints", "list REST APIs", "what HTTP routes are in X", or "list Kafka listeners". Returns Route nodes (both HTTP and async). Argument is an optional microservice name. ---- - -# /routes — List HTTP and messaging routes - -## When to use - -The user wants a **list** of routes: HTTP endpoints (`@GetMapping` etc., or brownfield `@CodebaseHttpRoute`) and async inbound topics (`@KafkaListener`, `@RabbitListener`, or brownfield `@CodebaseAsyncRoute`). Optionally scoped to one microservice. - -## Tools used - -`find` only. - -## Reasoning preamble (mandatory) - -``` -Q-class: structured Pick: find Why: route kind + optional microservice -``` - -**Q-class taxonomy reminder:** `semantic` (NL → `search`), **`structured`** (`find`), `inspect` (`describe`), `walk` (`neighbors`). - -## Argument contract - -Optional positional argument: microservice name. Omit to list all routes. - -**`microservice` value note:** Not validated by the MCP — invalid name returns an empty list. Verify with `find(kind="microservice", filter={})` if results look wrong. - -## Steps - -1. **Find.** Call: - - With microservice: `find(kind="route", filter={microservice:})` - - Without: `find(kind="route", filter={})` -2. **Render.** For each row show `fqn` (HTTP method + path, or async topic), `microservice`, `framework` (e.g. `spring_mvc`, `spring_kafka`, `codebase_http`), and `id`. Group by microservice if no filter. - -## Optional filters to narrow - -- `path_prefix:"/chat"` — HTTP routes under a path -- `topic_prefix:"chat-events"` — async topics under a prefix -- `framework:"spring_mvc"` — only HTTP routes -- `framework:"spring_kafka"` — only Kafka listeners - -These compose with `microservice`. Brownfield-annotated routes have `framework` values starting with `codebase_`. - -## Recovery - -- Empty with a microservice argument: re-run without the filter; if non-empty, the name is wrong. -- Empty without any filter: confirm the index is built (`describe(id="meta:index")` if available, else ask the user to re-run `java-codebase-rag init`). -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/routes chat-assign` -You: -``` -Q-class: structured Pick: find Why: route listing for one microservice -→ find(kind="route", filter={microservice:"chat-assign"}) - → route:POST /chat/assign microservice=chat-assign framework=spring_mvc - → route:GET /chat/status microservice=chat-assign framework=spring_mvc - → route:chat-events.assigned microservice=chat-assign framework=spring_kafka -``` - -## Do not - -- Do not enumerate routes from training-data Spring conventions — the route set is per-project. -- Do not read source files when `find` can answer. -- Do not call `search` for this — it's a structured listing. - -## Out of scope - -- The handler method for a route (use `/handlers`). -- All inbound paths to a route (use `/who-hits-route`). -- Following a route end-to-end (use `/trace-request-flow`). - -## Going deeper - -Full route schema (HTTP vs async, framework values, brownfield-annotated routes) is in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/routes`. diff --git a/skills/tier-1/who-hits-route/SKILL.md b/skills/tier-1/who-hits-route/SKILL.md deleted file mode 100644 index f2b9eb8..0000000 --- a/skills/tier-1/who-hits-route/SKILL.md +++ /dev/null @@ -1,75 +0,0 @@ ---- -name: who-hits-route -description: Show all inbound paths to a route — cross-service HTTP/async callers plus the local handler. Use when the user asks "who calls this endpoint", "who hits this route", "what services call POST /foo", or "all inbound to route X". Argument is a route: id or route identifier. Combines HTTP_CALLS, ASYNC_CALLS, and EXPOSES in one neighbors call. ---- - -# /who-hits-route — All inbound paths to a route - -## When to use - -The user has a **route** and wants every inbound edge: cross-service HTTP callers (`HTTP_CALLS` from Client nodes), async producers (`ASYNC_CALLS` from Producer nodes), and the local handler method (`EXPOSES` from a Symbol). - -This is the *cross-service* counterpart to `/callers` (which only handles in-process method-to-method). - -## Tools used - -`resolve` (when argument isn't already `route:`) + `neighbors`. `find` only as recovery fallback. - -## Reasoning preamble (mandatory) - -``` -Q-class: walk Pick: neighbors Why: all inbound edges of a route -``` - -## Argument contract - -Single positional argument: a **route** id (`route:...` preferred) OR an identifier-shaped string (`/chat/join`, `POST /chat/join`). - -## Steps - -1. **Resolve.** If argument starts with `route:` or `r:`, use directly. Else `resolve(identifier=, hint_kind="route")`: - - `one` → use `node.id`. - - `many` → list candidates, stop. - - `none` → `find(kind="route", filter={path_prefix:})`; if still empty, stop and report. -2. **Walk all inbound.** Call `neighbors(ids=, direction="in", edge_types=["HTTP_CALLS","ASYNC_CALLS","EXPOSES"])`. -3. **Render grouped by edge type:** - - `EXPOSES` → handler method Symbol (always exactly one in well-formed projects). - - `HTTP_CALLS` → Client nodes. Each row carries `attrs.match` (path/method match strength) and `attrs.confidence`. - - `ASYNC_CALLS` → Producer nodes. Same `attrs.match` / `attrs.confidence` columns. - -## Recovery - -- Empty result on a known route: the route exists but has no callers indexed. Confirm with `describe()` (`edge_summary` will show the same). -- Low `attrs.confidence` rows: those are probabilistic matches (path-template inference). Report but flag. -- After two failed attempts on the same intent, stop and report. - -## Worked example - -User: `/who-hits-route POST /chat/join` -You: -``` -Q-class: structured Pick: resolve Why: identifier-shaped route arg -→ resolve(identifier="POST /chat/join", hint_kind="route") - → status=one id=route:POST /chat/join -Q-class: walk Pick: neighbors Why: all inbound edges of route -→ neighbors(ids="route:POST /chat/join", direction="in", - edge_types=["HTTP_CALLS","ASYNC_CALLS","EXPOSES"]) - → EXPOSES : sym:com.bank.chat.core.api.ChatController#joinOperator (chat-core) - → HTTP_CALLS : client:com.bank.gateway.client.ChatClient#join (gateway) match=exact confidence=1.0 - → ASYNC_CALLS: (none) -``` - -## Do not - -- Do not pass `HTTP_CALLS`/`ASYNC_CALLS` on a bare method `sym:` — those edges originate at Client/Producer nodes, never methods. -- Do not fabricate `route:` ids. - -## Out of scope - -- In-process callers of a method (use `/callers`). -- Following the request forward from the handler (use `/trace-request-flow`). -- The handler alone (use `/handlers`). - -## Going deeper - -`HTTP_CALLS` / `ASYNC_CALLS` schema (always Client→Route / Producer→Route, never one-hop from method) and the `attrs.match` semantics are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/who-hits-route`. diff --git a/skills/tier-2/explain-feature/SKILL.md b/skills/tier-2/explain-feature/SKILL.md deleted file mode 100644 index 5227ae0..0000000 --- a/skills/tier-2/explain-feature/SKILL.md +++ /dev/null @@ -1,97 +0,0 @@ ---- -name: explain-feature -description: Understand how a feature works end-to-end by locating entry points and tracing call chains with bounded depth. Use when the user asks "how does X work", "explain feature X", "walk me through Y", or "show me the flow of Z". Argument is free-form feature/concept text. Composes search → describe → bounded neighbors walks. ---- - -# /explain-feature — Understand a feature end-to-end - -## When to use - -The user wants a **narrative explanation** of how something works: entry points, the call chain to persistence/async boundaries, and any cross-service hops. The input is fuzzy (a feature name, not an id). - -For a more focused single-method noise-filtered map use `/mini-map`. For a known route end-to-end use `/trace-request-flow`. - -## Tools used - -`search`, `describe`, `neighbors`. No `find` — the input is fuzzy. - -## Reasoning preamble (mandatory) - -Before **each** MCP call: - -``` -Q-class: -Pick: Why: <≤8 words> -``` - -**Q-class taxonomy:** `semantic` (`search`), `structured` (`find`/`resolve`), `inspect` (`describe`), `walk` (`neighbors`). - -This skill typically uses `semantic → inspect → walk → walk`. - -## Argument contract - -Single positional argument: free-form text describing the feature or concept to explain. - -## Steps - -1. **Locate entry points.** Call `search(query=, limit=8)`. Pick top 1–3 hits with strong `symbol_id` fit (role aligns with what the user wants — `CONTROLLER` for HTTP features, `SERVICE` for business logic, etc.). -2. **Inspect each hit.** Call `describe(id=)` for each. Read `edge_summary` (per-label `in`/`out` counts) and `role` to choose which edges to walk next. -3. **Walk with bounded `neighbors`.** Use *small* `edge_types` sets per call: - - **Methods:** `neighbors(ids=, direction="out", edge_types=["CALLS"])` for in-process flow. - - **At controller/handler boundaries:** `neighbors(ids=, direction="in", edge_types=["EXPOSES"])` to find the handler from a route, or vice versa. - - **At outbound HTTP/async:** `neighbors(out, ["DECLARES_CLIENT"])` then `neighbors(out, ["HTTP_CALLS"])`; or `neighbors(out, ["DECLARES_PRODUCER"])` then `neighbors(out, ["ASYNC_CALLS"])`. - - **Type wiring:** `neighbors(in, ["IMPLEMENTS"])` to see who realizes an interface; `neighbors(in, ["INJECTS"])` to see who depends on a type. -4. **Render.** Synthesize a narrative: entry points → key methods → data flow → cross-service boundaries. Cite each claim with the `sym:` / `route:` id you walked. - -## Stop conditions - -- Maximum 3 hops from any entry point. -- Stop as soon as you can answer the user's question. -- Do not pre-fetch unrelated subgraphs. -- After two empty/failed `neighbors` calls on the same node, stop walking that branch. - -## Recursion limit - -- Depth ≤ 3 from each entry point. -- Maximum 10 `neighbors` calls total across the whole skill invocation. -- If the explanation needs more, hand off to `/trace-request-flow` (route-anchored, depth 4) or `/impact-of` (reverse direction). - -## Worked example - -User: `/explain-feature operator assignment` -You: -``` -Q-class: semantic Pick: search Why: NL feature name -→ search(query="operator assignment", limit=8) - → sym:com.bank.chat.assign.service.OperatorAssignmentService (interface, SERVICE) - → sym:com.bank.chat.assign.api.AssignController (CONTROLLER) -Q-class: inspect Pick: describe Why: edge_summary on interface -→ describe(id="sym:...OperatorAssignmentService") - → edge_summary { IMPLEMENTS.in: 2, INJECTS.in: 3, CALLS.in: 4 } -Q-class: walk Pick: neighbors Why: find concrete implementors -→ neighbors(ids="sym:...OperatorAssignmentService", direction="in", edge_types=["IMPLEMENTS"]) - → RoundRobin..., Weighted... (2 strategies) -Q-class: walk Pick: neighbors Why: trace inbound from controller -→ neighbors(ids="sym:...AssignController", direction="out", edge_types=["CALLS"]) - → AssignController → OperatorAssignmentService#assign → OperatorRepository#save -Synthesize: "Operator assignment has two strategies (RoundRobin, Weighted) -behind an interface. Triggered via AssignController. Persists via OperatorRepository..." -``` - -## Do not - -- Do not answer from training data or general Java knowledge. -- Do not read source files when MCP can answer. -- Do not skip MCP calls and guess. -- Do not fabricate ids — always obtain them from `search` / `find` / `resolve`. -- Do not walk all edge types at once — small `edge_types` sets per call. - -## Out of scope - -- Exact impact analysis (use `/impact-of`). -- Route-anchored end-to-end trace (use `/trace-request-flow`). -- Noise-filtered single-method call map (use `/mini-map`). - -## Going deeper - -Edge taxonomy, the locate→inspect→walk workflow, and the recovery playbook are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/explain-feature`. diff --git a/skills/tier-2/impact-of/SKILL.md b/skills/tier-2/impact-of/SKILL.md deleted file mode 100644 index ba634f2..0000000 --- a/skills/tier-2/impact-of/SKILL.md +++ /dev/null @@ -1,95 +0,0 @@ ---- -name: impact-of -description: Analyze what breaks if a symbol changes by walking inbound edges recursively with bounded depth. Use when the user asks "what breaks if I change X", "impact of changing X", "who depends on X", or "blast radius of modifying Y". Argument is a sym: id or identifier resolved via `resolve`. Covers CALLS, INJECTS, IMPLEMENTS, EXTENDS plus route/client impact when applicable. ---- - -# /impact-of — What breaks if this changes - -## When to use - -The user wants the **blast radius** of changing a symbol: who calls it, who injects it, who implements it, and (for methods on routes / methods declaring clients) what crosses service boundaries. - -This is the reverse direction of `/trace-request-flow`. For a *forward* request walk use that skill instead. - -## Tools used - -`resolve`, `describe`, `neighbors`. `search` only as a recovery fallback. - -## Reasoning preamble (mandatory) - -``` -Q-class: -Pick: Why: <≤8 words> -``` - -This skill typically uses `structured → inspect → walk → walk ...`. - -## Argument contract - -Single positional argument: a Symbol id (`sym:...` preferred) OR an identifier-shaped string (FQN, simple name) → `resolve(identifier=..., hint_kind="symbol")`. - -## Steps - -1. **Resolve.** If argument starts with `sym:`, use directly. Else `resolve(identifier=, hint_kind="symbol")` (`one`/`many`/`none` handling per Tier 1 skills). -2. **Inspect.** Call `describe(id=)`. Read `edge_summary` and `role`. -3. **Recursive inbound walk (depth ≤ 2).** Call `neighbors(ids=, direction="in", edge_types=["CALLS","INJECTS","IMPLEMENTS","EXTENDS"])`. For each inbound neighbor that is a **method** symbol, repeat the same call on that id (one more hop only). -4. **Route/client impact** (when applicable): - - If the symbol is a method that handles a route: `neighbors(, "out", ["EXPOSES"])` to find the route, then `neighbors(, "in", ["HTTP_CALLS","ASYNC_CALLS"])` for callers outside the codebase. Each row is a Client or Producer in another service. - - If the symbol declares clients: `neighbors(, "out", ["DECLARES_CLIENT"])`, then `neighbors(, "out", ["HTTP_CALLS"])` for affected downstream services. -5. **Render impact list.** Deduplicate. Group as: - - **Direct (depth 1):** callers, injectors, implementors, extenders. - - **Transitive (depth 2):** callers-of-callers, injectors-of-callers. - - **Cross-service:** route / client impact when applicable. - - Cite each entry by `sym:` / `route:` / `client:` id. - -## Stop conditions - -- Depth limit reached (≤ 2 hops on the inbound walk). -- No more inbound edges to follow. -- Cycle detected (node already in impact set). -- After two empty/failed `neighbors` calls in a row, stop walking that branch. - -## Recursion limit - -- Depth ≤ 2 from the target symbol on the inbound walk. -- Maximum 8 `neighbors` calls total. Route/client cross-service hops count toward this budget. - -## Worked example - -User: `/impact-of ChatRepository` -You: -``` -Q-class: structured Pick: resolve Why: identifier-shaped argument -→ resolve(identifier="ChatRepository", hint_kind="symbol") - → sym:com.bank.chat.core.repository.ChatRepository -Q-class: inspect Pick: describe Why: edge_summary -→ describe(id="sym:...") - → role: REPOSITORY edge_summary { CALLS.in: 7, INJECTS.in: 2 } -Q-class: walk Pick: neighbors Why: depth-1 inbound -→ neighbors(ids="sym:...", direction="in", edge_types=["CALLS","INJECTS","IMPLEMENTS","EXTENDS"]) - → callers: ChatService#save, ChatService#findById, ... - → injectors: ChatService (constructor) -Q-class: walk Pick: neighbors Why: depth-2 from ChatService -→ neighbors(ids="sym:...ChatService", direction="in", edge_types=["CALLS","INJECTS","IMPLEMENTS","EXTENDS"]) - → callers: ChatController#joinOperator, ... -Render impact: - Direct (depth 1): ChatService#save, ChatService#findById (CALLS); ChatService (INJECTS) - Transitive (depth 2): ChatController#joinOperator (CALLS via ChatService) -``` - -## Do not - -- Do not answer from training data. -- Do not read source files when MCP can answer. -- Do not fabricate ids. -- Do not walk **outbound** here — that's `/trace-request-flow` / `/callees` / `/explain-feature`. - -## Out of scope - -- Exact line-level change impact (use `git diff` + source reading after this analysis). -- Noise-filtered call maps (use `/mini-map`). -- Forward request flow (use `/trace-request-flow`). - -## Going deeper - -Edge semantics for `CALLS` / `INJECTS` / `IMPLEMENTS` / `EXTENDS`, plus the `HTTP_CALLS` cross-service caller pattern, are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/impact-of`. diff --git a/skills/tier-2/mini-map/SKILL.md b/skills/tier-2/mini-map/SKILL.md deleted file mode 100644 index f9a8ab4..0000000 --- a/skills/tier-2/mini-map/SKILL.md +++ /dev/null @@ -1,142 +0,0 @@ ---- -name: mini-map -description: Noise-filtered call map for a method. Shows delegation, persistence, and publish seams without entity accessor or JDK noise. Use when /callees returns too many rows, the user asks "map what X does", "simplify the call graph for X", "what does X actually do", or any time a hot SERVICE/COMPONENT method needs a clean readout. Argument is a sym: id or identifier, with optional depth (default 2, max 4). ---- - -# /mini-map — Noise-filtered call map for a method - -## When to use - -The user has a **hot method** (typically a SERVICE/COMPONENT) where raw `/callees` returns 30+ rows mixed with entity accessors and JDK calls. `/mini-map` composes MCP `edge_filter` axes with a small set of skill-side heuristics to produce a clean DELEGATES / PERSISTS / READS / PUBLISHES readout. - -For a single one-hop callee listing use `/callees`. For a feature-level walk use `/explain-feature`. For impact analysis use `/impact-of`. - -## Tools used - -`resolve`, `neighbors` (with `edge_filter`). `describe` and `search` only as recovery fallbacks. - -## Reasoning preamble (mandatory) - -``` -Q-class: -Pick: Why: <≤8 words> -``` - -This skill is mostly `structured → walk → walk ...` (one `walk` per hop, possibly multiple per hop for filtered passes). - -## Argument contract - -- **Required:** seed id — `sym:` id or identifier-shaped string → `resolve(identifier=..., hint_kind="symbol")`. -- **Optional:** `depth` (default 2, max 4) — recursion depth on `DELEGATES` and `PUBLISHES` targets. -- **Optional:** `microservice` — scope filter to apply on each recursion (uses `microservice` on `find` lookups; on `neighbors` it's an out-of-band display filter — apply when rendering). - -## Steps - -### Step 1 — Resolve - -If the argument starts with `sym:`, use it as the id. Otherwise call `resolve(identifier=, hint_kind="symbol")`. On `status="one"`, use `node.id`; on `many`, list `candidates` and stop; on `none`, call `search(query=, limit=5)` and stop if still empty. - -### Step 2 — Fetch ordered CALLS - -Call `neighbors(ids=, direction="out", edge_types=["CALLS"])`. - -Rows are source-ordered (`call_site_line`, `call_site_byte`). After ontology 15, true receiver-failure sites are **not** on `CALLS` — they are `UnresolvedCallSite` nodes. `attrs.resolved=false` on remaining `CALLS` rows means known-receiver-external (JDK/Spring) callees, not receiver failure. - -### Step 3 — Optional MCP pre-filter - -When the raw `CALLS` set is large (e.g. > 30 rows), prefer MCP-side filtering over hand-rolled rules: - -- **Skeleton pass** (delegation hops): `neighbors(direction="out", edge_types=["CALLS"], edge_filter={callee_declaring_role:"SERVICE"})`. -- **Trim JDK/low-signal:** `neighbors(direction="out", edge_types=["CALLS"], edge_filter={min_confidence:0.5})` and/or `edge_filter={exclude_callee_declaring_roles:["OTHER"]}` (blunt — also drops known-external rows; document in output). -- **Collapse identical callees:** `neighbors(direction="out", edge_types=["CALLS"], dedup_calls=True)`. -- **Full transcript with unresolved sites:** `neighbors(direction="out", edge_types=["CALLS"], include_unresolved=True)` — use **only when not using `edge_filter`** on the same call (mutual exclusivity). - -### Step 4 — Skill heuristics - -What `callee_declaring_role` cannot do (accessor noise + semantic labels): - -1. **Skip entity accessors.** Callee simple name matches `get*` / `set*` / `is*` / `` on types matching `*Entity`, `*Request`, `*Response`, `*Event`, `*DTO`, or parent `role=DTO`. -2. **Skip JDK/library** when step 3 did not run: callee `fqn` prefix `java.`, `javax.`, `org.slf4j.`, `lombok.`. -3. **Classify remainder** (use `attrs.callee_declaring_role` when present, else callee parent `role` from `describe`): - - `REPOSITORY` / `MAPPER` → `PERSISTS` (`save*`/`delete*`) or `READS` (`find*`/`get*`). - - `SERVICE` or listener/scheduled capabilities → `DELEGATES`. - - `CLIENT` or publisher component → `PUBLISHES`. - - Else → `CALLS`. -4. **Deduplicate for display.** Same callee FQN → one line with `(×N)`. - -### Step 5 — Recurse - -On `DELEGATES` and `PUBLISHES` targets, repeat steps 2–4 up to `depth` (default 2, max 4). - -### Step 6 — Render output - -``` -() - DELEGATES → …Service#method - PERSISTS → …Repository#save (×2) - READS → …Repository#findById - PUBLISHES → …Publisher#publish (×1) - [filtered ~N edges: ~A accessors, ~B JDK/OTHER, ~C deduped] -``` - -The `[filtered ...]` line is transparency. Offer raw `/callees` or `neighbors` with a documented `edge_filter` if the map looks too thin (< 3 signal lines). - -## Stop conditions - -- Depth limit reached. -- No `DELEGATES` or `PUBLISHES` targets to recurse on. -- Cycle detected (callee already in map). -- After two empty filtered `neighbors` calls on the same node, fall back to raw `/callees` for that node. - -## Recursion limit - -- Default depth 2, max 4. -- When running without subagent: default depth 1, cap total raw edges examined per hop. - -## Subagent preference - -This skill is designed for subagent invocation. The subagent runs the MCP + heuristic pipeline per hop in its own context and returns a compact map. The main agent drills in with file reads after. - -**Graceful degradation:** on hosts without subagents (or tight context budgets), run in the main agent with depth default 1. If the map has fewer than 3 signal lines after filtering, fall back to raw `/callees` (optionally with `edge_filter`) and note that stereotype roles may be `OTHER`. - -## Worked example - -User: `/mini-map ClientMessageProcessor#process` -You: -``` -Q-class: structured Pick: resolve Why: identifier-shaped argument -→ resolve(identifier="ClientMessageProcessor#process", hint_kind="symbol") - → sym:com.bank.chat.core.processor.ClientMessageProcessor#process(ProcessingContext, InternalEvent) -Q-class: walk Pick: neighbors Why: raw CALLS scan -→ neighbors(direction="out", edge_types=["CALLS"]) - → ~49 rows (post-ontology-15 re-index) -Q-class: walk Pick: neighbors Why: skeleton pass via SERVICE role -→ neighbors(direction="out", edge_types=["CALLS"], edge_filter={callee_declaring_role:"SERVICE"}) - → skeleton: 8 rows -Skill classify → ~8–12 signal rows total. -Output: - ClientMessageProcessor#process(ProcessingContext, InternalEvent) - DELEGATES → SplitResolverService#resolveSplit - DELEGATES → DistributionTriggerPublisher#trigger - PERSISTS → ChatRepository#save (×2) - READS → ChatRepository#findById - [filtered ~37 edges: ~22 accessors, ~10 JDK/OTHER, ~5 deduped] -``` - -## Do not - -- Do not answer from training data or general Java knowledge. -- Do not read source files when MCP can answer. -- Do not skip MCP calls and guess. -- Do not fabricate `sym:` ids — always obtain them from `resolve` / `find` / `search`. -- Do not bypass the MCP `edge_filter` when applicable — these heuristics **compose** with `edge_filter`, they don't replace it. - -## Out of scope - -- Cross-service tracing (use `/trace-request-flow`). -- Impact analysis (use `/impact-of`). -- Replacing MCP `edge_filter` — this skill **composes** MCP filters; heuristics cover accessor noise and semantic labels only. - -## Going deeper - -The full `edge_filter` axis list, the role taxonomy (every `callee_declaring_role` value), and the rationale behind the post-ontology-15 `UnresolvedCallSite` split are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/mini-map`. diff --git a/skills/tier-2/trace-request-flow/SKILL.md b/skills/tier-2/trace-request-flow/SKILL.md deleted file mode 100644 index d52a568..0000000 --- a/skills/tier-2/trace-request-flow/SKILL.md +++ /dev/null @@ -1,107 +0,0 @@ ---- -name: trace-request-flow -description: Follow a request from an HTTP/async route entry point through the in-process call chain to persistence or async boundaries, with cross-service hops at clients/producers. Use when the user asks "trace POST /foo", "follow the request for X", "what happens when X is called", or "end-to-end flow of route Y". Argument is a route id, METHOD /path string, or path fragment. ---- - -# /trace-request-flow — Follow a request end-to-end - -## When to use - -The user has a **route** (or path) and wants a forward trace: route → handler method → service calls → repository / outbound client / producer. This is the *forward* counterpart to `/impact-of`. - -For a feature explanation that isn't anchored to a route use `/explain-feature`. For a single noisy method use `/mini-map`. - -## Tools used - -`resolve`, `find` (route fallback), `neighbors`. `describe` optional. `search` only as a deep-fallback. - -## Reasoning preamble (mandatory) - -``` -Q-class: -Pick: Why: <≤8 words> -``` - -Typical sequence: `structured → walk → walk → walk` (with the occasional `inspect`). - -## Argument contract - -Single positional argument: a route identifier — `route:` id, `METHOD /path`, or path fragment → `resolve(identifier=..., hint_kind="route")` then `find(kind="route", filter={path_prefix:...})` on `none`. - -## Steps - -1. **Resolve route.** If argument starts with `route:` or `r:`, use directly. Else `resolve(identifier=, hint_kind="route")`: - - `one` → use `node.id`. - - `many` → list candidates, stop. - - `none` → `find(kind="route", filter={path_prefix:})`; if still empty, stop and report. -2. **Handler method.** `neighbors(ids=, direction="in", edge_types=["EXPOSES"])` → exactly one handler `sym:` in well-formed projects. -3. **Walk call chain (depth ≤ 4 on methods).** `neighbors(ids=, direction="out", edge_types=["CALLS"])`. For each callee: - - **SERVICE / COMPONENT** → likely to delegate further; recurse one more hop. - - **REPOSITORY / MAPPER** → classify as persistence, **stop** that branch. - - **CLIENT-declaring** (parent has `DECLARES_CLIENT.out > 0` on `describe`) → go to step 4 for that method. - - **PRODUCER-declaring** → same, for producers. -4. **Cross-service boundaries.** At methods with outbound clients: `neighbors(, "out", ["DECLARES_CLIENT"])`, then for each Client id `neighbors(, "out", ["HTTP_CALLS"])`. At methods with async producers: `neighbors(, "out", ["DECLARES_PRODUCER"])`, then for each Producer id `neighbors(, "out", ["ASYNC_CALLS"])`. -5. **Render ordered sequence.** - ``` - Route → Handler → Service → ... → Repository / Client → DownstreamRoute / Producer → DownstreamTopic - ``` - with edge type annotations at every boundary. - -## Stop conditions - -- Depth limit reached (≤ 4 hops from the handler). -- No more `CALLS` edges to follow. -- All branches terminated at REPOSITORY / MAPPER / CLIENT / PRODUCER endpoints. -- Cycle detected (method already in trace). -- After two empty `neighbors` calls in a row on the same branch, stop that branch. - -## Recursion limit - -- Depth ≤ 4 from the handler method on the CALLS walk. -- Maximum 10 `neighbors` calls total. Cross-service hops in step 4 count toward this budget. - -## Worked example - -User: `/trace-request-flow POST /chat/join` -You: -``` -Q-class: structured Pick: resolve Why: identifier-shaped route arg -→ resolve(identifier="POST /chat/join", hint_kind="route") - → route:POST /chat/join -Q-class: walk Pick: neighbors Why: route → handler -→ neighbors(ids="route:POST /chat/join", direction="in", edge_types=["EXPOSES"]) - → sym:com.bank.chat.core.api.ChatController#joinOperator(JoinOperatorRequest) -Q-class: walk Pick: neighbors Why: depth-1 CALLS from handler -→ neighbors(ids="sym:...ChatController#joinOperator(...)", direction="out", edge_types=["CALLS"]) - → ChatService#join (SERVICE — recurse) -Q-class: walk Pick: neighbors Why: depth-2 from ChatService#join -→ neighbors(ids="sym:...ChatService#join(...)", direction="out", edge_types=["CALLS"]) - → ChatRepository#save (REPOSITORY — persistence, stop branch) - → ChatEventPublisher#publishJoined (PRODUCER-declaring — step 4 hop) -Q-class: walk Pick: neighbors Why: producer fan-out -→ neighbors(ids="sym:...ChatService#join(...)", direction="out", edge_types=["DECLARES_PRODUCER"]) - → producer:ChatEventPublisher -→ neighbors(ids="producer:ChatEventPublisher", direction="out", edge_types=["ASYNC_CALLS"]) - → route:chat-events.joined (chat-assign) -Render: - POST /chat/join → ChatController#joinOperator → ChatService#join → ChatRepository#save (persists) - └→ producer:ChatEventPublisher - → chat-events.joined (chat-assign) -``` - -## Do not - -- Do not pass `HTTP_CALLS`/`ASYNC_CALLS` on bare method `sym:` — those edges originate at Client/Producer nodes. -- Do not fabricate ids. -- Do not walk **inbound** here — that's `/impact-of`. -- Do not walk all edge types at once — single `edge_types=["CALLS"]` per call (or single boundary edge in step 4). - -## Out of scope - -- Reverse blast radius (use `/impact-of`). -- Noise-filtered single-method map (use `/mini-map`). -- Feature explanation without a route anchor (use `/explain-feature`). - -## Going deeper - -The full forward-trace workflow, the role taxonomy used in the classification heuristic (SERVICE / REPOSITORY / CLIENT / PRODUCER), and the cross-service edge schema are in `docs/AGENT-GUIDE.md`. This skill is self-sufficient for `/trace-request-flow`. diff --git a/tests/test_agent_skills_static.py b/tests/test_agent_skills_static.py index da996f6..2aa96c6 100644 --- a/tests/test_agent_skills_static.py +++ b/tests/test_agent_skills_static.py @@ -3,18 +3,11 @@ Imports allowlists from production code (mcp_v2, java_ontology) — not hand-maintained lists. Validates: - frontmatter (name + description present) - - MCP tool names referenced in skill bodies + - MCP tool names referenced in skill body - find kind values - direction values - edge_types values - - Tier 2 body structure (stop conditions, recursion limit) - -Known gap (intentional — see AGENT-SKILLS-AND-COMMANDS-PROPOSE §11): - - edge_filter parameters (callee_declaring_role, min_confidence, - exclude_callee_declaring_roles, dedup_calls, include_unresolved) - referenced in /mini-map are NOT validated against mcp_v2 parameter - definitions. The static validator does not parse edge_filter dicts. - On re-index, manually verify /mini-map against the MCP surface. + - worked example section present """ from __future__ import annotations @@ -45,29 +38,8 @@ # --------------------------------------------------------------------------- SKILLS_DIR = Path(__file__).resolve().parent.parent / "skills" -TIER1_DIR = SKILLS_DIR / "tier-1" -TIER2_DIR = SKILLS_DIR / "tier-2" - -TIER1_NAMES = [ - "nl", "controllers", "routes", "clients", "producers", - "callers", "callees", "handlers", "who-hits-route", - "implements", "injects", -] - -TIER2_NAMES = [ - "explain-feature", "impact-of", "trace-request-flow", "mini-map", -] - -ALL_SKILL_NAMES = TIER1_NAMES + TIER2_NAMES - - -def _skill_dir(name: str) -> Path: - """Return the tier directory for a skill name.""" - if name in TIER1_NAMES: - return TIER1_DIR / name - if name in TIER2_NAMES: - return TIER2_DIR / name - raise ValueError(f"Unknown skill name: {name}") +SKILL_NAME = "explore-codebase" +SKILL_PATH = SKILLS_DIR / SKILL_NAME / "SKILL.md" def _parse_frontmatter(text: str) -> dict[str, str]: @@ -83,14 +55,19 @@ def _parse_frontmatter(text: str) -> dict[str, str]: return result +def _read_skill() -> tuple[dict[str, str], str]: + """Read the explore-codebase SKILL.md and return (frontmatter, body).""" + text = SKILL_PATH.read_text(encoding="utf-8") + fm = _parse_frontmatter(text) + body = re.sub(r"^---\n.*?\n---\n*", "", text, count=1, flags=re.DOTALL) + return fm, body + + def _extract_tool_refs(body: str) -> set[str]: """Extract tool names referenced in MCP call patterns.""" - # Match patterns like `search(...)`, `find(kind=...)`, `describe(id=...)`, - # `neighbors({ids:`, `resolve(identifier=`, also backtick-wrapped names. refs: set[str] = set() for m in re.finditer(r"`(search|find|describe|neighbors|resolve)\b", body): refs.add(m.group(1)) - # Also catch patterns like search(query=...) find(kind=...) without backticks for m in re.finditer(r"\b(search|find|describe|neighbors|resolve)\s*[\(\{]", body): refs.add(m.group(1)) return refs @@ -117,13 +94,11 @@ def _extract_direction_refs(body: str) -> set[str]: def _extract_edge_type_refs(body: str) -> set[str]: """Extract edge_types values referenced in skill body.""" refs: set[str] = set() - # Match edge_types lists: ["CALLS"] or ["HTTP_CALLS","ASYNC_CALLS","EXPOSES"] for m in re.finditer(r'edge_types\s*:\s*\[([^\]]+)\]', body): inner = m.group(1) for val in re.findall(r'"(\w[\w.]*)"', inner): if val in _ALL_EDGE_TYPES: refs.add(val) - # Also match quoted edge names in backticked patterns for m in re.finditer(r'\["(\w[\w.]*)"', body): val = m.group(1) if val in _ALL_EDGE_TYPES: @@ -131,190 +106,125 @@ def _extract_edge_type_refs(body: str) -> set[str]: return refs -def _read_skill(name: str) -> tuple[dict[str, str], str]: - """Read a skill's SKILL.md and return (frontmatter, body).""" - path = _skill_dir(name) / "SKILL.md" - text = path.read_text(encoding="utf-8") - fm = _parse_frontmatter(text) - # Body is everything after the closing --- - body = re.sub(r"^---\n.*?\n---\n*", "", text, count=1, flags=re.DOTALL) - return fm, body - - -# --------------------------------------------------------------------------- -# Parametrized test ids -# --------------------------------------------------------------------------- - -@pytest.fixture(params=ALL_SKILL_NAMES, ids=lambda n: f"skill:{n}") -def skill_name(request): - return request.param - - # --------------------------------------------------------------------------- # Tests # --------------------------------------------------------------------------- class TestSkillFrontmatter: - """Every SKILL.md must have valid frontmatter.""" - - @pytest.mark.parametrize("name", ALL_SKILL_NAMES) - def test_frontmatter_has_name_and_description(self, name: str): - fm, _ = _read_skill(name) - rel = _skill_dir(name).relative_to(SKILLS_DIR.parent) - assert "name" in fm, f"{rel}/SKILL.md missing frontmatter 'name'" - assert fm["name"] == name, f"{rel}/SKILL.md: name={fm['name']!r}, expected {name!r}" - assert "description" in fm, f"{rel}/SKILL.md missing frontmatter 'description'" + """SKILL.md must have valid frontmatter.""" + + def test_skill_file_exists(self): + assert SKILL_PATH.is_file(), f"Missing {SKILL_PATH}" + + def test_frontmatter_has_name_and_description(self): + fm, _ = _read_skill() + assert "name" in fm, "SKILL.md missing frontmatter 'name'" + assert fm["name"] == SKILL_NAME, f"name={fm['name']!r}, expected {SKILL_NAME!r}" + assert "description" in fm, "SKILL.md missing frontmatter 'description'" assert len(fm["description"]) >= 20, ( - f"{rel}/SKILL.md description too short ({len(fm['description'])} chars)" + f"description too short ({len(fm['description'])} chars)" ) - @pytest.mark.parametrize("name", ALL_SKILL_NAMES) - def test_skill_file_exists(self, name: str): - path = _skill_dir(name) / "SKILL.md" - assert path.is_file(), f"Missing {path.relative_to(SKILLS_DIR.parent)}" - class TestMCPToolReferences: - """Tool names in skill bodies must be valid MCP navigation tools.""" + """Tool names in skill body must be valid MCP navigation tools.""" - @pytest.mark.parametrize("name", ALL_SKILL_NAMES) - def test_tool_refs_are_valid(self, name: str): - _, body = _read_skill(name) - rel = _skill_dir(name).relative_to(SKILLS_DIR.parent) + def test_tool_refs_are_valid(self): + _, body = _read_skill() refs = _extract_tool_refs(body) invalid = refs - _VALID_TOOLS - assert not invalid, f"{rel}/SKILL.md references invalid tools: {invalid}" + assert not invalid, f"SKILL.md references invalid tools: {invalid}" - @pytest.mark.parametrize("name", ALL_SKILL_NAMES) - def test_skill_references_at_least_one_tool(self, name: str): - _, body = _read_skill(name) - rel = _skill_dir(name).relative_to(SKILLS_DIR.parent) + def test_skill_references_all_five_tools(self): + _, body = _read_skill() refs = _extract_tool_refs(body) - assert refs, f"{rel}/SKILL.md references no MCP tools" + missing = _VALID_TOOLS - refs + assert not missing, f"SKILL.md does not reference all 5 tools, missing: {missing}" class TestKindAndEdgeReferences: """Kind, direction, and edge_type values must match production allowlists.""" - @pytest.mark.parametrize("name", ALL_SKILL_NAMES) - def test_kind_refs_are_valid(self, name: str): - _, body = _read_skill(name) - rel = _skill_dir(name).relative_to(SKILLS_DIR.parent) + def test_kind_refs_are_valid(self): + _, body = _read_skill() refs = _extract_kind_refs(body) invalid = refs - _VALID_KINDS - assert not invalid, f"{rel}/SKILL.md references invalid find kinds: {invalid}" + assert not invalid, f"SKILL.md references invalid find kinds: {invalid}" - @pytest.mark.parametrize("name", ALL_SKILL_NAMES) - def test_direction_refs_are_valid(self, name: str): - _, body = _read_skill(name) - rel = _skill_dir(name).relative_to(SKILLS_DIR.parent) + def test_direction_refs_are_valid(self): + _, body = _read_skill() refs = _extract_direction_refs(body) invalid = refs - _VALID_DIRECTIONS - assert not invalid, f"{rel}/SKILL.md references invalid directions: {invalid}" + assert not invalid, f"SKILL.md references invalid directions: {invalid}" - @pytest.mark.parametrize("name", ALL_SKILL_NAMES) - def test_edge_type_refs_are_valid(self, name: str): - _, body = _read_skill(name) - rel = _skill_dir(name).relative_to(SKILLS_DIR.parent) + def test_edge_type_refs_are_valid(self): + _, body = _read_skill() refs = _extract_edge_type_refs(body) invalid = refs - _ALL_EDGE_TYPES - assert not invalid, f"{rel}/SKILL.md references invalid edge_types: {invalid}" + assert not invalid, f"SKILL.md references invalid edge_types: {invalid}" +class TestBodyStructure: + """Skill body must contain key sections.""" -class TestTier2BodyStructure: - """Tier 2 skills must have stop conditions and recursion limits.""" + def test_has_worked_example(self): + _, body = _read_skill() + assert "## Worked example" in body, "SKILL.md missing '## Worked example'" - @pytest.mark.parametrize("name", TIER2_NAMES) - def test_has_stop_conditions(self, name: str): - _, body = _read_skill(name) - rel = _skill_dir(name).relative_to(SKILLS_DIR.parent) - assert "## Stop conditions" in body, f"{rel}/SKILL.md missing '## Stop conditions'" + def test_has_decision_tree(self): + _, body = _read_skill() + assert "## Decision tree" in body, "SKILL.md missing '## Decision tree'" - @pytest.mark.parametrize("name", TIER2_NAMES) - def test_has_recursion_limit(self, name: str): - _, body = _read_skill(name) - rel = _skill_dir(name).relative_to(SKILLS_DIR.parent) - assert "## Recursion limit" in body, f"{rel}/SKILL.md missing '## Recursion limit'" + def test_has_recovery_playbook(self): + _, body = _read_skill() + assert "## Recovery playbook" in body, "SKILL.md missing '## Recovery playbook'" - def test_mini_map_has_classification_rules(self): - _, body = _read_skill("mini-map") - assert "### Step 4 — Skill heuristics" in body or "Classification" in body, ( - "skills/tier-2/mini-map/SKILL.md missing classification rules" - ) + def test_has_edge_taxonomy(self): + _, body = _read_skill() + assert "## Edge taxonomy" in body, "SKILL.md missing '## Edge taxonomy'" - def test_mini_map_has_output_shape(self): - _, body = _read_skill("mini-map") - assert "PERSISTS" in body and "DELEGATES" in body, ( - "skills/tier-2/mini-map/SKILL.md missing output shape (PERSISTS/DELEGATES labels)" - ) + def test_has_navigation_patterns(self): + _, body = _read_skill() + assert "## Common navigation patterns" in body, "SKILL.md missing '## Common navigation patterns'" + def test_has_reasoning_preamble(self): + _, body = _read_skill() + assert "## Forced reasoning preamble" in body, "SKILL.md missing '## Forced reasoning preamble'" -class TestWorkedExamples: - """Every skill must have a worked example section.""" - @pytest.mark.parametrize("name", ALL_SKILL_NAMES) - def test_has_worked_example(self, name: str): - _, body = _read_skill(name) - rel = _skill_dir(name).relative_to(SKILLS_DIR.parent) - assert "## Worked example" in body, f"{rel}/SKILL.md missing '## Worked example'" +class TestDirectoryIntegrity: + """skills/ must have expected structure.""" + def test_skill_dir_exists(self): + assert (SKILLS_DIR / SKILL_NAME).is_dir(), f"skills/{SKILL_NAME}/ missing" -class TestDirectoryIntegrity: - """skills/ must split into tier-1/ and tier-2/ with the expected skills.""" - - def test_tier_dirs_exist(self): - assert TIER1_DIR.is_dir(), "skills/tier-1/ missing" - assert TIER2_DIR.is_dir(), "skills/tier-2/ missing" - - def test_tier1_no_extra_dirs(self): - actual = {p.name for p in TIER1_DIR.iterdir() if p.is_dir() and (p / "SKILL.md").exists()} - expected = set(TIER1_NAMES) - extra = actual - expected - assert not extra, f"Unexpected skills under skills/tier-1/: {extra}" - - def test_tier1_no_missing_dirs(self): - actual = {p.name for p in TIER1_DIR.iterdir() if p.is_dir() and (p / "SKILL.md").exists()} - expected = set(TIER1_NAMES) - missing = expected - actual - assert not missing, f"Missing skills under skills/tier-1/: {missing}" - - def test_tier2_no_extra_dirs(self): - actual = {p.name for p in TIER2_DIR.iterdir() if p.is_dir() and (p / "SKILL.md").exists()} - expected = set(TIER2_NAMES) - extra = actual - expected - assert not extra, f"Unexpected skills under skills/tier-2/: {extra}" - - def test_tier2_no_missing_dirs(self): - actual = {p.name for p in TIER2_DIR.iterdir() if p.is_dir() and (p / "SKILL.md").exists()} - expected = set(TIER2_NAMES) - missing = expected - actual - assert not missing, f"Missing skills under skills/tier-2/: {missing}" - - def test_no_skills_at_root(self): - """Skills must live under tier-1/ or tier-2/, not at the root of skills/.""" - root_skill_dirs = { - p.name for p in SKILLS_DIR.iterdir() - if p.is_dir() and p.name not in ("tier-1", "tier-2") and (p / "SKILL.md").exists() - } - assert not root_skill_dirs, ( - f"Found skills at skills/ root (must be moved into tier-1/ or tier-2/): {root_skill_dirs}" - ) + def test_no_tier_dirs(self): + """Old tier-1/ and tier-2/ directories must not exist.""" + for tier in ("tier-1", "tier-2"): + assert not (SKILLS_DIR / tier).is_dir(), f"Old skills/{tier}/ still exists — remove it" def test_readme_exists(self): assert (SKILLS_DIR / "README.md").is_file(), "skills/README.md missing" + def test_no_other_skill_dirs(self): + """Only explore-codebase/ should exist as a skill directory.""" + skill_dirs = { + p.name for p in SKILLS_DIR.iterdir() + if p.is_dir() and (p / "SKILL.md").exists() + } + assert skill_dirs == {SKILL_NAME}, ( + f"Expected only skills/{SKILL_NAME}/, found: {skill_dirs}" + ) + class TestAgentGuideConsistency: """AGENT-GUIDE.md copy-paste block must be self-contained.""" def test_guide_has_navigation_patterns_table(self): - """The copy-paste block must include a navigation patterns section - (it's standalone — no external file references work in a consumer project).""" + """The copy-paste block must include a navigation patterns section.""" guide = Path(__file__).resolve().parent.parent / "docs" / "AGENT-GUIDE.md" text = guide.read_text(encoding="utf-8") - # Extract the copy-paste block (marker on its own line) begin = text.find("") end = text.find("") assert begin != -1 and end != -1, "AGENT-GUIDE.md missing BEGIN/END markers" @@ -322,7 +232,6 @@ def test_guide_has_navigation_patterns_table(self): assert "### Common navigation patterns" in block, ( "AGENT-GUIDE.md copy-paste block missing '### Common navigation patterns'" ) - # Verify key patterns are present for pattern in ["CALLS", "EXPOSES", "IMPLEMENTS", "INJECTS"]: assert pattern in block, f"AGENT-GUIDE.md copy-paste block missing {pattern} pattern" @@ -340,26 +249,3 @@ def test_guide_copy_block_does_not_reference_skills_dir(self): "this path won't resolve in a consumer project. " "Keep skills/ references outside the copy-paste block." ) - - def test_guide_copy_block_has_no_slash_command_aliases(self): - """The copy-paste block must not contain slash-command alias bullets - like `/nl ` → ... — these imply commands that don't exist - and will mislead the agent. Incidental mentions (e.g. cross-references - in prose) are fine.""" - guide = Path(__file__).resolve().parent.parent / "docs" / "AGENT-GUIDE.md" - text = guide.read_text(encoding="utf-8") - begin = text.find("") - end = text.find("") - block = text[begin:end] - # Match alias definition lines: - `/skillname ...` → tool(...) - skill_names_pattern = "|".join(re.escape(n) for n in ALL_SKILL_NAMES) - alias_pattern = re.compile( - rf"^- `/(?:{skill_names_pattern})\s", - re.MULTILINE, - ) - matches = alias_pattern.findall(block) - assert not matches, ( - f"AGENT-GUIDE.md copy-paste block contains slash-command alias bullets: " - f"{alias_pattern.findall(block)}. " - "These are not real commands and will mislead the agent." - )