|
| 1 | +--- |
| 2 | +{ |
| 3 | + title: 'Agent Tooling', |
| 4 | + description: 'Internal guidelines: best practices for designing metadata, static checks, CLI tools, skills, apps, and MCP servers for agents.', |
| 5 | + layout: 'docs.11ty.js' |
| 6 | +} |
| 7 | +--- |
| 8 | + |
| 9 | +# {{title}} |
| 10 | + |
| 11 | +Agent tooling is not a separate product surface. It adds progressively higher-level adapters over the same metadata, validation rules, and package APIs that humans use. Build the slow deterministic layer first. Add model-facing tools only after the lower layer can answer the question or reject the invalid state. |
| 12 | + |
| 13 | +## Policy Compiler |
| 14 | + |
| 15 | +Turn repeated guidance into facts, gates, commands, and agent tools. |
| 16 | + |
| 17 | +<section id="tooling-policy-compiler" nve-layout="column gap:sm"> |
| 18 | + <svg width="100%" height="250" viewBox="0 100 1120 250"> |
| 19 | + <defs> |
| 20 | + <marker id="tooling-arrow-metadata" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="5" markerHeight="5" orient="auto-start-reverse"> |
| 21 | + <path d="M 0 0 L 10 5 L 0 10 z" fill="currentColor" fill-opacity="0.45"></path> |
| 22 | + </marker> |
| 23 | + <marker id="tooling-arrow-static-tools" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="5" markerHeight="5" orient="auto-start-reverse"> |
| 24 | + <path d="M 0 0 L 10 5 L 0 10 z" fill="currentColor" fill-opacity="0.45"></path> |
| 25 | + </marker> |
| 26 | + <marker id="tooling-arrow-cli-skills" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="5" markerHeight="5" orient="auto-start-reverse"> |
| 27 | + <path d="M 0 0 L 10 5 L 0 10 z" fill="currentColor" fill-opacity="0.45"></path> |
| 28 | + </marker> |
| 29 | + <marker id="tooling-arrow-mcp" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="5" markerHeight="5" orient="auto-start-reverse"> |
| 30 | + <path d="M 0 0 L 10 5 L 0 10 z" fill="currentColor" fill-opacity="0.45"></path> |
| 31 | + </marker> |
| 32 | + </defs> |
| 33 | + <text x="8" y="124" fill="currentColor" fill-opacity="0.55" font-size="11">most effort / durable</text> |
| 34 | + <text x="1112" y="124" text-anchor="end" fill="currentColor" fill-opacity="0.55" font-size="11">fast / contextual</text> |
| 35 | + <line x1="8" y1="132" x2="1112" y2="132" stroke="currentColor" stroke-opacity="0.18" stroke-width="1"></line> |
| 36 | + <path data-tooling-connection data-tooling-from="metadata" data-tooling-to="static-tools" d="M 352 202 L 372 202" stroke="currentColor" stroke-opacity="0.35" stroke-width="1.5" marker-end="url(#tooling-arrow-static-tools)"></path> |
| 37 | + <path data-tooling-connection data-tooling-from="static-tools" data-tooling-to="cli-skills" d="M 660 202 L 680 202" stroke="currentColor" stroke-opacity="0.35" stroke-width="1.5" marker-end="url(#tooling-arrow-cli-skills)"></path> |
| 38 | + <path data-tooling-connection data-tooling-from="cli-skills" data-tooling-to="mcp" d="M 920 202 L 940 202" stroke="currentColor" stroke-opacity="0.35" stroke-width="1.5" marker-end="url(#tooling-arrow-mcp)"></path> |
| 39 | + <g data-tooling-layer="metadata" style="cursor:pointer"> |
| 40 | + <rect x="8" y="154" width="344" height="96" rx="8" fill="var(--nve-ref-color-green-grass-1000)" fill-opacity="0.08" stroke="var(--nve-ref-color-green-grass-1000)" stroke-width="1"></rect> |
| 41 | + <text x="180" y="194" text-anchor="middle" fill="var(--nve-ref-color-green-grass-1000)" font-size="17" font-weight="700">Metadata</text> |
| 42 | + <text x="180" y="220" text-anchor="middle" fill="var(--nve-ref-color-green-grass-1000)" font-size="11">Source Of Truth</text> |
| 43 | + </g> |
| 44 | + <g data-tooling-layer="static-tools" style="cursor:pointer"> |
| 45 | + <rect x="372" y="154" width="288" height="96" rx="8" fill="var(--nve-ref-color-yellow-amber-1100)" fill-opacity="0.08" stroke="var(--nve-ref-color-yellow-amber-1100)" stroke-width="1"></rect> |
| 46 | + <text x="516" y="194" text-anchor="middle" fill="var(--nve-ref-color-yellow-amber-1100)" font-size="16" font-weight="700">Static Tools</text> |
| 47 | + <text x="516" y="218" text-anchor="middle" fill="var(--nve-ref-color-yellow-amber-1100)" font-size="11">Lint, Types, Tests</text> |
| 48 | + </g> |
| 49 | + <g data-tooling-layer="cli-skills" style="cursor:pointer"> |
| 50 | + <rect x="680" y="154" width="240" height="96" rx="8" fill="var(--nve-ref-color-blue-cobalt-1000)" fill-opacity="0.08" stroke="var(--nve-ref-color-blue-cobalt-1000)" stroke-width="1"></rect> |
| 51 | + <text x="800" y="194" text-anchor="middle" fill="var(--nve-ref-color-blue-cobalt-1000)" font-size="15" font-weight="700">CLI / Skills</text> |
| 52 | + <text x="800" y="216" text-anchor="middle" fill="var(--nve-ref-color-blue-cobalt-1000)" font-size="10">Commands, Context</text> |
| 53 | + </g> |
| 54 | + <g data-tooling-layer="mcp" style="cursor:pointer"> |
| 55 | + <rect x="940" y="154" width="172" height="96" rx="8" fill="var(--nve-ref-color-purple-violet-1000)" fill-opacity="0.08" stroke="var(--nve-ref-color-purple-violet-1000)" stroke-width="1"></rect> |
| 56 | + <text x="1026" y="195" text-anchor="middle" fill="var(--nve-ref-color-purple-violet-1000)" font-size="14" font-weight="700">MCP / Apps</text> |
| 57 | + <text x="1026" y="214" text-anchor="middle" fill="var(--nve-ref-color-purple-violet-1000)" font-size="9">Agent Interfaces</text> |
| 58 | + </g> |
| 59 | + <line x1="180" y1="306" x2="1026" y2="306" stroke="currentColor" stroke-opacity="0.2" stroke-width="2"></line> |
| 60 | + <circle data-tooling-step="metadata" cx="180" cy="306" r="6" fill="var(--nve-ref-color-green-grass-1000)"></circle> |
| 61 | + <circle data-tooling-step="static-tools" cx="516" cy="306" r="6" fill="var(--nve-ref-color-yellow-amber-1100)"></circle> |
| 62 | + <circle data-tooling-step="cli-skills" cx="800" cy="306" r="6" fill="var(--nve-ref-color-blue-cobalt-1000)"></circle> |
| 63 | + <circle data-tooling-step="mcp" cx="1026" cy="306" r="6" fill="var(--nve-ref-color-purple-violet-1000)"></circle> |
| 64 | + <text x="180" y="334" text-anchor="middle" fill="currentColor" fill-opacity="0.62" font-size="10">facts</text> |
| 65 | + <text x="516" y="334" text-anchor="middle" fill="currentColor" fill-opacity="0.62" font-size="10">gates</text> |
| 66 | + <text x="800" y="334" text-anchor="middle" fill="currentColor" fill-opacity="0.62" font-size="10">paths</text> |
| 67 | + <text x="1026" y="334" text-anchor="middle" fill="currentColor" fill-opacity="0.62" font-size="10">tools</text> |
| 68 | + </svg> |
| 69 | + <section id="tooling-detail-card" nve-layout="column gap:lg"> |
| 70 | + <div nve-layout="column gap:sm"> |
| 71 | + <h3 id="tooling-detail-title" nve-text="heading xl semibold"></h3> |
| 72 | + <p id="tooling-detail-subtitle" nve-text="label lg muted"></p> |
| 73 | + </div> |
| 74 | + <div nve-layout="column gap:lg"> |
| 75 | + <p id="tooling-detail-desc" nve-text="body"></p> |
| 76 | + <ul id="tooling-detail-items" nve-text="list" nve-layout="column gap:sm pad:md"></ul> |
| 77 | + <span id="tooling-detail-rule-text" nve-text="body muted"></span> |
| 78 | + </div> |
| 79 | + </section> |
| 80 | +</section> |
| 81 | + |
| 82 | +<script type="module"> |
| 83 | +const toolingRoot = document.getElementById('tooling-policy-compiler'); |
| 84 | +const toolingLayers = [ |
| 85 | + { |
| 86 | + id: 'metadata', |
| 87 | + label: 'Metadata', |
| 88 | + sublabel: 'Source Of Truth', |
| 89 | + color: 'var(--nve-ref-color-green-grass-1000)', |
| 90 | + description: 'Turns the rule into durable facts that every higher layer can query. Tools should read metadata, not repeat facts in prompts.', |
| 91 | + items: [ |
| 92 | + { text: 'Generated API manifests' }, |
| 93 | + { text: 'Reference metadata' }, |
| 94 | + { text: 'Entrypoints and package data' }, |
| 95 | + { text: 'One fact feeding many surfaces' }, |
| 96 | + ], |
| 97 | + rule: 'If the fact describes the system, encode it once as structured data.', |
| 98 | + }, |
| 99 | + { |
| 100 | + id: 'static-tools', |
| 101 | + label: 'Static Tools', |
| 102 | + sublabel: 'Deterministic Rejection', |
| 103 | + color: 'var(--nve-ref-color-yellow-amber-1100)', |
| 104 | + description: 'Compiles facts into rules that fail before runtime. This layer should catch every invalid state a parser can see.', |
| 105 | + items: [ |
| 106 | + { text: 'Type checking and JSON schema' }, |
| 107 | + { text: 'Lint and static analysis' }, |
| 108 | + { text: 'Unit and integration tests' }, |
| 109 | + { text: 'CI and build checks' }, |
| 110 | + ], |
| 111 | + rule: 'If a parser can catch it, fail before runtime.', |
| 112 | + }, |
| 113 | + { |
| 114 | + id: 'cli-skills', |
| 115 | + label: 'CLI and Skills', |
| 116 | + sublabel: 'Commands and Workflow Context', |
| 117 | + color: 'var(--nve-ref-color-blue-cobalt-1000)', |
| 118 | + description: 'Adapts deterministic behavior into repeatable terminal commands and focused workflow context. Humans prove the path first.', |
| 119 | + items: [ |
| 120 | + { text: 'Discovery and validation commands' }, |
| 121 | + { text: 'Project setup and scaffolding' }, |
| 122 | + { text: 'Prototype creation and validation' }, |
| 123 | + { text: 'Workflow order and local policy' }, |
| 124 | + ], |
| 125 | + rule: 'Expose deterministic behavior in the CLI. Put sequencing judgment in skills.', |
| 126 | + }, |
| 127 | + { |
| 128 | + id: 'mcp', |
| 129 | + label: 'MCP and Apps', |
| 130 | + sublabel: 'Agent Interfaces', |
| 131 | + color: 'var(--nve-ref-color-purple-violet-1000)', |
| 132 | + description: 'Exposes stable services to agents through narrow schemas, structured outputs, and explicit side-effect annotations.', |
| 133 | + items: [ |
| 134 | + { text: 'Tool schemas derived from services and CLIs' }, |
| 135 | + { text: 'Prompts for common workflows' }, |
| 136 | + { text: 'Distilled structured outputs' }, |
| 137 | + { text: 'Agent discovery and invocation' }, |
| 138 | + ], |
| 139 | + rule: 'Mirror the service layer. Do not make MCP the source of truth.', |
| 140 | + }, |
| 141 | +]; |
| 142 | + |
| 143 | +if (toolingRoot) { |
| 144 | + const layerGroups = toolingRoot.querySelectorAll('g[data-tooling-layer]'); |
| 145 | + const connections = toolingRoot.querySelectorAll('path[data-tooling-connection]'); |
| 146 | + const steps = toolingRoot.querySelectorAll('circle[data-tooling-step]'); |
| 147 | + const detailTitle = toolingRoot.querySelector('#tooling-detail-title'); |
| 148 | + const detailSubtitle = toolingRoot.querySelector('#tooling-detail-subtitle'); |
| 149 | + const detailDesc = toolingRoot.querySelector('#tooling-detail-desc'); |
| 150 | + const detailItems = toolingRoot.querySelector('#tooling-detail-items'); |
| 151 | + const detailRuleText = toolingRoot.querySelector('#tooling-detail-rule-text'); |
| 152 | + let locked = false; |
| 153 | + |
| 154 | + const setToolingLayer = layerId => { |
| 155 | + const layer = toolingLayers.find(candidate => candidate.id === layerId); |
| 156 | + if (!layer) { |
| 157 | + return; |
| 158 | + } |
| 159 | + |
| 160 | + layerGroups.forEach(group => { |
| 161 | + const id = group.getAttribute('data-tooling-layer'); |
| 162 | + const candidate = toolingLayers.find(item => item.id === id); |
| 163 | + const isActive = id === layerId; |
| 164 | + const rect = group.querySelector('rect'); |
| 165 | + const texts = group.querySelectorAll('text'); |
| 166 | + |
| 167 | + group.setAttribute('opacity', isActive ? '1' : '0.48'); |
| 168 | + rect.setAttribute('stroke-width', isActive ? '2.5' : '1'); |
| 169 | + rect.setAttribute('fill-opacity', isActive ? '0.16' : '0.08'); |
| 170 | + texts.forEach(text => { |
| 171 | + text.setAttribute('fill', isActive && candidate ? candidate.color : 'currentColor'); |
| 172 | + }); |
| 173 | + }); |
| 174 | + |
| 175 | + connections.forEach(connection => { |
| 176 | + const isActive = |
| 177 | + connection.getAttribute('data-tooling-from') === layerId || |
| 178 | + connection.getAttribute('data-tooling-to') === layerId; |
| 179 | + |
| 180 | + connection.setAttribute('stroke-opacity', isActive ? '0.55' : '0.2'); |
| 181 | + connection.setAttribute('stroke-width', isActive ? '2' : '1.5'); |
| 182 | + }); |
| 183 | + |
| 184 | + steps.forEach(step => { |
| 185 | + step.setAttribute('r', step.getAttribute('data-tooling-step') === layerId ? '8' : '6'); |
| 186 | + step.setAttribute('opacity', step.getAttribute('data-tooling-step') === layerId ? '1' : '0.45'); |
| 187 | + }); |
| 188 | + |
| 189 | + detailTitle.style.setProperty('color', layer.color, 'important'); |
| 190 | + detailTitle.textContent = layer.label; |
| 191 | + detailSubtitle.textContent = layer.sublabel; |
| 192 | + detailDesc.textContent = layer.description; |
| 193 | + detailRuleText.textContent = layer.rule; |
| 194 | + detailItems.innerHTML = ''; |
| 195 | + layer.items.forEach(item => { |
| 196 | + const text = document.createElement('li'); |
| 197 | + text.setAttribute('nve-text', 'body sm'); |
| 198 | + text.textContent = item.text; |
| 199 | + detailItems.appendChild(text); |
| 200 | + }); |
| 201 | + }; |
| 202 | + |
| 203 | + setToolingLayer('metadata'); |
| 204 | + |
| 205 | + layerGroups.forEach(group => { |
| 206 | + group.addEventListener('click', () => { |
| 207 | + locked = true; |
| 208 | + setToolingLayer(group.getAttribute('data-tooling-layer')); |
| 209 | + }); |
| 210 | + group.addEventListener('mouseenter', () => { |
| 211 | + if (!locked) { |
| 212 | + setToolingLayer(group.getAttribute('data-tooling-layer')); |
| 213 | + } |
| 214 | + }); |
| 215 | + }); |
| 216 | + |
| 217 | + toolingRoot.querySelector('svg').addEventListener('mouseleave', () => { |
| 218 | + locked = false; |
| 219 | + }); |
| 220 | +} |
| 221 | +</script> |
| 222 | + |
| 223 | +## Build Inside Out |
| 224 | + |
| 225 | +- **Metadata** is the durable contract. Generate facts once, then let docs, lint, CLI, skills, apps, and MCP consume them. If a fact exists only in a prompt, README, or tool description, the harness does not own it. |
| 226 | +- **Static tools** for when agents repeat a mistake: type or schema, lint rule, test, CLI validation, MCP tool. If CI cannot enforce a rule that a parser can see, the harness is incomplete. |
| 227 | +- **CLI and skills** to adapt the deterministic layer for humans and agent workflows. The CLI proves a capability without chat context. Skills should carry workflow order and project policy, not duplicate API catalogs. |
| 228 | +- **MCP and MCP Apps** expose existing services to agents through narrow schemas, structured outputs, and explicit side-effect annotations. They should mirror the service layer, not own domain logic. |
| 229 | + |
| 230 | +## Layering Rules |
| 231 | + |
| 232 | +{% dodont %} |
| 233 | + |
| 234 | +<div> |
| 235 | + |
| 236 | +- **Start with metadata.** Add or fix the generated fact before building consumers. |
| 237 | +- **Fail statically.** Prefer lint, types, and tests for any rule a parser can verify. |
| 238 | +- **Prove with CLI.** Make the command usable by humans before agents call it. |
| 239 | +- **Guide with skills.** Put workflow order, repository policy, and validation habits in skills. |
| 240 | +- **Expose through MCP last.** Mirror existing services with focused schemas and annotations. |
| 241 | + |
| 242 | +</div> |
| 243 | +<div> |
| 244 | + |
| 245 | +- **Do not hide facts in prompts.** Prompts are runtime hints, not durable data. |
| 246 | +- **Do not make MCP the source of truth.** Treat it as an adapter over services. |
| 247 | +- **Do not ship model-only validation.** If CI cannot enforce it, the harness is incomplete. |
| 248 | +- **Do not return raw dumps.** Distill context before it reaches the agent. |
| 249 | + |
| 250 | +</div> |
| 251 | + |
| 252 | +{% enddodont %} |
| 253 | + |
| 254 | +## Decision Checklist |
| 255 | + |
| 256 | +Before creating a new agent-facing tool, answer these questions: |
| 257 | + |
| 258 | +- What metadata does this tool need, and where is that metadata generated? |
| 259 | +- Which invalid states can type checking, JSON Schema, linting, or tests reject first? |
| 260 | +- Can a human use the same capability through the CLI without chat context? |
| 261 | +- Does the tool have a bounded input schema and a structured output schema? |
| 262 | +- Is the result distilled for the task, or does it push context cleanup onto the model? |
| 263 | +- Is this capability general enough for MCP, or is it only workflow context for a skill? |
| 264 | +- What test fails if the tool disappears, changes shape, or starts returning stale data? |
| 265 | + |
| 266 | +If the answer starts with "tell the model to remember," stop. Build the harness layer that makes remembering unnecessary. |
| 267 | + |
| 268 | +## Related Docs |
| 269 | + |
| 270 | +- [Agent Harness](/docs/internal/guidelines/agent-harness/) |
| 271 | +- [Documentation Guidelines](/docs/internal/guidelines/documentation/) |
| 272 | +- [Examples Guidelines](/docs/internal/guidelines/examples/) |
| 273 | +- [CLI](/docs/cli/) |
| 274 | +- [MCP](/docs/mcp/) |
| 275 | +- [Lint](/docs/lint/) |
0 commit comments