-
Notifications
You must be signed in to change notification settings - Fork 0
21 skills teaching the ai your playbook
📖 This page is generated from
modules/21-skills-teaching-the-ai-your-playbook/README.md. Edit the source, not the wiki; edits here are overwritten on the next sync. Run the hands-on labs from the repo, linked inline.
⬅ Previous: Module 20: MCP Servers, Giving the AI Hands
Stop re-explaining your own procedures. A skill is a repeatable workflow written down once, committed, and invoked on demand, so the AI does the thing your way, the same way, every time, without you narrating the steps again.
- Module 2: you commit, read diffs, and treat the repo as durable memory. Skills live in that repo and are versioned exactly like code.
-
Module 3: markdown-as-versioned-text, and the
CHANGELOG.mdconvention this module's lab writes to. - Module 4: the AI lives in your editor/CLI and reads your files directly. A skill is a file it loads; a browser chat can't pick one up automatically.
- Module 5, the one this builds on directly. You committed an always-on instructions file that tells the AI how the project works in general. This module is its structured big sibling: the same write-it-down-and-commit instinct, but for specific repeatable procedures invoked on demand.
- Module 13: what a real test is (and why "it didn't crash" isn't one). The lab's procedure includes writing one.
- Helpful, not required: Module 20 (MCP). A skill's steps can call the real tools an MCP server exposes, which is where a playbook reaches beyond editing files into live systems.
By the end of this module you can:
- Explain the difference between an always-on instructions file (Module 5) and a skill, and say when each is the right tool.
- Write a skill: a structured, named, invokable playbook for a recurring task, in your tool's format-agnostic essentials (when-to-use, inputs, ordered steps, done-criteria).
- Have the AI execute a skill end to end and verify it followed every step.
- Keep skills in version control so a procedure is shareable, reviewable, and recoverable like any other artifact.
- Recognize when a one-off prompt has earned promotion into a durable skill, and when it hasn't.
You've written the Module 5 instructions file, and it's working. The AI knows your layout, your test command, your off-limits files. But there's a class of knowledge it doesn't cover: multi-step procedures you run again and again.
"Add a new CLI command" is the canonical example. Done properly it's never one edit. It's: put the
logic in the right file, wire the CLI, write a test that actually checks the behavior, run the tests,
smoke-test the command, add a changelog line, commit it as one clean change. The AI can do every step.
But left to a bare prompt ("add a clear command") it'll usually give you the code and forget the
test, or skip the changelog, or commit tasks.json along for the ride. So you spell out the seven
steps. It works. Next week you add another command and you spell out the same seven steps again.
That re-narration is the exact pain Module 1 named, one level up: not re-explaining the project each session, but re-explaining the procedure each time you run it. A skill is where that procedure stops being something you retype and becomes something the repo carries.
A skill is a named, structured, invokable set of instructions for one repeatable procedure, stored as a file in the repo and loaded on demand when that procedure is the task at hand.
Strip the vendor branding and every skill has the same four parts:
- A name and a "when to use it." So both you and the AI know which playbook applies and, just as importantly, when it doesn't.
- Inputs. The few things the procedure needs to be told (here: the command name and what it does).
-
Ordered steps. The actual procedure: the commands, the files, the checks, in sequence, with the
non-negotiables marked ("run the tests before claiming success," "don't stage
tasks.json"). - Done-criteria. How the AI (and you) know it's actually finished, not just "produced something."
That's it. A skill is a checklist precise enough that an agent can execute it and you can verify it did.
This is the distinction to lock in, because the two are siblings and easy to conflate:
| Committed instructions file (Module 5) | Skill (this module) | |
|---|---|---|
| Scope | How the project works, in general | How to do one specific procedure |
| When it loads | Always on: read every session | On demand: invoked when relevant |
| Shape | Ambient briefing: conventions, commands, don't-touch list | A playbook: when-to-use, inputs, ordered steps, done-criteria |
| Analogy | The standing house rules posted on the wall | A labeled recipe card you pull out when you cook that dish |
They're complementary. The instructions file is the right home for facts true all the time ("tests
run with python3 -m unittest"). A skill is the right home for a procedure you run sometimes ("here
is exactly how we add a command"). Module 5 even told you this was coming: start with the always-on
file; graduate a procedure into a skill when it earns its own page.
Module 5 warned that bloat kills an instructions file: a 300-line always-on briefing gets read the way you read a terms-of-service. So you can't solve the re-narration problem by stuffing every procedure into the always-on file; you'd drown the signal that makes it work.
A skill solves that. Because a skill loads only when its procedure is the task, you can write it in full detail, every step and every guardrail, without taxing every unrelated session. Ten skills cost the AI nothing on a session that invokes none of them. This is progressive disclosure: keep the always-on context lean, and pull in the deep procedure exactly when it's needed. It's the same reason you don't tape every recipe you own to the kitchen wall.
This is what makes a skill more than a snippet in a notes app, and it's why this module sits where it does in the course. A skill is a file in the repo, so everything you already learned about versioned text applies to it directly:
-
Recoverable and historied (Module 2). A skill has a
git log. You can see when a step was added and why, andgit restorea botched edit. The procedure is a checkpoint like any other. - Shareable (Modules 8 & 11). Push the repo and the whole team, plus every agent that later operates on it, inherits the same playbook. Nobody runs their own private version of "how we add a command." It's the Module 5 anti-drift argument, applied to procedures.
- Reviewable (Module 10). Changing how the AI performs a procedure arrives as a diff in a PR. Tightening "add a test" into "add a test that asserts the end state, not just no-crash" is a reviewable change to your team's workflow, not an invisible tweak in one person's setup.
A prompt you keep in your head dies with the session. A skill in the repo is durable, shared capability. That's the upgrade: from one-off prompting to a versioned, reviewable asset.
"Skills" is one name for this. Tools also call them custom commands, slash commands, recipes, prompts,
playbooks, or modes, and they load them differently: some auto-discover a dedicated folder, some need
you to point at a file, some let your always-on instructions file say "when asked to add a command,
follow add-command.md." The durable pattern is the same in all of them: a named, invokable file
of structured steps for a repeatable procedure, kept in the repo. Learn the pattern; map it onto
whatever your tool calls it. As with everything in this course, the model and the tool are swappable;
the playbook you wrote is the part that lasts.
A skill's steps aren't limited to editing files. They can drive the test runner, the CLI, Git, and, once you have Module 20's MCP servers wired up, the real systems behind them (open the issue, hit the staging API, query the database). A skill is where you encode "use these hands, in this order, to get this outcome." The deeper your toolchain, the more a written playbook is worth, because there are more steps to get wrong, and more value in getting them right every time.
On paper this is just "write a runbook." The AI-specific twist is what changes the stakes:
-
The AI will execute the playbook, not just read it. A runbook for a human is a reminder; a skill
for an agent is something it performs. The precision pays off immediately: vague step, vague
result; imperative step ("run
python3 -m unittest; do not claim success until it's green"), reliable result. - The AI is confidently incomplete without one. Asked to "add a command," it'll happily stop at the code and skip the test, the changelog, the clean commit, and sound finished doing it. The skill is how you make complete the default instead of a thing you have to keep catching.
- The skill outlives the model. Swap models next quarter and the playbook carries over unchanged. You encoded the procedure, not the prompt that happened to coax it out of this month's model. The workflow is the durable skill; the model is the swappable part; here, literally.
Starting point (this lab is skip-friendly). You do not need to have done the earlier labs. To begin from a clean, known state, copy this module's snapshot into a fresh
tasks-appand make the first commit:mkdir -p ~/ai-workflow-course/tasks-app cp -r ~/ai-workflow-course/modules/21-skills-teaching-the-ai-your-playbook/lab/start/. ~/ai-workflow-course/tasks-app/ cd ~/ai-workflow-course/tasks-app && git init -b main && git add -A && git commit -m "start: module 21"Already carrying your
tasks-appfrom earlier modules? Keep using it and ignore this box. Lab language: markdown (the skill file) plus shell and Python (thetasks-app). You'll write a skill, then have your editor-integrated AI (Module 4) execute it.
You'll write a skill for the procedure from Key concepts, add a new tasks-app command, end to
end: code + test + changelog + clean commit, and then watch the AI run it on a command it's never
seen, producing all four parts without you listing the steps.
You'll need:
- Your agentic coding tool from Module 4, and knowledge of how it loads a procedure (a skills/commands folder it auto-discovers, or simply pointing it at a file by name; check its docs).
- A Python 3.10+
tasks-app. Use the snapshot in this module'slab/tasks-app/(it hasadd,list,done,count, atest_tasks.py, and aCHANGELOG.md), or carry forward your own from earlier modules. It should already be a Git repo from earlier modules; if you're starting fresh, ask Claude Code (claudein the project; sub your own agent) to initialize it and commit a baseline, then confirm withgit logthat the first commit landed.
-
Copy this module's starter skill,
lab/add-command-skill.md, into yourtasks-apprepo wherever your tool expects procedures. If your tool auto-discovers a folder, put it there under a clear name (e.g.add-command.md). If it doesn't, just drop it at the repo root and invoke it by name.cd ~/ai-workflow-course/tasks-app cp ~/ai-workflow-course/modules/21-skills-teaching-the-ai-your-playbook/lab/add-command-skill.md add-command.md
-
Read it. The whole file is short on purpose: when-to-use, inputs, seven ordered steps, and done-criteria. Confirm every project fact in it matches your app (test command, file names, the off-limits
tasks.json). A skill with wrong facts misdirects the AI worse than no skill. -
Commit it. This is the point: the procedure now lives in version control. Ask Claude Code (sub your own agent) to commit the new skill file with a message like "Add skill: add a tasks-app command end to end," then verify it landed:
git log --oneline -1 # the skill commit, by name
-
Start a fresh AI session in your editor and invoke the skill the way your tool does it: its slash command / skill name, or plainly: "Follow
add-command.mdto add aclearcommand that removes all tasks." Crucially, don't list the steps yourself. The skill is supposed to supply them. -
Watch it perform the procedure. A correctly-followed skill will, without you saying any of it:
- add
clear()totasks.pyand wire aclearbranch intocli.py(logic in the right file); - add a real test to
test_tasks.pythat asserts the list is empty afterward (not just "no crash"); - run
python3 -m unittestand show it green; - smoke-test
python3 cli.py clearand show the output; - add a
CHANGELOG.mdline; - stage code + test + changelog into one commit, without
tasks.json.
- add
-
Don't take the AI's word for it. Check against the skill's own done-criteria:
python3 -m unittest # green, and a clear-related test is present python3 cli.py add "x" && python3 cli.py clear && python3 cli.py list # -> (no tasks yet) git show --stat HEAD # one commit: tasks.py, cli.py, test_tasks.py, CHANGELOG.md; no tasks.json
If a step was skipped, that's the lab working: it shows you exactly where your wording was too soft. Tighten that line, have Claude Code (sub your own agent) commit the skill edit while you verify the diff, and run it again on a second command (
high <index>to flag a task, say). A skill you improve once and reuse forever is the deliverable, not the oneclearcommand.
-
Look at what you built:
git log --oneline add-command.md # the procedure's own history git log -p -- add-command.md # full patch history: the file's creation, plus the Part C tighten if you made one
(
git log -psurfaces the skill's own patches no matter what you committed after tightening it, unlikegit diff HEAD~1, which would be empty here because the most recent commit added the second command, not a change to the skill.) Each entry in that history is a change to how your team adds commands: readable, attributable, revertable. In a team repo (Modules 8, 11) it reaches everyone ongit pull; behind review (Module 10) it lands as a PR someone approves. You've turned a procedure you used to narrate into a versioned capability.
- A skill is guidance, not enforcement; same caveat as Module 5. It strongly biases the AI; it doesn't bind it. The agent can still skip a step, especially a soft one, especially late in a long session. The steps that can't be skipped are the ones backed by CI (Module 14): the test the skill tells it to write only gates anything once a pipeline runs it on every push. Write the done-criteria as hard checks, and let CI be the backstop.
- Skills rot. A playbook that says "tests run with X" after you've moved to Y will confidently march the AI off a cliff. Skills are code-adjacent: review them, update them, delete the ones you no longer run. Committing them (so changes are visible) is what makes that maintainable.
- Don't skillify everything. A skill earns its place when a procedure is repeated, multi-step, and gets done wrong without one. A one-off task doesn't need a playbook, and a pile of near-duplicate skills is its own kind of bloat: now you're maintaining ten files and the AI has to pick the right one. Promote a prompt to a skill the third time you've typed it, not the first.
- Overlap with the always-on file causes drift. If a fact lives in both your Module 5 instructions file and a skill, you'll eventually update one and not the other. Keep general facts in the always-on file and reference them from skills; don't duplicate them.
-
A skill is not a security boundary. "Don't stage
tasks.json" is a convention, not a permission. An installed third-party skill is untrusted code that runs against your repo; vetting, permissions, and prompt-injection defense are Module 22's job, immediately next, for exactly this reason.
You're done when:
- Your
tasks-apprepo has a committed skill file for "add a command," withgit logshowing the commit that added it. - You've invoked that skill and watched a fresh AI session produce all four parts (code, a real test, a changelog entry, and one clean commit) without you listing the steps that session.
- You've verified it against the skill's done-criteria (tests green, command works, the commit
contains the right files and not
tasks.json) rather than trusting the AI's summary. - You can state, in one sentence, when to put knowledge in the always-on instructions file (Module 5) versus a skill: general facts go in the file that's always read; a specific repeatable procedure goes in a playbook invoked on demand.
When adding the next command is "invoke the skill" instead of "re-explain the seven steps," the playbook is doing its job. Module 22 comes next, and not by accident: Unit 4 just gave the AI hands, MCP servers and skills, and the very next thing is securing them, because an installed skill or server is untrusted code running in your environment.
This is expansion-zone material; the concept is durable but tool specifics drift. Re-check at build time:
- Skill terminology and mechanics. Confirm how mainstream agentic tools name and load skills (skills / custom commands / slash commands / recipes / prompts), whether they auto-discover a folder or need an explicit pointer, and any required file format/frontmatter, without pinning the lesson to one vendor. Update the "Naming the pattern" paragraph if the common vocabulary has shifted.
- No vendor leaked in. Verify the module still names the pattern, not one implementation, and that the example skill format stays generic (when-to-use / inputs / steps / done-criteria).
- Dependency chain intact. Confirm Module 20 (MCP) and Module 22 (securing servers/skills) are still numbered as referenced, and that nothing here leans on a tool introduced after Module 20.
- Lab still runs.
python3 -m unittestis green inlab/tasks-app/, and theclear-command walkthrough still matches the starter files (add/list/done/count,test_tasks.py,CHANGELOG.md).
Continue to: Module 22: Securing Third-Party MCP Servers and Skills ➡
Generated from the ai-workflow-course repo • the model is the cheap, swappable part; the workflow is the durable skill.
Unit 1: Get out of the chat window
- 1 · The Copy-Paste Problem
- 2 · Version Control as a Safety Net
- 3 · Version Control for Words, Not Just Code
- 4 · Getting the AI Out of the Browser
- 5 · Commit the AI's Config, Not Just the Code
- 6 · Branches as Sandboxes for Experiments
- 7 · Worktrees for Running Agents in Parallel
Unit 2: Make it shareable, reviewable, recoverable
- 8 · Remotes and Hosting (GitHub, the Alternatives, and Owning Your Repo)
- 9 · Issues and the Task Layer
- 10 · Reviewing Code You Didn't Write
- 11 · Collaboration: Humans and Agents on One Repo
- 12 · When It Goes Wrong: Revert, Reset, and Recovery
Unit 3: Automate the checking and shipping
- 13 · Testing in the AI Era
- 14 · Continuous Integration
- 15 · Security Scanning for AI-Generated Code
- 16 · Containers and Reproducible Environments
- 17 · Secrets, Config, and Environments
- 18 · Continuous Delivery and Deployment
- 19 · Runners, the Compute Behind the Automation
Unit 4: Extend the AI into your systems
- 20 · MCP Servers, Giving the AI Hands
- 21 · Skills: Teaching the AI Your Playbook
- 22 · Securing Third-Party MCP Servers and Skills
- 23 · Working with Existing Codebases
Unit 5: AI in the Loop
- 24 · Assistive Agents (AI Review and Issue Triage)
- 25 · Module 25. Autonomous Agents: Issue-to-PR and Self-Healing CI
- 26 · Orchestrating Multiple Agents
- 27 · Module 27. Evals: Trusting an Agent That Acts Without You
Finale