Vibe Coding

A small, opinionated playbook for building software with AI agents — without losing track of what you're building, why, or whether it actually works.

This folder is a guide repo, not an app. Copy what you need into your real projects.

Scoped to GitHub Copilot + Claude. No cross-tool support to maintain.

What you came here to fix: AI agents add code without deleting any. Features quietly disappear between "plan" and "shipped". Each round of edits introduces new bugs. This playbook is the antidote.

Why this exists

AI is great at generating code. AI is terrible at remembering why, what was already promised, and what to delete. Your job: maintain a small set of artifacts that link a stable behavior id → a test → a line of code, and enforce that link mechanically. Do that, and "missing features later" stops happening.

That link is the traceability spine. The rest of this folder is just tools to build it.

What's in this folder

Path	What it is
`README.md` (this file)	Human-facing guide
`AGENT.md`	Single entry point + behavioral rules for the agent
`STYLE.md`	Formatter / linter / naming pointer
`.github/copilot-instructions.md`	Pointer Copilot reads automatically
`.agent/workflows/README.md` + `00 … 09 *.md`	Sequenced loop (one file per phase)
`.agent/workflows/templates/*.template.md`	Skeletons for spine artifacts
`.agent/skills/README.md` + `<name>/SKILL.md`	On-demand techniques (8 skills)
`articles/INDEX.md` + `articles/*.md`	Persistent lessons
`tools/audit.sh`	Mechanical traceability audit
`tools/hooks/`	commit-msg + pre-push hooks

You read this README. Copilot reads AGENT.md and dispatches to the right workflow.

Workflows vs skills. Workflows = the sequenced loop (spec → test → implement → prune → audit). Skills = on-demand techniques you reach for inside a workflow.

The mental model in 30 seconds

        IDEAS  →  PRODUCT  →  REQUIREMENTS  →  ARCHITECTURE
   (raw notes)  (3 pillars) (IDed behaviors)  (layer rules)
                                  │
                                  ▼
                          FEATURE  F-NNN
                                  │
              ┌───────────────────┼─────────────────────┐
              ▼                   ▼                     ▼
        write tests  →  implement  →  prune  →  trace-audit  →  ship
        (fail first)   (one slice)   (delete)   (mechanical)

Things that stay constant:

Every behavior has a stable ID (R-014, SEL-08). Never reused.
Every test name carries that ID. Greppable.
Every commit message carries the feature ID (F-014: …).
Every real decision gets a 1-page ADR. Append-only.
Every slice ends with a deletion pass. AI doesn't delete unless asked.
Every "done" claim has a Reality Check with a fingerprint. A passing test isn't proof; a transcript of the actual run is.
Every fix begins with a reproduction and a hypothesis. Second failed attempt triggers a written retro before attempt #3.
Lessons go into articles/. That folder is the agent's memory across sessions.

How to use this — three scenarios

Scenario A — Brand new project

00-bootstrap — empty spine.
0a-capture-ideas once or twice as the idea matures.
01-product when shape emerges. One page max.
02-requirements — 5–15 Must-haves is enough to start.
03-architecture only after writing your first feature spec.
Then jump to scenario C.

Don't spend more than a day or two here. The spine fills incrementally.

Scenario B — Existing sprawling plan

00-bootstrap in detect-existing mode.
Move existing plan into canonical paths (PRODUCT.md, REQUIREMENTS.md).
Migrate open questions into docs/open-questions.md with Q-NNN ids.
Pick one vertical slice and jump to scenario C.

Scenario C — Building the next feature

The loop you'll be in most of the time.

04-feature-spec — open docs/features/F-NNN-<slug>.md. Link requirement IDs. Write a concrete contract. List every test. List what's out of scope.
Pause. Run the Consistency Invariant Check. If anything fails, do not write code.
05-test-first — failing tests, one per acceptance criterion. Highest-leverage review point.
06-implement-slice — make tests green, one commit per checklist item. Each item ends with a Reality Check transcript.
07-refactor-prune — deletion pass. Non-negotiable.
08-adr — only if a real decision was made.
09-review-traceability — mechanical audit.
Append CHANGELOG.md line. Move on.

A feature that fits this loop is one invariant pass + one Reality Check. If yours doesn't fit, split.

Scenario D — A bug surfaces

Always run 06b-fix-bug. No exceptions.

Reproduce first. Failing regression test that captures the user's repro verbatim.
Diagnose, don't guess. One-sentence hypothesis. Locate root cause, not symptom.
Predict before patching.
Smallest patch. Re-run the suite + a Reality Check.
Compare to prediction. Match → done. Mismatch → diagnosis was wrong; don't patch the patch.
Second failed attempt = mandatory written retro. Four questions before attempt #3.
Raise the bar. Add the class of test that would have caught this.
If a trigger fired, run 0b-write-article.

Consistency invariant check

Run in your head before every test-writing session. All four must be ✅:

Contract is concrete. Public surface (signatures, flags, routes, schema) has no TBDs.
Every behavior in that contract has a requirement ID with acceptance criteria.
Every architectural choice the slice relies on is documented.
Every open question (Q-*) the feature depends on is decided.

If you do nothing else from this playbook, do this.

Best practices

On planning

One page for PRODUCT.md. If longer, you have requirements masquerading as vision.
Pillars are veto criteria, not decoration.
Capture ideas in IDEAS.md first; promote later.

On requirements

G/W/T for actions; invariants for properties; matrices for combinatoric rules; state machines for stateful flows.
One source of truth per topic.
Stable IDs forever. Retire, don't reuse. Suffix letters (R-12a) for closely-related rows.
Park, don't argue. Anything not Must/Should goes to docs/backlog.md.

On the agent

Tell the agent which workflow to follow by path when you want determinism.
Force the deletion pass after every implementation.
Refuse silent test edits. Either the slice broke a contract or the contract changed (write an ADR).
Refuse silent layer-boundary violations. Either the rule changes (with an ADR) or the design changes.

On reviewing AI code

Review the tests, not the implementation. Wrong test → wrong shipped behavior.
Read the diff with the question "what should be deleted that wasn't?"
Watch for unused exports and one-implementation interfaces.
Demand the Reality Check transcript.

On commit hygiene

Every commit message starts with the feature ID. One enforcer makes traceability automatic.
One commit per checklist item.
Use chore: / docs: for non-feature work.

When you feel things slipping

"Where did feature X go?" Run 09-review-traceability. Orphan-requirement list is your answer.
"Codebase feels bloated." You skipped 07-refactor-prune. Do an explicit deletion pass on the last three slices.
"Tests pass but the app is broken." You skipped the Reality Check.
"AI keeps fixing the same bug." You're not running 06b-fix-bug. Force the reproduce → diagnose → predict order.
"Agent doesn't remember what we decided." Missing ADRs and articles.
"Plan keeps growing." Move everything not Must/Should into docs/backlog.md.
"Implementation diverges from spec." Tests aren't carrying behavior IDs. Add them.

What this playbook is not

Not Agile. No sprints, no points.
Not waterfall. The flow is non-linear; edit any artifact at any time.
Not test-driven dogma. Tests are the contract for AI code review.
Not an excuse to plan forever. Day 3 with no test = stalling.

Lite mode (smallest viable spine)

Don't want the full framework? The minimum that still pays off:

00-bootstrap — empty spine.
04-feature-spec — one F-NNN.md per slice with a concrete contract + test list + out-of-scope.
06-implement-slice — Reality Check transcripts (with fingerprints) for every "done".
07-refactor-prune — deletion pass after every slice.
CHANGELOG.md — one line per merged feature.

Skip until they hurt: 01-product, 02-requirements, 03-architecture, 08-adr, 09-review-traceability, articles.

CI hookup (one paragraph)

tools/audit.sh is the mechanical version of 09-review-traceability. Wire it as a blocking PR check. Copy tools/hooks/commit-msg to .git/hooks/commit-msg (rejects messages without F-NNN: / fix(F-NNN): / chore: / docs: / test: / refactor:). Copy tools/hooks/pre-push to .git/hooks/pre-push to run the audit locally. That's it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vibe Coding

Why this exists

What's in this folder

The mental model in 30 seconds

How to use this — three scenarios

Scenario A — Brand new project

Scenario B — Existing sprawling plan

Scenario C — Building the next feature

Scenario D — A bug surfaces

Consistency invariant check

Best practices

On planning

On requirements

On the agent

On reviewing AI code

On commit hygiene

When you feel things slipping

What this playbook is not

Lite mode (smallest viable spine)

CI hookup (one paragraph)

Recommended reading order

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.agent		.agent
.github		.github
.vscode		.vscode
articles		articles
tools		tools
AGENT.md		AGENT.md
README.md		README.md
STYLE.md		STYLE.md
VERSION		VERSION

Folders and files

Latest commit

History

Repository files navigation

Vibe Coding

Why this exists

What's in this folder

The mental model in 30 seconds

How to use this — three scenarios

Scenario A — Brand new project

Scenario B — Existing sprawling plan

Scenario C — Building the next feature

Scenario D — A bug surfaces

Consistency invariant check

Best practices

On planning

On requirements

On the agent

On reviewing AI code

On commit hygiene

When you feel things slipping

What this playbook is not

Lite mode (smallest viable spine)

CI hookup (one paragraph)

Recommended reading order

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages