TDD-style prompt engineering for AI-assisted development.
Flight is a methodology and toolset that reduces AI code generation mistakes by front-loading constraints. Instead of fixing AI mistakes after the fact, Flight ensures the rules are known before a single line is written.
AI code generation has a fundamental problem: it often produces code that looks correct but violates project standards, framework conventions, and established engineering patterns.
The traditional fix is linting and code review after generation. This fails because:
- Remediation is expensive - Fixing generated code takes longer than writing it right
- Context is lost - By the time you're fixing, Claude has forgotten why it wrote what it wrote
- Patterns repeat - Claude makes the same mistakes across sessions
- Standards drift - Each conversation starts from zero knowledge of your rules
Flight inverts this. The rules come first. The code comes second.
┌─────────────────────────────────────────────────────────────────────────────┐
│ /flight-prd │
│ INPUT: Rough idea ("document collection via SMS") │
│ OUTPUT: PRD.md + tasks/*.md + known-landmines.md │
│ TOOLS: Web Search, Firecrawl, Context7 (temporal research automatic) │
│ USE: Starting from scratch, need to understand problem space │
│ FLAGS: --no-research (skip temporal research for fast iteration) │
└──────────────────────────────────┬──────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ /flight-prime │
│ INPUT: Task description, PRD.md, or tasks/001-*.md │
│ OUTPUT: PRIME.md │
│ TOOLS: Domain Files, Codebase Scan, Web Search, Context7, Firecrawl │
│ USE: Have a clear task, need to gather implementation details │
└──────────────────────────────────┬──────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ /flight-compile │
│ INPUT: PRIME.md │
│ OUTPUT: PROMPT.md │
│ TOOLS: None - pure synthesis │
│ USE: After prime, before implementation │
└──────────────────────────────────┬──────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ EXECUTE │
│ INPUT: PROMPT.md │
│ OUTPUT: Code files │
│ HOW: Claude implements following the compiled prompt │
└──────────────────────────────────┬──────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ /flight-validate │
│ INPUT: Generated code files │
│ OUTPUT: PASS/FAIL with specific violations │
│ TOOLS: .validate.sh scripts (deterministic grep-based checks) │
└──────────────────────────────────┬──────────────────────────────────────────┘
│
┌─────────┴─────────┐
▼ ▼
PASS FAIL
│ │
▼ ▼
Done /flight-tighten
│
▼
Loop back to /flight-compile
| Starting Point | First Command |
|---|---|
| Vague idea ("build SMS thing") | /flight-prd |
| Clear task description | /flight-prime |
| PRD.md or tasks/*.md file | /flight-prime |
| Have domain.md, need validator | /flight-create-validator |
| Have code, need to check it | /flight-validate |
| New project, need domain detection | /flight-scan |
| Existing project, adding new deps | /flight-research |
| Stale project (>3 months old) | /flight-research --quick |
curl -fsSL https://raw.githubusercontent.com/mojoatomic/flight/main/install.sh | bashThis creates:
your-project/
├── .flight/
│ ├── FLIGHT.md # Core methodology
│ └── domains/ # Domain rules + validators
│ ├── bash.md / .validate.sh
│ ├── javascript.md / .validate.sh
│ ├── react.md / .validate.sh
│ └── ...
├── .claude/skills/ # Flight skills (SKILL.md format)
│ ├── flight-prd/SKILL.md
│ ├── flight-prime/SKILL.md
│ ├── flight-compile/SKILL.md
│ ├── flight-validate/SKILL.md
│ ├── flight-tighten/SKILL.md
│ ├── flight-scan/SKILL.md
│ └── flight-research/SKILL.md
└── CLAUDE.md # Project instructions
./update.shOr fetch the latest update script directly:
curl -fsSL https://raw.githubusercontent.com/mojoatomic/flight/main/update.sh | bashUpdates:
.claude/skills/*(all Flight skills).flight/FLIGHT.md(core methodology).flight/validate-all.sh.flight/domains/*(all stock domains).flight/examples/,exercises/,templates/
Preserves:
CLAUDE.md(your project description)PROMPT.md,PRIME.md(your working files).flight/known-landmines.md(your project's temporal data)- Custom domains (any
.md/.shnot in Flight repo)
Purpose: Transform a rough product idea into atomic tasks with temporal validation.
Claude Code excels at atomic, well-defined tasks. It struggles with large, multi-step projects. This command breaks big ideas into executable units AND validates that your tech stack choices are current.
Output:
PRD.md- Product vision (human reference)MILESTONES.md- Phases with exit criteriatasks/*.md- Atomic task files with pinned versions.flight/known-landmines.md- Temporal issues discovered (if any)
Tools Used:
- Web Search → Competitors, market research, recent articles
- Firecrawl → Deep crawl competitor sites, extract features
- Context7 → Current docs for considered tech stack
- Temporal Research (automatic) → Breaking changes, version issues, landmines
Flags:
--no-research- Skip temporal research for fast iteration or when versions are known
Example:
/flight-prd Simple document collection - user gets SMS link, uploads docs, stored encrypted
Purpose: Gather all context needed to implement a task.
Input: Task description, PRD.md, or task file (tasks/001-*.md)
Output: PRIME.md with:
- Relevant domain constraints (NEVER/MUST/SHOULD)
- Existing patterns from codebase
- External API documentation
- Files to create/modify
Tools Used:
- Domain Files → Project constraints (always first)
- Codebase Scan → Existing patterns, configs
- Web Search → Current API docs, recent changes
- Context7 → Library/framework documentation
- Firecrawl → Deep dive specific doc sites
Example:
/flight-prime tasks/003-auth-basic.md
Purpose: Transform research into an atomic, executable prompt.
Input: PRIME.md
Output: PROMPT.md containing:
- Single, focused task
- Extracted NEVER/MUST constraints
- Acceptance criteria
- Definition of done
Why atomic? Large tasks fail. Small, well-specified tasks succeed.
Purpose: Check generated code against domain rules.
Runs relevant .validate.sh scripts. Returns pass/fail with specific violations.
Example:
/flight-validate src/auth/*.ts
═══════════════════════════════════════════
TypeScript Domain Validation
═══════════════════════════════════════════
## NEVER Rules
✅ N1: No explicit any without justification
✅ N2: No @ts-ignore without explanation
## MUST Rules
✅ M1: Explicit return types on exports
RESULT: PASS
═══════════════════════════════════════════
Purpose: Auto-detect project domains and generate validation config.
Scans your project for file types and frameworks, then creates .flight/flight.json with enabled domains.
Example:
/flight-scan
# Creates .flight/flight.json:
{
"enabled_domains": ["code-hygiene", "typescript", "react", "nextjs"]
}After scanning, .flight/validate-all.sh uses this config to run only relevant validators.
See Also: Validation Runner Documentation for complete usage, exclusions system, and CI/CD integration.
Purpose: Analyze validation failures and strengthen rules.
When validation fails, tighten examines why and suggests domain improvements to prevent recurrence. Then loop back to /flight-compile.
Purpose: Generate validator script and test files from a domain contract.
Input: Domain .md file
Output:
domain.validate.sh- Executable validator matching NEVER/MUST/SHOULD rulestests/domain.bad.ext- Test file that must fail validationtests/domain.good.ext- Test file that must pass validation
Example:
/flight-create-validator .flight/domains/my-domain.md
Purpose: Temporal research for dependencies - validates versions, discovers breaking changes.
Note: For new projects, /flight-prd runs this automatically. Use standalone for:
- Existing projects adding new dependencies
- Re-checking stale projects (>3 months old)
- After using
/flight-prd --no-research
Output:
- Version recommendations with reasons
.flight/known-landmines.md- Issues discovered with re-verification dates- Updated task files (if they exist) with pinned versions
Flags:
--quick- Check landmines staleness only, no new searches--deep- All deps, add Firecrawl deep dives--include-all- Include tooling deps (eslint, prettier, @types)
Example:
/flight-research express@5 mongodb@7
/flight-research --quick
| Tool | What It Does | When to Use |
|---|---|---|
| Web Search | General search, news, articles | Competitors, market research, recent changes |
| Context7 | Library/framework docs (curated, versioned) | Getting current API docs for Next.js, React, Supabase, etc. |
| Firecrawl | Deep web scraping, crawl entire sites | Extracting features from competitor sites, scraping doc sites |
| Domain Files | Project-specific rules | Always - before any code generation |
| Codebase Scan | Find existing patterns | Understanding project conventions |
MCP Tool Installation:
- Context7: https://github.com/upstash/context7
- Firecrawl: https://github.com/mendableai/firecrawl
Commands work without MCP tools (using web search fallback) but produce better results with them.
Domains are the heart of Flight. Each domain captures:
- NEVER rules - Hard constraints that fail validation
- MUST rules - Required patterns that fail validation
- SHOULD rules - Best practices that warn but don't fail
- GUIDANCE - Patterns too complex for grep, documented for Claude
The code-hygiene domain applies to all code in every language. It catches AI-generated code smells that transcend syntax:
- Generic variable names -
data,result,temp,item,value,obj→ use descriptive names - Redundant conditionals -
if (x) return true; else return false;→return x; - Meaningless prefixes -
myVar,theUser,aResult→ drop the noise - Magic number calculations -
1024 * 1024→BYTES_PER_MB - Boolean parameter blindness -
process(true, false)→ use named options
This domain runs automatically on every validation. You don't need to explicitly load it.
| Domain | Focus | Key Rules |
|---|---|---|
api |
REST/HTTP APIs | Resource URIs, status codes, versioning |
bash |
Shell scripts | Strict mode, quoting, error handling |
code-hygiene |
Universal | Naming, redundant logic, semantic clarity |
clerk |
Clerk auth (TS/Next.js) | clerkMiddleware, organizations, multi-tenant patterns |
docker |
Container config | Multi-stage builds, non-root users, layer caching |
embedded-c-p10 |
Safety-critical C | NASA Power of 10 rules |
go |
Go source files | Error handling, defer patterns, concurrency |
javascript |
JS files | No var, no ==, no console.log |
kubernetes |
K8s manifests | Resource limits, probes, security contexts |
nextjs |
Next.js App Router | Server/client boundaries, loading states |
prisma |
Prisma ORM (TS/Next.js) | Multi-tenant queries, N+1 prevention, error handling, singleton |
python |
Python files | No bare except, type hints, logging |
react |
React components | No inline objects in JSX, proper hooks |
rp2040-pico |
RP2040 embedded | Spinlocks, watchdog, static allocation |
rust |
Rust files | Error handling, unsafe blocks, ownership |
scaffold |
Project setup | create-vite, npm init patterns |
sms-twilio |
SMS/Twilio | Message validation, error handling, opt-out |
sql |
Database queries | No SELECT *, parameterized queries, RLS |
supabase |
Supabase (TS/Next.js) | @supabase/ssr, auth patterns, realtime cleanup |
testing |
Unit tests | Isolation, naming, assertion patterns |
typescript |
TypeScript files | No unjustified any, explicit returns |
webhooks |
Webhook handlers | Idempotency, signature verification, timeouts |
yaml |
YAML config | Quoting, anchors, multiline strings |
The .md file is a contract. The .validate.sh file enforces it.
Rules in NEVER/MUST sections must have corresponding validator checks. If a rule can't be mechanically checked, it belongs in GUIDANCE, not NEVER.
This keeps the contract honest. No aspirational rules that aren't enforced.
Flight domains are defined in .flight YAML files and compiled to .md (documentation) + .validate.sh (executable validator). This single-source approach eliminates drift between specs and validators.
# One-time setup
python3 -m venv .venv
.venv/bin/pip install pyyaml1. Create .flight/domains/my-domain.flight
2. Compile: .flight/bin/flight-domain-compile my-domain.flight
3. Test: .flight/validate-all.sh
domain: my-domain
version: "1.0"
description: What this domain covers
file_patterns:
- "**/*.ts"
rules:
N1:
title: Rule title
severity: NEVER # NEVER, MUST, SHOULD, or GUIDANCE
mechanical: true # Generate validator check
description: Why this rule exists
check:
type: grep # grep, script, presence, file_exists
pattern: "bad-pattern"
examples:
bad:
- "code that violates"
good:
- "code that passes"# Compile single domain
.flight/bin/flight-domain-compile my-domain.flight
# Compile all domains
.flight/bin/flight-domain-compile --all
# Syntax check only
.flight/bin/flight-domain-compile --check my-domain.flight| Level | Validator | Blocks |
|---|---|---|
| NEVER | check() |
Yes |
| MUST | check() |
Yes |
| SHOULD | warn() |
No |
| GUIDANCE | None | No |
See .flight/FLIGHT.md for complete YAML reference and check types.
Linters catch syntax and style. Flight catches semantic mistakes:
- Standard linters typically won't flag
const data = fetchUser()- it's valid JS - Flight flags it because
datais a meaningless name - Standard linters typically won't flag
if (x) return true; else return false; - Flight flags it because it should be
return x;
Linters and Flight are complementary. Linters handle formatting. Flight handles intent.
Prompts decay over long conversations. Claude forgets instructions. Context windows fill up.
Flight commands re-inject rules at decision points. Every /flight-prime loads fresh domain knowledge. Every /flight-validate checks against ground truth.
- Deterministic - Same input, same output, every time
- Fast - Grep is faster than AST parsing for most checks
- Transparent - Read the script, understand the check
- No dependencies - Bash exists everywhere
- Composable - Chain validators, filter output, integrate with CI
If it follows the invariants, it's correct.
- Invariants beat intelligence
- Prompts are compiled, not written
- Validation is executable, not interpreted
- Failures strengthen the system
# .github/workflows/flight.yml
- name: Flight Validation
run: |
for validator in .flight/domains/*.validate.sh; do
bash "$validator" "src/**/*.ts" || exit 1
done# .pre-commit-config.yaml
- repo: local
hooks:
- id: flight-validate
name: Flight Domain Validation
entry: bash -c 'for v in .flight/domains/*.validate.sh; do bash "$v" || exit 1; done'
language: systemFlight commands are automatically available in Claude Code when .claude/skills/ exists in your project.
Q: Does this replace code review?
No. Flight catches mechanical violations. Humans catch architectural mistakes, business logic errors, and "this works but is the wrong approach."
Q: What if a rule has false positives?
Refine the grep pattern or move the rule to GUIDANCE. The validator should only fail on actual violations.
Q: Can I disable rules?
Yes. Remove them from the domain file, or add exclusion patterns to the validator.
Q: How do I know which domains apply?
/flight-prime auto-detects based on file extensions and framework markers. You can also specify explicitly.
Apache 2.0