Stop re-explaining your project to AI every session. Battle-tested
.mdcontext templates, a cost calculator that shows you the damage, and engineering guides extracted from 5 months of running 10-agent AI operations in production.
No framework. No dependencies. Just Markdown files you own.
If you use AI seriously — for trading bots, content production, customer-facing agents, or any multi-step workflow — you've felt this:
- You re-paste the same project context at the start of every session
- Your API costs scale linearly with how often you explain yourself
- Your "good prompts" die when the conversation ends
- Moving from Claude to GPT to Gemini means starting from zero
- Your multi-agent setup has agents "remembering" slightly different things, and nothing tells you where they diverged
These aren't prompt problems. They're context problems.
This kit solves them the boring way: with persistent Markdown files you load at the start of every session, version-control with Git, and carry across any model.
| 📁 Templates | Drop-in .md starters for solo operators, multi-agent swarms, domain specialists, creators, and small businesses |
| 💰 Cost Calculator | See your actual monthly savings from persistent context + Prompt Caching. Open in a browser, no install. |
| 📘 Engineering Guides | Prompt Caching setup, Git workflow, multi-agent sync, model portability, context routing |
| 🧪 Production Examples | Anonymized context files from real running systems: WildEconForce, WILD_SNIPER, Allcare |
Both Anthropic and OpenAI discount cached prefix tokens by ~90%. If your context (persona, rules, domain knowledge) is loaded as a stable prefix at the top of every request, you pay full price once and near-nothing for every session after.
Illustrative numbers for a 10-agent operation running ~40 sessions/day with ~4K tokens of context each:
| Setup | Est. Monthly API Cost |
|---|---|
| Re-explain context every session (typical) | ~$1,200 |
.md context file, no caching |
~$900 |
.md context + Prompt Caching enabled |
~$240 |
That's a 5x swing on the same work. Most teams don't know about it because the UX of Claude and ChatGPT hides it — caching is API-layer only.
The cost calculator in this repo lets you plug in your own usage and see the real number.
git clone https://github.com/USERNAME/context-engineering-kit.git
cd context-engineering-kit
# 1. Pick a template
cp templates/solo-operator.md ~/my-context.md
# 2. Fill in your persona, rules, domain knowledge
$EDITOR ~/my-context.md
# 3. Load it in whatever AI tool you use:
# - Claude Code → rename to CLAUDE.md in project root (auto-loads)
# - Cursor → save as .cursorrules
# - ChatGPT → paste into Custom Instructions
# - Direct API calls → send as a cached system prompt prefixThat's the whole setup. The rest of this repo is depth.
context-engineering-kit/
├── README.md # You are here
│
├── templates/ # Drop-in starters
│ ├── solo-operator.md # Individuals, freelancers
│ ├── multi-agent-shared.md # Shared context for agent swarms
│ ├── domain-specialist.md # Vertical expertise (trading, legal, medical)
│ ├── creator.md # Content + writing workflows
│ └── small-business.md # Local service businesses
│
├── examples/ # Real production context files
│ ├── wildeconforce-efa-lite.md # Financial analysis framework
│ ├── wild-sniper-trading.md # Trading bot domain context
│ └── allcare-service-ops.md # Service business + customer tone
│
├── guides/
│ ├── 01-prompt-caching.md # The 90% discount most teams miss
│ ├── 02-git-workflow.md # Version-control your AI behavior
│ ├── 03-multi-agent-sync.md # Keep 10 agents on the same page
│ ├── 04-model-portability.md # Escape vendor lock-in
│ └── 05-context-router.md # Load different contexts per task
│
├── cost-calculator/
│ ├── index.html # Open in browser, no install
│ └── README.md
│
└── LICENSE
Every pattern in this repo comes from a live system. Nothing here is theoretical.
Korean-language macro + capital flow analysis publishing to X/Twitter (1B+ views). Every post is generated by an agent that loads the EFA (Economic Framework Analysis) System — a custom analytical framework — as persistent context before writing.
What the context file does: defines the analytical lens, voice, taboo claims, and structural templates. The agent never re-learns "how we analyze" because it's always loaded.
What it unlocked: consistent voice across 10+ agents and hundreds of posts, A/B testable framework versions (efa_v1.md vs efa_v2.md), and no drift when the underlying model changes.
→ See examples/wildeconforce-efa-lite.md
Live trading bot (currently v3.8, targeting v4.0 with integrated macro + chart signals). The context file encodes risk rules, signal filters, R:R targets, and trading philosophy.
What it unlocked: strategy iterations are versioned as .md diffs, not tribal knowledge. Caching makes experiment runs cheap. Reproducibility means a "bad trade" can be traced back to the exact context version that produced it.
→ See examples/wild-sniper-trading.md
A neighborhood-scale service business. Customer-facing AI handles pricing questions, booking flow, and tone across the website, KakaoTalk channel, and Instagram DMs.
What the context file does: pre-loads pricing, neighborhood logistics, service boundaries, and a house tone guide.
What it unlocked: one file replaces a 20-page internal "how we talk to customers" doc and keeps three channels in sync. Onboarding a new agent = pointing it at the file.
→ See examples/allcare-service-ops.md
Most conversations stop at "it saves tokens." There's more going on.
1. Direct token reduction You stop re-explaining yourself every session.
2. Indirect token reduction (the bigger one) Without context, the model burns turns on clarification questions. Three- and four-turn loops collapse into one turn. In practice this is 5–10x the impact of (1).
3. Prompt cache exploitation Stable prefix → ~90% discount on reuse. This is the single biggest cost lever most teams overlook.
4. Reproducibility Same context + same input = same output. You can now A/B test your system, not just prompts.
5. Multi-agent consistency One shared context file means every agent in a swarm shares the same world model. Eliminates silent drift between agents — the most expensive bug in multi-agent systems.
6. Model portability
.md moves between Claude, GPT, Gemini, Llama. Custom GPT and Claude Projects trap your context inside one vendor. A text file doesn't.
7. Version control Git-diffable, branchable, rollback-able. Context becomes software engineering instead of folklore.
8. Knowledge asset compounding
5 months of .md files = an IP asset. You can license it, teach from it, or use it as fine-tuning data. Chat history cannot do this.
Each template is a working .md file with six blocks:
- Persona — who the AI is being
- Core Rules — non-negotiables and taboo moves
- Knowledge Base — domain facts the AI should treat as ground truth
- Examples — what good output looks like
- Anti-examples — what to avoid, and why
- Recovery Layer — how to detect and fix drift mid-conversation
Pick a template, fill in the blocks, save the file. Twenty minutes to a working context.
Each guide is ~5 minutes to read, with copy-paste setups:
- Prompt Caching Setup — The 90% discount most teams miss (Claude + GPT API examples)
- Git Workflow for Context — Version, branch, diff, and roll back your AI behavior
- Multi-Agent Sync — Shared context across an agent swarm, with symlink and submodule patterns
- Model Portability — Move the same context between Claude, GPT, Gemini, and local models
- Context Router — Load different contexts per task type (the layer above persistent context)
- One-off, throwaway questions
- Public chatbots where no stable persona exists
- Tasks where the user's prompt already is the whole context
This kit is for repeated, structured work where you'd otherwise re-explain yourself every time.
The term started surfacing in 2025 and has been gaining traction through 2026 as people realized prompt engineering was the smaller half of the problem. Prompting is about wording. Context engineering is about what gets loaded, when, and how persistently.
This repo is a practical implementation of that idea — no theory, no new framework, just Markdown files treated like infrastructure.
PRs welcome for:
- Domain-specific template packs (legal, medical, SaaS ops, research, etc.)
- Translations (Korean, Japanese, Chinese especially — the term is still being localized)
- Additional tool integrations (Cursor, Windsurf, Continue, etc.)
- Case study writeups from your own production systems
See CONTRIBUTING.md.
Maintained by Jack — operator of WildEconForce (Korean macro + capital flow analysis, 1B+ views), the EFA System, the WILD_SNIPER trading bot, VERICUM (C2PA content authenticity marketplace), and Allcare (local service business in Jamsil, Seoul).
Patterns in this repo are extracted from systems that run every day, not from whiteboards.
MIT — use it, fork it, build templates on top and sell them. Attribution appreciated, not required.
If this saves you a few hundred dollars in API bills or a few hours of re-explaining your project, a ⭐ helps others find it.