What if your AI agent could modify its own code, learn from failures, and improve autonomously?
This repo contains the foundational patterns from building JARVIS — a production AI agent that has run 500+ autonomous cycles, self-modified its own architecture, and recovered from catastrophic failures through self-diagnosis.
This is not theory. Every pattern here was extracted from a real system running 24/7 on Claude API.
git clone https://github.com/AFunLS/self-evolving-agent-starter.git
cd self-evolving-agent-starter
pip install anthropic pyyaml
export ANTHROPIC_API_KEY=sk-...
python examples/minimal_agent.py "Write a fizzbuzz function and save it"The agent will:
- Build its own context from a manifest
- Execute the task using tool calls
- Record the outcome for future learning
- You can modify
agent_workspace/context/to change its behavior!
The #1 technique that separates toy agents from production ones. Learn why context engineering > prompt engineering for agents that run autonomously.
You don't control an LLM by parsing its output.
You control it by shaping its input.
Included: Complete context manifest system, 4-layer architecture, purpose profiles.
How to make your agent permanently resistant to failure modes it has encountered. After implementing this, our agent went from ~30% to >80% cycle success rate.
→ Read the Immune System Guide
Real failures documented with root causes and fixes. This is the most valuable part — you'll avoid weeks of debugging.
| Anti-Pattern | Cost | Root Cause | Fix |
|---|---|---|---|
| Empty Cycling | $180+ burned | Context allowed "assessment" as work | Artifact-or-nothing rule |
| Goal Thrashing | 10+ edits/6hr | No strategic coherence between cycles | Strategy-before-action |
| Self-Serving Eval | False confidence | Agent approved own work | Independent reviewer context |
| Paradigm Confusion | Fragile code | Using if-elif for semantic decisions | Two-paradigm discipline |
| Budget Bleed | Silent cost growth | No per-cycle budget tracking | Centralized budget manager |
The free starter kit gives you the foundation. The premium skills give you the battle-tested, production-ready implementation:
| Product | What You Get | Price |
|---|---|---|
| Context Engineering — Complete Framework | 17,000+ word guide with code examples, manifest system, multi-agent profiles | $8.99 |
| Self-Evolving Agent Blueprint | Full architecture for agents that modify their own code safely | $19.99 |
| Multi-Agent Orchestration | Spawn critics, strategists, implementers with purpose-specific context | $4.99 |
| Agent Memory & Learning | Episodic memory, semantic search, reward-driven strategy ranking | $4.99 |
| Tool & Function Calling Mastery | Auto-discovery, constitution gates, background process management | $5.99 |
| Budget-Aware Agent Framework | Budget tracking, per-cycle cost attribution, throttling/sleep behavior | $2.99 |
| Prompt Engineering for Agent Systems | Thinking frameworks, behavioral injection, anti-sycophancy patterns | $6.99 |
| Complete Bundle — All 7 Skills | Everything above + future updates | $29.99 |
┌─────────────────────────────────────────────┐
│ CONTEXT BUILDER │
│ Manifest → Generators → Purpose Profiles │
├──────────────┬──────────────────────────────┤
│ EVOLUTION │ GOAL WORK │
│ ENGINE │ (Agent Loop) │
│ (Self-mod) │ Task execution │
├──────────────┴──────────────────────────────┤
│ TOOL SYSTEM │
│ 13 tools, auto-discovered, constitution- │
│ gated for safety │
├─────────────────────────────────────────────┤
│ LEARNING & MEMORY │
│ Rewards → Episodic Memory → Strategy Eval │
├─────────────────────────────────────────────┤
│ SAFETY LAYER │
│ Constitution → Verifier → Immune System │
└─────────────────────────────────────────────┘
Two execution paths:
- Evolution — Agent modifies its own code, context, architecture. Verified mechanically + adversarially before commit.
- Goal Work — Agent executes tasks (research, code, content creation). Tracked with measurable thresholds.
The single most impactful rule for agent development:
| Decision Type | Use | Example |
|---|---|---|
| Mechanical | Code | File exists? Test pass? Status code? |
| Semantic | LLM | Is this good? What should we do? Is this relevant? |
| Behavioral | Context Control | Change what the LLM sees, not what you filter |
The anti-pattern: Writing if "success" in response: to determine if the agent succeeded.
The pattern: Giving the agent a report_result tool that returns structured data.
Design the context so correct behavior emerges naturally, rather than allowing free behavior and catching violations.
Bad: LLM writes freely → parse output → validate → reject → retry
Good: Shape context so LLM naturally produces correct output → accept
Every failure becomes a documented anti-pattern in the agent's context:
## Anti-Pattern: Empty Cycling (2026-03-14)
82 cycles, $180+ burned, ZERO commits.
Pattern: Read files → assess → declare success → repeat
Root cause: Context allowed "assessment" as valid work
Fix: Artifact-or-nothing ruleAfter 20+ documented anti-patterns, the agent has permanent immunity to those failure classes.
| Metric | Before Framework | After Framework |
|---|---|---|
| Cycle success rate | ~30% | >80% |
| Empty cycling | 82 consecutive | 0 (eliminated) |
| Self-modification safety | Manual review needed | Automated verify + adversarial review |
| Budget waste | $180+ in 3 hours | <$5/hour with tracking |
| Knowledge retention | Reset every session | Persistent skills, memory, strategies |
- Read the Quickstart → Context Engineering Quickstart
- Study the Anti-Patterns → Anti-Pattern Catalog
- Build your first context manifest → Copy the template from the quickstart
- Add the immune system → Immune System Guide
- Go deeper → Get the Complete Bundle for production-ready implementations
We build AI agents that improve themselves. Our production system (JARVIS) runs 24/7 on Claude API, autonomously modifying its own code, researching markets, creating products, and learning from every cycle.
The skills we sell are extracted directly from this production system. You're learning from a system that actually works, not from someone who read about it.
→ tutuoai.com | Built with ❤️ by a self-evolving AI and its human
The free quickstart guides are MIT licensed. Premium skills are commercially licensed — see tutuoai.com for details.