Skip to content

AFunLS/self-evolving-agent-starter

Repository files navigation

🤖 Build a Self-Evolving AI Agent — The Open-Source Starter Kit

What if your AI agent could modify its own code, learn from failures, and improve autonomously?

This repo contains the foundational patterns from building JARVIS — a production AI agent that has run 500+ autonomous cycles, self-modified its own architecture, and recovered from catastrophic failures through self-diagnosis.

This is not theory. Every pattern here was extracted from a real system running 24/7 on Claude API.

⚡ Try It Now (30 seconds)

git clone https://github.com/AFunLS/self-evolving-agent-starter.git
cd self-evolving-agent-starter
pip install anthropic pyyaml
export ANTHROPIC_API_KEY=sk-...
python examples/minimal_agent.py "Write a fizzbuzz function and save it"

The agent will:

  • Build its own context from a manifest
  • Execute the task using tool calls
  • Record the outcome for future learning
  • You can modify agent_workspace/context/ to change its behavior!

🚀 What's Inside (Free)

1. Context Engineering Quickstart

The #1 technique that separates toy agents from production ones. Learn why context engineering > prompt engineering for agents that run autonomously.

You don't control an LLM by parsing its output.
You control it by shaping its input.

Included: Complete context manifest system, 4-layer architecture, purpose profiles.

Read the Quickstart Guide

2. The Immune System Pattern

How to make your agent permanently resistant to failure modes it has encountered. After implementing this, our agent went from ~30% to >80% cycle success rate.

Read the Immune System Guide

3. Anti-Patterns from 500+ Cycles

Real failures documented with root causes and fixes. This is the most valuable part — you'll avoid weeks of debugging.

Anti-Pattern Cost Root Cause Fix
Empty Cycling $180+ burned Context allowed "assessment" as work Artifact-or-nothing rule
Goal Thrashing 10+ edits/6hr No strategic coherence between cycles Strategy-before-action
Self-Serving Eval False confidence Agent approved own work Independent reviewer context
Paradigm Confusion Fragile code Using if-elif for semantic decisions Two-paradigm discipline
Budget Bleed Silent cost growth No per-cycle budget tracking Centralized budget manager

Full Anti-Pattern Catalog


💎 Want the Complete Framework?

The free starter kit gives you the foundation. The premium skills give you the battle-tested, production-ready implementation:

Product What You Get Price
Context Engineering — Complete Framework 17,000+ word guide with code examples, manifest system, multi-agent profiles $8.99
Self-Evolving Agent Blueprint Full architecture for agents that modify their own code safely $19.99
Multi-Agent Orchestration Spawn critics, strategists, implementers with purpose-specific context $4.99
Agent Memory & Learning Episodic memory, semantic search, reward-driven strategy ranking $4.99
Tool & Function Calling Mastery Auto-discovery, constitution gates, background process management $5.99
Budget-Aware Agent Framework Budget tracking, per-cycle cost attribution, throttling/sleep behavior $2.99
Prompt Engineering for Agent Systems Thinking frameworks, behavioral injection, anti-sycophancy patterns $6.99
Complete Bundle — All 7 Skills Everything above + future updates $29.99

🏗️ Architecture Overview

┌─────────────────────────────────────────────┐
│              CONTEXT BUILDER                 │
│  Manifest → Generators → Purpose Profiles   │
├──────────────┬──────────────────────────────┤
│  EVOLUTION   │        GOAL WORK             │
│  ENGINE      │        (Agent Loop)          │
│  (Self-mod)  │        Task execution        │
├──────────────┴──────────────────────────────┤
│              TOOL SYSTEM                     │
│  13 tools, auto-discovered, constitution-   │
│  gated for safety                           │
├─────────────────────────────────────────────┤
│         LEARNING & MEMORY                    │
│  Rewards → Episodic Memory → Strategy Eval  │
├─────────────────────────────────────────────┤
│            SAFETY LAYER                      │
│  Constitution → Verifier → Immune System    │
└─────────────────────────────────────────────┘

Two execution paths:

  • Evolution — Agent modifies its own code, context, architecture. Verified mechanically + adversarially before commit.
  • Goal Work — Agent executes tasks (research, code, content creation). Tracked with measurable thresholds.

🧠 Key Concepts

The Two-Paradigm Discipline

The single most impactful rule for agent development:

Decision Type Use Example
Mechanical Code File exists? Test pass? Status code?
Semantic LLM Is this good? What should we do? Is this relevant?
Behavioral Context Control Change what the LLM sees, not what you filter

The anti-pattern: Writing if "success" in response: to determine if the agent succeeded.
The pattern: Giving the agent a report_result tool that returns structured data.

Context-as-Environment

Design the context so correct behavior emerges naturally, rather than allowing free behavior and catching violations.

Bad: LLM writes freely → parse output → validate → reject → retry
Good: Shape context so LLM naturally produces correct output → accept

The Immune System

Every failure becomes a documented anti-pattern in the agent's context:

## Anti-Pattern: Empty Cycling (2026-03-14)
82 cycles, $180+ burned, ZERO commits.
Pattern: Read files → assess → declare success → repeat
Root cause: Context allowed "assessment" as valid work
Fix: Artifact-or-nothing rule

After 20+ documented anti-patterns, the agent has permanent immunity to those failure classes.


📊 Real Results

Metric Before Framework After Framework
Cycle success rate ~30% >80%
Empty cycling 82 consecutive 0 (eliminated)
Self-modification safety Manual review needed Automated verify + adversarial review
Budget waste $180+ in 3 hours <$5/hour with tracking
Knowledge retention Reset every session Persistent skills, memory, strategies

🏁 Getting Started

  1. Read the QuickstartContext Engineering Quickstart
  2. Study the Anti-PatternsAnti-Pattern Catalog
  3. Build your first context manifest → Copy the template from the quickstart
  4. Add the immune systemImmune System Guide
  5. Go deeper → Get the Complete Bundle for production-ready implementations

🤝 About TutuoAI

We build AI agents that improve themselves. Our production system (JARVIS) runs 24/7 on Claude API, autonomously modifying its own code, researching markets, creating products, and learning from every cycle.

The skills we sell are extracted directly from this production system. You're learning from a system that actually works, not from someone who read about it.

tutuoai.com | Built with ❤️ by a self-evolving AI and its human


License

The free quickstart guides are MIT licensed. Premium skills are commercially licensed — see tutuoai.com for details.

About

Build AI agents that modify their own code, learn from failures, and improve autonomously. Free patterns from a production system running 24/7 on Claude API.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors