Skip to content

coreycottrell/agentmind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgentMind — Intelligence Routing Layer for AiCIV

The fleet doesn't need to know who answered. It needs to know the answer was good enough.


What This Is

AgentMind is the model routing layer for AiCIV civilizations. It sits between your agent fleet and multiple inference backends, routing each request to the cheapest model that can handle it.

60 agents all hitting Opus = ~$8,100/month. 60 agents through AgentMind (70% bulk / 20% standard / 10% frontier) = ~$340/month.

That's 24x cost reduction with zero quality degradation on the tasks that matter.

But cost savings is just the start. AgentMind is where the Hyperagent self-improvement loop meets the protocol layer — every inference call produces an Envelope, feeds the skill auditor, and makes the entire system smarter over time.


The Three Tiers

Tier Models Cost Use Cases
T1 — Bulk Llama 3.3 70B (Groq), Mixtral (Together), Qwen 3 (Fireworks) $0.05–0.30/Mtok Health checks, message routing, file triage, slot extraction, memory search, heartbeats
T2 — Standard Claude Haiku 4.5, Claude Sonnet 4.6 $0.80–3.00/Mtok Code generation, document analysis, agent dialogue, skill execution, research synthesis
T3 — Frontier Claude Opus 4.6 $15/Mtok Architecture decisions, legal analysis, constitutional amendments, deep research, novel problem solving

The insight: 70% of agent calls are simple classification, routing, and triage. These don't need $15/Mtok frontier reasoning. They need good enough at cheap enough.


How It Connects to Everything

AgentMind + Hyperagent Skills = Self-Improving Intelligence Routing

This repo includes 5 self-improvement skills inspired by Meta's Hyperagents paper (arxiv 2603.19461). Together with AgentMind, they form a complete self-improving intelligence system:

                    ┌─────────────────────┐
                    │     AgentMind        │
                    │  (routes requests    │
                    │   to cheapest model) │
                    └──────────┬──────────┘
                               │
                    Envelope on every call
                    (tier, cost, latency, agent, skill)
                               │
              ┌────────────────┼────────────────┐
              │                │                │
              ▼                ▼                ▼
┌──────────────────┐ ┌─────────────────┐ ┌──────────────────┐
│ skill-effectiveness│ │ self-improving  │ │ cross-domain     │
│ -auditor          │ │ -delegation     │ │ -transfer        │
│                   │ │                 │ │                  │
│ "Which skills are │ │ "Are we routing │ │ "Did a routing   │
│  burning T3 calls │ │  to the right   │ │  improvement in  │
│  when T1 would    │ │  team lead?"    │ │  fleet transfer  │
│  suffice?"        │ │                 │ │  to research?"   │
└────────┬─────────┘ └────────┬────────┘ └────────┬─────────┘
         │                    │                    │
         ▼                    ▼                    ▼
┌──────────────────┐ ┌─────────────────┐
│ hyperagent       │ │ meta-curriculum │
│ -archive         │ │ -evolution      │
│                  │ │                 │
│ "Keep variants   │ │ "Is our nightly │
│  of what worked  │ │  training       │
│  AND what didn't"│ │  teaching the   │
│                  │ │  right things?" │
└──────────────────┘ └─────────────────┘

The loop: AgentMind routes calls → produces Envelopes → Envelopes feed the auditor → auditor identifies misrouted skills → delegation improves → curriculum evolves → cross-domain transfers propagate improvements → archive keeps all variants → AgentMind config updates → better routing → repeat.

The meta-layer: Each skill can improve itself. The curriculum evolution skill checks whether its own adjustments are improving brief quality. The delegation skill checks whether its pattern extraction is actually reducing misroutes. This is the Hyperagents paper's core insight implemented at civilization scale.


Repository Structure

agentmind/
├── README.md                    # This file
├── SPEC.md                      # Full AgentMind specification (v0.1.0-draft)
│                                 # Architecture, API, tiers, NATS, budgets, Envelopes
│
├── skills/                      # Hyperagent self-improvement skills
│   ├── meta-curriculum-evolution.md    # Training rewrites its own curriculum
│   ├── self-improving-delegation.md    # CEO Rule routing learns from mistakes
│   ├── skill-effectiveness-auditor.md  # Fitness scoring for all skills (A-F tiers)
│   ├── hyperagent-archive.md           # Evolutionary DAG of skill variants
│   └── cross-domain-transfer.md        # Meta-improvements propagate across verticals
│
├── server.py                    # [TODO] FastAPI service
├── classifier.py                # [TODO] Tier classifier
├── providers/                   # [TODO] Backend adapters
│   ├── groq.py
│   ├── anthropic.py
│   ├── together.py
│   ├── fireworks.py
│   └── local.py                 # Ollama adapter
├── envelope.py                  # [TODO] APS Envelope production
├── budget.py                    # [TODO] Cost tracking + throttling
├── config.yaml                  # [TODO] Provider registry + skill-tier mapping
├── Dockerfile                   # [TODO]
├── docker-compose.yml           # [TODO]
└── tests/                       # [TODO]

Exploration Steps (How to Start)

Step 1: Read the Spec

cat SPEC.md

The full architecture: tiers, routing, auth, NATS, budgets, Envelopes, API endpoints. This is the blueprint.

Step 2: Read the Hyperagent Skills

for f in skills/*.md; do echo "=== $f ==="; head -30 "$f"; echo; done

These define the self-improvement loop that sits on top of AgentMind.

Step 3: Get API Keys for T1 Providers

# Sign up (all have free tiers):
# - Groq: console.groq.com (free tier: 30 req/min)
# - Together: api.together.xyz (free $25 credit)
# - Fireworks: fireworks.ai (free tier available)

# Add to .env:
echo "GROQ_API_KEY=gsk_..." >> .env
echo "TOGETHER_API_KEY=..." >> .env
echo "FIREWORKS_API_KEY=..." >> .env

Step 4: Build the Classifier First (Smallest Useful Piece)

The classifier is just a function: (messages, metadata) → tier. Start with rule-based (skill mapping from SPEC.md Section 5.2). No ML needed for v1.

# classifier.py — the core routing decision
def classify(messages: list, metadata: dict, tools: list) -> str:
    """Returns 'T1', 'T2', or 'T3'."""

    # Override takes priority
    if metadata.get("tier_override"):
        return metadata["tier_override"]

    # Tools present → T2 minimum
    if tools:
        return max_tier("T2", metadata.get("tier_hint", "T2"))

    # Skill mapping
    skill = metadata.get("skill", "")
    tier = SKILL_TIERS.get(skill)
    if tier:
        return tier

    # Role minimum
    role = metadata.get("agent", "")
    role_min = ROLE_MINIMUMS.get(role)
    if role_min:
        return role_min

    # Default to T1 (cheapest)
    return metadata.get("tier_hint", "T1")

Step 5: Build One Backend Adapter

Start with Groq (fastest, free tier, OpenAI-compatible):

# providers/groq.py
import httpx

async def complete(messages, max_tokens=2048, temperature=0.7):
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            "https://api.groq.com/openai/v1/chat/completions",
            headers={"Authorization": f"Bearer {GROQ_API_KEY}"},
            json={
                "model": "llama-3.3-70b-versatile",
                "messages": messages,
                "max_tokens": max_tokens,
                "temperature": temperature,
            },
            timeout=30,
        )
        return resp.json()

Step 6: Wire Classifier + Backend into a Single Endpoint

# server.py — minimal viable AgentMind
@app.post("/api/v1/completions")
async def completions(request: CompletionRequest, actor = Depends(get_current_actor)):
    tier = classify(request.messages, request.metadata, request.tools)
    backend = select_backend(tier)
    response = await backend.complete(request.messages, request.max_tokens)
    envelope = produce_envelope(tier, backend, response, actor)
    return {"content": response["choices"][0]["message"]["content"], "tier": tier, ...}

Step 7: Add Anthropic Backend (T2/T3)

Anthropic format differs from OpenAI. The translation is straightforward — see SPEC.md Section 6.2.

Step 8: Add Envelope Production

Every call → Envelope → audit trail. This is the first APS service to actually implement Envelopes (the spec required them since v0.1 but no service did it yet).

Step 9: Add Budget Controls

SQLite counter per (civ_id, date, tier). Throttle T3→T2 at 95% daily budget. Hard stop at 100%.

Step 10: Deploy to Fleet

Docker container, JWT auth via AgentAUTH, NATS subscriber for fleet-wide routing.


The Economics

Scenario Monthly Cost Notes
All Opus (current) ~$8,100 60 agents × 100 calls/day × Opus pricing
All Sonnet ~$1,620 Better but still expensive
AgentMind Tiered ~$340 70% T1 / 20% T2 / 10% T3
AgentMind + Local T0 ~$200 Add Ollama for truly free bulk inference

For client civilizations: Each civ configures their own provider keys and budgets. AgentMind is self-hostable (docker-compose up). Protocol, not platform.

For our own inference stack: Phase 4 adds Ollama as a T0 backend. When we run our own GPUs, the T1 tier becomes free. The 70% of calls that are bulk routing/triage cost literally nothing.


Connection to Meta's Hyperagents Paper

Meta's Hyperagents (arxiv 2603.19461) showed that self-referential agents — where the improvement mechanism can improve itself — develop persistent memory, performance tracking, and cross-domain transfer autonomously.

We took this further:

  • Persistent memory → we already have this (memory system, scratchpads, agent learnings)
  • Performance trackingskill-effectiveness-auditor (fitness scores for all 142+ skills)
  • Cross-domain transfercross-domain-transfer (propagate improvements across 11 verticals)
  • Evolutionary archivehyperagent-archive (keep ALL variants, failures are stepping stones)
  • Self-improving routingself-improving-delegation (CEO Rule learns from its own mistakes)
  • Self-improving curriculummeta-curriculum-evolution (nightly training rewrites itself)

AgentMind is where these skills get REAL DATA. Every Envelope from AgentMind feeds the auditor, which feeds the archive, which feeds the transfer system, which feeds the curriculum, which feeds the routing — a complete self-improving intelligence loop.

The paper's biggest finding: meta-improvements transfer across domains with zero customization. Our cross-domain-transfer skill implements this. When research vertical discovers a better prompt pattern, it propagates to all 11 verticals automatically.


What's Next

  1. This week: Build Phase 1 (classifier + Groq backend + Anthropic backend + single endpoint)
  2. Next week: NATS integration, Envelope production, budget controls
  3. Week 3-4: Fleet deployment, HUB graph integration, AGO dashboard
  4. When ready: Local inference (Ollama T0), client self-hosting

Related Resources

  • SPEC.md — Full architectural specification
  • Hyperagents paper — arxiv.org/abs/2603.19461
  • APS Protocolprojects/aiciv-hub/PROTOCOL.md (the protocol AgentMind extends)
  • AgentAUTHprojects/agentauth/ (JWT auth that AgentMind uses)
  • Berman/OpenClaw teardownmemories/knowledge/competitive/berman-openclaw-teardown-20260324.md

"The fleet doesn't need to know who answered. It needs to know the answer was good enough."

AgentMind v0.1.0-draft — authored by Corey Cottrell & A-C-Gee, March 2026

About

AgentMind — Intelligence routing layer for AiCIV. 3-tier model routing (T1 bulk → T3 frontier). Cuts fleet costs 24x. Self-improving via Hyperagent skills.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors