A persistent agent team plugin for Claude Code -- 12 roles, 57 skills, that learn and improve over time. v5.8.0.
Built by xdimension.ai
Each role accumulates knowledge across sessions: strategies that work get reinforced, strategies that fail get retired. Trust levels gate permissions so agents earn autonomy through clean performance.
- Claude Code CLI
- Python 3.6+ (required by hooks)
- bash 3.2+ (macOS stock works)
- git (for knowledge tracking)
# Add the marketplace (adjust path to where you cloned the repo):
claude plugin marketplace add /path/to/ainous-team
# Install the plugin:
claude plugin install ainous-teamAfter installing the plugin, Claude Code will prompt you to initialize on the next session start. You can also start it manually at any time:
/team-initThis interactive command walks you through choosing your operating mode and then scaffolds ~/.claude/ainous-roles/ with starter files for all 12 roles. The setup is idempotent — running it again won't overwrite existing data.
Mode options:
- Coordinator-as-default (recommended): Claude automatically plans, delegates to role agents, and synthesizes results. You just type tasks naturally.
- Agent mode: You manually invoke roles with
@coordinator,@developer, etc.
Alternative — manual setup via shell:
# Coordinator-as-default mode (recommended)
bash "$(claude plugin path ainous-team)/scripts/setup.sh"
# Agent mode
bash "$(claude plugin path ainous-team)/scripts/setup.sh" --agentmodePost-commit journal reminder for coordinators: run bash scripts/install-post-commit-journal-reminder.sh once per repo to enable automatic commit-time prompts.
Before any release, run bash scripts/verify-role-infrastructure.sh to confirm all roles have complete 4-file scaffold (playbook, growth, journal, learnings) + agent definition + capability card.
By default, setup configures Claude Code to be the coordinator -- it will automatically plan, delegate to role agents, and synthesize results. You just give it tasks naturally:
# Claude automatically plans, spawns roles, and synthesizes:
implement user auth for the API # spawns researcher + architect + developer + tester
fix the login bug # spawns developer + tester
scan src/ for vulnerabilities # spawns securityTo disable later, remove the "You ARE the Coordinator" section from ~/.claude/CLAUDE.md.
If you prefer to invoke roles explicitly:
@coordinator implement user auth for the API # full team orchestration
@developer fix the login bug # direct role invocation
@security scan src/ for vulnerabilities # security audit
/team-status # dashboard
/team-history # session history
/team-alerts # health checks
/team-retro # periodic team reviewUser --> @coordinator --> plans task --> spawns role teammates --> synthesizes results
| | |
@authority contracts + verify gates
(approval gate) playbook injection (loop on failure)
- User invokes
@coordinatorwith a task - Coordinator selects topology -- fast-fix for bugfixes, security-first for auth, full-pipeline for features, etc. Then assesses parallelizability and chooses spawn mode (Agent for quick tasks, tmux for long-running).
- Coordinator spawns teammates -- each gets their playbook + project context + team knowledge + execution contract. Can spawn multiple instances of the same role for independent tasks.
- Verification gates between phases -- tests must pass, findings must be actionable, contracts must be fulfilled
- Teammates work -- message @authority for approvals, @security for scans
- Coordinator synthesizes -- presents unified result, asks for rating
- On stop -- coordinator runs team retrospective (cross-role interaction analysis) and detects user corrections (implicit feedback from manual edits)
The coordinator has Write/Bash for journal writes but delegates all implementation to role agents. It has no Edit tool. It must delegate to @developer for code, @writer for docs, @tester for tests.
ainous-team plugin
|-- 12 agents -- coordinator, developer, architect, code-quality, tester,
| researcher, writer, security, authority, consolidator, retriever, signal
|-- 57 skills -- 3 orchestration + 54 domain-expertise (see Skills Vault below)
|-- 5 commands -- /team-status, /team-history, /team-alerts, /team-retro, /team-signal
|-- 2 hooks -- SessionStart (context injection), PreToolUse (enforcement)
|-- enforcement -- script-based Write/Edit/Bash gating (fail-closed, allowlist-based)
\-- runtime charter -- shared execution semantics injected into every role spawn
The team implements patterns from recent harness engineering research:
| Pattern | Source | How It's Used |
|---|---|---|
| Execution contracts | NLAH | Every teammate spawn defines required outputs, completion conditions, and scope |
| Verification gates | NLAH + Anthropic | Between phases: plan→implement→test→review, with loops on failure |
| Counterfactual diagnosis | Meta-Harness | Consolidator compares sessions with/without strategies to isolate causal impact |
| Rich diagnostic context | Meta-Harness | Consolidator reads execution traces, not just summaries, for playbook evolution |
| Assumption staleness | Anthropic | Strategies get checked: "would the model do this naturally now?" |
| Generator-evaluator loops | Anthropic | Developer ↔ tester/code-quality iterate until tests pass (max 3 rounds) |
| Shared team memory | Multi-Agent Memory Architecture | Append-only team-knowledge.md avoids cache coherence; consolidator deduplicates |
| Voice of the user | PAHF | Implicit feedback from user corrections weighted 3x vs self-scores |
| Team retrospectives | MAR | Coordinator reflects on cross-role interaction (not self-reflection, which degenerates) |
| Multi-instance roles | A-HMAD | Same role spawned N times with heterogeneous strategy emphasis |
| Hybrid spawn modes | Anthropic | Agent tool for quick tasks, tmux panes for long-running visible work |
| Named topologies | DyLAN + Puppeteer | Coordinator selects topology per task (fast-fix, security-first, etc.); consolidator evolves from retros |
| 3-tier memory lifecycle | MemOS + EvoSC | Hot (playbook + recent 5) / warm (full journals) / cold (archived sessions); hard caps on all stores |
| Self-triggered consolidation | EvoSC + JiuwenClaw | Triple trigger: Stop hook + 2-day critical + overflow at 20 entries |
| Exploration force | Meta-Harness | Consolidator injects [experimental] strategies with maturity-decaying rate |
| Soft enforcement | Original | Main session gets NOTE when writing directly in coordinator-as-default mode |
| Failure taxonomy | NLAH + Anthropic | 7 named failure modes with prescribed recovery actions |
| Skills vault | Original + gstack + community | 57 skills across 10 domains, assigned at spawn, invoked autonomously by roles |
| Session event log | Anthropic Managed Agents | 7 event types in task-history.jsonl; enables crash recovery |
| Knowledge lint | Karpathy LLM Wiki | Consolidator detects contradictions and orphans across knowledge stores |
| Structured retrieval tags | MemPalace | Journal entries tagged by task-type and area; retriever pre-filters |
| Strategy source tagging | OEL/ERL | [from-failure] vs [from-success] attribution; heuristic format enforcement |
templates/phase-definitions.md -- structured phase metadata that topologies compose. Each phase defines entry/exit criteria (artifact-based, mechanically verifiable), roles, skills, and context instructions.
| Phase | Entry | Exit | Roles |
|---|---|---|---|
| research | Task scope defined | researcher-findings.md exists | researcher |
| design | Findings exist OR scope clear | architect-design.md exists | architect |
| implement | Design exists OR simple fix | Code changes, tests pass | developer |
| test | Implementation exists | tester-results.md exists | tester |
| review | Tests pass | All CRITICAL resolved | security, code-quality (parallel) |
| docs | Review passed | Documentation updated | writer |
Topologies compose phases: full-pipeline: [research, design, implement, test, review, docs], fast-fix: [implement, test], etc. The consolidator evolves phase definitions from retro data.
agents-instructions/runtime-charter.md -- shared execution semantics injected into every role spawn. Defines:
- Execution contract fields -- required output, completion condition, permission scope, budget, verification criterion
- State conventions -- journal format (compiled truth + timeline), task history (JSONL event log), structured learnings (JSONL with utility scores), evidence artifacts
- Child lifecycle -- spawn, init, execute, verify, journal, learnings, stop
- Behavioral guards -- analysis paralysis (5+ reads without action → STOP), revision stall detection, context degradation tiers (PEAK/GOOD/DEGRADING/POOR), deviation rules (auto-fix bugs, STOP for architecture)
- Skill self-exclusion -- skills include "when NOT to use" guidance
- Mechanical verification gates --
test -fbefore accepting completion, not just self-reported status - Evidence artifacts -- structured findings for analytical roles (security, code-quality, researcher, architect)
- Acceptance-gated retry -- generalized retry pattern for any phase, not just developer-tester
7 named failure modes with prescribed recovery actions:
| Mode | Recovery |
|---|---|
| missing-artifact | Re-run phase that should have produced it |
| verifier-failure | Developer-tester loop (max 3) |
| tool-error | Retry with adjusted parameters |
| timeout | Split task, reduce scope |
| wrong-path | Re-route to correct role or approach |
| contract-partial | Retry with failure context, narrower scope |
| quality-reject | Fix findings, re-review |
57 skills across 10 domains that the coordinator assigns to roles at spawn time. Roles invoke them autonomously during execution.
Orchestration skills (3):
| Skill | Description |
|---|---|
| team-implement | End-to-end feature pipeline: research, design, code, test, review |
| team-review | Multi-angle review pipeline: security + quality + architecture |
| team-review-periodic | Periodic team health review: 1-on-1s, dynamics retro, coordinator self-assessment |
Domain-expertise skills (48):
| Category | Skills |
|---|---|
| Engineering Core | tdd, debug, design, verify, review-response, refactor, perf, api-design, migrate, code-review-ext, devops, release-gate |
| Testing & Quality | test-strategy, observability, a11y |
| Data & Security | data-model, security-scan, threat-model |
| Strategic/Leadership | premise-check, ideate, auto-decide, prioritize, estimate, negotiate |
| Team Operations | strategy-evolution, workflow-auto, skill-creator, post-mortem, onboard, retro-metrics |
| Research & Analysis | deep-research, knowledge-structure, source-validate, competitive-intel |
| Writing & Content | scqa, content-repurpose, tone-enforce, summarize, copywriting, docs, present |
| Visual & Design | diagram, infographic, flowchart, ui-layout |
| Video & Media | video-script, video-edit, caption-format |
Review ordering is two-stage: spec compliance before quality (catches structural misses before style nits).
task-history.jsonl records 9 event types: spawn, skill-invoked, completed, failed, retried, gate-passed, gate-failed, phase-transition, routing-decision. Each entry is timestamped ISO-8601 with role, phase, and detail fields.
skill-invoked events carry a source field distinguishing three emission paths: coordinator-spawn (coordinator assigned the skill), role-self-report (role instruction-based emission), and hook-auto (mechanically observed by the hooks/skill-telemetry PostToolUse hook — added in v4.14.0). Events also carry session_id (v4.14.0) for precise session-scoped aggregation in hooks/session-end.
spawn events carry a mode field (v4.15.0) with value "agent" or "tmux" indicating which spawn mechanism was used. The field is additive — existing readers without knowledge of mode continue to work.
On crash recovery, the coordinator reads the event log to determine where the session was interrupted and resumes from the last successful phase-transition gate.
Knowledge is split into three layers:
| Layer | Location | What | Persistence |
|---|---|---|---|
| Universal | ~/.claude/ainous-roles/*/ |
Playbooks, growth.json, trust | Across all projects |
| Shared | ~/.claude/ainous-roles/team-knowledge.md |
Facts discovered by 2+ roles | Across all projects |
| Project | .claude/ainous-roles/*/ |
Journals, memory, entities, traces | Per repository |
Playbooks evolve through consolidation. Growth metrics track performance. Trust scores determine permission levels. Shared team knowledge is append-only — the consolidator deduplicates periodically.
Phase 1: Orient --> scan what changed, build focus list (don't read everything)
Phase 2: Gather --> pull from task-history, traces, corrections, learnings.jsonl
Phase 3: Consolidate --> WAL-safe writes: temporary first, verify, then promote
Phase 4: Prune --> compact journals, enforce caps, update indices, verify integrity
Triple gate activation: time (>=24h or >=5 sessions) + volume (>=3 entries) + lock (no concurrent, 1-hour stale timeout).
- Quick capture fires after every session via Stop hooks (~5 seconds). Captures journal entries, structured learnings (JSONL with utility scores), execution traces, and user corrections.
- Deep consolidation auto-triggered (not cron-dependent) -- reads execution traces + learnings.jsonl, performs counterfactual diagnosis, evolves topologies and phase definitions from retros, applies utility-weighted strategy selection (MemRL-inspired), aggressive pruning (5-session inactivity challenge), analyzes user corrections (weighted 3x), promotes shared facts, enforces caps (50 sessions, 30 strategies), injects
[experimental]strategies, runs knowledge lint.
| Level | Score | Permissions |
|---|---|---|
| Intern | < 50 | Read-only, suggestions only |
| Junior | 50-74 | Baseline per role (default) |
| Senior | 75-89 | Expanded baseline + notify (not approve) |
| Principal | 90+ | Domain autonomy (requires user promotion) |
Trust score: +2/session, +1/approval, -5/denial, -15/violation, -3/user-override. Capped 0-100.
PreToolUse hook (script-based, not prompt-based, FAIL-CLOSED):
Write/Edit --> check role baseline against authority-book.md
Bash --> allowlist of safe read-only commands; rejects subshells ($(), `, <()), pipes to rm/mv/cp/xargs
Read/Grep/Glob --> instant allow (no enforcement)
Unknown role / main session --> allow (+ soft warning in coordinator-as-default mode)
Python crash / parse error --> BLOCK (fail closed)
Protected paths (always denied):
~/.claude/.session-role -- role identity marker, prevents impersonation
~/.claude/.session-role-* -- per-pane markers for tmux parallel mode
Trust-aware: Intern roles are blocked from all writes. Senior roles get expanded baselines. Unknown trust levels are treated as Intern. Per-pane role markers ($TMUX_PANE) prevent race conditions in tmux parallel mode.
| Role | Color | Domain | Tools | Baseline Write Access |
|---|---|---|---|---|
| coordinator | -- | Orchestration, planning, synthesis | Read, Write, Grep, Glob, Bash, Agent | Task plans, journals only |
| developer | green | Features, bugfixes, refactoring | Read, Write, Edit, Grep, Glob, Bash | src/, lib/, app/, pkg/ |
| architect | cyan | System design, trade-offs | Read, Write, Edit, Grep, Glob, Bash | Design docs, specs |
| code-quality | yellow | Reviews, bugs, standards | Read, Grep, Glob, Bash, Agent | Read-only |
| tester | magenta | Tests, coverage, edge cases | Read, Write, Edit, Grep, Glob, Bash | Test files only |
| researcher | green | Exploration, investigation | Read, Grep, Glob, Bash, WebSearch, WebFetch | Research notes |
| writer | cyan | Documentation, READMEs | Read, Write, Edit, Grep, Glob | Docs, *.md files |
| security | yellow | Vulnerabilities, threats | Read, Write, Edit, Grep, Glob, Bash, Agent | Security reports |
| authority | -- | Approvals, policy | Read, Write, Edit, Grep, Glob, Bash, Agent | Authority-book, decisions |
| consolidator | -- | Knowledge distillation | Read, Write, Edit, Grep, Glob, Bash | Playbooks, growth.json |
| signal | cyan | External intelligence | Read, Write, Grep, Glob, Bash, WebSearch, WebFetch | Team-knowledge, signal journals |
| retriever | -- | Context filtering | Read, Grep, Glob, Agent | Read-only |
Colors use a 4-color mosaic: green (builder/explorer), cyan (design/docs), yellow (reviewers), magenta (tester). Infrastructure roles (coordinator, authority, consolidator, retriever) have no color. Signal uses cyan (information/exploration).
~/.claude/ainous-roles/authority/authority-book.md -- the canonical permission matrix.
- Within baseline: auto-approved, no authority check needed
- Outside baseline: agent must message @authority for approval
- Escalate to user: push, destructive git, package installs (no role can approve)
~/.claude/ainous-roles/authority/decisions.md -- structured audit trail of every non-baseline approval (AUTH-001, AUTH-002, ...). Overly broad patterns (*, **/*) are rejected by the enforcement script.
ainous-team/ <-- the plugin
|-- .claude-plugin/plugin.json <-- manifest
|-- agents/ <-- 11 slim agent definitions
|-- agents-instructions/ <-- full instructions (referenced by agents)
| \-- runtime-charter.md <-- shared execution semantics (injected into all spawns)
|-- commands/ <-- /team-status, /team-history, /team-alerts, /team-retro
|-- skills/ <-- 15 skills (3 orchestration + 12 domain)
|-- hooks/ <-- session-start, authority-enforce.sh
|-- scripts/
| |-- team-status.sh <-- dashboard script
| \-- setup.sh <-- first-run bootstrapper
|-- templates/ <-- starter files for new installs
|-- docs/ <-- design specs and research (for contributors)
|-- CLAUDE.md <-- project rules
\-- LICENSE <-- MIT
~/.claude/ainous-roles/ <-- user data (created by setup.sh)
|-- team-knowledge.md <-- shared facts (all roles read/write, consolidator deduplicates)
|-- user-corrections.md <-- implicit feedback from user edits (coordinator captures)
|-- coordinator/playbook.md <-- evolved strategies
|-- coordinator/growth.json <-- performance + trust scores
|-- coordinator/retros.md <-- team retrospectives (cross-role interaction analysis)
|-- authority/authority-book.md <-- permission matrix
|-- authority/decisions.md <-- approval audit trail
|-- consolidator/cross-role-insights.md <-- patterns across 3+ roles
\-- ... (11 roles × playbook + growth.json)
<project>/.claude/ainous-roles/ <-- project-specific data (created at runtime)
|-- team-knowledge.md <-- project-specific shared facts
|-- team-sync/ <-- coordination dir for tmux-mode spawns
| |-- state/task-history.jsonl <-- session event log (7 event types)
| |-- state/ <-- phase completion status
| |-- artifacts/ <-- role deliverables (findings, reports)
| \-- index.md <-- knowledge index (topic-organized catalog)
|-- researcher/journal.md <-- session notes for THIS project
|-- researcher/memory.md <-- entities + patterns for THIS codebase
\-- ... (per-role journals + memory)
v5.8.0 ships two coordinated security changes. First, the Clinejection defense (scope-reduction-on-taint): when a session is tainted by a WebFetch/WebSearch call, the enforcement hook now applies a reduced-capability profile — Bash is restricted to a read-only allowlist and Write/Edit are restricted to the role's own paths and findings artifacts. This closes the gap where a poisoned agent retained full capabilities after fetching adversarial content. Second, Layer-2 contract-implied authorization was retired after 8 weeks of zero adoption: the scope field was hardcoded empty on every spawn event, making the enforcement block dead code. Deleting dead code reduces surface area; all authorization now flows through Layer-1 (project baselines) and Layer-3 (hardcoded baselines + decisions.md).
v5.7.0 closed a protocol-level bypass in the write-proxy nonce lifecycle. Three key fixes: (1) the per-session nonce moved from task-history persistence to a 0600 file, eliminating cross-session nonce bleed; (2) Bash credential-read gating was widened with a shell-metachar anchor and variable-indirection defense; (3) a teammate-lifecycle-reaper hook was added for crash-safe team-mode state cleanup. These changes harden the enforcement layer without changing the observable API.
Sixteen releases across four capability areas shipped since v5.1.x.
Every persistent-memory write now carries a five-field provenance block (role, session, source, discovered, verified). The provenance validator in hooks/authority-enforce.sh rejects writes that omit or partially fill the block — fail-closed, not warn-only. v5.4.0 added M-3 parity: spawn events are auto-emitted to task-history.jsonl at the coordinator level so no spawn goes unlogged even when a role skips the self-report.
- v5.2.0 — provenance validator: five-field block required on all six persistent-memory surfaces
- v5.4.0 — spawn-event auto-emission; M-3 parity between coordinator and role self-reports
Introduced a taint-flag mechanism that tracks whether a knowledge claim originates from an external-unsanitized source (signal scan, user correction). Tainted facts cannot flow directly into playbooks without an explicit approval step. A defensive session_id fix in v5.6.2 prevents cross-session taint bleed when multiple consolidation runs share the same day timestamp.
- v5.3.0 — taint-flag mechanism;
external-unsanitizedsource type added to provenance - v5.3.1 — taint propagation across upstream_chain; consolidator promotion-review gate
- v5.6.1 — tiered blocking read/apply flow (external-blocking, cross-role-waiting, awaiting-review)
- v5.6.2 — defensive
session_idfix; cross-session taint bleed closed
Team-mode (Agent(team_name=..., name=...)) is now a first-class spawn path with its own execution policy. A reproducible upstream crash (H.toolUseContext.getAppState) fires when a teammate's Write triggers the approval prompt. The policy response: teammates must not call Write/Edit/NotebookEdit; they return content via SendMessage for coordinator recovery-write. Subsequent releases refined state reaper cleanup, the ainous-team:<role>(<description>) naming convention, and journal-discipline enforcement at session end.
- v5.4.1 — team-mode execution policy; crash-safe Write prohibition; coordinator recovery-write pattern (runtime-charter §15)
- v5.6.0 — state reaper: cleans stale
.claude/ainous-roles/team-sync/state/files after team sessions end - v5.6.5 — naming convention: teammates spawned as
ainous-team:<role>(<description>)for pane-header clarity - v5.6.6 — journal-discipline closure: execution-focused roles must append journal entry before going idle
A structured write-proxy envelope lets background teammates persist content even when the coordinator session is not alive at finish time. The envelope format (<!-- WRITE-PROXY-ENVELOPE v1 --> with YAML frontmatter) includes the intended destination path, provenance block, and an HMAC computed against a per-session nonce. v5.6.4 shipped a canonical helper script. v5.6.7 added a pre-release gate script that verifies full role infrastructure before any version bump is committed.
- v5.5.0 — write-proxy envelope v1; SendMessage-based durable fallback (runtime-charter §15.1)
- v5.5.1 — three-tier identity resolution for write-proxy nonce (env var → coordinator message → random)
- v5.6.3 — HMAC field added to envelope; tamper-evident envelope body
- v5.6.4 — canonical HMAC helper:
scripts/compute-envelope-hmac.sh - v5.6.7 — release-gate script:
scripts/verify-role-infrastructure.sh
Three utility scripts shipped this wave and are available in scripts/:
compute-envelope-hmac.sh— computes the SHA-256 HMAC for a write-proxy envelope body, keyed by the per-session nonceinstall-post-commit-journal-reminder.sh— installs a git post-commit hook that prompts the coordinator to append a journal entry after each commitverify-role-infrastructure.sh— pre-release gate that confirms every role has its complete four-file scaffold (playbook, growth, journal, learnings) plus agent definition and capability card
The team is not a collection of agents — it is itself an agent at a higher layer of abstraction.
Cell → Role (individual capability, one function)
Organ → Role cluster (researcher+architect = "understanding")
Organism → Team (11 roles, one coherent output)
Embed, don't repeat. Each layer wraps the lower layer's capability as a black box. The coordinator doesn't know HOW the developer writes code — only WHEN to invoke it. If two roles produce overlapping output, the consolidator detects this and recommends specialization.
Dynamic topology, not fixed pipeline. The coordinator learns which team shapes work for which tasks. A bugfix skips the architect. A security-sensitive feature starts with a security scan. The topology emerges from the task, guided by retro data.
Exploration force. Without experimentation, the system converges to a local optimum. The consolidator occasionally injects [experimental] strategies. The coordinator occasionally tries unusual role combinations. Exploration rate decays as the system matures — high early, low when stable.
Minimal stable complexity. The goal is the leanest team that reliably serves your actual work patterns. Unused roles retire, redundant strategies merge, unstable topologies get abandoned. Like biological homeostasis — converge toward what works, shed what doesn't.
Full design doc: docs/design/2026-03-31-team-as-organism.md
Built on patterns from:
| Source | Pattern Used |
|---|---|
| Meta-Harness (Lee et al., 2026) | Counterfactual diagnosis, rich diagnostic context for playbook evolution |
| Natural-Language Agent Harnesses (2026) | Execution contracts, verification gates, file-backed state |
| Anthropic Harness Engineering (2026) | Generator-evaluator loops, assumption staleness detection |
| Hyperagents (Meta, 2026) | Self-improvement principle -- system improves how it improves |
| Supermemory ASMR | Agentic retrieval > vector search; 3-parallel retriever |
| ATF (CSA, 2026) | Trust progression (Intern to Principal) with earned autonomy |
| AutoResearch (Karpathy) | Tight loop + single metric per role |
| CrewAI | Role-based teams, scoped memory |
| Claude Code Agent Teams | Shared task list, mailbox, teammate hooks |
| Cofounder GIC | Sleep-time compute for background consolidation |
| Multi-Agent Memory Architecture (2026) | 3-layer hierarchy, append-only shared memory avoids coherence problems |
| PAHF (2026) | Implicit feedback from user corrections > explicit ratings |
| MAR: Multi-Agent Reflexion (2025) | Multi-persona reflection avoids thought degeneration |
| A-HMAD (ACL, 2025) | Heterogeneous agents outperform homogeneous in multi-agent search |
| EvoSC (Feb 2026) | Self-consolidation with contrastive learning; FIFO queue for experience management |
| DyLAN | Agent Importance Score for dynamic team selection |
| MemOS (Jul 2025) | 3-tier memory lifecycle (working/long-term/cold archive) |
| NeurIPS 2025 Puppeteer | RL-based dynamic orchestration learns task routing |
| JiuwenClaw (Mar 2026) | Execution-to-Learning Closed Loop for self-evolving agents |
| MemPalace | Structured tags on memory entries for retrieval filtering; temporal validity on facts |
| Karpathy LLM Wiki | Knowledge lint pass (contradiction + orphan detection); topic-organized index |
| Anthropic Managed Agents (2026) | Session event log for crash recovery; "harnesses encode assumptions that go stale" |
| OEL/ERL | Strategy source tagging ([from-failure] vs [from-success]); heuristic format enforcement (When X, do Y, because Z) |
Design specs and research notes are in docs/. The architecture is documented there for anyone who wants to understand the design decisions.
MIT