kagents is the project brand. The implementation lives in the
claude-teams-operatorrepository and ships under theclaude.amcheste.io/v1alpha1API group. Documentation site: kagents.dev (under construction. See v0.7.0 milestone).
Claude Code Agent Teams let multiple Claude Code instances collaborate. A lead coordinates work via a shared task list while teammates communicate through peer-to-peer mailboxes. Natively this runs on a single machine using tmux. This operator lifts that pattern into Kubernetes so you can run large-scale agent teams on your cluster.
The operator supports two distinct use cases controlled by a single field in the AgentTeam spec:
| Mode | Use when | Key field |
|---|---|---|
| Coding | Agents work on a git repository | spec.repository |
| Cowork | Agents produce documents, reports, emails, analysis | spec.workspace |
Both modes share the same coordination protocol (shared PVCs, mailboxes, task lists) and all Cowork extensions (Skills, MCP servers, approval gates).
- Native Agent Teams protocol. Preserves Anthropic's file-based mailbox and task list format over ReadWriteMany PVCs; no protocol translation
- Per-teammate git worktrees. Each coding agent works on an isolated branch to prevent merge conflicts
- Cowork mode. Mount ConfigMap/PVC inputs and collect outputs without requiring a git repo
- Skills as CRD fields. Mount Claude Code skills from ConfigMaps into each agent's
.claude/skills/ - MCP servers per agent. Configure Model Context Protocol connections per teammate
- Approval gates. Pause spawning specific teammates until a human applies an annotation
- Budget enforcement. Terminate the team if estimated API cost exceeds a configured limit
- Timeout enforcement. Terminate the team after a configurable wall-clock duration
dependsOnordering. Spawn teammates only after their declared dependencies complete- Reusable templates. Define team patterns with
AgentTeamTemplate, instantiate withAgentTeamRun
- Kubernetes 1.28+
- ReadWriteMany PVC support (NFS, EFS, or a compatible CSI driver. See ARCHITECTURE.md § Storage Requirements for options)
- Claude Code CLI access (Max subscription or API key)
- Opus 4.6 model access (required for Agent Teams)
# 1. Install the operator (CRDs + controller + RBAC)
helm install claude-teams-operator \
oci://ghcr.io/amcheste/charts/claude-teams-operator \
--namespace claude-teams-system --create-namespace
# 2. Create an API key secret in the namespace where your teams will run
kubectl create namespace dev-agents
kubectl create secret generic anthropic-api-key \
--namespace dev-agents \
--from-literal=ANTHROPIC_API_KEY=sk-ant-...
# 3. Apply a sample team
kubectl apply -n dev-agents -f \
https://raw.githubusercontent.com/amcheste/claude-teams-operator/main/config/samples/auth-refactor-team.yaml
# 4. Watch the team progress
kubectl get agentteams -n dev-agents -w
kubectl describe agentteam auth-refactor -n dev-agentsFor contributors and anyone who wants to run the full stack from source:
# 1. Create a Kind cluster with NFS provisioner
make kind-create
# 2. Build and load images into Kind
make docker-build docker-build-runner kind-load
# 3. Install CRDs and deploy the operator
make install deploy
# 4. Create your API key secret
kubectl create secret generic anthropic-api-key \
--namespace dev-agents \
--from-literal=ANTHROPIC_API_KEY=sk-ant-...
# 5. Apply a sample team
kubectl apply -f config/samples/auth-refactor-team.yamlSee CONTRIBUTING.md for the full dev loop (testing, linting, manifest regeneration).
apiVersion: claude.amcheste.io/v1alpha1
kind: AgentTeam
metadata:
name: auth-refactor
namespace: dev-agents
spec:
repository:
url: "git@github.com:acme/backend.git"
branch: "main"
credentialsSecret: "git-credentials"
auth:
apiKeySecret: "anthropic-api-key"
lead:
model: "opus"
prompt: |
Coordinate the migration from JWT to OAuth2.
Assign backend-api, frontend-auth, and test-coverage to their tracks.
Validate integration when all tracks complete.
teammates:
- name: "backend-api"
model: "sonnet"
prompt: "Implement OAuth2 endpoints. Remove JWT middleware."
scope:
includePaths: ["src/api/auth/", "src/middleware/"]
- name: "test-coverage"
model: "sonnet"
prompt: "Write comprehensive tests for the OAuth2 migration."
dependsOn: ["backend-api"]
lifecycle:
timeout: "2h"
budgetLimit: "30.00"
onComplete: "create-pr"
pullRequest:
targetBranch: "main"
titleTemplate: "feat(auth): migrate from JWT to OAuth2"apiVersion: claude.amcheste.io/v1alpha1
kind: AgentTeam
metadata:
name: q3-report
namespace: cowork-agents
spec:
workspace:
inputs:
- configMap: "quarterly-data"
mountPath: "/workspace/data"
output:
mountPath: "/workspace/output"
size: "5Gi"
auth:
apiKeySecret: "anthropic-api-key"
lead:
model: "opus"
prompt: "Coordinate the Q3 business report. Assign research, writing, and design."
teammates:
- name: "researcher"
model: "sonnet"
prompt: "Analyse the data in /workspace/data and produce a findings summary."
skills:
- name: "data-analysis"
source:
configMap: "data-analysis-skill"
- name: "email-drafter"
model: "sonnet"
prompt: "Draft follow-up emails for all Q3 prospects."
mcpServers:
- name: "gmail"
url: "https://gmail.mcp.example.com/mcp"
lifecycle:
timeout: "3h"
budgetLimit: "15.00"
approvalGates:
- event: "spawn-email-drafter"
channel: "webhook"
webhookUrl: "https://hooks.example.com/approvals"Grant approval after reviewing the researcher's output:
kubectl annotate agentteam q3-report \
"approved.claude.amcheste.io/spawn-email-drafter=true" \
-n cowork-agentsThe primary resource. Defines the full team, its workspace, lifecycle, and observability config.
| Field | Type | Description |
|---|---|---|
spec.repository |
RepositorySpec |
Git repo config (coding mode). Optional when spec.workspace is set. |
spec.workspace |
WorkspaceSpec |
Input/output volumes (Cowork mode). Optional when spec.repository is set. |
spec.auth |
AuthSpec |
API key or OAuth secret reference. |
spec.lead |
LeadSpec |
Lead agent model, prompt, skills, and MCP servers. |
spec.teammates |
[]TeammateSpec |
Worker agents with optional dependsOn, scope, skills, mcpServers. |
spec.lifecycle.timeout |
string |
Max duration, e.g. "4h". Defaults to "4h". |
spec.lifecycle.budgetLimit |
string |
Max USD spend, e.g. "10.00". No limit if unset. |
spec.lifecycle.onComplete |
string |
create-pr | push-branch | notify | none |
spec.lifecycle.approvalGates |
[]ApprovalGateSpec |
Human-in-the-loop gates before spawning a teammate. |
A reusable team pattern. Does not run on its own. Instantiate with AgentTeamRun.
Instantiates an AgentTeamTemplate against a specific repo or workspace.
apiVersion: claude.amcheste.io/v1alpha1
kind: AgentTeamRun
metadata:
name: q4-security-review
spec:
templateRef:
name: fullstack-review
repository:
url: "git@github.com:acme/platform.git"
branch: "release/4.0"
credentialsSecret: "git-credentials"
auth:
apiKeySecret: "anthropic-api-key"
lead:
model: "opus"
prompt: "Run a full security, performance, and test quality review."Watch team progress. The Ready column reports running+completed/total teammates, so 2/3 means two of three workers are up (or have finished) while one is still spawning or blocked on a dependency:
kubectl get agentteams -A
# NAME PHASE READY TASKS DONE COST AGE
# auth-refactor Running 2/3 7 $1.42 14m
# q3-report Completed 2/2 12 $3.80 2hInspect details, including operator events emitted at every phase transition:
kubectl describe agentteam auth-refactor -n dev-agents
# Status:
# Phase: Running
# Ready: 2/3
# Estimated Cost: 1.42
# Lead:
# Pod Name: auth-refactor-lead
# Phase: Running
# Teammates:
# - Name: backend-api, Phase: Running
# - Name: test-coverage, Phase: Waiting (dependsOn: backend-api)
# Events:
# Normal Initializing 5m agentteam-controller Provisioned PVCs and launched init Job
# Normal Running 4m agentteam-controller All agent pods startedApproval gates block a teammate from being spawned until a human applies an annotation.
# Inspect which teammates are waiting
kubectl get agentteam my-team -o jsonpath='{.status.teammates[*].pendingApproval}'
# Grant approval
kubectl annotate agentteam my-team \
"approved.claude.amcheste.io/spawn-email-drafter=true"If channel: webhook is set, the operator POSTs a JSON payload to webhookUrl when the gate is triggered, allowing an external system to present the approval to a human and then apply the annotation.
This README is the entry point. For deeper dives, every topic lives in a dedicated in-repo document:
| Document | Read when you want to… |
|---|---|
| ARCHITECTURE.md | Understand how the operator models Agent Teams. Phase state machine, PVC layout, RWX storage backends, coordination protocol, key design tradeoffs. |
| TESTING.md | See the test strategy (unit / integration / acceptance / E2E), how to run each suite, and what each one actually verifies. |
| CONTRIBUTING.md | Set up a dev environment, run the full build/test loop, follow the branch + PR workflow, and walk through "How to add a new reconciler feature." |
| docs/helm-values.md | Tune the Helm chart. Every value documented with defaults and production override recipes. |
| SECURITY.md | Report a vulnerability or review the project's security policy. |
| KUBECON.md | See the talk framing and "interesting problems" log. Useful context for why specific architectural choices were made. |
Common Makefile targets (full loop in CONTRIBUTING.md):
make build # Build operator binary
make test # Run all tests
make lint # Run golangci-lint
make manifests # Regenerate CRD manifests
make generate # Regenerate deepcopy methodsApache 2.0
