Kelos

The Kubernetes-native framework for orchestrating autonomous AI coding agents.

Quick Start · Examples · Reference · YAML Manifests

Point Kelos at a GitHub issue and get a PR back — fully autonomous, running in Kubernetes. Each agent runs in an isolated, ephemeral Pod with a freshly cloned git workspace. Fan out across repositories, chain tasks into pipelines, and react to events automatically.

Supports Claude Code, OpenAI Codex, Google Gemini, OpenCode, and custom agent images.

Demo

# Run multiple tasks in parallel across your repo
$ kelos run -p "Fix the bug described in issue #42 and open a PR" --name fix-42
$ kelos run -p "Add unit tests for the auth module" --name add-tests
$ kelos run -p "Update API docs for v2 endpoints" --name update-docs

# Watch all tasks progress simultaneously
$ kelos get tasks
NAME          TYPE          PHASE     BRANCH                WORKSPACE   AGENT CONFIG   DURATION   AGE
fix-42        claude-code   Running   kelos-task-fix-42      my-repo     my-config      2m         2m
add-tests     claude-code   Running   kelos-task-add-tests   my-repo     my-config      1m         1m
update-docs   claude-code   Running   kelos-task-update-docs my-repo     my-config      45s        45s

kelos-demo-0228.mp4

See Autonomous self-development pipeline for a full end-to-end example.

Why Kelos?

AI coding agents are evolving from interactive CLI tools into autonomous background workers. Kelos provides the infrastructure to manage this transition at scale.

Orchestration, not just execution — Don't just run an agent; manage its entire lifecycle. Chain tasks with dependsOn and pass results (branch names, PR URLs, token usage) between pipeline stages. Use TaskSpawner to build event-driven workers that react to GitHub issues, PRs, or schedules.
Host-isolated autonomy — Each task runs in an isolated, ephemeral Pod with a freshly cloned git workspace. Agents have no access to your host machine — use scoped tokens and branch protection to control repository access.
Standardized interface — Plug in any agent (Claude, Codex, Gemini, OpenCode, or your own) using a simple container interface. Kelos handles credential injection, workspace management, and Kubernetes plumbing.
Scalable parallelism — Fan out agents across multiple repositories. Kubernetes handles scheduling, resource management, and queueing — scale is limited by your cluster capacity and API provider quotas.
Observable & CI-native — Every agent run is a first-class Kubernetes resource with deterministic outputs (branch names, PR URLs, commit SHAs, token usage) captured into status. Monitor via kubectl, manage via the kelos CLI or declarative YAML (GitOps-ready), and integrate with ArgoCD or GitHub Actions.

Quick Start

Get running in 5 minutes (most of the time is gathering credentials).

Prerequisites

Kubernetes cluster (1.28+)

Don't have a cluster? Create one locally with kind

Install kind (requires Docker)
Create a cluster:
```
kind create cluster
```

This creates a single-node cluster and configures your kubeconfig automatically.

1. Install the CLI

curl -fsSL https://raw.githubusercontent.com/kelos-dev/kelos/main/hack/install.sh | bash

Alternative: install from source

go install github.com/kelos-dev/kelos/cmd/kelos@latest

2. Install Kelos

kelos install

This installs the Kelos controller and CRDs into the kelos-system namespace.

Verify the installation:

kubectl get pods -n kelos-system
kubectl get crds | grep kelos.dev

3. Initialize Your Config

kelos init

Edit ~/.kelos/config.yaml:

oauthToken: <your-oauth-token>
workspace:
  repo: https://github.com/your-org/your-repo.git
  ref: main
  token: <github-token>  # optional, for private repos and pushing changes

How to get your credentials

Claude OAuth token (recommended for Claude Code): Run claude auth login locally, then copy the token from ~/.claude/credentials.json.

Anthropic API key (alternative for Claude Code): Create one at console.anthropic.com. Set apiKey instead of oauthToken in your config.

Codex OAuth credentials (for OpenAI Codex): Run codex auth login locally, then reference the auth file in your config:

oauthToken: "@~/.codex/auth.json"
type: codex

Or set apiKey with an OpenAI API key instead.

GitHub token (for pushing branches and creating PRs): Create a Personal Access Token with repo scope (and workflow if your repo uses GitHub Actions).

Warning: Without a workspace, the agent runs in an ephemeral pod — any files it creates are lost when the pod terminates. Always set up a workspace to get persistent results.

4. Run Your First Task

$ kelos run -p "Add a hello world program in Python"
task/task-r8x2q created

$ kelos logs task-r8x2q -f

The task name (e.g. task-r8x2q) is auto-generated. Use --name to set a custom name, or -w to automatically watch task logs.

The agent clones your repo, makes changes, and can push a branch or open a PR.

Tip: If something goes wrong, check the controller logs with kubectl logs deployment/kelos-controller-manager -n kelos-system.

Using kubectl and YAML instead of the CLI

Create a Workspace resource to define a git repository:

apiVersion: kelos.dev/v1alpha1
kind: Workspace
metadata:
  name: my-workspace
spec:
  repo: https://github.com/your-org/your-repo.git
  ref: main

Then reference it from a Task:

apiVersion: kelos.dev/v1alpha1
kind: Task
metadata:
  name: hello-world
spec:
  type: claude-code
  prompt: "Create a hello world program in Python"
  credentials:
    type: oauth
    secretRef:
      name: claude-oauth-token
  workspaceRef:
    name: my-workspace

kubectl apply -f workspace.yaml
kubectl apply -f task.yaml
kubectl get tasks -w

Using an API key instead of OAuth

Set apiKey instead of oauthToken in ~/.kelos/config.yaml:

apiKey: <your-api-key>

Or pass --secret to kelos run with a pre-created secret (api-key is the default credential type), or set spec.credentials.type: api-key in YAML.

How It Works

Kelos orchestrates the flow from external events to autonomous execution:

  Triggers (GitHub, Cron) ──┐
                            │
  Manual (CLI, YAML) ───────┼──▶  TaskSpawner  ──▶  Tasks  ──▶  Isolated Pods
                            │          │              │             │
  API (CI/CD, Webhooks) ────┘          └─(Lifecycle)──┴─(Execution)─┴─(Success/Fail)

You define what needs to be done, and Kelos handles the "how" — from cloning the right repo and injecting credentials to running the agent and capturing its outputs (branch names, commit SHAs, PR URLs, and token usage).

Core Primitives

Kelos is built on four resources:

Tasks — Ephemeral units of work that wrap an AI agent run.
Workspaces — Persistent or ephemeral environments (git repos) where agents operate.
AgentConfigs — Reusable bundles of agent instructions (AGENTS.md, CLAUDE.md), plugins (skills and agents), and MCP servers.
TaskSpawners — Orchestration engines that react to external triggers (GitHub, Cron) to automatically manage agent lifecycles.

TaskSpawner — Automatic Task Creation from External Sources

TaskSpawner watches external sources (e.g., GitHub Issues) and automatically creates Tasks for each discovered item.

                    polls         new issues
 TaskSpawner ─────────────▶ GitHub Issues
      │        ◀─────────────
      │
      ├──creates──▶ Task: fix-bugs-1
      └──creates──▶ Task: fix-bugs-2

Examples

Create PRs automatically

Add a token to your workspace config:

workspace:
  repo: https://github.com/your-org/repo.git
  ref: main
  token: <your-github-token>

kelos run -p "Fix the bug described in issue #42 and open a PR with the fix"

The gh CLI and GITHUB_TOKEN are available inside the agent container, so the agent can push branches and create PRs autonomously.

Auto-fix GitHub issues with TaskSpawner

Create a TaskSpawner to automatically turn GitHub issues into agent tasks:

apiVersion: kelos.dev/v1alpha1
kind: TaskSpawner
metadata:
  name: fix-bugs
spec:
  when:
    githubIssues:
      labels: [bug]
      state: open
  taskTemplate:
    type: claude-code
    workspaceRef:
      name: my-workspace
    credentials:
      type: oauth
      secretRef:
        name: claude-oauth-token
    promptTemplate: "Fix: {{.Title}}\n{{.Body}}"
  pollInterval: 5m

kubectl apply -f taskspawner.yaml

TaskSpawner polls for new issues matching your filters and creates a Task for each one.

Chain tasks with dependencies

Use dependsOn to chain tasks into pipelines. A task in Waiting phase stays paused until all its dependencies succeed:

kelos run -p "Scaffold a new user service" --name scaffold --branch feature/user-service
kelos run -p "Write tests for the user service" --depends-on scaffold --branch feature/user-service

Tasks sharing the same branch are serialized automatically — only one runs at a time.

YAML equivalent

apiVersion: kelos.dev/v1alpha1
kind: Task
metadata:
  name: scaffold
spec:
  type: claude-code
  prompt: "Scaffold a new user service with CRUD endpoints"
  credentials:
    type: oauth
    secretRef:
      name: claude-oauth-token
  workspaceRef:
    name: my-workspace
  branch: feature/user-service
---
apiVersion: kelos.dev/v1alpha1
kind: Task
metadata:
  name: write-tests
spec:
  type: claude-code
  prompt: "Write comprehensive tests for the user service"
  credentials:
    type: oauth
    secretRef:
      name: claude-oauth-token
  workspaceRef:
    name: my-workspace
  branch: feature/user-service
  dependsOn: [scaffold]

Downstream tasks can reference upstream results in their prompt using {{.Deps}}:

apiVersion: kelos.dev/v1alpha1
kind: Task
metadata:
  name: open-pr
spec:
  type: claude-code
  prompt: |
    Open a PR for branch {{index .Deps "write-tests" "Results" "branch"}}.
  credentials:
    type: oauth
    secretRef:
      name: claude-oauth-token
  workspaceRef:
    name: my-workspace
  branch: feature/user-service
  dependsOn: [write-tests]

The .Deps map is keyed by dependency Task name. Each entry has Results (key-value map with branch, commit, pr, etc.) and Outputs (raw output lines). See examples/07-task-pipeline for a full three-stage pipeline.

Inject agent instructions and MCP servers

Use AgentConfig to bundle project-wide instructions, plugins, and MCP servers:

apiVersion: kelos.dev/v1alpha1
kind: AgentConfig
metadata:
  name: my-config
spec:
  agentsMD: |
    # Project Rules
    Follow TDD. Always write tests first.
  mcpServers:
    - name: github
      type: http
      url: https://api.githubcopilot.com/mcp/
      headers:
        Authorization: "Bearer <token>"

kelos run -p "Fix the bug" --agent-config my-config

agentsMD is written to ~/.claude/CLAUDE.md (user-level, additive with the repo's own instructions).
plugins are mounted as plugin directories and passed via --plugin-dir.
mcpServers are written to the agent's native MCP configuration. Supports stdio, http, and sse transport types.

See the full AgentConfig spec for plugins, skills, and agents configuration.

Autonomous self-development pipeline

This is a real-world TaskSpawner that picks up every open issue, investigates it, opens (or updates) a PR, self-reviews, and ensures CI passes — fully autonomously. When the agent can't make progress, it labels the issue kelos/needs-input and stops. Remove the label to re-queue it.

 ┌────────────────────────────────────────────────────────────────┐
 │                        Feedback Loop                           │
 │                                                                │
 │  ┌─────────────┐  polls  ┌────────────────┐                    │
 │  │ TaskSpawner │───────▶ │ GitHub Issues  │                    │
 │  └──────┬──────┘         │ (open, no      │                    │
 │         │                │  needs-input)  │                    │
 │         │ creates        └────────────────┘                    │
 │         ▼                                                      │
 │  ┌─────────────┐  runs   ┌─────────────┐  opens PR   ┌───────┐ │
 │  │    Task     │───────▶ │    Agent    │────────────▶│ Human │ │
 │  └─────────────┘  in Pod │   (Claude)  │  or labels  │Review │ │
 │                          └─────────────┘  needs-input└───┬───┘ │
 │                                                          │     │
 │                                           removes label ─┘     │
 │                                           (re-queues issue)    │
 └────────────────────────────────────────────────────────────────┘

See self-development/kelos-workers.yaml for the full manifest and the self-development/ README for setup instructions.

The key pattern is excludeLabels: [kelos/needs-input] — this creates a feedback loop where the agent works autonomously until it needs human input, then pauses. Removing the label re-queues the issue on the next poll.

Browse all ready-to-apply YAML manifests in the examples/ directory.

Orchestration Patterns

Autonomous Self-Development — Build a feedback loop where agents pick up issues, write code, self-review, and fix CI flakes until the task is complete. See the self-development pipeline.
Event-Driven Bug Fixing — Automatically spawn agents to investigate and fix bugs as soon as they are labeled in GitHub. See Auto-fix GitHub issues.
Fleet-Wide Refactoring — Orchestrate a "fan-out" where dozens of agents apply the same refactoring pattern across a fleet of microservices in parallel.
Hands-Free CI/CD — Embed agents as first-class steps in your deployment pipelines to generate documentation or perform automated migrations.
AI Worker Pools — Maintain a pool of specialized agents (e.g., "The Security Fixer") that developers can trigger via simple Kubernetes resources.

Reference

Resource	Key Fields	Full Spec
Task	`type`, `prompt`, `credentials`, `workspaceRef`, `dependsOn`, `branch`	Reference
Workspace	`repo`, `ref`, `secretRef`, `files`	Reference
AgentConfig	`agentsMD`, `plugins`, `mcpServers`	Reference
TaskSpawner	`when`, `taskTemplate`, `pollInterval`, `maxConcurrency`	Reference

CLI Reference

Command	Description
`kelos install`	Install Kelos CRDs and controller into the cluster
`kelos uninstall`	Uninstall Kelos from the cluster
`kelos init`	Initialize `~/.kelos/config.yaml`
`kelos run`	Create and run a new Task
`kelos get <resource> [name]`	List resources or view a specific resource (`tasks`, `taskspawners`, `workspaces`)
`kelos delete <resource> <name>`	Delete a resource
`kelos logs <task-name> [-f]`	View or stream logs from a task
`kelos suspend taskspawner <name>`	Pause a TaskSpawner
`kelos resume taskspawner <name>`	Resume a paused TaskSpawner

See full CLI reference for all flags and options.

Security Considerations

Kelos runs agents in isolated, ephemeral Pods with no access to your host machine, SSH keys, or other processes. The risk surface is limited to what the injected credentials allow.

What agents CAN do: Push branches, create PRs, and call the GitHub API using the injected GITHUB_TOKEN.

What agents CANNOT do: Access your host, read other pods, reach other repositories, or access any credentials beyond what you explicitly inject.

Best practices:

Scope your GitHub tokens. Use fine-grained Personal Access Tokens restricted to specific repositories instead of broad repo-scoped classic tokens.
Enable branch protection. Require PR reviews before merging to main. Agents can push branches and open PRs, but protected branches prevent direct pushes to your default branch.
Use maxConcurrency and maxTotalTasks. Limit how many tasks a TaskSpawner can create to prevent runaway agent activity.
Use podOverrides.activeDeadlineSeconds. Set a timeout to prevent tasks from running indefinitely.
Audit via Kubernetes. Every agent run is a first-class Kubernetes resource — use kubectl get tasks and cluster audit logs to track what was created and by whom.

About --dangerously-skip-permissions: Claude Code uses this flag for non-interactive operation. Despite the name, the actual risk is minimal — agents run inside ephemeral containers with no host access. The flag simply disables interactive approval prompts, which is necessary for autonomous execution.

Kelos uses standard Kubernetes RBAC — use namespace isolation to separate teams. Each TaskSpawner automatically creates a scoped ServiceAccount and RoleBinding.

Cost and Limits

Running AI agents costs real money. Here's how to stay in control:

Model costs vary significantly. Opus is the most capable but most expensive model. Use spec.model (or model in config) to choose cheaper models like Sonnet for routine tasks and reserve Opus for complex work. Check the API pricing page for current rates.

Use maxConcurrency to cap spend. Without it, a TaskSpawner can create unlimited concurrent tasks. If 100 issues match your filter on first poll, that's 100 simultaneous agent runs. Always set a limit:

spec:
  maxConcurrency: 3      # max 3 tasks running at once
  maxTotalTasks: 50       # stop after 50 total tasks

Use podOverrides.activeDeadlineSeconds to limit runtime. Set a timeout per task to prevent agents from running indefinitely:

spec:
  podOverrides:
    activeDeadlineSeconds: 3600  # kill after 1 hour

Or via the CLI:

kelos run -p "Fix the bug" --timeout 30m

Use suspend for emergencies. If costs are spiraling, pause a spawner immediately:

kelos suspend taskspawner my-spawner
# ... investigate ...
kelos resume taskspawner my-spawner

Rate limits. API providers enforce concurrency and token limits. If a task hits a rate limit mid-execution, it will likely fail. Use maxConcurrency to stay within your provider's limits.

FAQ

What agents does Kelos support?

Kelos supports Claude Code, OpenAI Codex, Google Gemini, and OpenCode out of the box. You can also bring your own agent image using the container interface.

Can I use Kelos without Kubernetes?

No. Kelos is built on Kubernetes Custom Resources and requires a Kubernetes cluster. For local development, use kind (kind create cluster) to create a single-node cluster on your machine.

Is it safe to give agents repo access?

Agents run in isolated, ephemeral Pods with no host access. Their capabilities are limited to what you inject — typically a scoped GitHub token. Use fine-grained PATs, branch protection, and maxConcurrency to control the blast radius. See Security Considerations.

How much does it cost to run?

Costs depend on the model and task complexity. Check the API pricing page for current rates. Use maxConcurrency, timeouts, and model selection to stay in budget. See Cost and Limits.

Uninstall

kelos uninstall

Development

Build, test, and iterate with make:

make update             # generate code, CRDs, fmt, tidy
make verify             # generate + vet + tidy-diff check
make test               # unit tests
make test-integration   # integration tests (envtest)
make test-e2e           # e2e tests (requires cluster)
make build              # build binary
make image              # build docker image

Contributing

Fork the repo and create a feature branch.
Make your changes and run make verify to ensure everything passes.
Open a pull request with a clear description of the change.

For significant changes, please open an issue first to discuss the approach.

We welcome contributions of all kinds — see good first issues for places to start.

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 438 Commits
.github/workflows		.github/workflows
api/v1alpha1		api/v1alpha1
claude-code		claude-code
cmd		cmd
codex		codex
docs		docs
examples		examples
gemini		gemini
hack		hack
internal		internal
opencode		opencode
pkg/generated		pkg/generated
self-development		self-development
test		test
.gitignore		.gitignore
.yamlfmt		.yamlfmt
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
install-crd.yaml		install-crd.yaml
install.yaml		install.yaml
local-run.sh		local-run.sh
tools.go		tools.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kelos

Demo

Why Kelos?

Quick Start

Prerequisites

1. Install the CLI

2. Install Kelos

3. Initialize Your Config

4. Run Your First Task

How It Works

Core Primitives

Examples

Create PRs automatically

Auto-fix GitHub issues with TaskSpawner

Chain tasks with dependencies

Inject agent instructions and MCP servers

Autonomous self-development pipeline

Orchestration Patterns

Reference

Security Considerations

Cost and Limits

FAQ

Uninstall

Development

Contributing

License

About

Uh oh!

Releases 10

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

kelos-dev/kelos

Folders and files

Latest commit

History

Repository files navigation

Kelos

Demo

Why Kelos?

Quick Start

Prerequisites

1. Install the CLI

2. Install Kelos

3. Initialize Your Config

4. Run Your First Task

How It Works

Core Primitives

Examples

Create PRs automatically

Auto-fix GitHub issues with TaskSpawner

Chain tasks with dependencies

Inject agent instructions and MCP servers

Autonomous self-development pipeline

Orchestration Patterns

Reference

Security Considerations

Cost and Limits

FAQ

Uninstall

Development

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages