Skip to content

agereaude/cx

Repository files navigation

CX — Continuous eXecution

CI validates. CD ships. CX builds.

CX is an autonomous AI coding pipeline. It continuously picks up issues, implements them, runs tests, reviews the output, and prepares PRs for merge — without waiting for a human to context-switch into the problem.

The key insight: the bottleneck isn't the models. It's the process around them. CX is a development process designed for agents, not adapted from one designed for humans.

→ Read the full reasoning in CX.md


How It Works

A deterministic orchestrator runs on a timer. Each tick:

  1. Cleanup — Remove stale worktrees, release completed jobs
  2. Check capacity — How many agents are running? Room for more?
  3. Triage open PRs — Merge conflicts → rebaser, CI failures → fixer, pending review → reviewer
  4. Discover eligible issues — Unblocked, well-specified, no open PR
  5. Dispatch — Fan out work to agents in parallel

No AI tokens are spent on routing. State machines and GitHub labels handle all orchestration logic.

Agents

Agent Model Job
Implementer Cheap (e.g. minimax) Read issue → write code → run tests → open PR
Fixer Cheap Fix CI failures or review feedback → push
Reviewer Capable (e.g. claude-sonnet) Read diff → security, logic, alignment → approve or request changes
Rebaser Cheap Resolve merge conflicts → push

Issue Lifecycle

Issue created
    │
    ▼
Backlog (classified: ready / blocked / excluded)
    │
    ▼ (unblocked, no open PR)
Implementer → PR created
    │
    ▼
CI runs ──────────── fail ──▶ Fixer ──▶ CI runs again
    │
    ▼ pass
Reviewer ────────── changes ──▶ Fixer ──▶ back to review
    │
    ▼ approved
ready-for-merge label → human merges

The DAG Replaces the Sprint

Tasks form a directed acyclic graph with explicit Blocked by #N dependencies. The system only picks up tasks where all blockers are closed. New tasks are added at the top while the swarm consumes from the bottom — no sprints, no planning ceremonies.


Quick Start

Prerequisites

  • Docker and Docker Compose
  • A GitHub personal access token with repo scope
  • An OpenRouter API key (models for all agents)

Setup

git clone https://github.com/agereaude/cx
cd cx

# Configure
cp .env.example .env
# Edit .env and fill in CX_REPO, OPENROUTER_API_KEY, GH_TOKEN

# Start
docker compose up

Configuration

All configuration is via environment variables. Copy .env.example and fill in values:

# Which repo to manage
CX_REPO=owner/repo

# How many agents run in parallel
CX_MAX_CONCURRENCY=3

# Seconds between orchestrator ticks
CX_TICK_INTERVAL=300

# Only pick up issues with the 'CX' label (opt-in mode)
CX_REQUIRE_LABEL=false

# Models (all via OpenRouter)
CX_IMPLEMENTER_MODEL=minimax/minimax-m2.5
CX_REVIEWER_MODEL=anthropic/claude-sonnet-4-5

# API keys
OPENROUTER_API_KEY=sk-or-...
GH_TOKEN=ghp_...

Preparing Issues for CX

CX executes tasks autonomously. Issues must be specific enough that an agent can implement them without asking questions. If a competent junior developer would need clarifying questions, the ticket isn't ready.

Pre-CX Checklist

  • Is it a subtask (not an epic)?
  • Are requirements specific and complete?
  • Are acceptance criteria testable?
  • Are dependencies declared with Blocked by #N?
  • Is there a single, decided approach (no "Options" sections)?
  • Could a junior dev implement this without asking questions?

If any answer is "no" — add needs-grooming label and don't mark it CX yet.

Labels

Label Meaning
CX Eligible for autonomous implementation
draft Not ready — WIP spec
needs-grooming Requires decomposition or clarification
blocked Waiting on a decision or dependency
ready-for-merge Approved PR, ready for a human to merge
epic Parent ticket with subtasks (not directly worked on)

Dependency Syntax

Dependencies must appear in the issue body for CX to parse them:

## Dependencies

Blocked by #101
Blocked by #102

Also set the native GitHub relationship (sidebar → "Tracked by" / blocked-by) for UI visualization.

Rule: Backend always blocks frontend. The API must exist before the UI can consume it.

Good vs Bad Tickets

# ✅ Good

Title: Add /login POST endpoint

## Context
We need JWT authentication. This is the first auth subtask.

## Requirements
- Accept email and password fields
- Validate against users table
- Return signed JWT using JWT_SECRET env var
- Return 401 for invalid credentials

## Acceptance Criteria
- [ ] POST /login returns 200 + token on valid credentials
- [ ] POST /login returns 401 on invalid credentials
- [ ] Tests cover both cases

## Dependencies
(none)
# ❌ Bad

Title: Add user authentication

Make the app work with users.

See skills/ISSUES.md for full issue management guidelines.


Dashboard

The Next.js dashboard shows:

  • Backlog — All issues classified by state (ready, blocked, in progress, needs fix, etc.)
  • Runs — Live and historical agent runs with step-by-step logs
  • Stats — Token usage and cost breakdown by model and agent type
  • Archive — Merged issues with cumulative cost tracking

Architecture

cx/
├── backend/
│   ├── cx/
│   │   ├── api/          # FastAPI routes
│   │   ├── graph/        # LangGraph orchestrator + agents
│   │   │   ├── orchestrator.py
│   │   │   ├── agents/   # implementer, fixer, reviewer, rebaser
│   │   │   └── nodes/    # workflow steps
│   │   ├── services/     # worktree, concurrency, token tracking
│   │   ├── tools/        # GitHub CLI wrappers
│   │   ├── models.py     # SQLAlchemy models
│   │   ├── backlog.py    # Issue classification
│   │   └── config.py     # Settings
│   └── pyproject.toml
│
├── frontend/
│   └── src/
│       ├── app/          # Next.js app router
│       └── components/   # Dashboard UI
│
├── docker-compose.yml
├── .env.example
├── CX.md                 # Philosophy and design decisions
└── skills/               # Process docs and agent skill templates
    ├── ISSUES.md         # Issue management guidelines
    └── REVIEW-PRS.md     # PR review criteria

Tech Stack

Backend: Python 3.11+, FastAPI, LangGraph, LangChain, SQLite (aiosqlite), Alembic

Frontend: Next.js 15, React 19, Tailwind CSS, SWR, Recharts, XYFlow


Next Steps

Post-Deployment Validation

The current lifecycle ends at merge. The natural extension is closing the loop after deployment.

Once a PR is merged and deployment is confirmed (e.g. via a deployment webhook or CI/CD status event), a Validator agent is spawned with read-only access to:

  • Production logs — to detect errors, exceptions, or anomalous patterns related to the change
  • Production database — to verify data integrity, expected schema state, or feature-specific records

The validator checks that the issue's acceptance criteria are met in the live environment — not just in tests.

PR merged → deployment confirmed
    │
    ▼
Validator agent spawned
    │
    ├── reads production logs
    ├── reads production DB (read-only)
    └── evaluates against original issue acceptance criteria
         │
         ├── ✅ all good → issue closed, loop complete
         │
         └── ❌ problem found
                  │
                  ▼
             New issue created (label: bug, CX)
             Body: references original issue, describes observed failure
                  │
                  ▼
             Loop starts again → Implementer picks it up

This turns CX from a development pipeline into a full delivery loop: code is not done when it ships — it is done when it works in production.

Validator Agent Design

Concern Approach
Access scope Read-only credentials; no write access to production systems
Log access Tail/query logs scoped to the deployment window and relevant service
DB access Read replica or snapshot — never primary write path
Bug report quality Validator includes log excerpts, query results, and a reference to the original issue so the new ticket is immediately actionable

License

MIT — see LICENSE

About

Autonomous AI coding pipeline — continuously picks up issues, implements them, reviews output, and prepares PRs for merge

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors