Autonomous CI/CD Repair • Zero Configuration • Real-Time Observability
- The Problem
- The Solution
- Key Features
- Architecture
- Quick Start
- Configuration
- API Reference
- The Cognitive Core
- Security
- Neural Dashboard
- Deployment
- Tech Stack
- Contributing
- License
Every developer knows the pain: you push code, grab a coffee, and return to a red CI badge. The error log is cryptic. The stack trace points to a file you didn't touch. You spend 30 minutes deciphering what went wrong.
Traditional CI/CD tools are blind. They execute scripts and report exit codes—but they don't understand the errors they produce.
| Impact | Reality |
|---|---|
| Developer Time | Average 23 minutes per failed build investigation |
| Lost Productivity | $100K+ annually for mid-size teams |
| Context Switching | Breaks flow state, delays feature delivery |
| Repeated Failures | Same error patterns occur across projects |
TALOS (The Autonomous Lifecycle Operations System) is not a pipeline. It's a digital organism that:
- Observes — Watches your GitHub repos via webhooks for build failures
- Reasons — Uses Gemini 3 to analyze logs, build dependency graphs, and identify the root cause
- Heals — Generates fixes in an isolated sandbox, verifies them, and opens a Pull Request
- Shows — Streams its "thought process" to a real-time dashboard, so you see exactly what it's doing
"TALOS doesn't just tell you what's broken—it fixes it for you."
Build Failed → Read Logs → Debug → Fix → Push → Wait → Hope
Build Failed → TALOS Fixes It → Review PR → Merge
| Feature | Description |
|---|---|
| Patient Zero Detection | Distinguishes between where the error manifests vs. where the bug originates using dependency graph analysis |
| Verification Loop | Tests fixes in isolated E2B Firecracker microVMs before proposing them |
| Real-Time Neural Dashboard | Watch the agent think and act via Server-Sent Events (SSE) streaming |
| Multi-Attempt Reasoning | If a fix fails verification, TALOS learns from the error and tries again (up to 3 attempts) |
| Zero-Config Security | GitHub App model with fine-grained permissions, HMAC webhook verification |
| Polyglot Support | Node.js, Python, TypeScript, Rust—auto-detected, no setup needed |
| Duplicate PR Prevention | Automatically detects existing TALOS PRs to avoid spam |
| Allow Retry | Dashboard button to bypass duplicate detection when you want a fresh fix attempt |
┌─────────────────────────────────────────────────────────────────────────┐
│ TALOS ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ GitHub FastAPI Backend Frontend │
│ ─────── ────────────── ───────── │
│ │
│ ┌─────────┐ Webhook ┌─────────────┐ ┌──────────┐ │
│ │workflow │────────────────▶│ Nervous │ SSE │ Neural │ │
│ │ _run │ │ System │─────────▶│Dashboard │ │
│ │ failure │ └──────┬──────┘ └──────────┘ │
│ └─────────┘ │ │
│ ▼ │
│ ┌─────────┐ ┌─────────────────┐ │
│ │Supabase │◀─────────────│ Event Bus │ │
│ │ DB │ Persistence │ (Redis) │ │
│ └─────────┘ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Gemini 3 │ ◀── ReAct Loop │
│ │ (Brain) │ Chain of Thought │
│ │ Flash + Pro │ Dual-Model System │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ E2B Sandbox │ ◀── Firecracker μVM │
│ │ (Hands) │ ~150ms boot time │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ ┌─────────────────┐ │
│ │ Pull │◀─────────────│ GitHub │ │
│ │ Request │ PR Created │ API │ │
│ └─────────┘ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
TALOS implements the OODA (Observe-Orient-Decide-Act) cognitive loop, a decision-making framework used by fighter pilots:
| Phase | Component | Action |
|---|---|---|
| Observe | Webhook Handler | Receives workflow_run failure events from GitHub |
| Orient | Repomix + Perception | Assembles code context, normalizes logs, extracts stack trace DNA |
| Decide | Gemini 3 + ReAct | Reasons about root cause, generates targeted fix |
| Act | E2B Sandbox | Applies fix, runs verification, creates PR |
┌──────────┐ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌────────┐ ┌──────────┐
│ 1.Detect │──▶│2.Analyze│──▶│3.Diagnose│──▶│ 4. Fix │──▶│5.Verify│──▶│ 6.Deploy │
│ │ │ │ │ │ │ │ │ │ │ │
│ Webhook │ │ Logs │ │ Gemini │ │ Patch │ │Sandbox │ │ PR │
│ Received │ │ Parsed │ │ Reasons │ │ Applied │ │ Tests │ │ Created │
└──────────┘ └─────────┘ └──────────┘ └─────────┘ └────────┘ └──────────┘
- Python 3.11+ with pip
- Node.js 18+ with npm
- Redis (for real-time event streaming)
- GitHub Account (to install the app)
- API Keys: Gemini, E2B, Supabase
Click the button to install the GitHub App on your repositories:
git clone https://github.com/timix648/TALOS.git
cd TALOS# Create virtual environment
cd api
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\Activate.ps1
# Install dependencies
pip install -e .
# Create .env file (see Configuration section)
cp .env.example .env
# Edit .env with your keys
# Start Redis (in a separate terminal)
redis-server
# Start the API server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000cd web
npm install
# Create .env.local file
cp .env.example .env.local
# Edit .env.local with your settings
# Start the development server
npm run devFor GitHub to send webhooks to your local machine:
# Using Cloudflared (recommended)
cloudflared tunnel --url http://localhost:8000
# Or using ngrok
ngrok http 8000Update your GitHub App webhook URL to the tunnel URL.
Push a bug to a watched repo. TALOS will:
- Detect the failure via webhook
- Clone the repo into an isolated sandbox
- Analyze the error with Gemini 3
- Generate and verify a fix
- Open a Pull Request
Create api/.env:
# --- API CONFIG ---
ENV=development
PORT=8000
# --- GITHUB APP SECRETS ---
GITHUB_APP_ID=123456
GITHUB_WEBHOOK_SECRET=your_webhook_secret
GITHUB_PRIVATE_KEY_PATH=/path/to/your-private-key.pem
# --- GEMINI BRAIN ---
# Comma-separated keys for automatic rotation on rate limits
GEMINI_API_KEYS=key1,key2,key3
# --- E2B SANDBOX ---
E2B_API_KEY=your_e2b_key
# --- REDIS (Required for SSE) ---
REDIS_URL=redis://localhost:6379
# --- SUPABASE MEMORY ---
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your_anon_keyCreate web/.env.local:
# Backend API URL
NEXT_PUBLIC_API_URL=http://localhost:8000
# GitHub OAuth (for dashboard login)
GITHUB_CLIENT_ID=your_client_id
GITHUB_CLIENT_SECRET=your_client_secret
# GitHub App URL
NEXT_PUBLIC_GITHUB_APP_URL=https://github.com/apps/your-app-name| Service | How to Get |
|---|---|
| GitHub App | Create a GitHub App with workflow_run, pull_request, contents permissions |
| Gemini API | Google AI Studio |
| E2B | E2B Dashboard |
| Supabase | Supabase Dashboard |
| Redis | Local: brew install redis / apt install redis-server |
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Health check - returns TALOS status and capabilities |
/webhook |
POST | GitHub webhook receiver (HMAC verified) |
/debug/auth |
GET | Test GitHub App authentication |
| Endpoint | Method | Description |
|---|---|---|
/events/stream/{run_id} |
GET | SSE stream for real-time updates |
/events/history/{run_id} |
GET | Get past events for a completed run |
| Endpoint | Method | Description |
|---|---|---|
/runs/ |
GET | List recent healing runs (paginated) |
/runs/{run_id} |
GET | Get details for a specific run |
/runs/{run_id} |
DELETE | Delete a run from history |
/runs/{run_id}/allow-retry |
POST | Allow TALOS to create a new fix PR |
/runs/stats |
GET | Aggregate statistics |
/runs/latest/active |
GET | Get the currently running heal (if any) |
| Endpoint | Method | Description |
|---|---|---|
/installations/{id}/repos |
GET | List repos for an installation |
/installations/{id}/repos/{repo}/watch |
POST | Enable TALOS for a repo |
/installations/{id}/repos/{repo}/unwatch |
POST | Disable TALOS for a repo |
| Endpoint | Method | Description |
|---|---|---|
/stats |
GET | Global healing statistics |
TALOS uses Gemini 3 model for different cognitive tasks:
| Model | Role | Use Case |
|---|---|---|
| Gemini 3 Flash | Fast Responder | Log analysis, quick fixes, syntax errors |
| Gemini 3 FLASH | Deep Thinker | Complex logic bugs, multi-file fixes |
TALOS automatically classifies errors for targeted fixes:
| Category | Examples | Fix Strategy |
|---|---|---|
| Syntax Errors | Missing brackets, typos, malformed imports | Direct pattern-based fix |
| Type Errors | Wrong types, null references | Type coercion, null checks |
| Logic Bugs | Off-by-one, wrong conditionals | Semantic analysis, test-driven |
| Config Issues | Missing deps, wrong scripts | Package.json/requirements fix |
| Import Errors | Module not found, circular deps | Dependency graph analysis |
Not all errors originate where they crash. TALOS uses triangulation:
Stack Trace
│
▼
┌─────────────────────┐
│ Where did it CRASH │ ← "Crash Site"
│ (symptom file) │
└──────────┬──────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Dependency│ │ Git Diff │ │ Import │
│ Graph │ │ (recent) │ │ Analysis │
└─────┬─────┘ └─────┬─────┘ └─────┬─────┘
│ │ │
└──────────────┼──────────────┘
▼
┌─────────────────────┐
│ PATIENT ZERO │ ← Actual bug location
│ (root cause) │
└─────────────────────┘
Logic:
- Parse stack trace to find "crash site"
- Build dependency graph (who imports what)
- Correlate with
git diff(what changed recently) - Triangulate: If crash file wasn't modified, but a caller was → Caller is Patient Zero
Raw CI logs are noisy. TALOS cleans them:
Raw Log Normalized Log
─────────────────────────────────────────────────────────
[0m[91mError: Cannot find... → Error: Cannot find module 'xyz'
npm WARN deprecated... (stripped)
████████████░░░░░░░ 60% (stripped)
[2026-01-29T12:30:45Z] FAIL → FAIL tests/app.test.js
TALOS is designed with security-first principles:
| Feature | Implementation |
|---|---|
| HMAC Webhook Verification | Constant-time signature comparison prevents timing attacks |
| Short-Lived Tokens | GitHub Installation Access Tokens expire in 1 hour |
| Isolated Execution | E2B Firecracker microVMs boot in ~150ms, fully sandboxed |
| Minimal Permissions | Only requests necessary scopes, repo-specific access |
| No Credential Storage | Tokens generated on-demand, never persisted |
| Private Key Protection | .gitignore blocks all .pem and .env files |
| Scope | Access | Why |
|---|---|---|
actions |
Read | Fetch workflow run logs |
contents |
Write | Create branches, push fixes |
pull_requests |
Write | Open PRs with fixes |
metadata |
Read | Repository information |
The real-time dashboard streams TALOS's thought process:
| Color | Phase | Events |
|---|---|---|
| 🔵 Blue | Initialization | mission_start, cloning |
| 🟡 Yellow | Perception | scouting, reading_code |
| 🟢 Cyan | Cognition | thinking, analyzing, diagnosing |
| 🟢 Green | Action | applying_fix, verifying, creating_pr, success |
| 🔴 Red | Error | failure, error_log |
| 🟠 Orange | Retry | retry |
| 🟣 Purple | Meta | thought_stream, mission_end |
| 🩷 Pink | Code | code_diff |
- Stats Overview: Protected repos, success rate, total heals
- Run History: Click any past run to replay its timeline
- Live Indicator: Pulses when a run is in progress
- Auto-Scroll: Follows live runs, pauses for historical viewing
- Expandable Events: Click to see metadata and code diffs
- ↩Allow Retry Button: Request a new fix attempt
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f api
# Stop services
docker-compose downServices:
| Service | Port | Description |
|---|---|---|
talos-api |
8000 | FastAPI backend |
talos-web |
3000 | Next.js dashboard |
talos-redis |
6379 | Event bus (internal) |
Run the schema in your Supabase SQL Editor:
-- Located in: api/db/schema.sql
-- Creates:
-- • installations (GitHub App installs)
-- • watched_repos (monitored repositories)
-- • healing_runs (repair history)
-- • healing_events (timeline events)- Set
ENV=productionin.env - Configure proper CORS origins (not
*) - Use managed Redis (e.g., Upstash, Redis Cloud)
- Set up SSL/TLS termination
- Enable Supabase Row Level Security
- Rotate API keys regularly
- Set up monitoring/alerting
| Technology | Purpose |
|---|---|
| FastAPI | Async Python web framework |
| Python 3.11+ | Runtime with modern typing |
| Pydantic | Data validation and settings |
| Redis | Pub/Sub for SSE streaming |
| Supabase | PostgreSQL database |
| httpx | Async HTTP client |
| Technology | Purpose |
|---|---|
| Gemini 3 Flash | Fast reasoning, log analysis,Deep reasoning, complex fixes |
| E2B | Firecracker microVM sandboxes |
| Repomix | Codebase context assembly |
| Technology | Purpose |
|---|---|
| Next.js 15 | React framework with App Router |
| React 19 | UI components |
| Tailwind CSS | Styling |
| Framer Motion | Animations |
| Lucide Icons | Icon library |
| Technology | Purpose |
|---|---|
| Docker | Containerization |
| Cloudflared | Secure tunneling |
| GitHub Apps | OAuth + Webhooks |
We welcome contributions! Here's how to get started:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
# Clone your fork
git clone https://github.com/YOUR_USERNAME/TALOS.git
# Install pre-commit hooks (optional)
pip install pre-commit
pre-commit install
# Run tests
cd api && pytest
cd web && npm test| Criteria | How TALOS Excels |
|---|---|
| The Code | Clean monorepo architecture, strict typing, modular design |
| The Demo | Real-time Neural Dashboard with SSE streaming |
| The Philosophy | "Species" that lives in the ecosystem, not a tool you invoke |
| Gemini Integration | Dual-model system, ReAct loop, Chain of Thought reasoning |
| Security | GitHub Apps, HMAC verification, isolated sandboxes |
| Innovation | Patient Zero algorithm, Visual Regression (planned) |
| Traditional Tools | TALOS |
|---|---|
| Report errors | Fix errors |
| Exit codes | Understanding |
| After-the-fact | Real-time |
| Manual intervention | Autonomous |
| Black box | Transparent (Neural Dashboard) |