DeepOps is a hackathon project for a live incident-response workflow:
- ingest production issues
- persist a canonical incident record
- diagnose likely root cause
- generate a fix
- gate deployment through policy and human approval
- escalate by phone when the blast radius is too high
- deploy and track the result
- trace the full loop for optimization
The current repo now contains a real FastAPI backend, Aerospike-backed incident storage, sponsor integration wrappers, approval and escalation flows, and the shared incident schema used across backend and frontend work.
Core pipeline:
ingest -> stored -> diagnosing -> fixing -> gating -> deploying -> resolved
Primary integrations:
Airbyte: issue ingestion trigger pathAerospike: incident persistenceMacroscope: diagnosis pathKiro: fix generation pathAuth0: approval and approval-context routingBland: phone escalation for high-risk incidentsTrueFoundry: deployment targetOvermindandOverclaw: tracing and optimization
agent/: shared contracts, orchestrator, tracing, detector, severity, diagnosis/fix runtimeagents/: Overclaw-facing agent entrypointsserver/: FastAPI backend, API routes, services, sponsor integration wrappers, testsinfra/aerospike/: local Aerospike config and Docker composedocs/: architecture, guide, schema, alignment docs, task packsdata/: datasets and fixtures used by agent optimization worktests/: agent-side tests
Important docs:
docs/implementation-alignment.mddocs/incident.schema.jsondocs/incident-example.jsondocs/ayush/person-a-policy.mddocs/ayush/remaining-work-instructions.md
The FastAPI backend entrypoint is:
server/main.py
Main routes:
GET /api/healthGET /api/incidentsGET /api/incidents/{incident_id}POST /api/incidentsPOST /api/ingest/demo-appPOST /api/ingest/airbyte-syncGET /api/incidents/streamPOST /api/agent/run-oncePOST /api/approval/{incident_id}/decisionPOST /api/webhooks/blandPOST /api/webhooks/truefoundry
Use Python 3.14+ in the current workspace environment.
Install the required runtime packages as needed:
python -m pip install fastapi uvicorn pytest aerospikeCreate a local .env file. The repo already ignores it.
Minimum useful keys for the full live demo path:
OPENAI_API_KEY=
OVERMIND_API_KEY=
TRUEFOUNDRY_API_KEY=
BLAND_API_KEY=
BLAND_PHONE_NUMBER=
BLAND_WEBHOOK_URL=
AUTH0_DOMAIN=
AUTH0_CLIENT_ID=
AUTH0_CLIENT_SECRET=
AUTH0_REDIRECT_URI=
AUTH0_MANAGEMENT_AUDIENCE=
AEROSPIKE_HOST=127.0.0.1
AEROSPIKE_PORT=3000
AEROSPIKE_NAMESPACE=deepops
AEROSPIKE_SET=incidents
DEEPOPS_ALLOW_IN_MEMORY_STORE=falseOptional depending on how live you want the demo:
AUTH0_ORGANIZATION_ID=
AUTH0_APPROVAL_CONNECTION=
AIRBYTE_API_URL=
AIRBYTE_API_KEY=
DEEPOPS_DEMO_APP_BASE_URL=
MACROSCOPE_API_KEY=
ANTHROPIC_API_KEY=The repo includes a real local Aerospike setup under infra/aerospike/.
Start it with:
docker compose -f infra/aerospike/docker-compose.yml up -dThis config brings up:
- host
127.0.0.1 - port
3000 - namespace
deepops - set
incidents
uvicorn server.main:app --host 127.0.0.1 --port 8000 --reloadHealth check:
curl http://127.0.0.1:8000/api/healthRun the full current suite with:
python -m pytest -q tests server/testsThe repo includes an Overclaw workspace and registered agent config under .overclaw/.
Useful commands:
overclaw agent list
overclaw setup deepops-person-a --fast --policy docs/ayush/person-a-policy.md
overclaw optimize deepops-person-aoverclaw setup and optimize require a valid model API key in the environment.
The demo is designed around three flows:
- autonomous self-healing
- human approval and guided replan
- phone escalation for high-risk incidents
The strongest backend path is already in the repo. The remaining work is mostly final integration hardening, frontend wiring, and live sponsor rehearsal.