Agent Forge is a single-server agentic AI platform where agents can request creation and deployment of other agents, but only through strict policy gates. It’s designed to be minimal yet production‑minded: a clear separation between orchestration, policy, durable workflows, deployment, and runtime isolation.
This repository gives you a working prototype with:
- LLM backend: Ollama only
- Orchestration: LangGraph (Python)
- Durable workflows: Temporal (Python SDK)
- Policy gate: Open Policy Agent (OPA)
- RAG: Qdrant vector DB + simple ingestion
- Tracing/Evals: OpenTelemetry (Phoenix default)
- Sandbox: local fallback runner with strict allowlist (Firecracker stub)
- Users submit objectives to the Gateway.
- The Orchestrator (LangGraph) creates a plan and can request new agents via the Deploy Controller.
- The Deploy Controller is the only component allowed to deploy agents. It calls OPA to authorize, then starts a Temporal workflow to build/evaluate/deploy.
- Agents are deployed locally on the same server, either:
- as systemd user services (preferred on Linux), or
- as managed subprocesses (fallback).
- Agents run as HTTP services on specific ports and can serve UIs or API endpoints.
- Registry keeps track of running instances, ports, and status.
User / Agent
|
v
Gateway (FastAPI) ----> Orchestrator (LangGraph) ----> Deploy Controller (FastAPI)
| | |
| | +--> Temporal Workflows/Activities
| | |
| | +--> Sandbox Runner (local fallback)
| | +--> Systemd user service or subprocess
| |
| +--> RAG Service (Qdrant)
|
+--> Telemetry (OpenTelemetry -> Phoenix OTLP HTTP)
-
Gateway (FastAPI)
- User API
- UI at
/ui - Submits objectives and proxies requests to the orchestrator/deploy-controller
-
Orchestrator (LangGraph)
- Graph: objective → plan → decide_spawn → delegate → merge → decide_deploy → final
- Can request spawn/deploy through Deploy Controller only
-
Deploy Controller (FastAPI)
- Only component allowed to deploy agents
- Calls OPA
- Starts Temporal workflows
- Manages systemd/subprocess deployments
-
Temporal Worker
- Executes spawn/deploy workflows and activities
-
RAG Service (FastAPI)
/ingestand/query- Uses Qdrant for vector search
-
Sandbox Runner
- Local fallback: allowlisted subprocess, timeouts, no network by default (best-effort)
- Firecracker stub (interface only)
- Ollama: LLM backend
- LangGraph: Orchestration graph
- Temporal: Durable workflows
- OPA: Policy enforcement
- Qdrant: Vector search
- OpenTelemetry + Phoenix: tracing
/README.md
/pyproject.toml
/config/
config.example.yaml
config.schema.json
/scripts/
start_all.sh
start_temporal_worker.sh
deps_instructions.md
/services/
/gateway/
/orchestrator/
/deploy_controller/
/rag/
/sandbox_runner/
/libs/
/schemas/
/policy/
/telemetry/
/common/
/temporal/
/llm/
/agents/
/templates/
/examples/
| Service | Port |
|---|---|
| Gateway (UI) | 8000 (/ui) |
| Orchestrator | 8001 |
| Deploy Controller | 8002 |
| RAG Service | 8003 |
| Ollama | 11434 |
| OPA | 8181 |
| Qdrant | 6333 |
| Temporal | 7233 |
| Phoenix | 6006 |
| Doc‑RAG Agent | 9010 |
| Ops‑Triage Agent | 9011 |
- Python 3.11+ (3.10 works but 3.11 is recommended)
uvfor dependency management- Local binaries for: Ollama, OPA, Qdrant, Temporal
cp config/config.example.yaml config/config.yaml
make install
ollama pull mistral
ollama serve
./opa run --server --addr 127.0.0.1:8181 ./libs/policy
./qdrant --storage-path ./data/qdrant
./temporalite start --ip 127.0.0.1 --port 7233
python -m phoenix.server.main --host 127.0.0.1 --port 6006
python -m tools.run_all
This starts:
- gateway
- orchestrator
- deploy_controller
- rag
- temporal worker
- Main UI:
http://127.0.0.1:8000/ui - Doc‑RAG agent UI:
http://127.0.0.1:9010/ - Ops‑Triage agent UI:
http://127.0.0.1:9011/
- Gateway or agent submits SpawnRequest to Deploy Controller.
- Deploy Controller sends request to OPA.
- If allowed → Temporal spawn_workflow creates a registry entry and a capability token.
- Response contains
agent_instance_id.
- Deploy Controller receives DeployRequest.
- OPA authorizes.
- Temporal deploy_workflow builds artifact and runs eval in sandbox.
- If eval passes → systemd/subprocess deployment.
- Registry updates with running status + port.
curl -s -X POST http://127.0.0.1:8000/objective \
-H 'Content-Type: application/json' \
-d '{"objective":"Build an ops triage agent","request_id":"11111111-1111-1111-1111-111111111111"}'
curl -s -X POST http://127.0.0.1:8002/spawn \
-H 'Content-Type: application/json' \
-d @agents/examples/spawn_request.json
curl -s -X POST http://127.0.0.1:8002/deploy \
-H 'Content-Type: application/json' \
-d @agents/examples/deploy_request.json
curl -s http://127.0.0.1:8002/registry
curl -s http://127.0.0.1:8002/workflow_status/<WORKFLOW_ID>
curl -s -X POST http://127.0.0.1:8002/agent/stop/<AGENT_INSTANCE_ID>
Health:
curl -s http://127.0.0.1:9010/health
Query:
curl -s -X POST http://127.0.0.1:9010/query \
-H "content-type: application/json" \
-d '{"query":"What is this platform?"}'
Health:
curl -s http://127.0.0.1:9011/health
Triage:
curl -s -X POST http://127.0.0.1:9011/triage \
-H "content-type: application/json" \
-d '{"issue":"Deploy failed with timeout","severity":"high","context":{"service":"gateway"}}'
curl -s -X POST http://127.0.0.1:8003/ingest \
-H "content-type: application/json" \
-d '{"texts":["Agent Forge is a single-server agentic platform.","It uses Ollama, Temporal, OPA, and Qdrant."]}'
python - <<'PY'
import requests, pathlib
text = pathlib.Path("README.md").read_text()
resp = requests.post("http://127.0.0.1:8003/ingest", json={"texts":[text]})
print(resp.status_code, resp.text)
PY
make install
make fmt
make lint
make test
make run
make worker
make check-ollama
- Restart systemd user services after code changes:
python - <<'PY'
import subprocess
subprocess.run(["systemctl","--user","restart","agent-<AGENT_ID>.service"])
PY
Check service:
python - <<'PY'
import subprocess
subprocess.run(["systemctl","--user","status","agent-<AGENT_ID>.service"])
PY
Make sure you ingested documents and that /query returns payloads.
Update policy in libs/policy/policy.rego and restart OPA.
- Single server, trusted operator.
- File-based registry is acceptable for MVP.
- Sandbox runner uses allowlist + timeouts; OS‑level network isolation is TODO.
- Systemd user services are preferred; subprocess supervisor is fallback.
- Phoenix is default OTEL collector (OTLP HTTP).
MIT (or add your license)