⚡ Competitive multi-agent framework (PPO vs. deterministic) where the best-scoring model wins
🏆 Human feedback loops for evaluation and selection
📡 Event streaming and replay via NATS JetStream
🛡️ Guardrails for operational safety
Core Principle: Multiple agent models compete on the same tasks. The model with highest human ratings handles production traffic, regardless of whether it's ML-based or rule-based.
graph TD
A[Environment: ERP/Market/Drone] --> B[NATS JetStream]
B --> C[Model Competition Arena]
C --> D1[PPO Agent PyTorch]
C --> D2[Deterministic Rules]
C --> D3[Future Models...]
D1 --> E[Best Model Selection]
D2 --> E
D3 --> E
E --> F[Risk Manager Validation]
F --> G[Action Execution FastAPI]
G --> B
H[Human Dashboard] --> I[Feedback & Ratings]
I --> B
B --> J[SQLite Metadata]
J --> K[Model Registry Hyperparams]
B --> L[Parquet Export]
L --> M[DuckDB Analytics]
K --> N[Training Pipeline]
N --> D1
Key Components:
- NATS JetStream: Event streaming, replay, integration for training/monitoring
- SQLite: Lightweight registry for runs, results, deployments, guardrails
- FastAPI: Operational API for actions, status, deployments, guardrails
- Streamlit: Dashboard for runs, models, deployments, and feedback
- Wrap ERP, trading, or drone data as Python API clients.
- Start with dummy data generator (
asyncioloop with JSON events).
- Input validation + normalization with Pydantic.
- Transform into Polars/Pandas DataFrames.
- First version: simple rule-based agent (
if/elsestrategies). - Upgrade later: PyTorch (RL or LLM-based reasoning).
- Planning: OR-tools or networkx.
- Defines “how to execute” strategies.
- Configurable guardrails in YAML.
- Pure Python checks (limits, thresholds).
- FastAPI endpoints for:
POST /act→ execute actions in ERP/trading APIGET /status→ query results
- NATS JetStream for event-driven communication and replay.
- Parquet (PyArrow) for structured logs.
- JSON lines for raw debug traces.
- SQLite as lightweight registry (deployments, arena runs/results, guardrails).
- Streamlit app with simplified views:
- Runs (Training, Benchmark)
- Models (browse/evaluate)
- Deployments & Guardrails
- Monitor & Feedback (actions + ratings)
- Feedback currently stored in UI state; integrate SQLite persistence next.
- Simple retraining loop in Python:
- Load feedback from DuckDB
- Fine-tune or retrain models
- Register new version
- Core Services: FastAPI, asyncio
- Data: Pandas, SQLite
- Message Bus: NATS JetStream
- ML: PyTorch + Gymnasium; Stable-Baselines3 optional
- Dashboard: Streamlit
# Create virtual environment (Windows)
python -m venv .venv
# Install dependencies into the venv using uv (recommended)
uv pip install -r requirements.txt -p .\.venv\Scripts\python.exeuv run -p .\.venv\Scripts\python.exe -m uvicorn api.main:app --reload --host 0.0.0.0 --port 8000uv run -p .\.venv\Scripts\python.exe -m streamlit run ui\dashboard_simple.py- v1 (Current): Rule-based prototype + human feedback collection + benchmark registry (SQLite)
- v2: PPO agents (PyTorch), train on feedback, deterministic critic for reward shaping
- v3: NATS JetStream integration for full event streaming and replay
- v4: Persist decisions/feedback in SQLite; improve dashboard analytics
- v5: React dashboard (replace Streamlit)
Alpha-stage with a clear direction:
- Maintain deterministic and ML agents in parallel (baseline, critic, fallback)
- Human-in-the-loop feedback drives training and selection
- Prefer sparse methods for efficiency where applicable (PPO/SAC variants)
See also: docs/focus_breakthrough.md.
api/ └── main.py # FastAPI entrypoint with /act and /status endpoints
core/ ├── environment.py # Dummy ERP/trading/drone data generator (asyncio) ├── perception.py # Pydantic validation + Pandas/Polars normalization ├── reasoning.py # Rule-based agent (if/else strategies) ├── tactical.py # OR-tools / networkx planning ├── risk_manager.py # Guardrails from YAML configs └── config.yaml # Risk thresholds, limits
bus/ └── nats_client.py # NATS JetStream client
memory/ └── registry.py # SQLite registry (arena runs/results, deployments, guardrails)
ui/ └── dashboard_simple.py # Streamlit app (simplified dashboard)
learning/ └── (agents/trainers) # PPO/SAC trainers and evaluation utilities
tests/ ├── test_api.py ├── test_reasoning.py ├── test_risk_manager.py └── test_end_to_end.py
docs/ ├── architecture.mmd └── trade_prompt.txt
requirements.txt # Dependencies README.md # Already provided
- ERP optimization (packing, routing, scheduling)
- Trading decision support (simulate, validate, human override)
- Robotics/drones (path planning with human corrections)
MIT (you own it, hack it, ship it).
Run the interactive brokers login simulation to verify credentials, JetStream wiring, and SQLite persistence end-to-end:
uv run -p .\.venv\Scripts\python.exe scripts\ibkr_login_cli.py --prefix IBKR --latency 0.1- Provide credentials via
IBKR_USERNAME,IBKR_PASSWORD, and optionalIBKR_ACCOUNTenvironment variables (or adjust the--prefix). - Override the database path with
--db-pathorAGENTA_DB_PATH. - Point at an existing NATS cluster using
--nats nats://localhost:4222.
The CLI prints the login session, JetStream acknowledgement, and the JSON payload emitted to the ENVIRONMENT_EVENTS stream.