Agenta: Human-in-the-Loop ML Agent Framework

⚡ Competitive multi-agent framework (PPO vs. deterministic) where the best-scoring model wins
🏆 Human feedback loops for evaluation and selection
📡 Event streaming and replay via NATS JetStream
🛡️ Guardrails for operational safety

🚀 Architecture

Core Principle: Multiple agent models compete on the same tasks. The model with highest human ratings handles production traffic, regardless of whether it's ML-based or rule-based.

graph TD
    A[Environment: ERP/Market/Drone] --> B[NATS JetStream]
    B --> C[Model Competition Arena]
    
    C --> D1[PPO Agent PyTorch]
    C --> D2[Deterministic Rules]
    C --> D3[Future Models...]
    
    D1 --> E[Best Model Selection]
    D2 --> E
    D3 --> E
    
    E --> F[Risk Manager Validation]
    F --> G[Action Execution FastAPI]
    G --> B
    
    H[Human Dashboard] --> I[Feedback & Ratings]
    I --> B
    
    B --> J[SQLite Metadata]
    J --> K[Model Registry Hyperparams]
    
    B --> L[Parquet Export]
    L --> M[DuckDB Analytics]
    
    K --> N[Training Pipeline]
    N --> D1

Key Components:

NATS JetStream: Event streaming, replay, integration for training/monitoring
SQLite: Lightweight registry for runs, results, deployments, guardrails
FastAPI: Operational API for actions, status, deployments, guardrails
Streamlit: Dashboard for runs, models, deployments, and feedback

🧩 Components (MVP)

1. Environment

Wrap ERP, trading, or drone data as Python API clients.
Start with dummy data generator (asyncio loop with JSON events).

2. Perception Layer

Input validation + normalization with Pydantic.
Transform into Polars/Pandas DataFrames.

3. Reasoning Agent

First version: simple rule-based agent (if/else strategies).
Upgrade later: PyTorch (RL or LLM-based reasoning).

4. Tactical Execution

Planning: OR-tools or networkx.
Defines “how to execute” strategies.

5. Risk Manager

Configurable guardrails in YAML.
Pure Python checks (limits, thresholds).

6. Operational Layer

FastAPI endpoints for:
- POST /act → execute actions in ERP/trading API
- GET /status → query results

7. Message Bus

NATS JetStream for event-driven communication and replay.

8. Memory & Logging

Parquet (PyArrow) for structured logs.
JSON lines for raw debug traces.

9. Agent Registry

SQLite as lightweight registry (deployments, arena runs/results, guardrails).

10. Human-in-the-loop

Streamlit app with simplified views:
- Runs (Training, Benchmark)
- Models (browse/evaluate)
- Deployments & Guardrails
- Monitor & Feedback (actions + ratings)
- Feedback currently stored in UI state; integrate SQLite persistence next.

11. Learning Engine

Simple retraining loop in Python:
- Load feedback from DuckDB
- Fine-tune or retrain models
- Register new version

🛠️ Tech Stack (Python)

Core Services: FastAPI, asyncio
Data: Pandas, SQLite
Message Bus: NATS JetStream
ML: PyTorch + Gymnasium; Stable-Baselines3 optional
Dashboard: Streamlit

📦 Getting Started

# Create virtual environment (Windows)
python -m venv .venv

# Install dependencies into the venv using uv (recommended)
uv pip install -r requirements.txt -p .\.venv\Scripts\python.exe

Run API

uv run -p .\.venv\Scripts\python.exe -m uvicorn api.main:app --reload --host 0.0.0.0 --port 8000

Run Dashboard

uv run -p .\.venv\Scripts\python.exe -m streamlit run ui\dashboard_simple.py

🛤️ Roadmap

v1 (Current): Rule-based prototype + human feedback collection + benchmark registry (SQLite)
v2: PPO agents (PyTorch), train on feedback, deterministic critic for reward shaping
v3: NATS JetStream integration for full event streaming and replay
v4: Persist decisions/feedback in SQLite; improve dashboard analytics
v5: React dashboard (replace Streamlit)

🔎 Alpha status & direction

Alpha-stage with a clear direction:

Maintain deterministic and ML agents in parallel (baseline, critic, fallback)
Human-in-the-loop feedback drives training and selection
Prefer sparse methods for efficiency where applicable (PPO/SAC variants)

Directories & Files

api/ └── main.py # FastAPI entrypoint with /act and /status endpoints

core/ ├── environment.py # Dummy ERP/trading/drone data generator (asyncio) ├── perception.py # Pydantic validation + Pandas/Polars normalization ├── reasoning.py # Rule-based agent (if/else strategies) ├── tactical.py # OR-tools / networkx planning ├── risk_manager.py # Guardrails from YAML configs └── config.yaml # Risk thresholds, limits

bus/ └── nats_client.py # NATS JetStream client

memory/ └── registry.py # SQLite registry (arena runs/results, deployments, guardrails)

ui/ └── dashboard_simple.py # Streamlit app (simplified dashboard)

learning/ └── (agents/trainers) # PPO/SAC trainers and evaluation utilities

tests/ ├── test_api.py ├── test_reasoning.py ├── test_risk_manager.py └── test_end_to_end.py

docs/ ├── architecture.mmd └── trade_prompt.txt

requirements.txt # Dependencies README.md # Already provided

🌍 Use Cases

ERP optimization (packing, routing, scheduling)
Trading decision support (simulate, validate, human override)
Robotics/drones (path planning with human corrections)

📜 License

MIT (you own it, hack it, ship it).

🧪 IBKR Login Smoke Test

Run the interactive brokers login simulation to verify credentials, JetStream wiring, and SQLite persistence end-to-end:

uv run -p .\.venv\Scripts\python.exe scripts\ibkr_login_cli.py --prefix IBKR --latency 0.1

Provide credentials via IBKR_USERNAME, IBKR_PASSWORD, and optional IBKR_ACCOUNT environment variables (or adjust the --prefix).
Override the database path with --db-path or AGENTA_DB_PATH.
Point at an existing NATS cluster using --nats nats://localhost:4222.

The CLI prints the login session, JetStream acknowledgement, and the JSON payload emitted to the ENVIRONMENT_EVENTS stream.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agenta: Human-in-the-Loop ML Agent Framework

🚀 Architecture

🧩 Components (MVP)

1. Environment

2. Perception Layer

3. Reasoning Agent

4. Tactical Execution

5. Risk Manager

6. Operational Layer

7. Message Bus

8. Memory & Logging

9. Agent Registry

10. Human-in-the-loop

11. Learning Engine

🛠️ Tech Stack (Python)

📦 Getting Started

Run API

Run Dashboard

🛤️ Roadmap

🔎 Alpha status & direction

Directories & Files

🌍 Use Cases

📜 License

🧪 IBKR Login Smoke Test

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
api		api
bus		bus
core		core
docs		docs
integrations		integrations
learning		learning
memory		memory
scripts		scripts
tests		tests
ui		ui
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
run_tests.py		run_tests.py

Folders and files

Latest commit

History

Repository files navigation

Agenta: Human-in-the-Loop ML Agent Framework

🚀 Architecture

🧩 Components (MVP)

1. Environment

2. Perception Layer

3. Reasoning Agent

4. Tactical Execution

5. Risk Manager

6. Operational Layer

7. Message Bus

8. Memory & Logging

9. Agent Registry

10. Human-in-the-loop

11. Learning Engine

🛠️ Tech Stack (Python)

📦 Getting Started

Run API

Run Dashboard

🛤️ Roadmap

🔎 Alpha status & direction

Directories & Files

🌍 Use Cases

📜 License

🧪 IBKR Login Smoke Test

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages