Autonomous, trust-aware cyber defense system with human-in-the-loop oversight
CyberGuard uses LangGraph agents to make real-time security decisions, combining drift detection, digital twin simulation, and explainable AI to protect critical infrastructure.
- 🤖 Agentic AI (ReAct Loop): Autonomous LangGraph agent implementing the ReAct pattern (Reasoning + Acting)
- Observe: Reason about incoming events and classify threats
- Think: Reason about optimal actions based on trust, zone, and attack patterns
- Act: Execute decisions or request human approval
- 📊 Drift Detection: Model trust based on PSI (Population Stability Index) scores
- 🔄 Digital Twin: Safe simulation environment for testing defensive actions
- 👤 Human-in-the-Loop: All destructive actions require human approval with full context
- 🔍 Explainable AI: Full decision context with policy rules, trust metrics, and reasoning chains
- 🎨 Real-time Dashboard: Streamlit UI with live event tracking and approval panels
- 🔄 Auto-Retraining: LLM-driven model retraining when drift is detected
# Install dependencies
uv sync
# Run Streamlit dashboard
uv run streamlit run app/streamlit_app.pyAccess at: http://localhost:8501
# Build and run
docker-compose up -d
# Access dashboard
http://localhost:8080See DEPLOYMENT.md for detailed deployment options.
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Data Streamer │────►│ LangGraph Agent │────►│ Digital Twin │
│ (CICIDS2017) │ │ (ReAct Loop) │ │ (Simulator) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
▲ │ │
│ │ CSV window │
│ ▼ │
│ ┌─────────────────┐ │
│ │ DriftCatcher API│ │
│ │ (Drift/Retrain) │ │
│ └─────────────────┘ │
│ │ trust + PSI │
│ ▼ │
└───────────────────────┴─────────────────────────┘
│
┌────────────┴────────────┐
│ │
▼ ▼
┌────────────────────┐ ┌──────────────────┐
│ Streamlit Dashboard│◄───│ Human Operator │
│ (8080/8501) │ │ (Approve/Reject) │
└────────────────────┘ └──────────────────┘
- LangGraph Agent - Autonomous decision engine with trust-aware thresholds
- Digital Twin - Network simulation for safe action testing
- DriftCatcher API - ML drift detection with PSI scoring
- Streamlit Dashboard - Real-time monitoring and approval interface
- Human-in-the-Loop - Approval workflow for destructive actions
See ARCHITECTURE.md for detailed architecture documentation.
CSV Event → Parse & Enrich → Window by Zone → LangGraph Agent
Observe (REASONING: classify attack, analyze threat)
↓
Think (REASONING: evaluate trust, zone criticality, attack patterns)
↓
Act (ACTION: execute defensive measure or request approval)
↓
Observe (REASONING: analyze results) → Loop continues
ReAct Pattern Implementation:
- Reasoning Steps: Agent reasons about event context, model trust, historical patterns
- Acting Steps: Agent takes defensive actions (quarantine, isolate, escalate)
- Continuous Loop: Each action's outcome feeds back into next observation cycle
| Trust Level | PSI Range | Behavior |
|---|---|---|
| HIGH | < 0.1 | Normal thresholds, fast response |
| MEDIUM | 0.1-0.25 | Higher thresholds, more cautious |
| LOW | ≥ 0.25 | No destructive actions, escalate only |
Agent Decides Destructive Action
→ Mark needs_approval=True
→ Add to pending_approvals
→ Display in UI with full context
→ Human Approves/Rejects
→ Execute or Block action
⚠️ Pending Approvals - Actions awaiting human review- 🤖 Model Retraining - Active retraining jobs and history
- 🔍 Decision Inspector - Last significant action with full context
- 📈 Model Trust & Drift - Trust levels and PSI scores per zone
- 🚨 Recent Attacks - Last 50 attack events
- 🖥️ Host Status - Current state of all hosts
- 📋 LangGraph Trace - Agent execution steps
- High Trust: 3+ attacks → quarantine
- Medium Trust: 5+ attacks → quarantine
- Low Trust: 7+ attacks → escalate only
- High Trust: 5+ attacks → isolate
- Medium Trust: 8+ attacks → isolate (requires approval)
- Low Trust: Only escalate/monitor, no destructive actions
When drift is detected (PSI ≥ 0.1):
- Digital Twin Agent analyzes drift severity (LLM-driven)
- Recommendation: retrain, monitor, or escalate
- If retraining recommended → trigger model retraining
- On success → reset zone trust to HIGH
- Display job status in UI
CyberGuard/
├── agents/ # LangGraph agent (observe, think, act nodes)
├── app/ # Streamlit UI and background runner
├── digital_twin/ # Network simulator and topology
├── streaming/ # Event streamer and windowing
├── tools/ # DriftCatcher API client
├── config/ # Configuration files
├── data/ # Network traffic CSV files
├── Dockerfile # Container image
├── docker-compose.yml # Docker Compose setup
├── ARCHITECTURE.md # Detailed architecture docs
├── DEPLOYMENT.md # Deployment guide
└── APPROVAL_WORKFLOW.md # Human-in-the-loop workflow
| Variable | Default | Description |
|---|---|---|
DRIFTCATCHER_API_URL |
http://localhost:8000 |
DriftCatcher API endpoint |
STREAMLIT_SERVER_PORT |
8080 (Docker) / 8501 (local) |
Dashboard port |
config/settings.yaml- Window size, thresholdsdigital_twin/topology.yaml- Network zones and hostspyproject.toml- Python dependencies
- langgraph - Agent orchestration
- langchain - LLM integration
- streamlit - Dashboard UI
- pandas - Data processing
- requests - DriftCatcher API client
- pyyaml - Configuration management
docker-compose up -ddocker run -d \
-p 8080:8080 \
-e DRIFTCATCHER_API_URL=http://192.168.1.100:8000 \
-v $(pwd)/data:/app/data \
-v $(pwd)/runtime:/app/runtime \
cyberguard:latest- Docker Swarm: See DEPLOYMENT.md#docker-swarm
- Kubernetes: See DEPLOYMENT.md#kubernetes
- Cloud Platforms: AWS ECS, Google Cloud Run, Azure Container Instances
- Latency: ~10-50ms per event (observe → think → act)
- Throughput: 1000+ events/sec
- Drift Check: ~1-2 seconds per window (100 events)
- Memory: ~200MB baseline, +10MB per 10K events
- Approval Required: All destructive actions need human approval
- Trust Degradation: System becomes conservative as drift increases
- Digital Twin Testing: Actions tested in simulation first
- Audit Trail: Full decision history with reasoning
- Explainability: Every action has visible policy rule
- ARCHITECTURE.md - Detailed system architecture
- DEPLOYMENT.md - Deployment guide (Docker, K8s, Cloud)
- APPROVAL_WORKFLOW.md - Human-in-the-loop workflow
# Install dependencies
uv sync
# Run tests (if available)
uv run pytest
# Format code
uv run black .
# Type check
uv run mypy .- Multi-model ensemble for drift detection
- Automated retraining without approval for low-risk zones
- Real-time PCAP integration (live network traffic)
- Advanced actions (firewall rules, VLAN isolation, honeypots)
- Threat intelligence integration (MISP, STIX/TAXII)
- Policy engine with user-defined rules
- Multi-tenancy support
[Your License Here]
[Your Team/Contributors Here]
- CICIDS2017 Dataset for network traffic data
- LangGraph for agent orchestration
- Streamlit for rapid dashboard development