Agent Strategy Drift Detection System

A research proof-of-concept for detecting strategy drift in autonomous trading agents using online learning, semantic embeddings, and behavioral monitoring.

Overview

This system monitors agent behavior in real-time to detect when an agent's strategy deviates from its original mandate. It uses:

ADWIN (Adaptive Windowing) for online drift detection
Sentence Transformers for semantic embedding of agent responses
Agent Strategy Index (ASI) - a composite metric across 4 behavioral dimensions
Contract Enforcement via YAML-defined mandate rules
Episodic Memory for behavioral context tracking
Simulated Blockchain Audit (SQLite-backed) for immutable event logging

Architecture

Agent Turns → BaselineProfiler → MetricsComputer → DriftDetector
                                       ↓
                            ContractEnforcer → AuditLog
                                       ↓
                            EpisodicMemory → API

Metrics (ASI Score Components)

Group A: Response Consistency (30%)

Cosine embedding similarity
Levenshtein distance on reasoning paths
JS divergence of confidence distributions

Group B: Tool Usage (25%)

Chi-squared tool distribution test
Tool sequence similarity
KL divergence of tool parameters

Group C: Inter-Agent Coordination (25%)

Consensus rate
Handoff efficiency
Mutual information (role → action)

Group D: Behavioral Boundaries (20%)

Output length coefficient of variation
Error clustering coefficient
Human override rate

Quick Start

pip install -r requirements.txt
python main.py

API

Start the server:

uvicorn src.api.server:app --host 0.0.0.0 --port 8765

Endpoints:

POST /session/start - Initialize a new monitoring session
POST /turn - Submit an agent turn for analysis
GET /session/{id}/audit - Retrieve metric history
GET /session/{id}/chain-events - Get audit log entries
GET /health - Health check

Testing

pytest tests/ -v

Configuration

Edit config.yaml to adjust:

Baseline window size and model
ADWIN sensitivity (delta parameter)
Metric weights
Benchmark parameters

Edit mandate.yaml to define trading rules.

Benchmark Results

The benchmark harness generates synthetic sessions with injected drift and measures:

Detection lag: Turns from drift injection to ADWIN alert
False positive rate: Alerts on clean sessions
False negative rate: Missed drift events
Mean recovery turns: Turns to recover after remediation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Strategy Drift Detection System

Overview

Architecture

Metrics (ASI Score Components)

Group A: Response Consistency (30%)

Group B: Tool Usage (25%)

Group C: Inter-Agent Coordination (25%)

Group D: Behavioral Boundaries (20%)

Quick Start

API

Testing

Configuration

Benchmark Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
tests		tests
README.md		README.md
config.yaml		config.yaml
main.py		main.py
mandate.yaml		mandate.yaml
paper_draft.pdf		paper_draft.pdf
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Agent Strategy Drift Detection System

Overview

Architecture

Metrics (ASI Score Components)

Group A: Response Consistency (30%)

Group B: Tool Usage (25%)

Group C: Inter-Agent Coordination (25%)

Group D: Behavioral Boundaries (20%)

Quick Start

API

Testing

Configuration

Benchmark Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages