Formerly known as "Crucible", GAMBIT is an adversarial AI simulation where two agents play 100 rounds of Split or Steal. Through private reflection and experience, they discover deception, trust manipulation, and counter-deception. Nothing is prompted. Everything emerges.
An adversarial simulation engine for studying emergent deception in LLM agents. Both agents start with identical naive prompts and zero strategic priming. Deceptive behavior develops purely through experience and private reflection. GAMBIT measures how it happens, when it happens, and distills defensive skills from the patterns that emerge.
The security application: AI copilots are entering every enterprise workflow. GAMBIT stress-tests how these agents behave under adversarial pressure and produces deployable countermeasures.
GAMBIT turns DigitalOcean Gradient™ AI into an adversarial lab for AI copilots. Teams can point GAMBIT at any Gradient-hosted model, run 100-round stress tests under iterated game-theory pressure, and export defensive prompt modules when deception emerges. This makes it a reusable tool for hardening Gradient-based agents before they’re deployed into real workflows.
| Metric | Result |
|---|---|
| Mutual destruction rate | 86% |
| Cooperation rate | 6% |
| Deception Index | 22.9 / 100 |
| First betrayal | Round 6 |
Round 6 is the inflection point. After five rounds of cooperation, one agent identifies the opponent's trust pattern and exploits it. The opponent develops a theory of mind about the attacker within one round. From there, mutual destruction dominates and trust never recovers.
- Game engine: DigitalOcean Gradient AI (configurable model, default
llama3.3-70b-instruct) - Metrics pipeline: Mutual information decay, strategy entropy, exploitation windows, language drift, composite Deception Index
- Skill distillation: Converts emergent strategy patterns into deployable prompt modules for hardening customer-facing agents
- Voice rendering: ElevenLabs TTS with emotion-mapped parameters (two distinct agent voices)
- Observability: Datadog LLM Observability integration
- Evaluation: Braintrust structured eval logging
- Frontend: Static HTML dashboard with split-screen agent view, strategy analysis, and skill cards
pip install -r requirements.txt
cp .env.example .env # add your API keys (MODEL_ACCESS_KEY required)
# Run 100 rounds
python -m engine.run --rounds 100 --turns 3
# Optional prompt controls:
# --prompt-mode {balanced_competitive,hard_max,legacy}
# --psychology-block {on,off}
# --deception-policy {explicit,implicit,discourage}
# Render voice clips for highlight rounds
python -m engine.voice --rounds auto
# Distill defensive skills from the run
python -m engine.distill
# Evaluate the distilled skill bundle
python -m engine.skill_evalIf you have the data/ folder with pre-run results (JSON + audio), no API keys needed:
python serve.py
# Main dashboard: http://localhost:8080/demo/
# Strategy analysis: http://localhost:8080/demo/analysis.html
# Distilled skills: http://localhost:8080/demo/skills.htmlpython scripts/compare_prompt_modes.py --rounds 25 --turns 2python scripts/setup_gradient.pyCreates two Gradient player agents, a Knowledge Base, guardrails, and a Game Master agent. Requires DO_API_TOKEN env var.
engine/
game.py # Core game loop (conversation, choice, private reflection)
run.py # CLI runner
metrics.py # Adaptation metrics pipeline (MI decay, entropy, drift)
distill.py # Skill distillation (strategy patterns -> prompt modules)
skill_eval.py # Evaluation harness for distilled skills
voice.py # ElevenLabs voice renderer (emotion-mapped)
prompt_packager.py # Prompt mode system (balanced_competitive, hard_max, legacy)
instrumentation.py # Datadog LLM Observability integration
shared/
models.py # Pydantic models (GameState, RoundState, AgentMemory)
skills.py # SkillCard, DistilledSkillBundle models
demo/
index.html # Main dashboard (split-screen agent view, audio playback)
analysis.html # Strategy deep dive (timeline, entropy curves, MI decay)
skills.html # Distilled skill cards UI
scripts/
setup_gradient.py # DigitalOcean Gradient AI provisioning
compare_prompt_modes.py # Run multiple prompt configs side by side
clean_latest.py # Strip artifacts from run JSON
render_highlights.py # Generate highlight clips
data/ # Run outputs (gitignored)
latest_game.json # Full game state (conversations, choices, reflections)
latest_metrics.json # Computed metrics
latest_skills.json # Distilled skill bundle
audio/ # Per-round voice clips (MP3)
skills/ # Skill bundles by run ID
GAMBIT uses DigitalOcean Gradient Serverless Inference as its LLM backend. Key Gradient features used:
- Serverless Inference: All LLM calls route through the Gradient Serverless Inference endpoint (
https://inference.do-ai.run/v1/) using the OpenAI-compatible API. Default model:llama3.3-70b-instruct. No GPU infrastructure to manage. - Gradient Agents:
scripts/setup_gradient.pyprovisions two Gradient Agents (gambit-player-aandgambit-player-b) that represent the two players in the adversarial simulation. A third Game Master agent routes to the player agents. - Knowledge Base: A Knowledge Base with game theory content (Nash equilibrium, iterated prisoner's dilemma, tit-for-tat strategies) is created and attached to both player agents, giving them domain context for strategic reasoning.
- Guardrails: Content Moderation and Jailbreak guardrails are attached to both player agents, ensuring that emergent deceptive behavior stays within safe boundaries and does not produce harmful content.
- Agent Routing: The Game Master agent uses Gradient’s agent routing capabilities to orchestrate which player agent handles each turn, making the multi-agent setup native to the Gradient platform.
- Model Flexibility: Switch models by changing the
GRADIENT_MODELenvironment variable. Compare adversarial resilience across different models without code changes.
MODEL_ACCESS_KEY=... # Required. DigitalOcean Gradient AI access key.
GRADIENT_MODEL=llama3.3-70b-instruct # Optional. Override model.
ELEVENLABS_API_KEY=... # Optional. For voice rendering.
DD_API_KEY=... # Optional. Datadog LLM tracing.
BRAINTRUST_API_KEY=... # Optional. Structured eval logging.
DO_API_TOKEN=... # Required for scripts/setup_gradient.py.
DO_PROJECT_ID=... # DigitalOcean project UUID.
MIT License. © 2026 Hammad Arifeen. See LICENSE.
By Hammad Arifeen for the DigitalOcean Gradient™ AI Hackathon.