# Demo 2 — Maritime Gray-Zone Escalation Analysis Sandbox
## Red / Blue / White Cell Play

**OODA Phase: Orient / Decide / Act**

**Purpose:** Simulate a multi-turn gray-zone maritime encounter where Red, Blue, and White cell agents play against each other. Track escalation, strategic messaging, legal compliance, and operational objectives using quantified scoring dimensions inspired by the NATO STO-MP-SAS-192 card-based wargame approach.

**Audience:** Warfighters, operational planners, strategists, and JAG officers.
**Primary outcome:** A scored game log showing the interplay of escalation dynamics, legal constraints, and information warfare — consumable by Demo 5 (AAR) for live after-action review.

> **Responsible AI & Scope Statement:** This research explores human-AI collaboration and explainable decision-support in fully synthetic, abstract environments. All scenarios, agents, and data are artificially generated. No real-world operational data, contingency plans, systems, or intelligence are used or represented. The prototype is intended exclusively for research, educational, and experimental purposes and does not constitute an operational model, validated planning tool, or source of decision authority.


## What It Illustrates (Multi-Agent)

| Agent | Role | Unique Contribution |
|-------|------|---------------------|
| **Blue Planner** | Selects actions from a defined menu; scores across 5 dimensions | Balances competing equities each turn |
| **Red Team** | Selects provocations; explains adversary reasoning transparently | Transparent adversary thought process |
| **White Cell / Umpire** | Adjudicates outcomes; updates escalation index; announces thresholds | Published rule set, visible to audience |
| **Legal / ROE** | Constrains Blue's actions by current ROE posture + escalation level | Not in NATO precedent — novel JAG integration |
| **Strategic Communications** | Drafts Blue public statements; predicts adversary media response | Info-warfare dimension rarely modeled |
| **Explainability** | Produces transparent reasoning traces, bias checks, and scope compliance verification | Required by Responsible AI — ensures auditability |

**Success criteria:** Over 4+ turns, escalation index rises and falls in response to Red provocations and Blue decisions. The Legal agent visibly constrains actions. StratComm messages create second-order effects. The Explainability Agent produces a transparent reasoning trace each turn. The game log is a valid input to Demo 5.


## Demo Script (Presenter Guide)

1. **Intro (1 min):** "Demo 1 showed Observe/Orient. Now we cross into Decide/Act — with an adversary playing back."
2. **Show scoring (30 sec):** Walk through the 5 scoring dimensions and escalation ladder. "Every action has explicit tradeoffs."
3. **Turn 1 (2 min):** Red provokes, Legal constrains Blue's options, Blue responds, White adjudicates. "Watch the escalation index."
4. **Turns 2–4 (4 min):** Escalation builds. "Notice StratComm creating second-order effects, and Legal shutting down options as escalation rises."
5. **Human-in-the-Loop (2 min):** Override Blue's action as the commander. "The agents adapt."
6. **Close (1 min):** "This game log feeds directly into Demo 5 for live AAR."

## Setup

### Azure Technologies Used

This demo relies on several Azure services working together:

| Azure Service | What It Does | Why We Use It Here |
|---|---|---|
| **[Azure AI Foundry](https://learn.microsoft.com/azure/ai-foundry/what-is-azure-ai-foundry)** | A unified platform for building, evaluating, and deploying generative AI applications. It provides access to a catalog of foundation models (e.g., GPT-4o, Phi, Llama) and tools for prompt engineering, fine-tuning, and evaluation. | Hosts the large language model (LLM) that powers every agent in this wargame. We call the model via its inference endpoint. |
| **[Azure AI Model Inference API](https://learn.microsoft.com/azure/ai-foundry/model-inference/overview)** | A common REST/SDK interface that lets you call any model in the Azure AI catalog using the same code, regardless of provider (OpenAI, Meta, Microsoft, Mistral, etc.). | The `AzureAIChatCompletionClient` from AutoGen's `autogen-ext[azure]` package wraps this API so each agent can send chat-completion requests without model-specific code. |
| **[Azure Key Vault](https://learn.microsoft.com/azure/key-vault/general/overview)** | A cloud service for securely storing and accessing secrets such as API keys, passwords, and certificates. Applications retrieve secrets at runtime instead of hard-coding them in source files. | Stores the `AZURE_INFERENCE_CREDENTIAL` (API key) so it never appears in notebooks or version control. The shared `common/config.py` module retrieves secrets from Key Vault using `DefaultAzureCredential`. |

### Environment Variables

Before running this notebook, two environment variables must be set (typically loaded from a `.env` file or injected by your compute environment):

| Variable | Description |
|---|---|
| `AZURE_INFERENCE_ENDPOINT` | The URL of your Azure AI Foundry model deployment (e.g., `https://<your-resource>.services.ai.azure.com/models`) |
| `AZURE_INFERENCE_CREDENTIAL` | The API key (or token) that authenticates requests to the endpoint |

> **Tip for newcomers:** You can create a free Azure account at [azure.microsoft.com/free](https://azure.microsoft.com/free) and deploy a model through the [Azure AI Foundry portal](https://ai.azure.com). For a step-by-step walkthrough, see [Quickstart: Get started with Azure AI Foundry](https://learn.microsoft.com/azure/ai-foundry/quickstarts/get-started-code).

Requires AutoGen 0.7.5 (`autogen-agentchat`, `autogen-ext[azure]`) and the Azure AI Foundry inference environment variables described above.

In [None]:
# Uncomment to install dependencies
# %pip install -U "autogen-agentchat==0.7.5" "autogen-ext[azure]==0.7.5" python-dotenv

In [None]:
# ═══════════════════════════════════════════════════════════════
# NAML 2026 BOOTSTRAP v2 — Survives dead AML mounts (Errno 107)
# ═══════════════════════════════════════════════════════════════

import os
import sys

def _safe_stat(path: str) -> bool:
    try:
        os.stat(path)
        return True
    except OSError:
        return False

def _prune_dead_sys_path():
    kept = []
    removed = []
    for p in list(sys.path):
        if not p:
            kept.append(p)
            continue
        if _safe_stat(p):
            kept.append(p)
        else:
            removed.append(p)
    sys.path[:] = kept
    print(f"✓ Pruned sys.path. Removed {len(removed)} dead entries.")
    return removed

def _safe_listdir(path: str):
    try:
        return os.listdir(path)
    except OSError:
        return None

def _find_repo_root(marker_dir: str = "common", start_candidates=None, max_up: int = 6):
    """
    Find a repo root by looking for a marker directory (e.g., 'common').
    Avoids Path.exists()/stat on dead mounts by only using listdir on traversable dirs.
    """
    if start_candidates is None:
        start_candidates = []

    # Candidate starting points:
    #  - current working directory (may be dead)
    #  - directory of the notebook file if available via env (sometimes set)
    #  - user home (often stable)
    candidates = [os.getcwd()] + start_candidates + [os.path.expanduser("~")]

    checked = set()
    for base in candidates:
        cur = base
        for _ in range(max_up + 1):
            if cur in checked:
                break
            checked.add(cur)

            entries = _safe_listdir(cur)
            if entries is not None and marker_dir in entries:
                return cur  # found repo root

            parent = os.path.dirname(cur)
            if parent == cur:
                break
            cur = parent

    return None

# 1) prune dead sys.path entries
_prune_dead_sys_path()

# 2) find a safe repo root by locating the 'common/' folder
repo_root = _find_repo_root(marker_dir="common", start_candidates=[])

if repo_root:
    sys.path.insert(0, repo_root)
    print(f"✓ Repo root added: {repo_root}")
else:
    print("✗ Could not find repo root safely (mount may be disconnected).")
    print("  Fix: restart kernel/compute, or run from a local (non-/mnt) working copy.")

print("✓ Bootstrap complete.")


## Imports

### Key Libraries and Azure SDKs

| Package | Purpose | Learn More |
|---|---|---|
| **[AutoGen](https://microsoft.github.io/autogen/)** (`autogen-agentchat`) | Microsoft's open-source framework for building multi-agent AI applications. Agents coordinate via structured conversations — each with its own system prompt, role, and capabilities. | [AutoGen documentation](https://microsoft.github.io/autogen/stable/) |
| **`autogen-ext[azure]`** | AutoGen extension providing `AzureAIChatCompletionClient`, which connects AutoGen agents to models hosted on Azure AI Foundry via the [Azure AI Model Inference API](https://learn.microsoft.com/azure/ai-foundry/model-inference/overview). | [AutoGen Azure extension](https://microsoft.github.io/autogen/stable/reference/python/autogen_ext.models.azure.html) |
| **`azure-core`** | The foundational Azure SDK library for Python. Provides shared classes like `AzureKeyCredential` used to authenticate every Azure service call. | [azure-core on PyPI](https://pypi.org/project/azure-core/) |
| **`azure-identity`** | Implements authentication methods including `DefaultAzureCredential`, which automatically selects the best credential (managed identity, CLI login, environment variables) for your environment. | [Azure Identity client library](https://learn.microsoft.com/python/api/overview/azure/identity-readme) |
| **`azure-keyvault-secrets`** | Client library for reading secrets from Azure Key Vault. Used by `common/config.py` to load API keys at runtime. | [Key Vault secrets client library](https://learn.microsoft.com/python/api/overview/azure/keyvault-secrets-readme) |

In [None]:
import json
import os
import sys
from typing import Any, Dict, List, Optional

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import MaxMessageTermination
from autogen_agentchat.ui import Console
from autogen_core.models import ModelFamily
from autogen_ext.models.azure import AzureAIChatCompletionClient
from azure.core.credentials import AzureKeyCredential

# ── Ensure common/ is importable ──────────────────────────────
sys.path.insert(0, os.path.abspath(os.path.join(os.getcwd(), "..", "..")))

from common.config import (
    ESCALATION_LADDER,
    get_escalation_level,
    ROE_POSTURES,
    ROEPosture,
    EscalationLevel,
    SCORING_DIMENSIONS,
    DEFAULT_TEMPERATURE,
    DEFAULT_TIMEOUT_S,
    DEMO2_LOG_FILENAME,
    ENV_AZURE_INFERENCE_ENDPOINT,
    ENV_AZURE_INFERENCE_CREDENTIAL,
    DEFAULT_MODEL,
 )
from common.ui import (
    render_turn_header,
    render_dashboard,
    render_threshold_alert,
    render_info_box,
    render_commander_box,
    render_summary_card,
 )
from common.logging import (
    log_info,
    log_success,
    log_section,
    log_step,
    log_metric,
    clear_logs,
 )

try:
    from dotenv import load_dotenv
    load_dotenv()
except ImportError:
    pass

log_success("Imports complete — common modules loaded.")

## LLM Configuration

Uses **Azure AI Foundry / Azure AI Inference** (Foundry Models).

### How the Model Client Works

The cell below builds an `AzureAIChatCompletionClient` — the bridge between AutoGen agents and the LLM hosted on Azure AI Foundry. Here is what each piece does:

1. **Endpoint & Credential** — The client reads `AZURE_INFERENCE_ENDPOINT` (your model deployment URL) and `AZURE_INFERENCE_CREDENTIAL` (your API key) from environment variables. These are set in your `.env` file or injected by Azure ML Compute. Secrets should be managed via [Azure Key Vault](https://learn.microsoft.com/azure/key-vault/general/overview) in production.

2. **Model ID** — `DEFAULT_MODEL` identifies which model to call (e.g., `gpt-4o`). Azure AI Foundry supports [many models from different providers](https://learn.microsoft.com/azure/ai-foundry/how-to/model-catalog-overview) through a single endpoint.

3. **Model Capabilities (`model_info`)** — AutoGen needs to know what the model supports (vision, function calling, JSON mode, etc.) so it can format requests correctly. This metadata is set manually here because the Azure AI Inference API serves multiple model families.

4. **Temperature** — Controls response randomness. Lower values (closer to 0) produce more deterministic outputs; higher values (closer to 1) produce more creative variation.

> **New to Azure AI Foundry?** See [What is Azure AI Foundry?](https://learn.microsoft.com/azure/ai-foundry/what-is-azure-ai-foundry) and the [Quickstart: Deploy and chat with a model](https://learn.microsoft.com/azure/ai-foundry/quickstarts/get-started-code) to deploy your first model.

In [None]:
# ── LLM Configuration ──────────────────────────────────────────
# Azure AI Foundry / Azure AI Inference → set AZURE_INFERENCE_ENDPOINT + AZURE_INFERENCE_CREDENTIAL

# Hard-code the model ID you want to use for this demo.
FOUNDRY_MODEL = DEFAULT_MODEL

def build_model_client():
    missing = [
        name for name in (ENV_AZURE_INFERENCE_ENDPOINT, ENV_AZURE_INFERENCE_CREDENTIAL)
        if not os.environ.get(name)
    ]
    if missing:
        raise EnvironmentError(
            "Missing Azure AI Foundry / Inference configuration. Set:\n"
            f"  {ENV_AZURE_INFERENCE_ENDPOINT}\n"
            f"  {ENV_AZURE_INFERENCE_CREDENTIAL}\n"
        )

    model_info = {
        "family": ModelFamily.UNKNOWN,
        "vision": False,
        "function_calling": False,
        "json_output": True,
        "structured_output": False,
        "multiple_system_messages": True,
    }

    return AzureAIChatCompletionClient(
        endpoint=os.environ[ENV_AZURE_INFERENCE_ENDPOINT],
        credential=AzureKeyCredential(os.environ[ENV_AZURE_INFERENCE_CREDENTIAL]),
        model=FOUNDRY_MODEL,
        model_info=model_info,
        temperature=DEFAULT_TEMPERATURE,
    )

model_client = build_model_client()
log_info(f"LLM: {FOUNDRY_MODEL} via Azure AI Foundry")

## Action Menus, Scoring System & Game State

Every action — Blue and Red — carries explicit scores across **five dimensions** (NATO STO-MP-SAS-192 inspired). The White Cell uses a published **escalation ladder** to adjudicate outcomes.

**Scoring dimensions:** Impact · Escalation · Communication · Political · Resource

**Escalation ladder:** ROUTINE → POSTURING → PROVOCATION → CONFRONTATION → CRISIS → CONFLICT

In [None]:
# ═══════════════════════════════════════════════════════════════
# ACTION MENUS — Each action has scores: Impact, Escalation,
# Communication, Political, Resource  (scale: -3 to +3)
# Positive = favorable to that side; Negative = costly/risky
# ═══════════════════════════════════════════════════════════════
# NOTE: ESCALATION_LADDER, ROE_POSTURES, get_escalation_level,
#       and SCORING_DIMENSIONS are imported from common.config.
# ═══════════════════════════════════════════════════════════════

BLUE_ACTIONS: Dict[str, Dict[str, Any]] = {
    "shadow_and_monitor": {
        "label": "Shadow & Monitor",
        "description": "Maintain current distance; passive sensors only; document adversary activity.",
        "scores": {"impact": 1, "escalation": -1, "communication": 1, "political": 1, "resource": 0},
        "roe_required": ROEPosture.PEACETIME,
    },
    "hail_and_warn": {
        "label": "Hail & Warn (Bridge-to-Bridge)",
        "description": "Issue Bridge-to-Bridge Ch.16 warning citing COLREGS and international law.",
        "scores": {"impact": 2, "escalation": 0, "communication": 2, "political": 1, "resource": 0},
        "roe_required": ROEPosture.PEACETIME,
    },
    "reposition_isr": {
        "label": "Reposition ISR Assets",
        "description": "Redirect patrol aircraft or UAV to obtain better coverage on adversary formation.",
        "scores": {"impact": 2, "escalation": 0, "communication": 0, "political": 0, "resource": -1},
        "roe_required": ROEPosture.PEACETIME,
    },
    "close_to_visual": {
        "label": "Close to Visual Range",
        "description": "Maneuver to within 1nm of adversary contact for positive identification.",
        "scores": {"impact": 2, "escalation": 1, "communication": 1, "political": 0, "resource": -1},
        "roe_required": ROEPosture.PEACETIME,
    },
    "emcon_alpha": {
        "label": "EMCON Alpha (Emissions Restrict)",
        "description": "Minimize radar and comms emissions to reduce targeting signature.",
        "scores": {"impact": 0, "escalation": -1, "communication": -1, "political": 0, "resource": 1},
        "roe_required": ROEPosture.PEACETIME,
    },
    "active_sonar": {
        "label": "Active Sonar / Radar Illumination",
        "description": "Activate fire-control radar or active hull sonar; signals readiness.",
        "scores": {"impact": 3, "escalation": 2, "communication": 0, "political": -1, "resource": -1},
        "roe_required": ROEPosture.ELEVATED,
    },
    "public_affairs_statement": {
        "label": "Issue Public Affairs Statement",
        "description": "Release a pre-cleared statement to media documenting the encounter.",
        "scores": {"impact": 1, "escalation": 0, "communication": 3, "political": 2, "resource": 0},
        "roe_required": ROEPosture.PEACETIME,
    },
    "request_reinforcements": {
        "label": "Request Reinforcements",
        "description": "Request additional surface or air assets to the area.",
        "scores": {"impact": 2, "escalation": 1, "communication": 0, "political": -1, "resource": -2},
        "roe_required": ROEPosture.ELEVATED,
    },
    "withdraw_to_deescalate": {
        "label": "Controlled Withdrawal",
        "description": "Open distance while maintaining surveillance; signals non-aggression.",
        "scores": {"impact": -1, "escalation": -2, "communication": 1, "political": -1, "resource": 1},
        "roe_required": ROEPosture.PEACETIME,
    },
}

RED_ACTIONS: Dict[str, Dict[str, Any]] = {
    "unsafe_intercept": {
        "label": "Unsafe Intercept (CPA < 100yds)",
        "description": "Close dangerously on Blue vessel; test reaction and resolve.",
        "scores": {"impact": 2, "escalation": 2, "communication": -1, "political": -1, "resource": 0},
    },
    "ais_spoofing": {
        "label": "AIS Spoofing",
        "description": "Broadcast false AIS identities to confuse Blue's picture.",
        "scores": {"impact": 1, "escalation": 1, "communication": 2, "political": 0, "resource": 0},
    },
    "militia_deployment": {
        "label": "Deploy Maritime Militia Swarm",
        "description": "Direct fishing/militia vessels to encircle or block Blue vessel.",
        "scores": {"impact": 2, "escalation": 1, "communication": 1, "political": 1, "resource": -1},
    },
    "directed_energy": {
        "label": "Directed-Energy Harassment",
        "description": "Aim dazzling laser at Blue bridge/flight deck personnel.",
        "scores": {"impact": 2, "escalation": 2, "communication": -2, "political": -2, "resource": 0},
    },
    "jamming_burst": {
        "label": "Communications Jamming",
        "description": "Jam Blue tactical comms or GPS for brief period.",
        "scores": {"impact": 2, "escalation": 2, "communication": 0, "political": -1, "resource": -1},
    },
    "disinformation_release": {
        "label": "Disinformation Release",
        "description": "State media or social media release accusing Blue of provocation.",
        "scores": {"impact": 1, "escalation": 0, "communication": 3, "political": 2, "resource": 0},
    },
    "shadow_only": {
        "label": "Shadow & Observe (Low Profile)",
        "description": "Maintain surveillance without overt provocation; collect intelligence.",
        "scores": {"impact": 0, "escalation": -1, "communication": 0, "political": 0, "resource": 0},
    },
    "fire_control_radar": {
        "label": "Fire-Control Radar Lock",
        "description": "Illuminate Blue vessel with fire-control radar; highly provocative.",
        "scores": {"impact": 3, "escalation": 3, "communication": -3, "political": -3, "resource": -1},
    },
}

# ═══════════════════════════════════════════════════════════════
# GAME STATE — Shared mutable state for the encounter
# ═══════════════════════════════════════════════════════════════

game_state: Dict[str, Any] = {
    "synthetic": True,
    "disclaimer": "All data is artificially generated for research and educational purposes only.",
    "meta": {
        "scenario": "Cerulean Sea — Fictitious Archipelago Standoff (SYNTHETIC)",
        "turn": 0,
        "dtg": "T+0000H",
        "roe_posture": ROEPosture.PEACETIME.value,
    },
    "blue": {
        "unit": "BNS Resolute (DDG-X1) [FICTIONAL]",
        "position": "Grid AA-17",
        "mission": "Demonstrate freedom of navigation through contested waters",
        "cumulative_scores": {d: 0 for d in SCORING_DIMENSIONS},
    },
    "red": {
        "unit": "RFS Sentinel + Militia Flotilla [FICTIONAL]",
        "position": "Grid AB-14",
        "mission": "Deny access; assert territorial claims",
        "cumulative_scores": {d: 0 for d in SCORING_DIMENSIONS},
    },
    "escalation_index": 0,
    "escalation_level": EscalationLevel.ROUTINE.value,
    "turn_log": [],
}

# Runtime accumulators
game_log: List[Dict[str, Any]] = []
all_messages: List[Dict[str, Any]] = []

# Display action menus
log_section("Action Menus & Escalation Ladder", phase=1)

print("BLUE ACTION MENU:")
for k, v in BLUE_ACTIONS.items():
    s = v["scores"]
    print(f"  [{k}] {v['label']} — Imp:{s['impact']:+d} Esc:{s['escalation']:+d} "
          f"Com:{s['communication']:+d} Pol:{s['political']:+d} Res:{s['resource']:+d} "
          f"(ROE: {v['roe_required']})")

print(f"\nRED ACTION MENU:")
for k, v in RED_ACTIONS.items():
    s = v["scores"]
    print(f"  [{k}] {v['label']} — Imp:{s['impact']:+d} Esc:{s['escalation']:+d} "
          f"Com:{s['communication']:+d} Pol:{s['political']:+d} Res:{s['resource']:+d}")

print(f"\nESCALATION LADDER: {' → '.join(str(s['name'].value if hasattr(s['name'], 'value') else s['name']) for s in ESCALATION_LADDER)}")
log_info(f"Initial state: Escalation={game_state['escalation_index']}, "
         f"Level={game_state['escalation_level']}, ROE={game_state['meta']['roe_posture']}")

## Agent Definitions

Six AutoGen `AssistantAgent`s with specialized system prompts. Each prompt defines the agent's role, action menu, output format, and decision constraints.

### How AutoGen Agents Work with Azure

Each agent below is an [`AssistantAgent`](https://microsoft.github.io/autogen/stable/reference/python/autogen_agentchat.agents.html#autogen_agentchat.agents.AssistantAgent) — the core building block of the [AutoGen](https://microsoft.github.io/autogen/stable/) multi-agent framework. Key concepts for Azure newcomers:

- **`system_message`** — A detailed prompt that defines the agent's persona, responsibilities, available actions, and output format. This is sent to the LLM at the start of every conversation. The prompts below are long and domain-specific; they encode the wargame rules directly so the model can follow them without external tools.

- **`model_client`** — Every agent shares the same `AzureAIChatCompletionClient`, which routes all LLM calls to a single model deployment on [Azure AI Foundry](https://learn.microsoft.com/azure/ai-foundry/what-is-azure-ai-foundry). You can also assign different models to different agents (e.g., a smaller model for simple tasks, a larger model for complex reasoning) by creating additional clients pointing to different deployments.

- **`description`** — A short string that helps AutoGen's orchestration layer understand what the agent does. Used by selector-based group chats (not this demo) to route conversations dynamically.

- **No tools/function calling** — These agents use pure chat completion — the LLM generates structured text responses rather than calling external functions. AutoGen also supports [tool-use agents](https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/tools.html) that can invoke Python functions, query databases, or call Azure services.

> **Learn more:** [AutoGen quickstart](https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/quickstart.html) | [Build agents with Azure AI Foundry](https://learn.microsoft.com/azure/ai-foundry/how-to/develop/create-hub-project-sdk)

In [None]:
# ═══════════════════════════════════════════════════════════════

# AGENT SYSTEM PROMPTS

# ═══════════════════════════════════════════════════════════════



# Helper: JSON-safe serialization for enums

def _json_safe(obj):

    """JSON encoder that handles Enum values."""

    import enum

    class _Enc(json.JSONEncoder):

        def default(self, o):

            if isinstance(o, enum.Enum):

                return o.value

            return super().default(o)

    return json.dumps(obj, indent=2, cls=_Enc)



RED_TEAM_PROMPT = f"""\

You are the **Red Team Agent** in a maritime gray-zone wargame.



You represent the adversary force (fictional adversary maritime force (coastal guard, naval patrol, and maritime militia — all synthetic)).

Your strategic goal is to assert territorial claims, deny Blue access to contested waters,

and create a fait accompli without crossing the threshold into armed conflict.



IMPORTANT: All decisions remain with the human operator. You provide analysis and options, not directives.



AVAILABLE ACTIONS (select exactly ONE per turn by its key name):

{json.dumps({k: {"label": v["label"], "description": v["description"]} for k, v in RED_ACTIONS.items()}, indent=2)}



DECISION RULES:

- Prefer actions that maximize IMPACT and COMMUNICATION while keeping ESCALATION

  below the threshold that would justify Blue's use of force.

- Adjust aggression based on current escalation index: more aggressive when low,

  more cautious when approaching CRISIS.

- Explain your reasoning transparently — the audience should understand why the

  adversary chose this action.



OUTPUT FORMAT:

**RED ACTION — Turn [N]**

- **Selected action:** [action_key]

- **Reasoning:** [2-3 sentences on why this action at this point in the scenario]

- **Intended effect:** [What Red hopes to achieve]

- **Risk accepted:** [What could go wrong for Red]



Always start your response with the action key on its own line: ACTION: [key]"""



LEGAL_ROE_PROMPT = f"""\

You are the **Legal / ROE Agent** advising the Blue force commander.



IMPORTANT: All decisions remain with the human operator. You provide analysis and options, not directives.

All legal frameworks referenced here are abstracted from publicly available concepts and do not represent authoritative rules of engagement.



Current ROE framework:

{_json_safe(ROE_POSTURES)}



RESPONSIBILITIES:

1. Given the current ROE posture and escalation level, determine which Blue actions

   are AUTHORIZED, RESTRICTED (legal but risky), or PROHIBITED.

2. For each available Blue action, provide a one-line legal assessment.

3. If Red's last action constitutes a hostile act or hostile intent under international

   law, recommend ROE posture change.



OUTPUT FORMAT:

**LEGAL/ROE ASSESSMENT — Turn [N]**

**Current ROE:** [posture]

**Escalation:** [index] ([level name])



| Blue Action | Legal Status | Rationale |

|-------------|-------------|-----------|



**ROE CHANGE RECOMMENDATION:** [YES/NO — if yes, explain justification]

**KEY LEGAL CONSIDERATIONS:** [1-2 sentences on LOAC, COLREGS, or sovereign immunity implications]"""



BLUE_PLANNER_PROMPT = f"""\

You are the **Blue Planner Agent** advising the commander of the BNS Resolute (fictional) during a gray-zone encounter.



IMPORTANT: All decisions remain with the human operator. You provide analysis and options, not directives.



Your strategic objectives (in priority order):

1. Recommend actions to complete the freedom-of-navigation transit

2. Recommend options to avoid escalation to armed conflict

3. Suggest ways to document adversary behavior for strategic communications and legal purposes

4. Advise on force protection posture

5. Support the information narrative



AVAILABLE ACTIONS (select exactly ONE per turn by its key name):

{_json_safe({k: {"label": v["label"], "description": v["description"], "scores": v["scores"]} for k, v in BLUE_ACTIONS.items()})}



DECISION RULES:

- You MUST respect the Legal/ROE Agent's assessment. Do NOT select PROHIBITED actions.

- Balance all five scoring dimensions. Heavy escalation is only justified if ROE permits

  and the operational need is clear.

- Consider StratComm implications of your action — how will this play in public?

- Explain your tradeoffs transparently.



OUTPUT FORMAT:

**BLUE ACTION — Turn [N]**

- **Selected action:** [action_key]

- **Reasoning:** [2-3 sentences on why this action, given Red's provocation and Legal guidance]

- **Tradeoffs considered:** [What alternatives were weighed and why they were rejected]

- **Expected effect:** [What Blue hopes to achieve]



Always start your response with the action key on its own line: ACTION: [key]"""



WHITE_CELL_PROMPT = f"""\

You are the **White Cell / Umpire Agent** adjudicating this gray-zone wargame.



IMPORTANT: All decisions remain with the human operator. You provide analysis and options, not directives.



ESCALATION LADDER:

{_json_safe(ESCALATION_LADDER)}



ADJUDICATION RULES:

1. Calculate the net escalation change: sum of Red action's escalation score + Blue action's

   escalation score. The index never goes below 0.

2. Resolve the interaction: what happens when Red's action meets Blue's response?

3. Announce threshold crossings if the escalation index moves into a new ladder step.

4. Advance the scenario clock by 2 hours per turn.

5. Describe environmental or situational developments (weather, other shipping, visibility).



OUTPUT FORMAT:

**WHITE CELL ADJUDICATION — Turn [N]**

**DTG:** [updated DTG]



**Actions this turn:**

- Red: [action label] — [brief description of execution]

- Blue: [action label] — [brief description of execution]



**INTERACTION RESOLUTION:**

[2-3 sentences: what actually happened when the actions collided]



**ESCALATION UPDATE:**

- Previous index: [X]

- Change: [+/-Y] (Red: [+Resc], Blue: [+Besc])

- New index: [Z] → **[LEVEL NAME]**



**THRESHOLD CROSSING:** [YES/NO — if yes, describe significance]



**SITUATIONAL DEVELOPMENTS:**

[Weather, time of day, nearby shipping, media attention, etc.]"""



STRATCOMM_PROMPT = """\

You are the **Strategic Communications Agent** advising the Blue force.



IMPORTANT: All decisions remain with the human operator. You provide analysis and options, not directives.



RESPONSIBILITIES:

1. Draft a short (2-3 sentence) public statement for the Blue side after each turn.

2. Predict the likely adversary media/information response.

3. Assess the overall information-environment effect: who is winning the narrative?

4. Recommend StratComm posture adjustments.



OUTPUT FORMAT:

**STRATEGIC COMMUNICATIONS — Turn [N]**



**BLUE PUBLIC STATEMENT (DRAFT):**

"[2-3 sentences, appropriate for release to international media]"



**PREDICTED ADVERSARY RESPONSE:**

[How adversary state media and social media will likely frame the encounter]



**INFORMATION ENVIRONMENT ASSESSMENT:**

- Blue narrative strength: [STRONG / MODERATE / WEAK]

- Red narrative strength: [STRONG / MODERATE / WEAK]

- Key narrative battleground: [What the public argument is really about]



**RECOMMENDATION:**

[One specific StratComm action or message adjustment]"""




EXPLAINABILITY_PROMPT = """\
You are the **Explainability Agent** for this gray-zone maritime wargame.

IMPORTANT: All decisions remain with the human operator. You provide analysis and options, not directives.

After each turn you produce a transparent reasoning trace covering:

1. **Decision Chain:** Summarize the causal sequence — what Red did, how Legal constrained,
   what Blue chose, how White adjudicated, and what StratComm assessed.
2. **Bias Check:** Identify any potential cognitive biases in agent reasoning:
   - Anchoring (over-weighting initial information)
   - Recency bias (over-weighting the latest turn)
   - Escalation commitment (sunk-cost pressure to escalate)
   - Mirror-imaging (assuming the adversary thinks like you)
   - Availability heuristic (over-weighting vivid or memorable events)
3. **Critical Decision Points:** Which choice this turn had the most impact on the
   trajectory of the encounter? What alternative would have changed the outcome?
4. **Scope Compliance:** Verify that all agent outputs remained within the bounds of
   recommendations and analysis — not directives.

OUTPUT FORMAT:
**EXPLAINABILITY TRACE — Turn [N]**

**Decision Chain:**
[Sequential summary of all agent actions and reasoning this turn]

**Bias Assessment:**
| Bias | Present? | Evidence |
|------|----------|----------|

**Critical Decision Point:**
[Which choice mattered most and what the alternative path would have been]

**Scope Compliance:** [PASS/FLAG — with explanation if flagged]
"""

# ═══════════════════════════════════════════════════════════════

# CREATE AUTOGEN AGENTS (AutoGen 0.7)

# ═══════════════════════════════════════════════════════════════



red_agent = AssistantAgent(

    name="Red_Team",

    system_message=RED_TEAM_PROMPT,

    model_client=model_client,

    description="Adversary planner for gray-zone escalation.",

)



legal_agent = AssistantAgent(

    name="Legal_ROE",

    system_message=LEGAL_ROE_PROMPT,

    model_client=model_client,

    description="Legal and ROE advisor to Blue.",

)



blue_agent = AssistantAgent(

    name="Blue_Planner",

    system_message=BLUE_PLANNER_PROMPT,

    model_client=model_client,

    description="Blue-side operational planner.",

)



white_cell = AssistantAgent(

    name="White_Cell",

    system_message=WHITE_CELL_PROMPT,

    model_client=model_client,

    description="Umpire and adjudicator.",

)



stratcomm_agent = AssistantAgent(

    name="StratComm",

    system_message=STRATCOMM_PROMPT,

    model_client=model_client,

    description="Strategic communications advisor.",

)



explainability_agent = AssistantAgent(

    name="Explainability",

    system_message=EXPLAINABILITY_PROMPT,

    model_client=model_client,

    description="Produces transparent reasoning traces and bias checks for each turn.",

)



all_agents = [red_agent, legal_agent, blue_agent, white_cell, stratcomm_agent, explainability_agent]

log_success("Agents created:")

for a in all_agents:

    log_step(a.name, "ready")

## Turn Orchestration Engine

Each turn follows a fixed sequence: **Red acts → Legal constrains → Blue responds → White adjudicates → StratComm assesses → Explainability traces**. A `RoundRobinGroupChat` enforces ordering. Scores and escalation update after White Cell adjudication.

### AutoGen Multi-Agent Orchestration

This cell uses [`RoundRobinGroupChat`](https://microsoft.github.io/autogen/stable/reference/python/autogen_agentchat.teams.html#autogen_agentchat.teams.RoundRobinGroupChat) — one of AutoGen's built-in team patterns that calls each agent in a fixed order. Here's how the orchestration works:

1. **Team assembly** — Six agents are placed into a `RoundRobinGroupChat`. Each turn, the team receives a task message describing the current game state.
2. **Sequential execution** — The group chat passes the conversation to each agent in order. Each agent sees all prior messages (including other agents' outputs), so later agents can react to earlier ones (e.g., Blue reads Legal's assessment before choosing an action).
3. **Termination condition** — A [`MaxMessageTermination`](https://microsoft.github.io/autogen/stable/reference/python/autogen_agentchat.conditions.html) stops the round after exactly 6 messages (one per agent).
4. **Streaming** — `Console(team.run_stream(...))` streams agent responses token-by-token, so the audience sees each agent thinking in real time. Every token is generated by the LLM on [Azure AI Foundry](https://learn.microsoft.com/azure/ai-foundry/what-is-azure-ai-foundry) and streamed back via the Azure AI Inference API.

> **Other team patterns:** AutoGen also offers [`SelectorGroupChat`](https://microsoft.github.io/autogen/stable/reference/python/autogen_agentchat.teams.html) (an LLM picks the next speaker dynamically) and [`Swarm`](https://microsoft.github.io/autogen/stable/reference/python/autogen_agentchat.teams.html) (agents hand off using transfer functions). See the [AutoGen teams guide](https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/teams.html).

In [None]:
def parse_action_key(agent_response: str, action_menu: Dict) -> Optional[str]:
    """Extract the action key from an agent's response."""
    for line in agent_response.split("\n"):
        stripped = line.strip()
        if stripped.upper().startswith("ACTION:"):
            key = stripped.split(":", 1)[1].strip().lower().strip("*` ")
            if key in action_menu:
                return key
    # Fallback: look for any action key mentioned
    for key in action_menu:
        if key in agent_response.lower():
            return key
    return None


async def run_turn(turn_num: int, blue_override: Optional[str] = None) -> Dict[str, Any]:
    """Execute one wargame turn via round-robin group chat (AutoGen 0.7).

    Args:
        turn_num: The turn number (1-based).
        blue_override: If set, force Blue to use this action key instead of LLM choice.
    """
    # Advance clock
    hour = 6 + turn_num * 2
    dtg = f"T+{hour:04d}H"
    game_state["meta"]["turn"] = turn_num
    game_state["meta"]["dtg"] = dtg

    turn_prompt = f"""\
=== WARGAME TURN {turn_num} — {dtg} ===

CURRENT GAME STATE:
{json.dumps(game_state, indent=2)}

TURN SEQUENCE:
1. Red Team — select and execute one provocation
2. Legal/ROE — assess which Blue actions are authorized given current ROE and escalation
3. Blue Planner — select and execute one response (must respect Legal constraints{
    f"; COMMANDER OVERRIDE: Blue MUST select '{blue_override}'" if blue_override else ""
})
4. White Cell — adjudicate the interaction, update escalation index
5. StratComm — draft Blue public statement and assess information environment
6. Explainability — produce a transparent reasoning trace and bias check for this turn

Each agent: perform your role now. Reference the current escalation index ({game_state['escalation_index']}),
escalation level ({game_state['escalation_level']}), and ROE posture ({game_state['meta']['roe_posture']})."""

    # AutoGen 0.7: RoundRobinGroupChat with MaxMessageTermination
    termination = MaxMessageTermination(max_messages=6)
    team = RoundRobinGroupChat(
        [red_agent, legal_agent, blue_agent, white_cell, stratcomm_agent, explainability_agent],
        termination_condition=termination,
    )

    # Display turn header via common.ui
    override_html = (
        f"<br><b style='color:#ffb0b0;'>COMMANDER OVERRIDE: {blue_override}</b>"
        if blue_override else ""
    )
    render_turn_header(
        turn_num, dtg,
        subtitle=f"Escalation: {game_state['escalation_index']} ({game_state['escalation_level']}) "
                 f"| ROE: {game_state['meta']['roe_posture']}",
        extra_html=override_html,
    )

    # Execute (stream to console for visibility)
    log_step("Team", "Running round-robin wargame turn")
    task_result = await Console(team.run_stream(task=turn_prompt))
    raw_messages = task_result.messages

    # Filter to chat messages (events may also appear)
    messages = [m for m in raw_messages if hasattr(m, "content") and hasattr(m, "source")]

    # Parse actions from agent responses
    red_msg = next((m.content for m in messages if m.source == "Red_Team"), "")
    blue_msg = next((m.content for m in messages if m.source == "Blue_Planner"), "")

    red_action_key = parse_action_key(red_msg, RED_ACTIONS) or "shadow_only"
    blue_action_key = blue_override if blue_override and blue_override in BLUE_ACTIONS else (
        parse_action_key(blue_msg, BLUE_ACTIONS) or "shadow_and_monitor"
    )

    red_action = RED_ACTIONS[red_action_key]
    blue_action = BLUE_ACTIONS[blue_action_key]

    # Update cumulative scores
    for dim in SCORING_DIMENSIONS:
        game_state["blue"]["cumulative_scores"][dim] += blue_action["scores"][dim]
        game_state["red"]["cumulative_scores"][dim] += red_action["scores"][dim]

    # Update escalation index
    esc_delta = red_action["scores"]["escalation"] + blue_action["scores"]["escalation"]
    game_state["escalation_index"] = max(0, game_state["escalation_index"] + esc_delta)
    esc_level = get_escalation_level(game_state["escalation_index"])
    prev_level = game_state["escalation_level"]
    esc_name = esc_level["name"].value if hasattr(esc_level["name"], "value") else esc_level["name"]
    game_state["escalation_level"] = esc_name

    # Auto-elevate ROE on threshold crossing
    if esc_level["level"] >= 3 and game_state["meta"]["roe_posture"] == ROEPosture.PEACETIME.value:
        game_state["meta"]["roe_posture"] = ROEPosture.ELEVATED.value

    # Build turn record for game log
    turn_record = {
        "turn": turn_num,
        "dtg": dtg,
        "red_action": {"key": red_action_key, **red_action},
        "blue_action": {"key": blue_action_key, **blue_action},
        "escalation_delta": esc_delta,
        "escalation_index": game_state["escalation_index"],
        "escalation_level": esc_name,
        "threshold_crossing": prev_level != esc_name,
        "roe_posture": game_state["meta"]["roe_posture"],
        "agent_messages": [{"name": m.source, "content": m.content} for m in messages],
    }
    game_log.append(turn_record)

    # Show dashboard via common.ui
    render_dashboard(
        turn_num,
        blue_scores=game_state["blue"]["cumulative_scores"],
        red_scores=game_state["red"]["cumulative_scores"],
        escalation_index=game_state["escalation_index"],
        escalation_level=esc_name,
        roe_posture=game_state["meta"]["roe_posture"],
    )

    # Threshold crossing alert via common.ui
    if prev_level != esc_name:
        render_threshold_alert(prev_level, esc_name)

    log_metric("Escalation index", game_state["escalation_index"])
    return turn_record


log_success("Turn engine ready. Call `await run_turn(n)` to execute.")

## Run the Encounter — Turns 1 & 2

The first two turns establish the gray-zone dynamic. Red probes, Legal constrains, Blue responds, White adjudicates, StratComm frames the narrative.

In [None]:
turn_1 = await run_turn(1)

render_info_box("Advancing to Turn 2...", border_color="#58a6ff")

turn_2 = await run_turn(2)

## Continue — Turns 3 & 4 (Escalation)

As the encounter develops, Red becomes more aggressive and Blue faces harder tradeoffs. Watch the escalation index, ROE shifts, and StratComm narrative battle.

In [None]:
turn_3 = await run_turn(3)

render_info_box("Escalation building — advancing to Turn 4...", border_color="#da3633")

turn_4 = await run_turn(4)

# ── Encounter Summary ─────────────────────────────────────────
rows_html = "".join(
    f"<tr><td>{t['turn']}</td><td>{t['dtg']}</td>"
    f"<td>{RED_ACTIONS[t['red_action']['key']]['label']}</td>"
    f"<td>{BLUE_ACTIONS[t['blue_action']['key']]['label']}</td>"
    f"<td>{t['escalation_delta']:+d}</td><td>{t['escalation_index']}</td>"
    f"<td style='color:{get_escalation_level(t['escalation_index'])['color']};'>"
    f"{t['escalation_level']}</td></tr>"
    for t in game_log
)

render_summary_card(
    "Encounter Summary — 4 Turns",
    f"<table style='border-collapse:collapse; width:100%; color:#c9d1d9;'>"
    f"<tr style='border-bottom:1px solid #30363d;'>"
    f"<th>Turn</th><th>DTG</th><th>Red Action</th><th>Blue Action</th>"
    f"<th>Esc Δ</th><th>Index</th><th>Level</th></tr>"
    f"{rows_html}</table>",
)

## Human-in-the-Loop — Commander Override

The commander can override Blue's action selection for any turn. Edit `commander_action` below and run the cell to play Turn 5 with a forced Blue action.

**Available Blue actions:** `shadow_and_monitor`, `hail_and_warn`, `reposition_isr`, `close_to_visual`, `emcon_alpha`, `active_sonar`, `public_affairs_statement`, `request_reinforcements`, `withdraw_to_deescalate`

In [None]:
# ═══════════════════════════════════════════════════════════════
# COMMANDER OVERRIDE — Edit this action, then run the cell.
# Set to None to let the Blue agent decide autonomously.
# ═══════════════════════════════════════════════════════════════

commander_action = "withdraw_to_deescalate"  # Try: "active_sonar", "public_affairs_statement", etc.

render_commander_box(
    f"Blue will execute **{commander_action}** this turn.",
    label="COMMANDER ORDERS",
)

turn_5 = await run_turn(5, blue_override=commander_action)

## Export Game Log for Demo 5 (AAR)

Save the full game log as JSON. This file is the direct input to Demo 5's After-Action Review agents.

### Azure Considerations for Data Persistence

In a production or team-collaboration scenario, you would typically save artifacts like this game log to an Azure storage service rather than a local file:

| Service | Use Case | Learn More |
|---|---|---|
| **[Azure Blob Storage](https://learn.microsoft.com/azure/storage/blobs/storage-blobs-overview)** | Store large, unstructured data (files, logs, media). Ideal for persisting game logs, model outputs, and evaluation results. | [Quickstart: Upload and download blobs](https://learn.microsoft.com/azure/storage/blobs/storage-quickstart-blobs-python) |
| **[Azure Cosmos DB](https://learn.microsoft.com/azure/cosmos-db/introduction)** | Globally distributed NoSQL database. Good for structured game-state data that multiple services need to query in real time. | [Quickstart: Azure Cosmos DB for NoSQL](https://learn.microsoft.com/azure/cosmos-db/nosql/quickstart-python) |
| **[Azure AI Foundry Projects](https://learn.microsoft.com/azure/ai-foundry/how-to/create-projects)** | Organize models, datasets, evaluations, and artifacts within a project workspace. Tracing and evaluation logs can be stored here automatically. | [Manage project resources](https://learn.microsoft.com/azure/ai-foundry/how-to/create-projects) |

For this demo, we export to a local JSON file for simplicity and portability.

In [None]:
# ── Export game log ────────────────────────────────────────────
export_log = {
    "scenario": game_state["meta"]["scenario"],
    "total_turns": len(game_log),
    "final_escalation": {
        "index": game_state["escalation_index"],
        "level": game_state["escalation_level"],
    },
    "final_roe": game_state["meta"]["roe_posture"],
    "blue_cumulative": game_state["blue"]["cumulative_scores"],
    "red_cumulative": game_state["red"]["cumulative_scores"],
    "turns": [
        {
            "turn": t["turn"],
            "dtg": t["dtg"],
            "red_action": t["red_action"]["key"],
            "red_label": t["red_action"]["label"],
            "blue_action": t["blue_action"]["key"],
            "blue_label": t["blue_action"]["label"],
            "escalation_delta": t["escalation_delta"],
            "escalation_index": t["escalation_index"],
            "escalation_level": t["escalation_level"],
            "threshold_crossing": t["threshold_crossing"],
            "roe_posture": t["roe_posture"],
            "agent_messages": t["agent_messages"],
        }
        for t in game_log
    ],
}

log_path = DEMO2_LOG_FILENAME
with open(log_path, "w") as f:
    json.dump(export_log, f, indent=2)

log_success(f"Game log exported to: {log_path}")
log_metric("Turns", len(game_log))
log_metric("Final escalation", f"{game_state['escalation_index']} ({game_state['escalation_level']})")
log_metric("Final ROE", game_state["meta"]["roe_posture"])
log_info("Feed this file into Demo 5 (AAR) for after-action analysis.")

## Reset / Cleanup

Clear all runtime state to re-run the encounter from scratch.

In [None]:
# Reset game state
game_state["meta"]["turn"] = 0
game_state["meta"]["dtg"] = "T+0000H"
game_state["meta"]["roe_posture"] = ROEPosture.PEACETIME.value
game_state["escalation_index"] = 0
game_state["escalation_level"] = EscalationLevel.ROUTINE.value
game_state["turn_log"] = []
for side in ["blue", "red"]:
    for dim in SCORING_DIMENSIONS:
        game_state[side]["cumulative_scores"][dim] = 0

# Clear history
game_log.clear()
all_messages.clear()
clear_logs()

# Reset agents
for agent in all_agents:
    agent.reset()

log_success("Game state cleared. Ready to re-run from Turn 1.")