# Circular vDAG Tutorial: A Deep Dive into AIOSv1 Policies

### Introduction and Overview

Welcome! This tutorial demonstrates how to build and run a sophisticated, circular vDAG using AIOSv1. We'll construct a multi-turn debate system featuring two debaters (A and B) and a judge. By the end of this video, you'll be able to build your own self-correcting, debating, or collaborative systems.

**In this tutorial, your take away will be:**
- How to implement a circular workflow using post-processing routing policies.
- How to use a pre-processing policy to create and inject a "running summary" for long-term context.
- How to deploy and manage the vDAG on a Kubernetes cluster.

This directory contains local copies of all the policy and model inference code, and the vDAG itself is defined inline, making this a self-contained example.

### Tutorial Overview
1.  **What are Policies and Blocks in AIOSv1?**: Understand the core components that separate logic from inference.
2.  **Implementation Deep Dive**: See the high-level architecture of our three-agent debate system.
3.  **Code and Policies: A Closer Look**: Examine the local Python scripts that define our agent behaviors.
4.  **Circular vDAG spec**: Review the declarative JSON that defines the vDAG structure.
5.  **The Debate Flow and History**: Trace the data packets as they move between agents.
6.  **Knobs to Tune for More Control**: Learn how to change the debate's behavior by adjusting policy parameters.
7.  **Deploying and Managing the vDAG**: Use API calls to register and deploy the vDAG on a cluster.
8.  **Running and Inspecting the Debate**: Trigger the debate with an inference call and see the results.
9.  **Observing logs in K8s**: Monitor the real-time interaction between agents via Kubernetes logs.
10. **Troubleshooting**: Get tips for debugging common issues in a circular vDAG.
11. **Cleanup**: Learn how to properly remove the vDAG and its controller from the cluster.


### 1.  What are Policies in AIOSv1?
A policy is a dynamically loadable, executable Python code that is used in various places and use cases across the AIOS system. Since policies are dynamic, they allow developers to implement custom functionalities throughout the AIOS system. Below are some examples of polices used in AIOS that are relevant for this tutorial as well.
- **Preprocessing Policies**: Modify incoming requests. In this tutorial, we use one to manage a conversation summary.
- **Postprocessing Policies**: Modify outgoing responses. Here, we use them to route the conversation between debaters and the judge.

> 📖 **Further Reading**: [AIOSv1 Policies System Overview](https://github.com/OpenCyberspace/OpenOS.AI-Documentation/blob/main/policies-system/policies-system.md)

### What is a "Block" in AIOSv1?
Block the core component AIOSv1 responsible for instantiating, serving, scaling and managing the AI inference or any general computational workload defined using AIOSv1 Instance SDK. In this tutorial, a "block" is the core component that executes your model code(e.g., running a Llama.cpp model) and has several policies associated with it. It is intentionally kept simple and focused on inference. The complex logic is offloaded to the policies that wrap it. A `block` can be composed of below policies apart from the other policies in the link below:
1.  **Preprocessing Policy (Optional)**: Acts on the request before it hits your model.
2.  **Inference Code**: The actual model execution.
3.  **Postprocessing Policy (Optional)**: Acts on the response from your model.

> 📖 **Further Reading**: [What is a Block?](https://github.com/OpenCyberspace/OpenOS.AI-Documentation/blob/main/block/block.md)

### 2. Implementation Deep Dive

Here is a high-level overview of the components in our circular debate system.

- **Preprocessing (Summarizer Policy)**
  - Builds a per-session `recent_turns` window, summarizing the conversation periodically.
  - This summary is then injected back into the prompt, giving the models long-term context.
  - The summarization has its own knobs, like cadence (`summarize_every_n_messages`) and token thresholds (`min_tokens_for_summarization`).

- **Debater Inference Code (`main_debate_simple.py`)**
  - A simple inference wrapper with a fixed role (A or B) using AIOSv1's LLM SDK
  - It constructs a prompt from the topic and the opponent's last turn.
  - If a `running_summary` is available (from the preprocessor), it's injected as a system message.

- **Debater Postprocessing (Router Policy)**
  - This is where the core routing logic lives.
  - It decides whether to send the response to the opponent or escalate to the judge based on a set of rules.
  - For example, it uses `judge_interval_rounds` to escalate the debate to the judge every N rounds for a periodic review. It also uses `max_consec_by_same_role` as a safeguard to prevent one debater from dominating the conversation.

- **Judge Inference Code (`main_judge_capped.py`)**
  - Builds a prompt for the judge using the topic and the full recent history.
  - The judge's job is to assess the state of the debate and decide whether it should continue or end.

- **Judge Postprocessing (Router Policy)**
  - Parses the judge's decision from the model's output. A `CONTINUE` decision routes the conversation back to one of the debaters, maintaining the circular flow. A `FINAL_JUDGMENT` decision terminates the vDAG execution.
  - It also enforces global caps like `max_rounds` to ensure the debate eventually concludes, preventing infinite loops.

- **Stopping Conditions**
  - The debate ends when the Judge returns `FINAL_JUDGMENT` or when a hard cap (like `max_rounds`) is reached.

### 3. Code and Policies: A Closer Look

This tutorial is self-contained. All the code for the model inferences and policies is located in this directory for easy reference. The policies are assumed to be pre-registered with AIOS.

#### Local Code and Policies
- **Model Inference Code**:
  - [Debater (`main_debate_simple.py`)] (./model_inference/main_debate_simple.py)
  - [Judge (`main_judge_capped.py`)] (./model_inference/main_judge_capped.py)
- **Policy Code**:
  - [Preprocessing Summarizer](./policies/preprocessing_policy_for_summarization/)
  - [Postprocessing Debater Router](./policies/postprocessing_policy_router_debater/)
  - [Postprocessing Judge Router](./policies/postprocessing_policy_router_judge/)


#### A Note on the Inference Code and Policies

It's important to understand how the inference code and policies collaborate.

**Inference Code:** The Python scripts (`main_debate_simple.py` and `main_judge_capped.py`) are intentionally minimal. Their main job is to:
1. Receive a request.
2. Format a prompt based on the input data.
3. Call the LLM for inference.
4. Return the raw output.

They use specific system messages to guide the behavior of the models:
- **Judge System Message**: 
  ```
  You are an impartial debate judge. You will be given TOPIC, ROUNDS, and the latest turns from A and B. Decide strictly by clarity, relevance to the topic/instruction, coherence, and factual plausibility. Return exactly: DECISION: CONTINUE_A|CONTINUE_B|FINAL_JUDGMENT If FINAL_JUDGMENT, also return:  WINNER: A|B|DRAW  REASON: <very short reason>  Output only the specified fields with no extra text.
  ```
- **Debater System Message**:
  ```
  You are a debate participant debater-A or role A in a router-orchestrated exchange. Provide a concise, on-topic argument that advances your side, directly addressing the latest message and also based on your past arguments. Do not declare a winner, do not judge, and do not ask who speaks next. Be specific, factual where possible, and avoid meta-comments or system/control tokens. Keep the response short and self-contained in 100 words.
  ```

**Policies Calling LLMs:** Policies can also execute their own inference calls. For instance, the **preprocessing summarizer** needs to can call an LLM to generate a summary. This is typically done by using a utility client within the policy code to call an "external" LLM service. This "external" service could even be another AIOS Block optimized for summarization. This powerful feature allows you to build complex, multi-model workflows where policies act as intelligent agents, preparing and routing data between different specialized models. One can even use the same models for summarization and review via rest or grpc calls.

### 4. Circular vDAG specification

In [80]:

# This Python dictionary defines the entire circular vDAG.
# Notice how each node is a "model_inference" and specifies its pre- and post-processing policies.
# The "graph" structure below defines the static connections between the nodes for the initial request. 
# However, the circular behavior (e.g., Debater -> Debater -> Judge -> Debater) is not defined here. 
# Instead, it is dynamically managed by the post-processing router policies, which decide where to send the packet next based on the rules we've defined.

circular_vdag_spec = {
  "parser_version": "Parser/V1",
  "body": {
    "spec": {
      "values": {
        "vdagName": "llm-circular-vdag-demo-17",
        "vdagVersion": { "version": "1.0.0", "release-tag": "stable" },
        "discoveryTags": ["vdag-llm",  "circular-vdag"],
        "controller": {},
        "nodes": [
          {
            "spec": {
              "values": {
                "nodeLabel": "debater-A",
                "nodeType": "block",
                "manualBlockId": "llama4-scout-17b-block-circular",
                "preprocessingPolicyRule": {"policyRuleURI": "preprocessing_policy_for_summarization:0.0.1-stable"},
                "postprocessingPolicyRule": {"policyRuleURI": "postprocessing_policy_router_debater:0.0.1-stable"},
                "modelParameters": {}
              },
              "IOMap": [
                {
                  "inputs": [{ "name": "input_0", "reference": "input_0" }],
                  "outputs": [{ "name": "output_0", "reference": "output_0" }]
                }
              ]
            }
          },
          {
            "spec": {
              "values": {
                "nodeLabel": "debater-B",
                "nodeType": "block",
                "manualBlockId": "magistral-small-2506-llama-cpp-block-circular",
                "preprocessingPolicyRule": {"policyRuleURI": "preprocessing_policy_for_summarization:0.0.1-stable"},
                "postprocessingPolicyRule": {"policyRuleURI": "postprocessing_policy_router_debater:0.0.1-stable"},
                "modelParameters": {}
              },
              "IOMap": [
                {
                  "inputs": [{ "name": "input_0", "reference": "input_0" }],
                  "outputs": [{ "name": "output_0", "reference": "output_0" }]
                }
              ]
            }
          },
          {
            "spec": {
              "values": {
                "nodeLabel": "judge-llm",
                "nodeType": "block",
                "manualBlockId": "deepseek-r1-distill-70b-block-circular",
                "preprocessingPolicyRule": {"policyRuleURI": "preprocessing_policy_for_summarization:0.0.1-stable"},
                "postprocessingPolicyRule": {
                  "policyRuleURI": "postprocessing_policy_router_judge:0.0.1-stable"
                },
                "modelParameters": {}
              },
              "IOMap": [
                {
                  "inputs": [{ "name": "input_0", "reference": "input_0" }],
                  "outputs": [{ "name": "output_0", "reference": "output_0" }]
                }
              ]
            }
          }
        ],
        "graph": {
                "input": [
                    {
                        "nodeLabel": "debater-A",
                        "inputNames": [
                            "input_0"
                        ]
                    }
                ],
                "connections": [
                    {
                        "nodeLabel": "debater-B",
                        "inputs": [
                            {
                                "nodeLabel": "debater-A",
                                "outputNames": [
                                    "output_0"
                                ]
                            }
                        ]
                    },
                    {
                        "nodeLabel": "judge-llm",
                        "inputs": [
                            {
                                "nodeLabel": "debater-B",
                                "outputNames": [
                                    "output_0"
                                ]
                            }
                        ]
                    }
                ],
                "output": [
                    {
                        "nodeLabel": "judge-llm",
                        "outputNames": [
                            "output_0"
                        ]
                    }
                ]
            }
      }
    }
  }
}

### 5. The Debate Flow and History

The data packets flowing between the model_inferences have a canonical schema enforced by the policies. This ensures that each component gets the information it needs in a predictable format.

#### Canonical Packet Examples

**Debater model_inference → Debater Router**
The model output is simple. The router will use this to construct the next packet.
```json
{
  "reply": "Opening argument...",
  "prev_role": "A",
  "topic": "Is remote work more productive?",
  "session_id": "debate_207",
  "router_meta": {
    "running_summary": "...",
    "recent_turns": [{"role":"A","reply":"Opening argument..."}]
  }
}
```

**Debater Router → Opponent**
The router transforms the packet into the canonical format for the next debater.
```json
{
  "prev_turn_text": "Opening argument...",
  "prev_turn_role": "A",
  "receiver_role": "B",
  "topic": "Is remote work more productive?",
  "session_id": "debate_207",
  "router_meta": {"recent_turns": [{"role":"A","text":"Opening argument..."}]}
}
```

**Debater Router → Judge (on escalation)**
When it's time for the judge to review, the packet contains the history needed to make a decision.
```json
{
  "topic": "Is remote work more productive?",
  "session_id": "debate_207",
  "router_meta": {"router_counts": {"A": 2, "B": 2}, "recent_turns": [/* ... */]}
}
```

**Judge model_inference → Judge Router**
The judge outputs a decision, which the judge's router will interpret.
```json
{
  "judge_text": "... DECISION: CONTINUE_A",
  "opponent_last": "...",
  "bump_round": true,
  "topic": "Is remote work more productive?",
  "session_id": "debate_207",
  "router_meta": {"router_counts": {"A": 2, "B": 2}, "recent_turns": [/* ... */]}
}
```

**Judge Router → A/B (on continue)**
If the judge decides to continue, the router sends a packet to the appropriate debater.
```json
{
  "prev_turn_text": "Opponent last reply...",
  "prev_turn_role": "B",
  "receiver_role": "A",
  "topic": "Is remote work more productive?",
  "session_id": "debate_207"
}
```

### 6. Knobs to Tune for More Control

The behavior of the debate is controlled by parameters compiled into the policies. You can adjust these to change the dynamics of the conversation.

- **Preprocessing Summarizer (`preprocessing_policy_for_summarization:0.0.1-stable`)**
  - `summarize_every_n_messages` (default 3): Controls the frequency of summarization. A lower number means more frequent summaries, which provides better context but increases computational overhead.
  - `history_max_messages` (default 3): Defines the size of the sliding window of conversation turns used for the summary. A larger window provides more context to the summarizer model.
  - `min_tokens_for_summarization` (default 300): A gate to prevent summarizing very short exchanges, saving resources.
  - `include_last_summary_in_prompt` (true): Enables chained summaries, where the previous summary is included in the prompt for the next one, creating a continuous thread of context.

- **Debater Router (`postprocessing_policy_router_debater:0.0.1-stable`)**
  - `review_threshold` (default 0.5): An optional quality gate. If the model output includes a confidence score, the router can use this threshold to decide whether to accept the response or retry. (Note: This is not used in the current simple debater).
  - `judge_interval_rounds` (default 4): Sets the cadence for periodic review by the judge. After this many rounds, the debate is automatically escalated to the judge.
  - `max_consec_by_same_role` (default 3): A safety measure to prevent one debater from making multiple consecutive arguments, ensuring a balanced conversation.

- **Judge Router (`postprocessing_policy_router_judge:0.0.1-stable`)**
  - `max_rounds` (default 20): A hard cap on the total number of rounds in the debate. This acts as a failsafe to prevent infinite loops and control costs.
  - `judge_continue_cap` (default 5): Limits the number of times the judge can return a `CONTINUE` decision. After this cap is reached, the judge is forced to make a `FINAL_JUDGMENT`, ensuring the debate concludes.

Tip: Watch the pod logs to see the exact prompts being constructed, especially when summaries are used.

### 7. Deploying and Managing the vDAG

The following commands show how to register the vDAG, create a controller, and run inference.
#### Note: Adjust the IP addresses to match your cluster's endpoints.

### a. Create the vDAG with the AIOS createvDAG Endpoint

In [None]:

import requests
createvDAG_URL = "http://MANAGEMENTMASTER:30501/api/createvDAG"
response = requests.post(createvDAG_URL, json=circular_vdag_spec)
print(f"Parser Response Status: {response.status_code}")
print('Parser Response Body:', response.json())

Parser Response Status: 200
Parser Response Body: {'result': {'task_id': 'a572ebf5-2220-43ed-87be-7df0dd397e37', 'vdagURI': 'llm-circular-vdag-demo-17:1.0.0-stable'}, 'success': True, 'task_id': ''}


### b. Verify the vDAG is registered

In [None]:

!curl -X GET http://MANAGEMENTMASTER:30103/vdag/llm-circular-vdag-demo-17:1.0.0-stable | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3226  100  3226    0     0   499k      0 --:--:-- --:--:-- --:--:--  525k
{
   "data" : {
      "assignment_info" : {
         "debater-A" : "llama4-scout-17b-block-circular",
         "debater-B" : "magistral-small-2506-llama-cpp-block-circular",
         "judge-llm" : "deepseek-r1-distill-70b-block-circular"
      },
      "compiled_graph_data" : {
         "head" : "llama4-scout-17b-block-circular",
         "rev_mapping" : {
            "deepseek-r1-distill-70b-block-circular" : "judge-llm",
            "llama4-scout-17b-block-circular" : "debater-A",
            "magistral-small-2506-llama-cpp-block-circular" : "debater-B"
         },
         "t2_graph" : {
            "deepseek-r1-distill-70b-block-circular" : [],
            "llama4-scout-17b-block-circular" : [
               "magistral-small-2506-llama-cpp-block-cir

In [None]:
http://MANAGEMENTMASTER:30201/block/health/llama4-scout-17b-block-circular
http://MANAGEMENTMASTER:30201/block/health/magistral-small-2506-llama-cpp-block-circular
http://MANAGEMENTMASTER:30201/block/health/deepseek-r1-distill-70b-block-circular

### c. Create a vDAG Controller to deploy the pipeline


This command tells AIOS to use the necessary blocks that are already up for our vDAG. For more information check [vdag controller](https://docs.aigr.id/vdag-controller/vdag-controller/)

In [None]:

%%bash
curl -X POST http://MANAGEMENTMASTER:30600/vdag-controller/gcp-cluster-2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "create_controller",
    "payload": {
      "vdag_controller_id": "llm-circular-vdag-demo-17", 
      "vdag_uri": "llm-circular-vdag-demo-17:1.0.0-stable",
      "config": {
        "policy_execution_mode": "local",
        "replicas": 1
      },
      "search_tags": []
    }
  }'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   354  100    58  100   296    557   2844 --:--:-- --:--:-- --:--:--  3403


{"data":"Controller created successfully","success":true}


### d. Get the controller details

In [None]:

!curl -X GET http://MANAGEMENTMASTER:30103/vdag-controller/llm-circular-vdag-demo-17 | json_pp

### 8. Running and Inspecting the Debate

Once the controller is active, you can start a debate by sending a request to the first model_inference in the chain (`debater-A`).

1.  **Trigger the first turn**: Post to the `v1/infer` endpoint with a `session_id` and `topic`.
2.  **Observe the logs**: The most valuable insights come from watching the logs of the pods.
    - Look for the summarizer's output when it fires.
    - Inspect the debater and judge prompts, which are logged in detail.

In [None]:
import requests
import json

# Example: Start a debate on remote work
# We make the request directly in Python to avoid shell escaping issues.
INFERENCE_URL = "http://CLUSTER1MASTER:32647/v1/infer"
debate_payload = {
    "model": "llama4-scout-17b-block-circular",
    "session_id": "debate_220",
    "seq_no": 1,
    "data": {
        "mode": "chat",
        "message": "Please begin with your opening.",
        # "session_id": "debate_212",
        # "topic": "Debate for atleast 20 rounds,Propose a chain-snatching action recognition system from traffic CCTV, integrating detector+tracker+temporal model; target ≥0.85 F1 with FPPI ≤0.05/hr; include dataset plan and domain adaptation.?"
        "topic": "Debate for atleast 20 rounds,Design an end-to-end real-time computer vision pipeline for multi-camera loitering and chain-snatching detection in urban CCTV. Constraints: 1080p@15–25 FPS, ≤300 ms alert latency, ≥90% recall at FPPI ≤0.1/hr/camera, 200 cameras on mixed Jetson Orin/T4 edge nodes, intermittent connectivity. Specify: detection (persons/riders/hand–object), MOT + re-ID, dwell-time estimation, action recognition, geo-fencing, alerting; model choices (e.g., RT-DETR/YOLOv8 vs lightweight MobileNet-SSD), trackers (ByteTrack/OC-SORT), re-ID (FastReID), temporal models; data/labeling plan and hard-negative mining; robustness (night/rain/occlusion/domain shift); privacy (on-device blur/redaction) and bias checks; monitoring/drift/A/B; throughput and GPU/CPU/power budgets; accuracy–latency trade-offs and fallback modes.Propose an evaluation plan and benchmarks (mAP, IDF1, FPPI, E2E latency) plus synthetic stress tests (night/rain/crowds) for the above pipelines; include A/B and rollout strategy."
    },
    "graph": {},
    "selection_query": {}
}

debate_output = None
try:
    response = requests.post(INFERENCE_URL, json=debate_payload, timeout=600) # Added a long timeout for long debates
    response.raise_for_status()
    debate_output = response.json()
    print("Inference request successful. The raw JSON output is stored in 'debate_output'.")
except requests.exceptions.RequestException as e:
    print(f"Failed to get response from inference service: {e}")
except json.JSONDecodeError:
    print("Failed to parse JSON from response. Raw text:")
    print(response.text)

Inference request successful. The raw JSON output is stored in 'debate_output'.


In [100]:
print(debate_output)

{'data': {'bump_round': True, 'judge_text': "Alright, so I'm trying to design an end-to-end real-time computer vision pipeline for detecting loitering and chain-snatching incidents using urban CCTV cameras. The constraints are pretty tight: 1080p resolution at 15–25 FPS, maximum alert latency of 300 ms, and a high recall rate of at least 90% with a low false positive rate (FPPI ≤0.1/hr/camera). Plus, I have to handle 200 cameras on a mix of Jetson Orin and T4 edge nodes, which means I need to be mindful of computational resources and potential connectivity issues.\n\nFirst, I need to break down the problem into manageable components. The pipeline should include detection of persons, riders, and hand-object interactions, multi-object tracking (MOT) with re-identification (re-ID), dwell-time estimation, action recognition, geo-fencing, and alerting. Each of these components will require careful selection of models and algorithms to meet the performance and latency constraints.\n\nFor det

In [101]:
import json
import re

# The 'debate_output' variable now holds the Python dictionary from the request.
try:
    # Check if the request was successful and debate_output is a dictionary
    if debate_output and isinstance(debate_output, dict):
        # Extract the main data object from the response
        data = debate_output.get("data", {})
        
        # The debate history is in router_meta.recent_turns
        recent_turns = data.get("router_meta", {}).get("recent_turns", [])
        topic = data.get("topic", "N/A")

        print("="*50)
        print("              DEBATE REPLAY")
        print("="*50)
        print(f"Topic: {topic}\n")

        if not recent_turns:
            print("No turns found in the output. The debate may have ended immediately or an error occurred.")
            print("\nRaw Output:")
            print(json.dumps(debate_output, indent=2))
        else:
            for i, turn in enumerate(recent_turns):
                role = turn.get("role", "Unknown")
                reply = turn.get("reply", turn.get("text", "No content"))
                print(f"--- Turn {i+1}: Role '{role}' ---")
                print(reply)
                print("-" * (22 + len(role)))
        
        # Display the final judgment if available
        judge_text = data.get("judge_text", "")
        if "FINAL_JUDGMENT" in judge_text:
            # Use regex to extract the relevant parts of the judge's decision
            winner_match = re.search(r"WINNER: (A|B|DRAW)", judge_text)
            reason_match = re.search(r"REASON: (.*)", judge_text, re.DOTALL)
            
            winner = winner_match.group(1) if winner_match else "Not specified"
            reason = reason_match.group(1).strip() if reason_match else "Not specified"

            print("\n" + "="*50)
            print("              FINAL JUDGMENT")
            print("="*50)
            print(f"Decision: FINAL_JUDGMENT")
            print(f"Winner: {winner}")
            print(f"Reason: {reason}")
            print("="*50)
    else:
        print("Debate output not available or is in an incorrect format.")
        if debate_output:
            print("\nRaw Output Received:")
            print(json.dumps(debate_output, indent=2))

except Exception as e:
    print(f"An error occurred while processing the debate output: {e}")
    if debate_output:
        print("\nRaw Output Received:")
        print(json.dumps(debate_output, indent=2))

              DEBATE REPLAY
Topic: Debate for atleast 20 rounds,Design an end-to-end real-time computer vision pipeline for multi-camera loitering and chain-snatching detection in urban CCTV. Constraints: 1080p@15–25 FPS, ≤300 ms alert latency, ≥90% recall at FPPI ≤0.1/hr/camera, 200 cameras on mixed Jetson Orin/T4 edge nodes, intermittent connectivity. Specify: detection (persons/riders/hand–object), MOT + re-ID, dwell-time estimation, action recognition, geo-fencing, alerting; model choices (e.g., RT-DETR/YOLOv8 vs lightweight MobileNet-SSD), trackers (ByteTrack/OC-SORT), re-ID (FastReID), temporal models; data/labeling plan and hard-negative mining; robustness (night/rain/occlusion/domain shift); privacy (on-device blur/redaction) and bias checks; monitoring/drift/A/B; throughput and GPU/CPU/power budgets; accuracy–latency trade-offs and fallback modes.Propose an evaluation plan and benchmarks (mAP, IDF1, FPPI, E2E latency) plus synthetic stress tests (night/rain/crowds) for the abo

### 9. Observability: Kubernetes Dashboard and Logs

- **Kubernetes Dashboard** (if enabled):
  - Open: https://CLUSTER1MASTER:32319/


### 10. Troubleshooting


- **Missing summaries**: Check the `summarize_every_n_messages` and `min_tokens_for_summarization` gates. Short conversations won't trigger summaries.
- **Unexpected judge finalization**: The `max_rounds` or `judge_continue_cap` was likely hit. The judge router has a "force-finalize" handshake to ensure termination.
- **Schema mismatches**: Ensure all components are using the canonical fields. Legacy fields are ignored.
- **Duplicate turns in summarizer window**: This can happen if the role or text differs slightly. The current deduplication is role-aware and whitespace-normalized.

### 11. Cleanup

Always remove controllers and VDAGs that are no longer in use to free up cluster resources.

### a. Remove the controller

In [None]:
%%bash
curl -X POST http://MANAGEMENTMASTER:30600/vdag-controller/gcp-cluster-2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "remove_controller",
    "payload": {
      "vdag_controller_id": "llm-circular-vdag-demo-17"
    }
  }'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   177  100    58  100   119    705   1447 --:--:-- --:--:-- --:--:--  2158


{"data":"Controller removed successfully","success":true}


### b. Delete the vDAG definition

In [None]:
%%bash
curl -X DELETE http://MANAGEMENTMASTER:30103/vdag/llm-circular-vdag-demo-17:1.0.0-stable

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    51  100    51    0     0   5858      0 --:--:-- --:--:-- --:--:--  6375


{"data":{"message":"vDAG deleted"},"success":true}


### 13. Circular vDAG Summary

Here is a visual summary of the circular vDAG we have built in this tutorial. This diagram shows how the Debaters and the Judge interact, with the policies routing the conversation between them in a loop until a final judgment is reached.

![Circular vDAG Summary](./circular_vdag.png)