## 🧠 vDAGs — Building Scalable, Distributed AI Workflows with Blocks

- Author: Shridhar Kini ([Profile](https://www.linkedin.com/in/shridhar-kini-79911249/?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=android_app))
- To Securely Run: `jupyter notebook password` to generate onetime password for secure access
- To Run: `jupyter notebook --allow-root  --port 9999 --ip=0.0.0.0`
- To Clear Outputs: Use `jupyter nbconvert --clear-output --inplace vdag-demo.ipynb`

In modern AI systems, building powerful applications is no longer about deploying a single large model. Instead, it's about connecting many smaller, reusable, and scalable components — each handling a part of the task. This is where **vDAGs**, or **virtual Directed Acyclic Graphs**, step in as a game-changing abstraction for designing **distributed AI workflows**.

A **vDAG** represents a **virtual workflow composed of interconnected “blocks”**, where each block serves a specific AI or computational function. These blocks are created and deployed by developers on their own clusters using the **AIOSv1 Instance SDK**, and can scale independently based on demand.

What makes vDAGs powerful is that they allow you to build **end-to-end applications** using these distributed blocks, orchestrating them across a graph structure that can span **within or across multiple clusters**.

---

## 🔍 What Is a Block?

Before diving deeper into vDAGs, let’s clarify what a **block** is.

A **block** is the core serving component in the AIOSv1 ecosystem. It represents a self-contained unit responsible for:

* Instantiating and serving AI models or general-purpose computation
* Scaling based on load
* Being managed dynamically across a distributed cluster environment

Blocks are deployed by users on any cluster that meets the resource requirements. Once deployed, they can serve:

* As **nodes in one or more vDAGs**
* Or as **standalone inference endpoints** outside any vDAG

You can read more about the block [here](https://docs.aigr.id/block/block/).

---

## 🔗 What Is a vDAG?

A **vDAG** (virtual Directed Acyclic Graph) is a **workflow composed of blocks**, where each node in the graph is a block (or even another vDAG). It defines how data flows through a sequence of operations — such as preprocessing, model inference, post-processing, and so on — executed across the network of blocks.

The key word here is *virtual*. vDAGs don’t physically contain the logic — they refer to existing blocks deployed on clusters. Think of a vDAG as a **blueprint or routing plan** for how a particular task should be processed by different blocks across the network.

---

## ⚙️ Key Features of vDAG

* ✅ **Modular Composition**: Each node is a block that can be reused in multiple workflows or used standalone.
* ✅ **Nested Graphs**: Nodes in a vDAG can themselves reference other vDAGs (subgraphs).
* ✅ **Cross-Cluster Execution**: Nodes can reside on different clusters depending on where blocks are deployed.
* ✅ **Assignment Policies**: During vDAG creation, a policy can select the most suitable block from a pool of candidates for a given node.
* ✅ **Custom Behavior**: Each node supports **pre-processing** and **post-processing policies**, which run before or after the core block function — allowing for transformations, validation, routing logic, etc.
* ✅ **Flexible Patterns**: Supports fan-in/fan-out logic like multiple producers → single consumer, ensembles, and branching.

## vDAGs End-to-End Workflow

This notebook demonstrates how to build and interact with **vDAGs (virtual Directed Acyclic Graphs)** — a powerful abstraction for building scalable, distributed AI pipelines.

We'll walk through:

1. ✅ Creating a vDAG with multiple LLM blocks
2. 🛠️ Deploying a Controller that manages routing, health, and policy execution
3. 🚀 Submitting real inference requests using multimodal inputs
4. 🧹 Cleaning up the controller after use

By the end of this notebook, you will have a working vDAG pipeline capable of handling image + text analysis for surveillance-style use cases.


## 🧱 Step 1: Register the vDAG

In this step, we define a virtual DAG that outlines how input data should flow through a series of AI blocks.

Each block in the vDAG performs a specific role:

- `gemma3-27b-block`: Analyzes the input image and extracts semantic scene information (objects, interactions, environment).
- `llama4-scout-17b-block`: Processes the extracted attributes to detect high-level events like theft or accidents.
- `magistral-small-2506-llama-cpp-block`: Acts as a decision-making module to determine if the event requires escalation (e.g., alerts, notifications).



### 🔄 Flow:

![image](./timeline-1_rescaled.jpg)

To Know more about Functions ([here](https://github.com/OpenCyberspace/OpenOS.AI-Documentation/blob/main/policies-system/policies-system.md))

You can read more about the vDAG spec ([here](https://docs.aigr.id/parser/vdag/)).


**Here is the python script that registers the vDAG, once the vDAG is created, a unique vDAG URI will be assigned to it which is a string obtained by combining `vdagName` and `vdagVersion` strings as follows: `<vdagName>:<vdagVersion.version>-<vdagVersion.releaseTag>`.**

**Pre-requisites:**
- User Should have knowledge on how to deploy blocks on the AIOS cluster.
- User should have knwoledge on how to register policy


In [1]:
import requests

PARSER_URL = "http://MANAGEMENTMASTER:30501/api/createvDAG"

data = {
  "parser_version": "Parser/V1",
  "body": {
    "spec": {
      "values": {
        "vdagName": "llm-analyzer-aug6",
        "vdagVersion": {
          "version": "0.0.6",
          "release-tag": "stable"
        },
        "discoveryTags": [
          "vdag-llm",
          "llm-vdag"
        ],
        "controller": {},
        "nodes": [
          {
            "spec": {
              "values": {
                "nodeLabel": "gemma3-27b-block",
                "nodeType": "block",
                "manualBlockId": "gemma3-27b-block",
                "preprocessingPolicyRule": {},
                "postprocessingPolicyRule": {},
                "modelParameters": {}
              },
              "IOMap": [
                {
                  "inputs": [
                    {
                      "name": "input_0",
                      "reference": "input_0"
                    }
                  ],
                  "outputs": [
                    {
                      "name": "output_0",
                      "reference": "output_0"
                    }
                  ]
                }
              ]
            }
          },
          {
            "spec": {
              "values": {
                "nodeLabel": "llama4-scout-17b-block",
                "nodeType": "block",
                "manualBlockId": "llama4-scout-17b-block",
                "preprocessingPolicyRule": {},
                "postprocessingPolicyRule": {},
                "modelParameters": {}
              },
              "IOMap": [
                {
                  "inputs": [
                    {
                      "name": "input_0",
                      "reference": "input_0"
                    }
                  ],
                  "outputs": [
                    {
                      "name": "output_0",
                      "reference": "output_0"
                    }
                  ]
                }
              ]
            }
          },
          {
            "spec": {
              "values": {
                "nodeLabel": "magistral-small-2506-llama-cpp-block",
                "nodeType": "block",
                "manualBlockId": "magistral-small-2506-llama-cpp-block",
                "preprocessingPolicyRule": {},
                "postprocessingPolicyRule": {
                  "policyRuleURI": "post_processor_for_job_caller:0.0.1-stable"
                },
                "modelParameters": {}
              },
              "IOMap": [
                {
                  "inputs": [
                    {
                      "name": "input_0",
                      "reference": "input_0"
                    }
                  ],
                  "outputs": [
                    {
                      "name": "output_0",
                      "reference": "output_0"
                    }
                  ]
                }
              ]
            }
          }
        ],
        "graph": {
          "input": [
            {
              "nodeLabel": "gemma3-27b-block",
              "inputNames": [
                "input_0"
              ]
            }
          ],
          "output": [
            {
              "nodeLabel": "magistral-small-2506-llama-cpp-block",
              "outputNames": [
                "output_0"
              ]
            }
          ],
          "connections": [
            {
              "nodeLabel": "llama4-scout-17b-block",
              "inputs": [
                {
                  "nodeLabel": "gemma3-27b-block",
                  "outputNames": [
                    "output_0"
                  ]
                }
              ]
            },
            {
              "nodeLabel": "magistral-small-2506-llama-cpp-block",
              "inputs": [
                {
                  "nodeLabel": "llama4-scout-17b-block",
                  "outputNames": [
                    "output_0"
                  ]
                }
              ]
            }
          ]
        }
      }
    }
  }
}

response = requests.post(PARSER_URL, json=data)
print(response.status_code)
print('api response', response.json())

200
api response {'result': {'task_id': '3c5887c7-6544-43d1-92af-797a325af7c2', 'vdagURI': 'llm-analyzer-aug6:0.0.6-stable'}, 'success': True, 'task_id': ''}


**You can confirm the creation of vDAG by querying the vDAG information using `vdagURI` from vDAGs registry**:

In [2]:
%%bash
curl -X GET http://MANAGEMENTMASTER:30103/vdag/llm-analyzer-aug6:0.0.6-stable | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2777  100  2777    0     0   4856      0 --:--:-- --:--:-- --:--:--  4854


{
   "data" : {
      "assignment_info" : {
         "gemma3-27b-block" : "gemma3-27b-block",
         "llama4-scout-17b-block" : "llama4-scout-17b-block",
         "magistral-small-2506-llama-cpp-block" : "magistral-small-2506-llama-cpp-block"
      },
      "compiled_graph_data" : {
         "head" : "gemma3-27b-block",
         "rev_mapping" : {
            "gemma3-27b-block" : "gemma3-27b-block",
            "llama4-scout-17b-block" : "llama4-scout-17b-block",
            "magistral-small-2506-llama-cpp-block" : "magistral-small-2506-llama-cpp-block"
         },
         "t2_graph" : {
            "gemma3-27b-block" : [
               "llama4-scout-17b-block"
            ],
            "llama4-scout-17b-block" : [
               "magistral-small-2506-llama-cpp-block"
            ],
            "magistral-small-2506-llama-cpp-block" : []
         },
         "t3_graph" : {
            "gemma3-27b-block" : {
               "outputs" : [
                  {
                     "block

## 🧭 Step 2: Deploy a vDAG Controller

The vDAG Controller is the **runtime engine** that orchestrates the flow of data through the vDAG graph.

It handles:
- Task routing between blocks
- Health and quota monitoring
- Quality management - using a policy to capture outputs and verifiying (manual/automated)

You can read about the vDAG controller [here](https://docs.aigr.id/vdag-controller/vdag-controller/).

### Configuration Parameters:
- `vdag_uri`: Which vDAG this controller will serve
- `policy_execution_mode`: Whether to run policies locally or remotely
- `replicas`: How many controller pods to run (for redundancy or scale)

> You can deploy multiple controllers for the same vDAG across clusters for multi-region or HA setups.

The command below deploys a vDAG controller for the vDAG we created with 2 replicas:

In [3]:
%%bash
curl -X POST http://MANAGEMENTMASTER:30600/vdag-controller/gcp-cluster-2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "create_controller",
    "payload": {
      "vdag_controller_id": "aug6-controller", 
      "vdag_uri": "llm-analyzer-aug6:0.0.6-stable",
      "config": {
        "policy_execution_mode": "local",
        "replicas": 2
      },
      "search_tags": []
    }
  }'


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   336  100    58  100   278     91    437 --:--:-- --:--:-- --:--:--   529


{"data":"Controller created successfully","success":true}


**We can query the available controllers for the given vDAG using the command below by specifying the `vDAGURI`**

In [1]:
%%bash
curl -X GET http://MANAGEMENTMASTER:30103/vdag-controllers/by-vdag-uri/llm-analyzer:0.0.3-stable | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1871  100  1871    0     0   3480      0 --:--:-- --:--:-- --:--:--  3484


{
   "data" : [
      {
         "cluster_id" : "gcp-cluster-2",
         "config" : {
            "api_url" : "http://CLUSTER1MASTER:32696",
            "policy_execution_mode" : "local",
            "replicas" : 1,
            "rest_url" : "http://CLUSTER1MASTER:31351",
            "rpc_url" : "CLUSTER1MASTER:30095"
         },
         "metadata" : {},
         "public_url" : "CLUSTER1MASTER:30095",
         "search_tags" : [
            "objedet",
            "narasimha",
            "prasanna"
         ],
         "vdag_controller_id" : "llm-004",
         "vdag_uri" : "llm-analyzer:0.0.3-stable"
      },
      {
         "cluster_id" : "gcp-cluster-2",
         "config" : {
            "api_url" : "http://CLUSTER1MASTER:32084",
            "policy_execution_mode" : "local",
            "replicas" : 1,
            "rest_url" : "http://CLUSTER1MASTER:30436",
            "rpc_url" : "CLUSTER1MASTER:30666"
         },
         "metadata" : {},
         "public_url" : "CLUSTER1MASTER:

The individual controller details can also be queried by specifying the `vdag_controller_id`

In [5]:
%%bash
curl -X GET http://MANAGEMENTMASTER:30103/vdag-controller/aug6-controller | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   396  100   396    0     0    756      0 --:--:-- --:--:-- --:--:--   757


{
   "data" : {
      "cluster_id" : "gcp-cluster-2",
      "config" : {
         "api_url" : "http://CLUSTER1MASTER:31096",
         "policy_execution_mode" : "local",
         "replicas" : 2,
         "rest_url" : "http://CLUSTER1MASTER:32579",
         "rpc_url" : "CLUSTER1MASTER:32693"
      },
      "metadata" : {},
      "public_url" : "CLUSTER1MASTER:32693",
      "search_tags" : [
         "vdag-llm",
         "llm-vdag"
      ],
      "vdag_controller_id" : "aug6-controller",
      "vdag_uri" : "llm-analyzer-aug6:0.0.6-stable"
   },
   "success" : true
}


Every vDAG controller exposes REST and gRPC APIs for submitting inference requests and a REST service for management of health checker, quality checker and quota management policies. `config.api_url` can be used to submit inference requests using REST API and `config.rpc_url` can be used for submitting inference requests using GRPC interface.

## 🤖 Step 3: Run Inference on the vDAG

We now simulate an **inference task** using a multi-modal input — both **text** and **image**.

### Input Structure:
- `session_id`: Used for tracking and quota enforcement
- `seq_no`: Monotonically increasing number per session
- `data.mode`: Set to `"chat"` to enable conversational behavior
- `messages`: The core input — includes both a text prompt and an image URL

### What Happens:
1. The request is received by the controller
2. It routes the input to `gemma3-27b-block` for vision analysis
3. Then to `llama4-scout-17b-block` for event detection
4. Finally to `magistral-small-2506-llama-cpp-block` for decision making and alert triggering

The final output will be a structured scene description and alert status.


### Inference using REST API

Inference requests can be submitted using the REST API using the command below (the REST API url can be obtained from `config.api_url` field of vDAG controller data):

In [6]:
%%bash
curl -X POST  http://CLUSTER1MASTER:31096/v1/infer \
  -H "Content-Type: application/json" \
  -d '{
  "session_id": "session1",
  "seq_no": 5,
  "data": {
    "mode": "chat",
    "gen_params": {
      "temperature": 0.1,
      "top_p": 0.95,
      "max_tokens": 4096
    },
    "messages": [
      {
        "content": [
          {
            "type": "text",
            "text": "Analyze the following image and generate your objective scene report.?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://akm-img-a-in.tosshub.com/indiatoday/images/story/202311/chain-snatching-caught-on-camera-in-bengaluru-293151697-16x9_0.jpg"
            }
          }
        ]
      }
    ]
  },
  "graph": {},
  "selection_query": {}
}' | json_pp


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  6239  100  5549  100   690     36      4  0:02:52  0:02:32  0:00:20  1329


{
   "data" : {
      "reply" : "### **Event Analysis**\n\n**Classification:** `Alertable`\n\n**Summary:** The event involves a person lying on the ground with a man assisting them, possibly due to an accident or incident. The presence of standing water on the road may have contributed to the situation. Further investigation is recommended to determine the exact nature of the event.\n\n---\n\n### **A. Python Policy Script (`alerter.py`)**\n\n```python\nimport logging\nimport requests\n\nclass AIOSv1PolicyRule:\n    def __init__(self, rule_id, settings, parameters):\n        self.rule_id = rule_id\n        self.settings = settings\n        self.parameters = parameters\n\n    def eval(self, parameters, input_data, context):\n        try:\n            summary = input_data.get(\"summary\", \"\")\n            if not summary:\n                return\n\n            destination_url = self.settings.BASE_URI\n            payload = {\n                \"source_rule_id\": self.rule_id,\n           

----

### Inference using gRPC API

vDAG controller provides a well defined gRPC interface and a structured protobuf definition for submitting inference requests:

```proto
syntax = "proto3";

message vDAGFileInfo {
    string metadata = 1; 
    bytes file_data = 2; 
}

// Definition the message structure
message vDAGInferencePacket {
    string session_id = 3;      
    uint64 seq_no = 4;           
    bytes frame_ptr = 5;       
    string data = 6;             
    double ts = 8;
    repeated vDAGFileInfo files = 9;
}

// Definition the gRPC service
service vDAGInferenceService {
    rpc infer(vDAGInferencePacket) returns (vDAGInferencePacket);
}
```

The following proto file can be compiled and imported in any programming language to integrate vDAG controller into your application. Here is the sample python file which submits inference task and waits for results using gRPC API.

In [None]:
import grpc
import time
import logging
from pathlib import Path
from uuid import uuid4
import json
from concurrent.futures import ThreadPoolExecutor

from proto.vdag_service_pb2 import vDAGInferencePacket, vDAGFileInfo
from proto.vdag_service_pb2_grpc import vDAGInferenceServiceStub

# Configure logging
logging.basicConfig(level=logging.INFO)


def load_file_info(file_path: str, metadata: str = "") -> vDAGFileInfo:
    with open(file_path, "rb") as f:
        file_data = f.read()
    return vDAGFileInfo(metadata=metadata, file_data=file_data)


def send_inference_request(seq_no: int):
    channel = grpc.insecure_channel("CLUSTER1MASTER:32409")
    stub = vDAGInferenceServiceStub(channel)

    data = {
        "mode": "chat",
        "gen_params": {
            "temperature": 0.1,
            # "min_p": 0.01,
            # "top_k": 64,
            "top_p": 0.95,
            "max_tokens": 4096  # Set a limit for the response length
        },
        "messages": [{"content": [
            {"type": "text", "text": "Analyze the following image and generate your objective scene report.?"},
            {"type": "image_url",
             "image_url": {"url": "https://akm-img-a-in.tosshub.com/indiatoday/images/story/202311/chain-snatching-caught-on-camera-in-bengaluru-293151697-16x9_0.jpg"}}]}]
    }
    ts = time.time()

    # Optional file
    example_file = Path("example.txt")
    if example_file.exists():
        files = [load_file_info(str(example_file), metadata=f"seq_{seq_no}")]
    else:
        files = []

    session_id = str(uuid4())

    request = vDAGInferencePacket(
        session_id=session_id,
        seq_no=seq_no,
        frame_ptr=b"",
        data=json.dumps(data),
        ts=ts,
        files=files
    )

    try:
        logging.info(f"[seq_no={seq_no}] Sending request")
        st = time.time()
        response = stub.infer(request)
        et = time.time()
        logging.info(
            f"[seq_no={seq_no}] Response: data={response.data}, latency={et - st:.3f}s")
    except grpc.RpcError as e:
        logging.error(
            f"[seq_no={seq_no}] gRPC error: {e.code()} - {e.details()}")


send_inference_request(10)


## 🧹 Step 4: Clean-up

The controller can be removed using the following command

In [None]:
%%bash
curl -X POST http://MANAGEMENTMASTER:30600/vdag-controller/gcp-cluster-2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "remove_controller",
    "payload": {
      "vdag_controller_id": "aug6-controller"
    }
  }'

The vDAG entry if not needed anymore can be removed using the following command:

In [None]:
%%bash
curl -X DELETE http://MANAGEMENTMASTER:30103/vdag/llm-analyzer-agu6:0.0.6-stable


## Testing the live demo using Streamlit dashboard

We have created a Stream-lit dashboard that provides a live real time interaction using chat interface.

`pip install streamlit`

Run the streamlit demonstration
`streamlit run app_vdag_final_working.py`

 