# Tutorial: Simple vDAG for workflow Automation with Swappable Policies

In this tutorial we will explore a powerful architecture based on **swappable policies** and a **generic Large Language Model (LLM) model** to create workflows easily, and we will go on to show how we can create our custom policies as tools for creating different workflows using the AIOSv1 policies

## Tutorial Overview
1. **Task**: Automate the process of screening candidate resumes against a job description using an LLM-powered workflow.
2. **Policies**: Discover how pre-processing policies prepare data for an LLM and post-processing policies orchestrate actions based on the LLM's output.
3. **Policies and Inference Code Overview**: Examine the Python code for the resume-parsing policy, the action-dispatching policy, and the model inference code.
4. **vDAG Spec**: Define the entire vDAG, attaching the specific pre- and post-processing policies to a generic block, all within a single JSON specification.
5. **Running this workflow as vDAG**: Deploy the vDAG and trigger the complete, multi-step recruitment workflow with a single API call.
6. **Conclusion**: Master the use of policies to build modular, scalable, and intelligent automation workflows on the AIOS platform.

## 1. Task

Our goal is to automate the initial screening of resumes, and send them a mail communication. The pipeline should:
1.  Accept a `.zip` file containing multiple resumes in PDF format.
2.  Extract the text from each resume.
3.  Use an LLM to analyze the resumes against a job description and decide on next steps (e.g., background check, send rejection email).
4.  Execute these tool calling steps automatically.

## 2. The Architecture: Swappable Policies


### What are Policies in AIOSv1?
A policy is a dynamically loadable, executable Python code that is used in various places and use cases across the AIOS system. Since policies are dynamic, they allow developers to implement custom functionalities throughout the AIOS system. 

> 📖 **Further Reading**: [AIOSv1 Policies System Overview](https://github.com/OpenCyberspace/OpenOS.AI-Documentation/blob/main/policies-system/policies-system.md)


The pipeline is designed to be highly modular. Instead of a single, monolithic block, we use a central, generic LLM block that is customized at runtime by `preprocessingPolicyRule` and `postprocessingPolicyRule` policies in a vDAG

A key concept in this architecture is that the tools are policies which are **stateless**, or "fire and die." As detailed in the `policies-system.md` documentation, policies of this Type 3 are loaded for a single execution to perform a specific task and are then terminated. They do not maintain any memory or state between requests, which makes the system robust, scalable, and easy to debug.

**Flow Diagram:**
```
[Input: zip file of resumes] -> [vDAG Block level Preprocessing Policy] -> [Generic LLM Model] -> [vDAG Block level Postprocessing Policy] -> [Automated Actions]
```

*   **`Preprocessing Policy`**: Its only job is to handle the input data. In this case, it unzips the file, finds all PDFs, and extracts their text content. It then passes this data as `supplemental_data` to the next block.
*   **`Generic LLM Model`**: This is a standard Llama_cpp_python model code. It's designed to be unaware of the specific task. It receives a prompt and, optionally, `supplemental_data`. Its role is to generate a response based on the inputs.
*   **`Postprocessing Policy`**: This policy takes the raw output from the LLM, parses it, and takes action. For our use case, it's responsible for parsing the JSON and executing the requested `tool_calls`.

Check this tutorial if you want to use the Author's given pre_processing and post_processing policy
- [Tutorial: Building a Modular Recruitment Automation/Workflow Pipeline From Author's Policies](http://CLUSTER2NODE1:9999/notebooks/recruitment_automation_tutorial.ipynb)

## 3. Policies and Inference Code Overview

### Preprocessing Policy: Getting the Data Ready

This policy handles the initial data extraction. It's a simple Python function that uses standard libraries to process the file.

**File:** `policies/recruitment_automation/preprocessing_policy/code/function.py`

In [None]:
import os
import zipfile
import fitz  # PyMuPDF
import io

def eval(data):
    # Assumes 'file' is a key in the input data, containing the zip file bytes
    zip_file_bytes = data.get('file')
    if not zip_file_bytes:
        return {"error": "No file provided"}

    extracted_texts = []
    with io.BytesIO(zip_file_bytes) as zip_stream:
        with zipfile.ZipFile(zip_stream, 'r') as zip_ref:
            for file_name in zip_ref.namelist():
                if file_name.lower().endswith('.pdf'):
                    with zip_ref.open(file_name) as pdf_file:
                        pdf_bytes = pdf_file.read()
                        with fitz.open(stream=pdf_bytes, filetype="pdf") as doc:
                            text = ""
                            for page in doc:
                                text += page.get_text()
                            extracted_texts.append({"file_name": file_name, "content": text})

    # The output of this policy becomes the 'supplemental_data' for the LLM block
    return {"candidates": extracted_texts}

### Postprocessing Policy: Taking Action

This policy is responsible for interpreting the LLM's response. A key lesson learned was that LLMs don't always produce perfect JSON. They might wrap it in markdown or add extra text. Therefore, we need a robust way to extract the JSON.

**File:** `policies/recruitment_automation/postprocessing_policy/code/function.py`

In [None]:
import json
import re

def extract_json_from_response(response_text):
    # Use regex to find JSON wrapped in markdown-style code blocks
    match = re.search(r"```json\n(.*?)\n```", response_text, re.DOTALL)
    if match:
        json_str = match.group(1)
    else:
        # Fallback for cases where there's no markdown wrapping
        json_str = response_text
    
    try:
        return json.loads(json_str)
    except json.JSONDecodeError:
        # Handle cases where the extracted string is still not valid JSON
        return {"error": "Failed to decode JSON from LLM response", "response": response_text}
        
    def submit_and_monitor_job(self, job_name: str, policy_rule_uri: str, params: Dict[str, Any]) -> Dict[str, Any]:
        """
        Submits a job to an AIOS executor and monitors it until completion.
        """
        # 1. Submit the job
        
        submit_endpoint = f"{self.aios_url}/jobs/submit/executor-001"
        
        submit_payload = {
            "name": job_name,
            "policy_rule_uri": policy_rule_uri,
            "inputs": params,
            "policy_rule_parameters": {},
            "node_selector": {}
        }
        
        logger.info(f"Submitting job '{job_name}' for policy '{policy_rule_uri}' with params: {_snippet(params)}")
        
        try:
            headers = {'Content-Type': 'application/json'}
            response = requests.post(submit_endpoint, json=submit_payload, headers=headers, timeout=30)
            response.raise_for_status()
            job_info = response.json()
            
            job_id = job_info.get("job_id")
            if not job_id:
                raise ValueError(f"Job submission did not return a job_id. Response: {_snippet(job_info)}")
            logger.info(f"Job '{job_id}' submitted for policy '{policy_rule_uri}'.")
        except requests.exceptions.HTTPError as e:
            if e.response and 500 <= e.response.status_code < 600:
                logger.error(f"Server error when submitting job. Status: {e.response.status_code}. Response: {e.response.text}")
            raise
        except requests.exceptions.RequestException as e:
            logger.error(f"Failed to submit job for policy '{policy_rule_uri}': {e}")
            raise

        # 2. Poll for job completion
        status_endpoint = f"{self.aios_url}/jobs/{job_id}"
        start_time = time.monotonic()
        
        while True:
            if time.monotonic() - start_time > self.job_timeout:
                raise TimeoutError(f"Job '{job_id}' timed out after {self.job_timeout} seconds.")

            try:
                response = requests.get(status_endpoint, timeout=10)
                if response.status_code == 200:
                    resp_json = response.json()
                    
                    # Check for success at the top level
                    if resp_json.get("success"):
                        data = resp_json.get("data", {})
                        status = data.get("job_status")
                        
                        if status == "completed":
                            logger.info(f"Job '{job_id}' completed successfully. Retrieving result from final poll.")
                            result_data = data.get("job_output_data")
                            if result_data is None:
                                logger.warning(f"Job result for '{job_id}' did not contain 'job_output_data'. Full response: {_snippet(resp_json)}")
                                return resp_json # Fallback to the full body
                            
                            logger.info(f"Result for job '{job_id}': {_snippet(result_data)}")
                            return result_data
                        elif status == "failed":
                            # The detailed result might be in 'job_output_data'
                            details = data.get("job_output_data", "No details provided.")
                            raise RuntimeError(f"Job '{job_id}' failed. Details: {details}")
                        elif status in ["queued", "running"]:
                            logger.debug(f"Job '{job_id}' is '{status}'. Polling again.")
                            time.sleep(self.job_poll_interval)
                        else:
                            logger.warning(f"Job '{job_id}' has unknown status: '{status}'. Retrying...")
                            time.sleep(self.job_poll_interval)
                    else:
                        # Handle cases where 'success' is false or missing
                        error_message = resp_json.get("message", "Unknown error during polling.")
                        logger.warning(f"Polling for job '{job_id}' was not successful: {error_message}. Retrying...")
                        time.sleep(self.job_poll_interval)
                else:
                    logger.warning(f"Polling for job '{job_id}' returned status {response.status_code}. Retrying...")
                    time.sleep(self.job_poll_interval)
                            
            except requests.exceptions.RequestException as e:
                logger.warning(f"Error polling job status for '{job_id}': {e}. Retrying...")
                time.sleep(self.job_poll_interval)
                
def eval(data):
    # 'data' is the raw output from the LLM block
    llm_response_text = data.get('response', '')
    
    # Attempt to parse the response directly
    try:
        parsed_json = json.loads(llm_response_text)
    except json.JSONDecodeError:
        # If direct parsing fails, use our robust extraction function
        parsed_json = extract_json_from_response(llm_response_text)

    if 'error' in parsed_json:
        return parsed_json

    # The core logic: check for the 'tool_calls' key
    if 'tool_calls' not in parsed_json:
        return {"error": "'tool_calls' key not found in the LLM response.", "response_data": parsed_json}

    # In a real system, you would execute the tool calls here caliing submit_and_monitor_job
    return {"executed_tool_calls": parsed_json['tool_calls']}

### Generic Tool Policies: The "Fire and Die" Policies

It's important to note that the `postprocessing_policy` itself doesn't contain the logic for every possible action. Instead, it acts as a dispatcher. When the LLM requests a `background-check` or a `send-email`, the postprocessing policy calls *other* generic, stateless policies to do the actual work.

These tool policies, such as `background_check_policy` and `send_email_policy`, re loaded, executed with the inputs provided by the `postprocessing_policy`, and then terminated. This keeps the entire system modular, as new tools can be added without changing the core pipeline, and each tool is a self-contained, stateless unit.

## Making the LLM Context-Aware

We modify the `on_data` method in the LLM block's code to explicitly format the supplemental data and prepend it to the user's prompt. This ensures the LLM has the necessary context to make an informed decision.

**File:** Refer `main.py` from [02_Part2_onboard_custom_llama_cpp
](https://github.com/OpenCyberspace/AIOS_AI_Blueprints/blob/main/video_tutorial_series/02_Part2_onboard_custom_llama_cpp/02-Model-Integration-Setup.ipynb)

In [None]:
# This is a simplified representation of the key logic in main_more_context.py

class LlamaCppBlock:
    # ... (other class methods)

    def on_data(self, input_data):
        message = input_data.get("message", "")
        supplemental_context = ""

        # Check for our specific supplemental data from the preprocessing policy
        if "supplemental_data" in input_data and "candidates" in input_data["supplemental_data"]:
            candidates = input_data["supplemental_data"]["candidates"]
            
            # **THE CRITICAL FIX**: Format the resume data into a clear context string
            context_parts = ["\n\n--- START OF SUPPLEMENTAL DATA ---"]
            for i, candidate in enumerate(candidates):
                context_parts.append(f"\n--- Resume {i+1}: {candidate.get('file_name', 'Unknown')} ---")
                context_parts.append(candidate.get('content', 'No content'))
            context_parts.append("\n--- END OF SUPPLEMENTAL DATA ---")
            supplemental_context = "\n".join(context_parts)
            
            # Clean up the input data so it's not processed further
            del input_data["supplemental_data"]

        # Prepend the context to the user's original message
        final_message = f"{supplemental_context}\n\nUSER REQUEST:\n{message}"

        # ... (rest of the logic to send 'final_message' to the LLM)
        # self.llm.create_chat_completion(...)

In [1]:
!cat allocation-llama4scout_recruiter_vdag.json

{
    "head": {
        "templateUri": "Parser/V1",
        "parameters": {}
    },
    "body": {
        "spec": {
            "values": {
                "mode": "allocate",
                "blockId": "llama4-scout-17b-block",
                "blockComponentURI": "model.llama4-scout-17b:1.0.0-stable",
                "minInstances": 1,
                "maxInstances": 3,
                "blockInitData": {
                    "model_name": "Llama-4-Scout-17B-16E-Instruct-UD-Q8_K_XL/Llama-4-Scout-17B-16E-Instruct-UD-Q8_K_XL-00001-of-00003.gguf",
                    "system_message": "You are an expert recruitment assistant. Your task is to analyze the provided resume context and create a JSON-based execution plan for a multi-stage recruitment pipeline. Your output MUST be a valid JSON object with a single key, \"tool_calls\". This key must contain a list of jobs to be executed in sequence.\n\nIMPORTANT: Group ALL candidates into a SINGLE background check job. Do not create separate back

# 4. vDAG Creation Spec

The entire vDAG is defined by a single allocation JSON file. This file acts as the blueprint, telling the system which model to load, what initial instructions to give it (the system prompt), and which policies to attach for pre-processing and post-processing. This approach makes the pipeline modular, as you can change the model or policies simply by modifying this file.

Below is a simplified version of the allocation spec, highlighting the key components.

In [5]:
recruitment_vdag_spec = {
    "parser_version": "Parser/V1",
    "body": {
        "spec": {
            "values": {
                "vdagName": "recruitment-automation-vdag",
                "vdagVersion": {"version": "1.0.0", "release-tag": "stable"},
                "discoveryTags": ["recruitment", "single-node-vdag"],
                "controller": {},
                "nodes": [
                    {
                        "spec": {
                            "values": {
                                "nodeLabel": "recruitment-node",
                                "nodeType": "block",
                                "manualBlockId": "llama4-scout-17b-block",
                                "preprocessingPolicyRule": {"policyRuleURI": "recruitment_preprocessing:1.0.0-stable"},
                                "postprocessingPolicyRule": {"policyRuleURI": "recruitment_postprocessing:1.0.0-stable"},
                                "modelParameters": {}
                            },
                            "IOMap": [
                                {
                                    "inputs": [{"name": "input_0", "reference": "input_0"}],
                                    "outputs": [{"name": "output_0", "reference": "output_0"}]
                                }
                            ]
                        }
                    }
                ],
                "graph": {
                    "input": [{
                        "nodeLabel": "recruitment-node",
                        "inputNames": ["input_0"]
                    }],
                    "connections": [],
                    "output": [{
                        "nodeLabel": "recruitment-node",
                        "outputNames": ["output_0"]
                    }]
                }
            }
        }
    }
}

# 5. Running this workflow as vDAG

## 1. Inference Flow

When an inference request is made, the vDAG Controller orchestrates the following steps:

1.  **Request Reception**: The vDAG Controller receives a request containing a job description and a zip file of resumes.
2.  **Block Selection**: The controller identifies that the request is for the `llama4-scout-17b-block`.
3.  **vDAG Block Pre-processing**: The attached `recruitment_preprocessing` policy is executed. It opens the zip file, extracts the text from all PDF resumes, and formats it as supplemental data.
4.  **Author's Block Pre_processing**: In case of any block level preprocessing present added by the author, that policy is executed.Here it is empty.
5.  **LLM Invocation**: The extracted resume text is passed along with the original job description to the `llama4-scout` model. The model analyzes the information based on its system prompt.
6.  **Author's Block Post_processing**: In case of any block level postprocessing present added by the author, that policy is executed.Here it is empty.
7.  **vDAG Block Post-processing**: The model's raw output (a JSON string) is sent to the `recruitment_postprocessing` policy. This policy parses the JSON, validates it, and extracts the `tool_calls`.
8.  **Response**: The final, structured output is returned to the user.

### Pre-processing Policy: `recruitment_preprocessing:1.0.0-stable`
This policy is responsible for handling the input files.

### Post-processing Policy: `recruitment_postprocessing:1.0.0-stable`
This policy parses the LLM's output and perform tool calling

### a. Create the vDAG with the AIOS createvDAG Endpoint
This command registers the vDAG definition with the AIOS system.

In [6]:
import requests
# NOTE: Adjust the IP address to match your cluster's endpoint.
create_vdag_url = "http://MANAGEMENTMASTER:30501/api/createvDAG" 
response = requests.post(create_vdag_url, json=recruitment_vdag_spec)
print(f"vDAG Creation Response Status: {response.status_code}")
print('Response Body:', response.json())

vDAG Creation Response Status: 200
Response Body: {'result': {'task_id': '80e95f71-a6e2-4837-aabc-15a97a528f77', 'vdagURI': 'recruitment-automation-vdag:1.0.0-stable'}, 'success': True, 'task_id': ''}


### b. Verify the vDAG is Registered
Use this command to confirm that the vDAG exists in the registry.

In [7]:
# NOTE: Adjust the IP address and vDAG name if you changed it.
!curl -X GET http://MANAGEMENTMASTER:30103/vdag/recruitment-automation-vdag:1.0.0-stable | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1208  100  1208    0     0   217k      0 --:--:-- --:--:-- --:--:--  235k
{
   "data" : {
      "assignment_info" : {
         "recruitment-node" : "llama4-scout-17b-block"
      },
      "compiled_graph_data" : {
         "head" : "llama4-scout-17b-block",
         "rev_mapping" : {
            "llama4-scout-17b-block" : "recruitment-node"
         },
         "t2_graph" : {
            "llama4-scout-17b-block" : []
         },
         "t3_graph" : {
            "llama4-scout-17b-block" : {
               "outputs" : []
            }
         },
         "tail" : [
            "llama4-scout-17b-block"
         ]
      },
      "controller" : {
         "initParameters" : {},
         "initSettings" : {},
         "inputSources" : [],
         "policies" : []
      },
      "discoveryTags" : [
         "recruitment",
       

### c. Create a vDAG Controller
This deploys the vDAG, creating a running instance with an API endpoint.

In [8]:
%%bash
# NOTE: Adjust the IP address, cluster name, and vdag_controller_id if needed.
curl -X POST http://MANAGEMENTMASTER:30600/vdag-controller/gcp-cluster-2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "create_controller",
    "payload": {
      "vdag_controller_id": "recruitment-automation-vdag-controller", 
      "vdag_uri": "recruitment-automation-vdag:1.0.0-stable",
      "config": {
        "policy_execution_mode": "local",
        "replicas": 1
      }
    }
  }'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   344  100    58  100   286    567   2799 --:--:-- --:--:-- --:--:--  3372


{"data":"Controller created successfully","success":true}


### d. Get the Controller Details
Check the status of your deployed vDAG controller. The `service_urls` will give you the inference endpoint.

In [9]:
# NOTE: Adjust the IP address and controller ID if needed.
!curl -X GET http://MANAGEMENTMASTER:30103/vdag-controller/recruitment-automation-vdag-controller | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   440  100   440    0     0  88852      0 --:--:-- --:--:-- --:--:--  107k
{
   "data" : {
      "cluster_id" : "gcp-cluster-2",
      "config" : {
         "api_url" : "http://CLUSTER1MASTER:31871",
         "policy_execution_mode" : "local",
         "replicas" : 1,
         "rest_url" : "http://CLUSTER1MASTER:31533",
         "rpc_url" : "CLUSTER1MASTER:32511"
      },
      "metadata" : {},
      "public_url" : "CLUSTER1MASTER:32511",
      "search_tags" : [
         "recruitment",
         "single-node-vdag"
      ],
      "vdag_controller_id" : "recruitment-automation-vdag-controller",
      "vdag_uri" : "recruitment-automation-vdag:1.0.0-stable"
   },
   "success" : true
}


## 6. Inference

To run the pipeline, we send an inference request to the vDAG controller's API endpoint. The request must contain the `block_id`, the `message` (job description), and the zipped resumes encoded in Base64.

The following Python code automates this process: it reads the zip file, encodes it, constructs the JSON payload, and sends the request.

In [10]:
import base64
import json
import os
import requests
from datetime import datetime

# --- Configuration ---
zip_file_path =  'resume_compressed.zip'
# NOTE: This URL should match the vDAG controller's service URL from the step above.
INFERENCE_URL = "http://CLUSTER1MASTER:31871/v1/infer" 
message = """We're hiring for a Senior Computer Vision Engineer. Requirements include:
    - 6+ years of hands-on experience in computer vision and deep learning.
    - Production-level experience with frameworks like PyTorch or TensorFlow, and libraries like OpenCV.
    - Master's degree or Ph.D. in Computer Science or a related field.

Please analyze the provided resumes and create a recruitment plan that:
1. Initiates background checks for qualified candidates
2. Sends appropriate follow-up communications"""
# --- End Configuration ---

print(f"🚀 Starting recruitment pipeline at {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("=" * 70)

try:
    with open(zip_file_path, "rb") as f:
        zip_binary_data = f.read()
    base64_encoded_data = base64.b64encode(zip_binary_data).decode('utf-8')
    file_size = len(zip_binary_data)
    print(f"✅ Successfully read {zip_file_path} ({file_size} bytes)")
except FileNotFoundError:
    print(f"❌ ERROR: The file was not found at '{zip_file_path}'.")
    raise

# The payload now targets the vDAG controller, not a specific block_id.
payload = {
    "model": "llama4-scout-17b-block",
    "session_id": "test_135",
    "seq_no": 1,
    "data": {
        "mode": "chat",
        "message": message   },
    "files": [
      {
        "metadata": {"filename": zip_file_path, "size": file_size},
        "file_data": base64_encoded_data #zip_binary_data #
      }],
    "graph": {},
    "selection_query": {}
}


output = None
try:
    response = requests.post(INFERENCE_URL, json=payload, timeout=600) # Added a long timeout for long debates
    response.raise_for_status()
    output = response.json()
    print("Inference request successful. The raw JSON output is stored in 'output'.")
except requests.exceptions.RequestException as e:
    print(f"Failed to get response from inference service: {e}")
except json.JSONDecodeError:
    print("Failed to parse JSON from response. Raw text:")
    print(response.text)

🚀 Starting recruitment pipeline at 2025-09-02 08:33:17
✅ Successfully read resume_compressed.zip (250601 bytes)
Inference request successful. The raw JSON output is stored in 'output'.


In [24]:
print(output)
# print(payload)

{'data': {'background_check_processed': True, 'message': 'Recruitment pipeline processing finished.', 'processed_tool_calls': 2}, 'seq_no': 1, 'session_id': 'test_134', 'ts': 1756735787.09144}
