# 🔧 vDAG Policy Demonstration: Quota Management & Quality Checker Record

This notebook demonstrates how to design, implement, and register two key policies in the **vDAG (Virtual Directed Acyclic Graph)** runtime system:

---

## 1️⃣ Quota Checker Policy

The **Quota Checker Policy** helps enforce limits on how often a given `session_id` can use the inference APIs.

This policy:
- Tracks usage per session using a Redis backend
- Enforces per-session or global usage limits
- Supports whitelisting of session IDs
- Can be dynamically managed via its `management()` interface

---

## 2️⃣ Quality Checker Policy

The **Quality Checker Policy** is a passive policy that audits the performance of the vDAG by logging request–response pairs periodically.

This policy:
- Collects structured input/output samples
- Saves them to a remote Redis database for manual or automated review
- Runs asynchronously to avoid impacting runtime performance

---

By the end of this notebook, you will understand how to:
- Define and customize quota enforcement policies
- Set up request auditing for quality assurance
- Register these policies into the vDAG policy registry


## 🚦 Quota Checker Policy for vDAG Inference

The **Quota Checker Policy** is designed to control and limit the usage of inference APIs in a vDAG system.

This policy is invoked **every time a task is submitted**, and it determines whether the request should be allowed based on quota constraints. It supports:

- Per-session quota limits
- Global default quota limits
- Whitelisting of session IDs (which bypass all checks)
- Runtime quota updates and resets
- Redis-based storage for scalable session tracking

The quota checker ensures fair usage, prevents abuse, and enables flexible access control for session-based inference tasks in a distributed AI pipeline.


### 🧠 Sample Quota Policy: Breakdown

Below is a breakdown of the logic implemented in the sample `QuotaCheckerPolicy` class.

### ✅ Features Implemented:
1. **Default and Per-Session Limits**:
   - The policy accepts a `default_limit` (e.g., 1000) used when a specific limit is not set for a session.
   - A dictionary `session_limits` can override limits for specific `session_id`s.

2. **Whitelist**:
   - The `whitelist` parameter contains session IDs that are always allowed, bypassing quota checks.

3. **Quota Check Logic**:
   - When a task is evaluated, the new quota (i.e., current + 1) is compared against the configured limit.
   - If the quota exceeds the allowed limit, the request is denied (`"allowed": false`).

4. **Management Interface**:
   - The `management()` method allows dynamic updates such as:
     - Resetting quota for a session
     - Setting quotas manually
     - Updating limits
     - Modifying the whitelist
     - Clearing all quotas in Redis

5. **Backend Integration**:
   - Uses a Redis instance (URL from `parameters["db_url"]`) as the key-value store to persist and track quotas across sessions.

This modular policy enables scalable and dynamic quota control over distributed inference jobs in the vDAG runtime.

Here is the sample policy code:

In [None]:


class AIOSv1PolicyRule:
    def __init__(self, rule_id, settings, parameters):
        self.default_limit = parameters.get("default_limit", 1000)
        self.session_limits = parameters.get("session_limits", {})
        self.whitelist = set(parameters.get("whitelist", []))

    def eval(self, parameters, input_data, context):
        quota_table = input_data["quota_table"]
        session_id = input_data["session_id"]
        quota = input_data["quota"]  # proposed quota (current + 1)

        # Whitelisted session_ids are always allowed
        if session_id in self.whitelist:
            return {"allowed": True}

        # Determine limit (session-specific or default)
        limit = self.session_limits.get(session_id, self.default_limit)

        # Reject if quota exceeds limit
        if quota > limit:
            return {"allowed": False}

        return {"allowed": True}

    def management(self, action: str, data: dict) -> dict:

        try:
            action = action.lower()
            qt = data.get("quota_table")
            sid = data.get("session_id")

            if action == "get_quota":
                return {"status": "ok", "value": qt.get(sid)}

            elif action == "reset_quota":
                qt.reset(sid)
                return {"status": "ok", "message": f"Quota reset for {sid}"}

            elif action == "set_quota":
                value = int(data.get("value", 0))
                qt.remove(sid)
                for _ in range(value):
                    qt.increment(sid)
                return {"status": "ok", "message": f"Quota set to {value} for {sid}"}

            elif action == "update_limit":
                limit = int(data.get("limit"))
                self.session_limits[sid] = limit
                return {"status": "ok", "message": f"Limit updated for {sid} to {limit}"}

            elif action == "update_default_limit":
                self.default_limit = int(data.get("limit"))
                return {"status": "ok", "message": f"Default limit updated to {self.default_limit}"}

            elif action == "add_whitelist":
                self.whitelist.add(sid)
                return {"status": "ok", "message": f"{sid} added to whitelist"}

            elif action == "remove_whitelist":
                self.whitelist.discard(sid)
                return {"status": "ok", "message": f"{sid} removed from whitelist"}

            elif action == "clear_all":
                qt.clean()
                return {"status": "ok", "message": "All quotas cleared"}

            else:
                return {"status": "error", "message": f"Unknown action '{action}'"}

        except Exception as e:
            return {"status": "error", "message": str(e)}


The code is already onboarded as a policy with URI: `quota-checker:2.0-stable`

**Process For Onboarding the Policy**:

Create a file `function.py` and place it in direcrectory `code`
You can place `requirements.txt` also in code directory.
-code
|  function.py
|  requirements.txt

zip the code directory with `zip -r mypolicy.zip code`

Upload the zip file to AIOS Policy storage using the command: `bash upload.sh`

## 🧪 Quality Checker Policy – Introduction

The **Quality Checker Policy** is designed to support continuous monitoring and auditing of inference tasks in a vDAG (Virtual Directed Acyclic Graph) pipeline.

Unlike enforcement policies (like the Quota Checker), this policy is **non-blocking** and runs in the **background**. Its primary purpose is to:

- Capture and log input–output pairs from inference tasks
- Save structured request and response data to a database (e.g., Redis)
- Enable post-execution review for quality assurance and debugging
- Optionally integrate with external analytics or auditing systems

### 🧠 Sample Quality Checker Policy: Breakdown

The `QualityCheckerPolicy` is a passive monitoring policy designed to collect and store **input-output samples** from vDAG inference tasks. It is not used for allowing or rejecting tasks, but instead for **auditing** and **quality assurance** purposes.

---

### ✅ Features Implemented:

1. **Request–Response Sampling**:
   - The policy extracts the `.data` field from the `vDAGInferencePacket` objects representing the request and response.
   - These `.data` fields are JSON strings and are parsed to extract structured task information.

2. **Audit Record Construction**:
   - A full audit record is created including:
     - `timestamp`
     - `session_id`
     - `seq_no`
     - Request and Response payloads (as parsed JSON)
     - Optional `vdag_id` metadata

3. **Redis-Backed Storage**:
   - The policy writes audit records to a remote Redis instance, using a key format like:
     ```
     audit:<session_id>:<seq_no>
     ```
   - This allows quick querying and time-based indexing of audit samples.

4. **Background Execution**:
   - The policy is executed **asynchronously** in the background and does **not block** or affect the primary inference path.

5. **Management Interface**:
   - The `management()` method allows external systems or users to:
     - Fetch a specific audit record by session and sequence number
     - Retrieve the latest record for a given session
     - List all audit keys
     - Delete individual records

---

This policy is particularly useful for:
- Validating model outputs
- Running post-hoc quality assurance
- Diagnosing failure cases and edge behavior
- Integrating with external QA pipelines or dashboards

Together with the Quota Checker, it helps monitor and govern the overall **health and fairness** of a distributed inference workflow.

Here is the sample policy code:


In [None]:
import json
import hashlib
import time
import redis
from urllib.parse import urlparse


class AIOSv1PolicyRule:
    def __init__(self, rule_id, settings, parameters):
       
        # Parse Redis connection URL
        redis_url = parameters["db_url"]
        parsed = urlparse(redis_url)

        self.redis_client = redis.Redis(
            host=parsed.hostname,
            port=parsed.port or 6379,
            db=int(parsed.path.lstrip("/")) if parsed.path else 0,
            decode_responses=True  # Store values as strings
        )

    def eval(self, parameters, input_data, context):
        try:
            request_packet = input_data["input_data"]["request"]
            response_packet = input_data["input_data"]["response"]

            # Extract JSON data fields
            request_json = json.loads(request_packet.data)
            response_json = json.loads(response_packet.data)

            # Create audit record
            record = {
                "timestamp": time.time(),
                "session_id": request_packet.session_id,
                "seq_no": request_packet.seq_no,
                "request": request_json,
                "response": response_json
            }

            # Generate Redis key
            key = f"audit:{record['session_id']}:{record['seq_no']}"

            # Save to Redis
            self.redis_client.set(key, json.dumps(record))

        except Exception as e:
            context["last_error"] = str(e)

        return {}

    def management(self, action: str, data: dict) -> dict:
        try:
            if action == "get":
                session_id = data.get("session_id")
                seq_no = data.get("seq_no")
                if not session_id or seq_no is None:
                    return {"status": "error", "message": "Missing session_id or seq_no"}

                key = f"audit:{session_id}:{seq_no}"
                val = self.redis_client.get(key)
                return {"status": "ok", "value": json.loads(val) if val else None}

            elif action == "get_latest":
                session_id = data.get("session_id")
                if not session_id:
                    return {"status": "error", "message": "Missing session_id"}

                pattern = f"audit:{session_id}:*"
                keys = self.redis_client.keys(pattern)

                if not keys:
                    return {"status": "ok", "value": None}

                # Parse seq_nos from keys
                def extract_seq(key):
                    try:
                        return int(key.split(":")[-1])
                    except ValueError:
                        return -1

                latest_key = max(keys, key=extract_seq)
                val = self.redis_client.get(latest_key)
                return {"status": "ok", "key": latest_key, "value": json.loads(val) if val else None}

            elif action == "list_keys":
                pattern = data.get("pattern", "audit:*")
                keys = self.redis_client.keys(pattern)
                return {"status": "ok", "keys": keys}

            elif action == "delete":
                session_id = data.get("session_id")
                seq_no = data.get("seq_no")
                if not session_id or seq_no is None:
                    return {"status": "error", "message": "Missing session_id or seq_no"}

                key = f"audit:{session_id}:{seq_no}"
                deleted = self.redis_client.delete(key)
                return {"status": "ok", "deleted": deleted}

            return {"status": "error", "message": f"Unsupported action '{action}'"}

        except Exception as e:
            return {"status": "error", "message": str(e)}

## Demo

### 🛠️ Step 1: Create vDAG Controller with Quota & Quality Checker Policies

In this step, we create a vDAG controller named `policies-test-c` using the vDAG URI `llm-analyzer:0.0.3-stable`.  
We attach two policies:
- A **Quota Checker Policy** with a default limit of `1` and a whitelist that includes `session10`
- A **Quality Checker Policy** that writes audit records to a Redis instance

The controller will automatically enforce quotas and record inference audits.


In [None]:
%%bash
# Step 1: Create the vDAG Controller
curl -X POST http://MANAGEMENTMASTER:30600/vdag-controller/gcp-cluster-2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "create_controller",
    "payload": {
      "vdag_controller_id": "policies-test-c", 
      "vdag_uri": "llm-analyzer:0.0.3-stable",
      "config": {
        "policy_execution_mode": "local",
        "replicas": 1,
        "custom_data": {
            "quotaChecker": {
                "quotaCheckerPolicyRule": {
                    "policyRuleURI": "quota-checker:2.0-stable",
                    "parameters": {
                        "default_limit": 1,
                        "whitelist": ["session10"]
                    }
                }
            },
            "qualityChecker": {
              "qualityCheckerPolicyRule": {
                "policyRuleURI": "quality-checker:2.0-stable",
                "parameters": {
                  "db_url": "redis://POLICYSTORESERVER:6379/0"
                }
              },
              "framesInterval": 1
            }
        }
      },
      "search_tags": []
    }
  }'


### 🔍 Step 2: Query the vDAG Controller

Once the controller is created, we can verify its status and configuration using a GET request.


In [None]:
%%bash
curl -X GET http://MANAGEMENTMASTER:30103/vdag-controller/policies-test-c | json_pp

### 💬 Step 3: Submit Inference Request

We now send an inference request with:
- `session_id = "session1"` → this is **not in the whitelist**, so it's subject to the quota limit (1 request max)
- A request payload containing both **text** and **image** inputs for scene analysis

This will trigger:
- The **Quota Checker** to decide whether to allow the request
- The **Quality Checker** to log the input/output in Redis for auditing


In [None]:
%%bash

# Step 3: Inference Request with session1 (subject to quota)

curl -X POST  http://CLUSTER1MASTER:32076/v1/infer \
  -H "Content-Type: application/json" \
  -d '{
  "session_id": "session2",
  "seq_no": 5,
  "data": {
    "mode": "chat",
    "gen_params": {
      "temperature": 0.1,
      "top_p": 0.95,
      "max_tokens": 4096
    },
    "messages": [
      {
        "content": [
          {
            "type": "text",
            "text": "Analyze the following image and generate your objective scene report.?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://akm-img-a-in.tosshub.com/indiatoday/images/story/202311/chain-snatching-caught-on-camera-in-bengaluru-293151697-16x9_0.jpg"
            }
          }
        ]
      }
    ]
  },
  "graph": {},
  "selection_query": {}
}' | json_pp


### 🔍 Retrieve Quality Audit Record

This cell sends a management command to the QualityCheckerPolicy via the `/quality/mgmt` endpoint.  
You can use it to:

- Retrieve a specific audit record by `session_id` and `seq_no` using `get`
- Retrieve the latest audit record for a session using `get_latest`

#### Example: Retrieve Latest Audit Record for a Session


In [None]:
%%bash
curl -X POST http://CLUSTER1MASTER:32638/quality/mgmt \
     -H "Content-Type: application/json" \
     -d '{
           "mgmt_action": "get_latest",
           "mgmt_data": {
             "session_id": "session1",
             "seq_no": 5
           }
         }'

### ✅ Demonstrate Quota Whitelist Behavior

This demo will:
1. Add a `session_id` to the quota policy whitelist using the `/quota/mgmt` API.
2. Simulate a quota check using that `session_id` to confirm it bypasses the quota limits.

When a session is whitelisted, the `eval()` method in the QuotaCheckerPolicy will always return `{"allowed": true}`.


In [5]:
%%bash
curl -X POST http://CLUSTER1MASTER:32638/quota/mgmt \
     -H "Content-Type: application/json" \
     -d '{
           "mgmt_action": "add_whitelist",
           "mgmt_data": {
             "session_id": "session2"
           }
         }'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   224  100    96  100   128    182    243 --:--:-- --:--:-- --:--:--   425


{"data":{"message":"'NoneType' object has no attribute 'get'","status":"error"},"success":true}


## Clean-up

The controller can be removed using the following command:

In [None]:
%%bash
curl -X POST http://MANAGEMENTMASTER:30600/vdag-controller/gcp-cluster-2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "remove_controller",
    "payload": {
      "vdag_controller_id": "policies-test-c"
    }
  }'