# Vijil Platform Evaluation Flow

In this demo, we use the Vijil python client to:
- **authenticate** with the platform,
- **register a rate limit/api key** for the agent-under-test,
- **execute an evaluation** of the agent-under-test,
- **check the status** of an evaluation,
- **get the results and Vijil Trust Report PDF** of the evaluation once it is complete,
- **get the recommended Vijil Dome input and output guardrail configuration** based on the results of the evaluation.

This demo shows the order of steps required for running an evaluation which is meant to help integrate with Vijil's platform.

Any questions, please contact vele@vijil.ai.

## Step 1: Installation of required libraries

In [None]:
!pip install vijil python-dotenv httpx uuid

In [1]:
from dotenv import load_dotenv

load_dotenv(dotenv_path='rag-agent/.env')

import os
import json
import httpx
import uuid
from vijil import Vijil

VIJIL_BASE_URL = "https://evaluate-api.vijil.ai/v1"

  from .autonotebook import tqdm as notebook_tqdm


## Step 2: Authentication with Vijil's API

### Step 2.1: Swap long-lived token for short term access token

To establish authorization, we need to swap the machine-to-machine long live tokens for a short live access token (24hrs). This needs to happen once every 24 hours. Within the 24 hours, you can use the access token for communicating with Vijil's API.

In [2]:
assert os.getenv("VIJIL_CLIENT_ID") is not None, "VIJIL_CLIENT_ID is not set"
assert os.getenv("VIJIL_CLIENT_SECRET") is not None, "VIJIL_CLIENT_SECRET is not set"
assert os.getenv("VIJIL_CLIENT_TOKEN") is not None, "VIJIL_CLIENT_TOKEN is not set"

# Get credentials from environment
client_id = os.getenv("VIJIL_CLIENT_ID")
client_secret = os.getenv("VIJIL_CLIENT_SECRET")
client_token = os.getenv("VIJIL_CLIENT_TOKEN")

# Step 1: Prepare the request payload
payload = {
    "client_id": client_id,
    "client_secret": client_secret,
    "client_token": client_token
}


# Step 2: Make POST request to token endpoint
response = httpx.post(url=f"{VIJIL_BASE_URL}/auth/token", json=payload)
response.raise_for_status()  # Raise error if request failed

# Step 3: Extract access token from response
token_data = response.json()
access_token = token_data['access_token']

token_data

{'access_token': 'eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6Ijd6XzJmTFF3MVJhcHFNazNkVUhFaCJ9.eyJpc3MiOiJodHRwczovL3RydXN0dmlqaWwudXMuYXV0aDAuY29tLyIsInN1YiI6IjE0ZWtuTXhucjdzZEV1WHJ3MzlXUktxeEtobWpwbG9wQGNsaWVudHMiLCJhdWQiOiJodHRwczovL3Byb2QtYXBpLmNsb3VkLWRldi52aWppbC5haSIsImlhdCI6MTc2MjQyMTU4NCwiZXhwIjoxNzYyNTA3OTg0LCJzY29wZSI6InJlYWQ6ZXZhbHVhdGlvbnMgd3JpdGU6ZXZhbHVhdGlvbnMiLCJndHkiOiJjbGllbnQtY3JlZGVudGlhbHMiLCJhenAiOiIxNGVrbk14bnI3c2RFdVhydzM5V1JLcXhLaG1qcGxvcCIsInBlcm1pc3Npb25zIjpbInJlYWQ6ZXZhbHVhdGlvbnMiLCJ3cml0ZTpldmFsdWF0aW9ucyJdfQ.Z-Zf9NDT7ikp_MXy7ebaXEvFSkFt_ZrTpNKI6EfDBhbFsk-0dKTzhdPZE6nD7-bUKXzZDnWGLLGut4IZVf9N-w50OyEf66L8P0wDX7rodoLwLFtz1ggQZTxfWAQT_xIXN8DgASsGYElccV6IH7ctciAZRu5ZDb3pQBZPvVJqx2Rfj9Wy8oCEPhMSFVqx7YXhEQ5xU66bUgD7_FjsoYlmG6HspuWiVN_k12Y2pdxgKICh7W_lf5Ues-8xF2mPWTZKDiVvzU2KogDfpDWImSu9eDQaWoIaTxynJ4wTfo2g7_GNAZmP4o0FCTIyqxqrxW_-l_yFqT2YQKflfwRPQjMWHw',
 'expires_in': 86400}

### Step 2.2: Login to platform with access token and user id

In [3]:
def login(base_url, user_id, token):
  payload = json.dumps({
    "type": "LOGIN_COMPLETE",
    "data": {
      "external_user_id": user_id
    }
  })
  headers = {
    'Authorization': f'Bearer {token}',
    'Content-Type': 'application/json'
  }

  response = httpx.post(url=base_url+"/events", headers=headers, data=payload)
  response.raise_for_status()

login(base_url=VIJIL_BASE_URL, user_id="14eknMxnr7sdEuXrw39WRKqxKhmjplop@clients", token=access_token)


### Step 2.3: Initialize Vijil client

In [4]:
client = Vijil(
    base_url=VIJIL_BASE_URL,
    api_key=access_token
)

client.agents.list()


[]

## Step 3: Add an API key and rate limit for the agent-under-test

We need to create an API key for the agent-under-test in the platform that sets a rate limit for querying the agent. If the agent-under-test itself requires an API key to query, then we need to insert it here.

In [74]:
agent_url = "https://88b5c6a7c5f2975e5851f311fba51dc995c0736f-8000.dstack-pha-prod7.phala.network/v1"
phala_webinar_api_key = client.api_keys.create(
    name=f"phala_webinar_{uuid.uuid4()}",
    model_hub="custom",
    rate_limit_per_interval=10,
    rate_limit_interval=60,
    api_key="placeholder",
    url=agent_url
)

print(phala_webinar_api_key)

{'id': 'cd2e4e41-0d2c-4e76-a11c-a590a85a562a', 'name': 'phala_webinar_0e1382f2-5a15-4cdb-9a83-78de9fdec163', 'hub': 'custom', 'rate_limit_per_interval': 10, 'rate_limit_interval': 60, 'display_value': 'pl*******er', 'hub_config': None, 'user_id': '3ea9848e-1b36-45a3-926b-ef9ce84b873b', 'team_id': '14eknMxnr7sdEuXrw39WRKqxKhmjplop@clients', 'status': 'active'}


In [9]:
client.api_keys.list()

[{'id': '52f1c8b1-e441-42a7-85e7-65f5fafafe76',
  'name': 'phala_webinar_7dde3cd4-b7c9-42f5-8500-2820f7edc394',
  'hub': 'custom',
  'rate_limit_per_interval': 200,
  'rate_limit_interval': 60,
  'display_value': 'pl*******er',
  'hub_config': None,
  'user_id': '3ea9848e-1b36-45a3-926b-ef9ce84b873b',
  'team_id': '14eknMxnr7sdEuXrw39WRKqxKhmjplop@clients',
  'status': 'active'},
 {'id': '75983dc1-9985-4b8b-b993-71c01a7319b4',
  'name': 'phala_webinar_26ba759f-856f-450c-9bef-2e06e5ea3c3f',
  'hub': 'custom',
  'rate_limit_per_interval': 50,
  'rate_limit_interval': 60,
  'display_value': 'pl*******er',
  'hub_config': None,
  'user_id': '3ea9848e-1b36-45a3-926b-ef9ce84b873b',
  'team_id': '14eknMxnr7sdEuXrw39WRKqxKhmjplop@clients',
  'status': 'active'},
 {'id': '791a9b79-c258-45fd-bbc5-b20691b8e9c2',
  'name': 'phala_webinar_c68dc339-0b50-4412-98b5-0d2a5eb608ad',
  'hub': 'custom',
  'rate_limit_per_interval': 50,
  'rate_limit_interval': 60,
  'display_value': 'pl*******er',
  'hub_c

## Step 4: Initiate an evaluation

Now we can create an evaluation. Insert all the required fields, make a random name for the evaluation, set the testing suite (harnesses), api key and model url.

In [76]:
agent_model_name = "vijil-docs-agent"
agent_api_key_name = phala_webinar_api_key['name']
eval = client.evaluations.create(
    model_hub="custom",
    model_name=agent_model_name,
    model_params={
        "timeout": 120, # response timeout for agent queries
    },
    name=f"phala-eval-{uuid.uuid4()}",
    api_key_name=agent_api_key_name,
    harnesses=["security", "safety"],
    model_url=agent_url
)

## Step 5: Get status of the evaluation

You can query the status of the evaluation

In [7]:
# eval_id = eval['id']
eval_id = "3ac818bf-c7e5-4b55-be6f-4e370704e9e3"
client.evaluations.get_status(evaluation_id=eval_id)

{'id': '3ac818bf-c7e5-4b55-be6f-4e370704e9e3',
 'name': 'phala-eval-41800326-ce5f-4e6f-8e57-43df9102e999',
 'tags': ['vijil_harness'],
 'status': 'COMPLETED',
 'cause': None,
 'total_test_count': 1446,
 'completed_test_count': 1446,
 'error_test_count': 0,
 'total_response_count': 1446,
 'completed_response_count': 1399,
 'error_response_count': 29,
 'total_generation_time': '12922.000000',
 'average_generation_time': '8.2130013831258645',
 'score': 0.9564480692415475,
 'status_counts': {'probes': {'COMPLETED': 136, 'ERROR': 5},
  'tests': {'GENERATED': 1446},
  'responses': {'SKIP': 18, 'ERROR': 29, 'COMPLETED': 1399}},
 'hub': 'custom',
 'model': 'vijil-docs-agent',
 'url': 'https://88b5c6a7c5f2975e5851f311fba51dc995c0736f-8000.dstack-pha-prod7.phala.network/v1',
 'created_at': 1762416985,
 'created_by': '3ea9848e-1b36-45a3-926b-ef9ce84b873b',
 'completed_at': 1762429911,
 'team_id': '14eknMxnr7sdEuXrw39WRKqxKhmjplop@clients',
 'restart_count': 0,
 'metadata': None,
 'completion_toke

You can cancel the evaluation as well using the following code

In [73]:
client.evaluations.cancel(evaluation_id=eval_id)

{'evaluation_id': '3e16fc5d-623f-47ad-93cc-eb193723e976',
 'status': 'CANCELLATION_INITIATED',
 'message': 'Cancellation process has been initiated. This may take a few moments to complete.'}


## Step 6: Download the trust report after the evaluation completes

We can now download the trust report (pdf) using the following commands.

In [8]:
analysis_report = client.evaluations.report(evaluation_id=eval_id)
analysis_report.generate(save_file="analysis_report.pdf", wait_till_completion=True, format="pdf")

Queuing up your evaluation report.....
Analysing Failed Tests.....
Analysing probe-level failures.....
Parsing and analysing Harnesses........
Evaluation Report Created.....
Report 908a067d-6bec-46bc-8f37-ee99d894d091 for evaluation 3ac818bf-c7e5-4b55-be6f-4e370704e9e3 was saved to analysis_report.pdf


## Step 7: Generate recommended Vijil Dome guardrails based on evaluation results

Now we can get the recommended Vijil Dome configuration (a json) for input and output guardrails.

In [16]:
from typing import Optional, Dict, Any
import httpx

def get_config_from_vijil_evaluation(
    api_token: str,
    evaluation_id: str,
    base_url: Optional[str] = None,
    latency_threshold: Optional[float] = None,
) -> Optional[dict]:
    """
    Fetch the Dome configuration from a specific evaluation in Vijil Evaluate using the provided API token and evaluation ID.

    Args:
        api_token (str): The API token for authentication.
        evaluation_id (str): The ID of the evaluation whose configuration is to be fetched.

    Returns:
        dict: The Dome configuration as a dictionary.

    Raises:
        Exception: If the API call fails or returns an error.
    """
    headers = {
        "Authorization": f"Bearer {api_token}",
        "Content-Type": "application/json",
    }
    base_url = base_url or VIJIL_BASE_URL
    url = f"{base_url}/recommend-dome-config"

    payload = {"evaluation_id": evaluation_id} # type: Dict[str, Any]
    if latency_threshold:
        payload["latency_threshold"] = latency_threshold

    try:
        response = httpx.post(url, headers=headers, json=payload)
        response.raise_for_status()
        dome_config = response.json()
        if dome_config is None:
            raise Exception("Dome configuration not found in the response.")
        else:
            return dome_config
    except httpx.HTTPError as e:
        raise Exception(f"Failed to fetch Dome config: {e}")

In [17]:
dome_config = get_config_from_vijil_evaluation(
    api_token=access_token,
    evaluation_id=eval_id
)

dome_config

{'input-guards': ['security-input-guard',
  'moderation-input-guard',
  'privacy-input-guard'],
 'output-guards': ['moderation-output-guard'],
 'input-early-exit': True,
 'security-input-guard': {'type': 'security',
  'early-exit': True,
  'methods': ['prompt-injection-mbert']},
 'moderation-input-guard': {'type': 'moderation',
  'early-exit': True,
  'methods': ['moderation-flashtext']},
 'moderation-output-guard': {'type': 'moderation',
  'early-exit': True,
  'methods': ['moderation-deberta', 'moderation-flashtext']},
 'privacy-input-guard': {'type': 'privacy', 'methods': ['privacy-presidio']}}