# Cloud executed evaluations for continuous evaluation during development and production 

## Documentation

Evaluate your Generative AI application on the cloud with Azure AI Projects SDK (preview)<br>
https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/develop/cloud-evaluation

Some hints from https://carlos.mendible.com/2025/02/27/custom-evaluators-with-ai-foundry/

## Dependencies

In [1]:
#%%cmd
#pip install azure-identity azure-ai-projects azure-ai-ml azure-ai-evaluation

## Setup

### Common packages

In [2]:
import os
import dotenv
from pathlib import Path

### Global settings

In [3]:
# Global variables
PRIVATE = False
DATA_DIR = Path("data")
TMP_DIR = Path("tmp")

### Load environment variables

In [4]:
# Import override environment variables from .env file
# or from private.env file if PRIVATE is True
dotenv.load_dotenv('.env' if not PRIVATE else 'private.env', override=True)

True

### Config dictionaries used by Azure AI SDK

In [5]:
# Configuration for Azure AI Foundry project
azure_ai_project = {
    "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
    "resource_group_name": os.environ.get("AZURE_RESOURCE_GROUP_AI"),
    "project_name": os.environ.get("AZURE_AI_PROJECT_NAME"),
}

# Configuration for Azure OpenAI model
model_config = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
    "azure_deployment": os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
    "api_version": os.environ.get("AZURE_OPENAI_API_VERSION"),
    "type": "azure_openai"
}

### Azure credentials

In [6]:
# https://learn.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()

### Get AI Foundry project client

In [7]:
from azure.ai.projects import AIProjectClient

# Create an Azure AI Client from a connection string. Available on Azure AI project Overview page.
# https://learn.microsoft.com/en-us/python/api/azure-ai-projects/azure.ai.projects.aiprojectclient?view=azure-python-preview
project_client = AIProjectClient.from_connection_string(
    credential=credential,
    conn_str=os.environ.get("AZURE_AI_PROJECT_CONNECTION_STRING"),
)

## Upload evaluation data

In [8]:
# https://learn.microsoft.com/en-us/python/api/azure-ai-projects/azure.ai.projects.aiprojectclient?view=azure-python-preview#methods
# Upload a file to the Azure AI Foundry project. This method required azure-ai-ml to be installed.
# Return: tuple, containing asset id and asset URI of uploaded file.
data_id, data_url = project_client.upload_file(DATA_DIR / "data.jsonl")
print(f"Uploaded data asset id: {data_id}")
print(f"Uploaded data asset url: {data_url}")

Uploaded data asset id: /subscriptions/c11caebe-ea81-4036-9e58-ccf406d87ead/resourceGroups/sbn-ai-prod-swc/providers/Microsoft.MachineLearningServices/workspaces/10_stable/data/d76cbf36-3bac-49c3-8b1e-843bf4f44438/versions/1
Uploaded data asset url: azureml://subscriptions/c11caebe-ea81-4036-9e58-ccf406d87ead/resourcegroups/sbn-ai-prod-swc/workspaces/10_stable/datastores/workspaceblobstore/paths/LocalUpload/d572131ff174c5ed77e2a11d4b1b945b/data.jsonl


## Get built-in evaluator for their ids

In [9]:
from azure.ai.evaluation import F1ScoreEvaluator, GroundednessEvaluator, GroundednessProEvaluator, ViolenceEvaluator
print(f'e.g. {F1ScoreEvaluator.id}')

e.g. azureml://registries/azureml/models/F1Score-Evaluator/versions/3


## Get custom evaluator library ids from Azure AI Foundry
Note: this could also be looked up in the AI Foundry Portal (Evaluation Library)

### Connect to Azure AI Foundry project

In [10]:
from azure.ai.ml import MLClient

# Define ml_client to register custom evaluator
# https://learn.microsoft.com/en-us/python/api/azure-ai-ml/azure.ai.ml.mlclient?view=azure-python
ml_client = MLClient(
       subscription_id=os.environ["AZURE_SUBSCRIPTION_ID"],
       resource_group_name=os.environ["AZURE_RESOURCE_GROUP_AI"],
       workspace_name=os.environ["AZURE_AI_PROJECT_NAME"],
       credential=credential
)

Overriding of current TracerProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented


### Helper to built evaluator library id

In [11]:
from azure.ai.ml.entities import Model

def get_evaluator_library_id(_evaluator: Model) -> str:
    _ws = ml_client.workspaces.get(ml_client.workspace_name)
    _id=f"azureml://locations/{_ws.location}/workspaces/{_ws._workspace_id}/models/{_evaluator.name}/versions/{_evaluator.version}"
    print(f"{_evaluator.name} library id: {_id}")
    return _id

### Get library ids

In [12]:
_evaluator = ml_client.evaluators.get("AnswerLenEvaluator", label="latest")
answerLenEvaluator_libId = get_evaluator_library_id(_evaluator)

Method evaluators: This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


AnswerLenEvaluator library id: azureml://locations/swedencentral/workspaces/e7d65f7e-f370-4509-a296-c4d2c8befcad/models/AnswerLenEvaluator/versions/4


In [13]:
_evaluator = ml_client.evaluators.get("FriendlinessEvaluator", label="latest")
friendlinessEvaluator_libId = get_evaluator_library_id(_evaluator)

FriendlinessEvaluator library id: azureml://locations/swedencentral/workspaces/e7d65f7e-f370-4509-a296-c4d2c8befcad/models/FriendlinessEvaluator/versions/6


## Start evaluation in the cloud

In [18]:
# https://learn.microsoft.com/en-us/python/api/azure-ai-projects/azure.ai.projects.models.evaluation?view=azure-python-preview
from azure.ai.projects.models import Evaluation

# https://learn.microsoft.com/en-us/python/api/azure-ai-projects/azure.ai.projects.models.evaluatorconfiguration?view=azure-python-preview
from azure.ai.projects.models import EvaluatorConfiguration

# https://learn.microsoft.com/en-us/python/api/azure-ai-projects/azure.ai.projects.models.dataset?view=azure-python-preview
from azure.ai.projects.models import Dataset

# Create an evaluation
evaluation = Evaluation(
    display_name="Cloud evaluation",
    description="Evaluation of dataset",
    data=Dataset(id=data_id),
    
    # Note the evaluator configuration key must follow a naming convention
    # the string must start with a letter with only alphanumeric characters 
    # and underscores. Take "f1_score" as example: "f1score" or "f1_evaluator" 
    # will also be acceptable, but "f1-score-eval" or "1score" will result in errors.
    evaluators={
        "f1_score": EvaluatorConfiguration(
            id=F1ScoreEvaluator.id,
        ),

        "groundedness": EvaluatorConfiguration(
            id=GroundednessEvaluator.id,
            init_params={
                "model_config": model_config
            },
        ),

        "groundedness_pro": EvaluatorConfiguration(
            id=GroundednessProEvaluator.id,
            init_params={
                "azure_ai_project": project_client.scope
            },
        ),

        "violence": EvaluatorConfiguration(
            id=ViolenceEvaluator.id,
            init_params={
                "azure_ai_project": project_client.scope
            },
        ),
        
        "answer_length": EvaluatorConfiguration(
            id=answerLenEvaluator_libId,
            data_mapping={
                "answer": "${data.response}"
            },
        ),
        
        "friendliness": EvaluatorConfiguration(
            id=friendlinessEvaluator_libId,
            init_params={
                "model_config": model_config
            },
            
            data_mapping={
            "response": "${data.response}"
            } 
        )
    },
)

# Create evaluation
evaluation_response = project_client.evaluations.create(
    evaluation=evaluation,
)

# Get evaluation
get_evaluation_response = project_client.evaluations.get(evaluation_response.id)

print("----------------------------------------------------------------")
print("Created evaluation, evaluation ID: ", get_evaluation_response.id)
print("Evaluation status: ", get_evaluation_response.status)
print("AI project URI: ", get_evaluation_response.properties["AiStudioEvaluationUri"])
print("----------------------------------------------------------------")

----------------------------------------------------------------
Created evaluation, evaluation ID:  60483791-47a0-4916-9c51-20624d7ed5ed
Evaluation status:  Starting
AI project URI:  https://ai.azure.com/build/evaluation/60483791-47a0-4916-9c51-20624d7ed5ed?wsid=/subscriptions/c11caebe-ea81-4036-9e58-ccf406d87ead/resourceGroups/sbn-ai-prod-swc/providers/Microsoft.MachineLearningServices/workspaces/10_stable
----------------------------------------------------------------
