# [Cloud Evaluation](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/develop/cloud-evaluation#cloud-evaluation-preview-with-azure-ai-projects-sdk)
While Azure AI Evaluation SDK client supports running evaluations locally on your own machine, you might want to delegate the job remotely to the cloud. For example, after you ran local evaluations on small test data to help assess your generative AI application prototypes, now you move into pre-deployment testing and need run evaluations on a large dataset. Cloud evaluation frees you from managing your local compute infrastructure, and enables you to integrate evaluations as tests into your CI/CD pipelines. After deployment, you might want to continuously evaluate your applications for post-deployment monitoring.

In this article, you learn how to run cloud evaluation (preview) in pre-deployment testing on a test dataset. Using the Azure AI Projects SDK, you'll have evaluation results automatically logged into your Azure AI project for better observability. This feature supports all Microsoft curated built-in evaluators and your own custom evaluators which can be located in the Evaluator library and have the same project-scope RBAC

## Environment prepration

In [1]:
# !az login

In [2]:
# Constants and Libraries
import os, json
from azure.identity import DefaultAzureCredential, get_bearer_token_provider #requires azure-identity
from pprint import pprint
from dotenv import load_dotenv # requires python-dotenv

if not load_dotenv("./../../config/credentials_my.env"):
    print("Environment variables not loaded, cell execution stopped")
    sys.exit()
os.environ["AZURE_OPENAI_API_VERSION"] = os.environ["OPENAI_API_VERSION"]

credential = DefaultAzureCredential()

In [3]:
# Initialize Azure OpenAI connection

model_config = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
    "azure_deployment": os.environ.get("MODEL_DEPLOYMENT_NAME"),
    "api_version": os.environ.get("AZURE_OPENAI_API_VERSION"),
}

## [Create an Azure AI Client from a connection string](https://learn.microsoft.com/en-us/azure/ai-services/agents/quickstart?pivots=programming-language-python-azure). 
- available on Azure AI project Overview page.
- format: `<HostName>;<AzureSubscriptionId>;<ResourceGroup>;<ProjectName>`
- command: `az ml workspace show -n mmai-swc-hub01-prj01 --resource-group mmai-swc-hub01-grp --query discovery_url`

In [4]:
from azure.ai.projects import AIProjectClient

project_client = AIProjectClient.from_connection_string(
    credential=DefaultAzureCredential(),
    conn_str=os.environ.get("PROJECT_CONNECTION_STRING")
)

## Uploading evaluation data
We provide two ways to register your data in Azure AI project required for evaluations in the cloud:
- From SDK: Upload new data from your local directory to your Azure AI project in the SDK, and fetch the dataset ID as a result
- Given existing datasets uploaded to your Project...

In [5]:
data_id, _ = project_client.upload_file("./synthetic_dataset_cloud.jsonl")

## Specifying built-in evaluators from Evaluator library

In [6]:
from azure.ai.evaluation import F1ScoreEvaluator, RelevanceEvaluator, ViolenceEvaluator

## Specifying custom evaluators
Note: they must be already registered as done in 2.4 Custom Evaluators

In [7]:
# Define ml_client to register custom evaluator

from azure.ai.ml import MLClient

ml_client = MLClient(
       subscription_id=os.environ["AZURE_SUBSCRIPTION_ID"],
       resource_group_name=os.environ["RESOURCE_GROUP_NAME"],
       workspace_name=os.environ["PROJECT_NAME"],
       credential=credential
)

Overriding of current TracerProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented


In [8]:
# Specify evaluator name as it appears in the Evaluator library
evaluator_name = "FriendlinessEvaluator"
registered_evaluator = ml_client.evaluators.get(evaluator_name, version=3)
print("Registered evaluator:", registered_evaluator)

Method evaluators: This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


Registered evaluator: creation_context:
  created_at: '2025-04-20T23:33:23.686456+00:00'
  created_by: Mauro Minella
  created_by_type: User
  last_modified_at: '2025-04-20T23:33:23.686456+00:00'
  last_modified_by: Mauro Minella
  last_modified_by_type: User
description: prompt-based evaluator measuring response friendliness.
id: azureml:/subscriptions/eca2eddb-0f0c-4351-a634-52751499eeea/resourceGroups/mmai-swc-hub01-grp/providers/Microsoft.MachineLearningServices/workspaces/mmai-swc-hub01-prj01/models/FriendlinessEvaluator/versions/3
name: FriendlinessEvaluator
path: azureml://subscriptions/eca2eddb-0f0c-4351-a634-52751499eeea/resourceGroups/mmai-swc-hub01-grp/workspaces/mmai-swc-hub01-prj01/datastores/workspaceblobstore/paths/LocalUpload/e8ccca8c8f07c002cc62dce8635918f0/friendliness_local
properties:
  is-evaluator: 'true'
  is-promptflow: 'true'
stage: Development
tags: {}
type: custom_model
version: '3'



## Create an evaluation

In [9]:
from azure.ai.projects.models import Evaluation, Dataset, EvaluatorConfiguration

evaluation = Evaluation(
    display_name="Cloud evaluation",
    description="Evaluation of dataset",
    data=Dataset(id=data_id),
    evaluators={
        # Note the evaluator configuration key must follow a naming convention
        # the string must start with a letter with only alphanumeric characters 
        # and underscores. Take "f1_score" as example: "f1score" or "f1_evaluator" 
        # will also be acceptable, but "f1-score-eval" or "1score" will result in errors.
        "f1_score": EvaluatorConfiguration(
            id=F1ScoreEvaluator.id,
        ),
        "relevance": EvaluatorConfiguration(
            id=RelevanceEvaluator.id,
            init_params={
                "model_config": model_config
            },
        ),
        "violence": EvaluatorConfiguration(
            id=ViolenceEvaluator.id,
            init_params={
                "azure_ai_project": project_client.scope
            },
        ),
        "friendliness": EvaluatorConfiguration(
            id=registered_evaluator.path,
            init_params={
                "model_config": model_config
            }
        )
    },
)
evaluation

{'displayName': 'Cloud evaluation', 'description': 'Evaluation of dataset', 'data': {'type': 'dataset', 'id': '/subscriptions/eca2eddb-0f0c-4351-a634-52751499eeea/resourceGroups/mmai-swc-hub01-grp/providers/Microsoft.MachineLearningServices/workspaces/mmai-swc-hub01-prj01/data/902bf50f-c87c-4320-ab89-9bf64be8be3b/versions/1'}, 'evaluators': {'f1_score': {'id': 'azureml://registries/azureml/models/F1Score-Evaluator/versions/3'}, 'relevance': {'id': 'azureml://registries/azureml/models/Relevance-Evaluator/versions/4', 'initParams': {'model_config': {'azure_endpoint': 'https://mmai-swc-hub01-oais581696736083.openai.azure.com/', 'api_key': '9W7MYkTJhnsTiY4eSyH8zFlol3SEoj7hbYUSyXkJuvIcpUBCvwQnJQQJ99BDACfhMk5XJ3w3AAAAACOGKaCA', 'azure_deployment': 'gpt-4.1', 'api_version': '2025-03-01-preview'}}}, 'violence': {'id': 'azureml://registries/azureml/models/Violent-Content-Evaluator/versions/3', 'initParams': {'azure_ai_project': {'subscription_id': 'eca2eddb-0f0c-4351-a634-52751499eeea', 'resour

In [11]:
# Create evaluation
evaluation_response = project_client.evaluations.create(
    evaluation=evaluation,
)

HttpResponseError: (UserError) Evaluation evaluator ID invalid asset or open AI grader format. Provided evaluator id is azureml://subscriptions/eca2eddb-0f0c-4351-a634-52751499eeea/resourceGroups/mmai-swc-hub01-grp/workspaces/mmai-swc-hub01-prj01/datastores/workspaceblobstore/paths/LocalUpload/e8ccca8c8f07c002cc62dce8635918f0/friendliness_local is invalid
Code: UserError
Message: Evaluation evaluator ID invalid asset or open AI grader format. Provided evaluator id is azureml://subscriptions/eca2eddb-0f0c-4351-a634-52751499eeea/resourceGroups/mmai-swc-hub01-grp/workspaces/mmai-swc-hub01-prj01/datastores/workspaceblobstore/paths/LocalUpload/e8ccca8c8f07c002cc62dce8635918f0/friendliness_local is invalid

In [11]:
# Get evaluation
evaluation_response = project_client.evaluations.get(evaluation_response.id)

print("----------------------------------------------------------------")
print("Created evaluation, evaluation ID: ", get_evaluation_response.id)
print("Evaluation status: ", get_evaluation_response.status)
print("AI project URI: ", get_evaluation_response.properties["AiStudioEvaluationUri"])
print("----------------------------------------------------------------")

NameError: name 'evaluation_response' is not defined