# **Evaluating AI Models in Azure AI Foundry**

## Overview
This notebook demonstrates how to evaluate AI model outputs using Azure AI Foundry's evaluation capabilities. You'll learn how to assess the groundedness and quality of AI-generated responses, ensuring they're factually accurate and aligned with provided context.

## Evaluation Types in Azure AI Foundry

### **Groundedness Evaluation**
Groundedness evaluates whether an AI model's responses are properly supported by the provided context:

- **GroundednessEvaluator**: Assesses if the model's response contains claims that are verifiable within the given context
- **GroundednessProEvaluator**: An enhanced version that provides more detailed evaluation metrics

### **Other Common Evaluations (Not Covered in This Notebook)**
- **Relevance**: Measures how well the response addresses the user query
- **Coherence**: Evaluates the logical flow and consistency of the response
- **Fluency**: Analyzes the grammatical correctness and readability of the text
- **Toxicity**: Checks for harmful, offensive, or inappropriate content

## Learning Objectives
- Set up evaluation in Azure AI Foundry
- Configure evaluators for assessing groundedness of AI responses
- Run evaluations on sample data

## Prerequisites
- An Azure account with access to Azure AI Foundry
- Azure OpenAI connection configured in your Azure AI Foundry project
- Appropriate environment variables in a `.env` file:
  - `PROJECT_CONNECTION_STRING`: Connection string for your Azure AI Foundry project
  - `OAI_CONNECTION_NAME`: Name of your Azure OpenAI connection
  - `chatModel`: The model to use for evaluation (e.g., GPT-4)
  - `AZURE_OPENAI_API_VERSION`: API version for Azure OpenAI

## Workflow
This notebook walks through the complete process of evaluating AI model outputs:
1. Setting up the environment and client
2. Configuring evaluators for groundedness assessment
3. Running evaluations on sample data
4. Uploading and analyzing evaluation results in Azure AI Foundry

Follow along step by step to learn how to evaluate your AI models in Azure AI Foundry!

In [1]:
import os
import dotenv
dotenv.load_dotenv(".env")

True

In [2]:
from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential

project_client = AIProjectClient.from_connection_string(
    conn_str=os.environ["PROJECT_CONNECTION_STRING"], credential=DefaultAzureCredential()
)

oai_connection = project_client.connections.get(
    connection_name=os.getenv("OAI_CONNECTION_NAME"),
    include_credentials=True)

**Run evaluation and upload results to AI Foundry**

In [18]:
from azure.ai.evaluation import evaluate
import datetime
from azure.ai.projects.models import EvaluatorConfiguration, ConnectionType
from azure.ai.evaluation import GroundednessEvaluator, GroundednessProEvaluator

GroundednessEvaluator

def run_eval_on_azure(model_config, data_path):
    now = datetime.datetime.now()
    result = evaluate(
        evaluation_name = f"groundedness-{now.strftime('%Y-%m-%d-%H-%M-%S')}",
        data=data_path,
        evaluators={
            "GroundednessEvaluator": GroundednessEvaluator(model_config=model_config),
            "GroundednessProEvaluator": GroundednessProEvaluator(azure_ai_project=project_client.scope, credential=DefaultAzureCredential()),
        },
        evaluator_config={
            "GroundednessProEvaluator": {
                "column_mapping": {
                    "query": "${data.query}",
                    "context": "${data.context}",
                    "response": "${data.response}"
                }
            },
            "GroundednessEvaluator": {
                "column_mapping": {
                    "query": "${data.query}",
                    "context": "${data.context}",
                    "response": "${data.response}"
                }
            }
        },
        azure_ai_project = project_client.scope,
        # output_path="./myevalresults.json"
    )

In [19]:
data_path = "data/eval.jsonl"

######## get configs ########
model_config = oai_connection.to_evaluator_model_config(
    deployment_name=os.environ.get("chatModel"),
    api_version=os.environ.get("AZURE_OPENAI_API_VERSION")
)

######## run evaluation ########
run_eval_on_azure(
    model_config,
    data_path
)

[2025-04-04 16:52:32 +0200][promptflow._sdk._orchestrator.run_submitter][INFO] - Submitting run azure_ai_evaluation_evaluators_common_base_eval_asyncevaluatorbase_w58dwg92_20250404_165232_024664, log path: C:\Users\povelf\.promptflow\.runs\azure_ai_evaluation_evaluators_common_base_eval_asyncevaluatorbase_w58dwg92_20250404_165232_024664\logs.txt
[2025-04-04 16:52:32 +0200][promptflow._sdk._orchestrator.run_submitter][INFO] - Submitting run azure_ai_evaluation_evaluators_common_base_eval_asyncevaluatorbase_u65j6es6_20250404_165232_024664, log path: C:\Users\povelf\.promptflow\.runs\azure_ai_evaluation_evaluators_common_base_eval_asyncevaluatorbase_u65j6es6_20250404_165232_024664\logs.txt


Prompt flow service has started...
Prompt flow service has started...
You can view the traces in local from http://127.0.0.1:23333/v1.0/ui/traces/?#run=azure_ai_evaluation_evaluators_common_base_eval_asyncevaluatorbase_w58dwg92_20250404_165232_024664
You can view the traces in local from http://127.0.0.1:23333/v1.0/ui/traces/?#run=azure_ai_evaluation_evaluators_common_base_eval_asyncevaluatorbase_u65j6es6_20250404_165232_024664


[2025-04-04 16:52:32 +0200][promptflow.core._prompty_utils][ERROR] - Exception occurs: AuthenticationError: Error code: 401 - {'error': {'code': '401', 'message': 'Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.'}}
[2025-04-04 16:52:32 +0200][promptflow.core._prompty_utils][ERROR] - Exception occurs: AuthenticationError: Error code: 401 - {'error': {'code': '401', 'message': 'Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.'}}
[2025-04-04 16:52:32 +0200][promptflow.core._prompty_utils][ERROR] - Exception occurs: AuthenticationError: Error code: 401 - {'error': {'code': '401', 'message': 'Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscripti

2025-04-04 16:52:48 +0200   32776 execution.bulk     INFO     Finished 1 / 3 lines.
2025-04-04 16:52:48 +0200   32776 execution.bulk     INFO     Average execution time for completed lines: 15.9 seconds. Estimated time for incomplete lines: 31.8 seconds.
2025-04-04 16:52:48 +0200   32776 execution.bulk     INFO     Finished 2 / 3 lines.
2025-04-04 16:52:48 +0200   32776 execution.bulk     INFO     Average execution time for completed lines: 15.9 seconds. Estimated time for incomplete lines: 31.8 seconds.
2025-04-04 16:52:48 +0200   32776 execution.bulk     INFO     Finished 2 / 3 lines.
2025-04-04 16:52:48 +0200   32776 execution.bulk     INFO     Average execution time for completed lines: 7.96 seconds. Estimated time for incomplete lines: 7.96 seconds.
2025-04-04 16:52:48 +0200   32776 execution.bulk     INFO     Finished 3 / 3 lines.
2025-04-04 16:52:48 +0200   32776 execution.bulk     INFO     Average execution time for completed lines: 5.34 seconds. Estimated time for incomplete l

**Upload data**