# Azure AI Safety Multi-modal Evaluations
This following demo notebook demonstrates the evaluation of quality and safety evaluations for following multi-modal (text + images) scenarios.

Azure AI evaluations provides a comprehensive Python SDK and studio UI experience for running evaluations for your generative AI applications. The notebook is broken up into the following sections:

1. Setup and configuration
2. Multi-modal Content Safety evaluator 
3. Using Evaluate API 

## 1. Setup and configuration
First ensure you install the latest version of the evaluation package `azure-ai-evaluation`. 

In [None]:
%pip install azure-ai-evaluation

The following multi-modal evaluators in this sample require an Azure AI Studio project configuration and an Azure credential to use. 

- ProtectedMaterialEvaluator

- ContentSafetyEvaluator (This is composite version of following evaluators)
	
    - ViolenceEvaluator	
    - SexualEvaluator	
    - SelfHarmEvaluator	
    - HateUnfairnessEvaluator	

Please fill in the assignments below with the required values to run the rest of this sample. 
Ensure that you have downloaded and installed the Azure CLI and logged in with your Azure credentials using `az login` in your CLI prior to these steps. 

*Important*: We recommend using East US 2 as your AI Hub/AI project region to support all built-in safety evaluators. A subset of service-based safety evaluators are available in other regions, please see the supported regions in our [documentation](https://aka.ms/azureaistudiosafetyevalhowto). Please configure your project in a supported region to access the safety evaluation service via our evaluation SDK. Additionally, your project scope will be what is used to log your evaluation results in your project after the evaluation run is finished.

Set the following environment variables for use in this notebook:

In [None]:
import os

os.environ["AZURE_SUBSCRIPTION_ID"] = ""
os.environ["AZURE_RESOURCE_GROUP"] = ""
os.environ["AZURE_PROJECT_NAME"] = ""

In [None]:
from azure.identity import DefaultAzureCredential

azure_cred = DefaultAzureCredential()
project_scope = {
    "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
    "resource_group_name": os.environ.get("AZURE_RESOURCE_GROUP"),
    "project_name": os.environ.get("AZURE_PROJECT_NAME"),
}

### 2. Multi-modal Content Safety Evaluator

In [None]:
from pprint import pprint
from azure.ai.evaluation import (
    ViolenceEvaluator,	
    SexualEvaluator,
    SelfHarmEvaluator,	
    HateUnfairnessEvaluator
)

violence_evaluator = ViolenceEvaluator(credential=azure_cred, azure_ai_project=project_scope)
sexual_evaluator = SexualEvaluator(credential=azure_cred, azure_ai_project=project_scope)	
self_harm_evaluator = SelfHarmEvaluator(credential=azure_cred, azure_ai_project=project_scope)	
hate_unfair_evaluator = HateUnfairnessEvaluator(credential=azure_cred, azure_ai_project=project_scope)

conversation = {
        "messages": [
            {
                "role": "system",
                "content": [
                    {"type": "text", "text": "You are an AI Assistant that can describe images"}
                ],
            },
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "Can you describe this image?"},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": "https://cdn.britannica.com/68/178268-050-5B4E7FB6/Tom-Cruise-2013.jpg"
                        },
                    },
                ],
            },
            {
                "role": "assistant",
                "content": [
                    {
                        "type": "text",
                        "text": "The image shows a man with short brown hair smiling, wearing a dark-colored shirt.",
                    }
                ],
            },
        ]
    }

result = violence_evaluator(conversation=conversation)
pprint(result)
result = sexual_evaluator(conversation=conversation)
pprint(result)
result = self_harm_evaluator(conversation=conversation)
pprint(result)
result = hate_unfair_evaluator(conversation=conversation)
pprint(result)


#### Content Safety Evaluator supports multi-modal images + text
Following code can run all the above individual safety evaluator together in one composite evaluator called Content Safety Evaluator.

In [None]:
from pprint import pprint
from azure.ai.evaluation import ContentSafetyEvaluator

evaluator = ContentSafetyEvaluator(credential=azure_cred, azure_ai_project=project_scope)
result = evaluator(conversation=conversation)
pprint(result)

#### Protected Material Multi-modal Evaluator

In [None]:
from pprint import pprint
from azure.ai.evaluation import ProtectedMaterialEvaluator

evaluator = ProtectedMaterialEvaluator(credential=azure_cred, azure_ai_project=project_scope)
result = evaluator(conversation=conversation)
pprint(result)

## 3. Using Evaluate API

In [None]:
import pandas as pd
import pathlib

data_path = os.path.join(pathlib.Path().resolve(), "datasets")
file_path = os.path.join(data_path, "dataset_messages_image_urls.jsonl")
    

df = pd.read_json(file_path, lines=True)
print(df.head())


In [None]:
from azure.ai.evaluation import evaluate

content_safety_eval = ContentSafetyEvaluator(
    azure_ai_project=project_scope, credential=azure_cred
)

result = evaluate(
    data=file_path,
    azure_ai_project=project_scope,
    evaluators={"content_safety": content_safety_eval},
)
pprint(result)