# [Generate synthetic and simulated data for evaluation](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/develop/simulator-interaction-data)
**Azure AI Evaluation SDK's** `Simulator` provides an end-to-end synthetic data generation capability to help developers test their application's response to typical user queries in the absence of production data. AI developers can use an index or text-based query generator and fully customizable simulator to create robust test datasets around non-adversarial tasks specific to their application. The `Simulator` class is a powerful tool designed to generate synthetic conversations and simulate task-based interactions. This capability is useful for:
- **Testing Conversational Applications**: Ensure your chatbots and virtual assistants respond accurately under various scenarios.
- **Training AI Models**: Generate diverse datasets to train and fine-tune machine learning models.
- **Generating Datasets**: Create extensive conversation logs for analysis and development purposes.
By automating the creation of synthetic data, the Simulator class helps streamline the development and testing processes, ensuring your applications are robust and reliable.
<br/>
By automating the creation of synthetic data, the `Simulator` class helps streamline the development and testing processes, ensuring your applications are robust and reliable.

In [None]:
# !az login

In [1]:
# Constants and Libraries
import os, json
from azure.identity import DefaultAzureCredential, get_bearer_token_provider #requires azure-identity
from pprint import pprint
from dotenv import load_dotenv # requires python-dotenv
from typing import List, Dict, Any, Optional
from promptflow.client import load_flow
from pprint import pprint

if not load_dotenv("./../../config/credentials_my.env"):
    print("Environment variables not loaded, cell execution stopped")
    sys.exit()
os.environ["AZURE_OPENAI_API_VERSION"] = os.environ["OPENAI_API_VERSION"]

credential = DefaultAzureCredential()

In [2]:
# Initialize Azure OpenAI connection

model_config = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
    "azure_deployment": os.environ.get("MODEL_DEPLOYMENT_NAME"),
    "api_version": os.environ.get("AZURE_OPENAI_API_VERSION"),
    "type": "AzureOpenAI" # NEEDED FOR \Lib\site-packages\promptflow\core\_prompty_utils.py
}

## Specify application Prompty
The following `application.prompty` file specifies how a chat application behaves.

In [3]:
%%writefile ./eval_assets/application.prompty
---
name: ApplicationPrompty
description: Chat RAG application
model:
    api: chat
    parameters:
        temperature: 0.0
        top_p: 1.0
        presence_penalty: 0
        frequency_penalty: 0
        response_format:
            type: text
 
inputs:
    context:
        type: string
    query:
        type: string
    conversation_history:
        type: dict
---
system:
You are a helpful assistant and you're helping with the user's query. 
Keep the conversation engaging and interesting.

Keep your conversation grounded in the provided context: 
{{ context }}

Output with a string that continues the conversation, responding to 
the latest message from the user query using it same language:
{{ query }}

given the conversation history:
{{ conversation_history }}

Overwriting ./eval_assets/application.prompty


In [4]:
prompty_path = "eval_assets/application.prompty"
flow = load_flow(source=prompty_path, model={"configuration": model_config})

pprint(flow(context="", query="Come preparare una pizza fatta in casa", conversation_history=[]))

("Preparare una pizza fatta in casa è un'ottima idea! Inizia con l'impasto: "
 'mescola 500g di farina, 300ml di acqua tiepida, 15g di lievito fresco, un '
 'cucchiaino di zucchero e 10g di sale. Impasta fino a ottenere una '
 'consistenza liscia e elastica, poi lascia lievitare per circa 2 ore. Una '
 "volta lievitato, stendi l'impasto su una teglia e aggiungi i tuoi "
 'ingredienti preferiti, come salsa di pomodoro, mozzarella e basilico. Cuoci '
 'in forno preriscaldato a 220°C per circa 10-15 minuti. Buon appetito!')


In [5]:
context = "Conversazione amichevole fra due persone"

conversation_history = [
    {
        "content": "Cosa fai di bello questa sera?",
        "role": "user",
        "context": "Domanda amichevole",
    },
    {
        "content": "Vado al cinema",
        "role": "assistant",
        "context": "Esposizione di un'attività ricreativa divertente",
    },
    {
        "content": "Ah fantastico! Che cosa vai a vedere?",
        "role": "user",
        "context": "Ulteriore domanda amichevole a dimostrare interesse verso l'interlocutore",
    },
    {
        "content": "Lo SQUALO di Steven Spielberg",
        "role": "assistant",
        "context": "Informazione specifica di un'attività divertente che si intende compiere",
    },
]

flow(context=conversation_history[-1]["context"], query=conversation_history[-1]["content"], conversation_history=conversation_history)

'Wow, un classico! È sempre emozionante vedere Lo SQUALO sul grande schermo. Sei un fan dei film di Spielberg?'

In [11]:
i = 0
for ch in conversation_history:
    print(f'Message {i} - {ch["role"]}: {ch["content"]}\n')
    i += 1
    
response = flow(context=conversation_history[-1]["context"], query=conversation_history[-1]["content"], conversation_history=conversation_history)

conversation_history.append({"content": response, "role": "user" if conversation_history[-1]["role"]=="assistant" else "assistant", "context": ""})
print(f'Message {i} - {conversation_history[-1]["role"]}: {conversation_history[-1]["content"]}\n')

Message 0 - user: Cosa fai di bello questa sera?

Message 1 - assistant: Vado al cinema

Message 2 - user: Ah fantastico! Che cosa vai a vedere?

Message 3 - assistant: Lo SQUALO di Steven Spielberg

Message 4 - user: Che scelta interessante! È un classico che non perde mai il suo fascino. Sei un fan dei film di Spielberg o è la prima volta che vedi "Lo SQUALO"?

Message 5 - assistant: Sono un grande fan dei film di Spielberg! Ho visto "Lo SQUALO" diverse volte, ma ogni volta è come se fosse la prima. C'è qualcosa di magico nel modo in cui Spielberg riesce a creare suspense e tensione. Tu hai un film preferito di Spielberg?

Message 6 - user: Anche io adoro i film di Spielberg! È difficile scegliere un preferito, ma "E.T. l'extra-terrestre" ha sempre avuto un posto speciale nel mio cuore. La storia di amicizia e avventura è davvero commovente. Qual è il tuo preferito?

Message 7 - assistant: "E.T. è davvero un capolavoro! Spielberg ha un talento unico nel raccontare storie che toccano 

## Specify target callback to simulate against
You can bring any application endpoint to simulate against by specifying a target callback function such as the following given an application that is an LLM with a Prompty file like `application.prompty`

In [12]:
async def callback(
    messages: Dict,
    stream: bool = False,
    session_state: Any = None,  # noqa: ANN401
    context: Optional[Dict[str, Any]] = None,
    subfolder: str = "eval_assets",
) -> dict:
    messages_list = messages["messages"]
    latest_message = messages_list[-1] # Get the last message
    query = latest_message["content"]
    context = latest_message.get("context", None) # looks for context, default None
    # Call your endpoint or AI application here
    prompty_path = os.path.join(os.getcwd(), subfolder, "application.prompty") 
    _flow = load_flow(source=prompty_path, model={"configuration": model_config})
    response = _flow(query=query, context=context, conversation_history=messages_list)
    # Format the response to follow the OpenAI chat protocol
    formatted_response = {
        "content": response,
        "role": "user" if conversation_history[-1]["role"]=="assistant" else "assistant",
        "context": context,
    }
    messages["messages"].append(formatted_response)
    return {
        "messages": messages["messages"],
        "stream": stream,
        "session_state": session_state,
        "context": context
    }

## Generate text or index-based synthetic data as input
In the first part, we prepare the text for generating the input to our simulator

In [13]:
import asyncio
from azure.identity import DefaultAzureCredential
import wikipedia
import os
# Prepare the text to send to the simulator
wiki_search_term = "Leonardo da Vinci"
wiki_title = wikipedia.search(wiki_search_term)[0]
wiki_page = wikipedia.page(wiki_title)
wiki_text = wiki_page.summary[:5000]
print(f"{wiki_text[:100]}...")

Leonardo di ser Piero da Vinci (15 April 1452 – 2 May 1519) was an Italian polymath of the High Rena...


In [14]:
from azure.ai.evaluation.simulator import Simulator
simulator = Simulator(model_config=model_config)

Class Simulator: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


In [15]:
outputs = await simulator(
    num_queries = 1,  # Number of queries
    text        = wiki_text,
    target      = callback   
)

Generating: 100%|████████████████████████████████████████████████| 5/5 [00:22<00:00,  4.53s/message]


In [16]:
outputs

[{'messages': [{'role': 'user',
    'content': 'What was the sale price of Salvator Mundi attributed to Leonardo at auction in 2017?',
    'context': 'None'},
   {'role': 'assistant',
    'content': 'The "Salvator Mundi" attributed to Leonardo da Vinci was sold at auction in 2017 for a staggering $450.3 million, making it the most expensive painting ever sold at auction. It\'s quite a fascinating piece with a mysterious history!',
    'context': 'None'},
   {'role': 'user',
    'content': "That's incredible! Do you know where the painting is currently located or who owns it now?",
    'context': 'None'},
   {'role': 'assistant',
    'content': 'The "Salvator Mundi" is believed to be owned by Saudi Arabia\'s Crown Prince Mohammed bin Salman. There have been reports suggesting that the painting might be housed on his yacht or possibly in storage. There was also speculation about it being displayed at the Louvre Abu Dhabi, but as of now, its exact location remains somewhat of a mystery. I

In [17]:
i = 0
for ch in outputs[0]["messages"]:
    print(f'Message {i} - {ch["role"]}: {ch["content"]}\n')
    i += 1

Message 0 - user: What was the sale price of Salvator Mundi attributed to Leonardo at auction in 2017?

Message 1 - assistant: The "Salvator Mundi" attributed to Leonardo da Vinci was sold at auction in 2017 for a staggering $450.3 million, making it the most expensive painting ever sold at auction. It's quite a fascinating piece with a mysterious history!

Message 2 - user: That's incredible! Do you know where the painting is currently located or who owns it now?

Message 3 - assistant: The "Salvator Mundi" is believed to be owned by Saudi Arabia's Crown Prince Mohammed bin Salman. There have been reports suggesting that the painting might be housed on his yacht or possibly in storage. There was also speculation about it being displayed at the Louvre Abu Dhabi, but as of now, its exact location remains somewhat of a mystery. It's fascinating how such a renowned artwork can have such an elusive presence!

Message 4 - user: It's intriguing how the whereabouts of such a famous painting c