# [Generate synthetic and simulated data for evaluation](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/develop/simulator-interaction-data)
**Azure AI Evaluation SDK's** `Simulator` provides an end-to-end synthetic data generation capability to help developers test their application's response to typical user queries in the absence of production data. AI developers can use an index or text-based query generator and fully customizable simulator to create robust test datasets around non-adversarial tasks specific to their application. The `Simulator` class is a powerful tool designed to generate synthetic conversations and simulate task-based interactions. This capability is useful for:
- **Testing Conversational Applications**: Ensure your chatbots and virtual assistants respond accurately under various scenarios.
- **Training AI Models**: Generate diverse datasets to train and fine-tune machine learning models.
- **Generating Datasets**: Create extensive conversation logs for analysis and development purposes.
By automating the creation of synthetic data, the Simulator class helps streamline the development and testing processes, ensuring your applications are robust and reliable.
<br/>
By automating the creation of synthetic data, the `Simulator` class helps streamline the development and testing processes, ensuring your applications are robust and reliable.

In [1]:
# !az login

In [2]:
# Constants and Libraries
import os, json
from azure.identity import DefaultAzureCredential, get_bearer_token_provider #requires azure-identity
from pprint import pprint
from dotenv import load_dotenv # requires python-dotenv

if not load_dotenv("./../../config/credentials_my.env"):
    print("Environment variables not loaded, cell execution stopped")
    sys.exit()
os.environ["AZURE_OPENAI_API_VERSION"] = os.environ["OPENAI_API_VERSION"]

credential = DefaultAzureCredential()

In [3]:
# Initialize Azure AI project and Azure OpenAI connection
azure_ai_project = {
    "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
    "resource_group_name": os.environ.get("RESOURCE_GROUP_NAME"),
    "project_name": os.environ.get("PROJECT_NAME"),
}

model_config = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
    "azure_deployment": os.environ.get("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"),
    "api_version": os.environ.get("AZURE_OPENAI_API_VERSION"),
}

## Generate text or index-based synthetic data as input
In the first part, we prepare the text for generating the input to our simulator

In [4]:
import asyncio
from azure.identity import DefaultAzureCredential
import wikipedia
import os
from typing import List, Dict, Any, Optional
# Prepare the text to send to the simulator
wiki_search_term = "Leonardo da Vinci"
wiki_title = wikipedia.search(wiki_search_term)[0]
wiki_page = wikipedia.page(wiki_title)
text = wiki_page.summary[:5000]
print(f"{text[:100]}...")

Leonardo di ser Piero da Vinci (15 April 1452 – 2 May 1519) was an Italian polymath of the High Rena...


## Specify application Prompty
The following `application.prompty` file specifies how a chat application behaves.

In [5]:
%%writefile ./eval_assets/application.prompty
---
name: ApplicationPrompty
description: Chat RAG application
model:
    api: chat
    parameters:
        temperature: 0.0
        top_p: 1.0
        presence_penalty: 0
        frequency_penalty: 0
        response_format:
            type: text
 
inputs:
    context:
        type: string
    query:
        type: string
    conversation_history:
        type: dict
---
system:
You are a helpful assistant and you're helping with the user's query. 
Keep the conversation engaging and interesting.

Keep your conversation grounded in the provided context: 
{{ context }}

Output with a string that continues the conversation, responding to 
the latest message from the user query using it same language:
{{ query }}

given the conversation history:
{{ conversation_history }}

Overwriting ./eval_assets/application.prompty


In [9]:
from promptflow.client import load_flow

model_config["type"] = "AzureOpenAI"

prompty_path = "eval_assets/application.prompty"
flow = load_flow(source=prompty_path, model={"configuration": model_config})

flow(context="", query="Come preparare una pizza fatta in casa", conversation_history=[])

'Certo! Ecco una ricetta semplice per preparare una pizza fatta in casa:\n\nIngredienti per l’impasto:\n- 500 g di farina 00\n- 300 ml di acqua tiepida\n- 10 g di sale\n- 3 g di lievito di birra secco (oppure 10 g fresco)\n- 1 cucchiaio di olio extravergine d’oliva\n\nProcedimento:\n1. Sciogli il lievito in poca acqua tiepida.\n2. In una ciotola capiente, versa la farina e aggiungi il lievito sciolto.\n3. Inizia a impastare aggiungendo gradualmente l’acqua. Quando l’impasto inizia a formarsi, aggiungi il sale e l’olio.\n4. Continua a impastare fino a ottenere un impasto liscio ed elastico (circa 10 minuti).\n5. Copri la ciotola con un panno e lascia lievitare per almeno 2 ore, fino al raddoppio del volume.\n6. Stendi l’impasto su una teglia leggermente unta o su carta forno.\n7. Condisci con salsa di pomodoro, mozzarella e gli ingredienti che preferisci.\n8. Cuoci in forno preriscaldato a 250°C per circa 10-15 minuti, finché la pizza non è dorata e croccante.\n\nVuoi qualche consiglio 

In [46]:
context = "Conversazione amichevole fra due persone"

query = "Lo SQUALO di Steven Spielberg"

conversation_history = [
    {
        "content": "Cosa fai di bello questa sera?",
        "role": "user",
        "context": "Domanda amichevole",
    },
    {
        "content": "Vado al cinema",
        "role": "assistant",
        "context": "Esposizione di un'attività ricreativa divertente",
    },
    {
        "content": "Ah fantastico! Che cosa vai a vedere?",
        "role": "user",
        "context": "Ulteriore domanda amichevole a dimostrare interesse verso l'interlocutore",
    },
    {
        "content": "Lo SQUALO di Steven Spielberg",
        "role": "assistant",
        "context": "Informazione specifica di un'attività divertente che si intende compiere",
    },
]

flow(context=conversation_history[-1]["context"], query=conversation_history[-1]["content"], conversation_history=conversation_history)

'Che bello! È un grande classico, non passa mai di moda. Lo rivedi per nostalgia o è la prima volta che lo guardi al cinema?'

In [53]:
i = 0
for ch in conversation_history:
    print(f'Message {i} - {ch["role"]}: {ch["content"]}\n')
    i += 1
    
response = flow(context=conversation_history[-1]["context"], query=conversation_history[-1]["content"], conversation_history=conversation_history)

conversation_history.append({"content": response, "role": "user" if conversation_history[-1]["role"]=="assistant" else "user", "context": ""})
print(f'Message {i} - {conversation_history[-1]["role"]}: {conversation_history[-1]["content"]}\n')

Message 0 - user: Cosa fai di bello questa sera?

Message 1 - assistant: Vado al cinema

Message 2 - user: Ah fantastico! Che cosa vai a vedere?

Message 3 - assistant: Lo SQUALO di Steven Spielberg

Message 4 - user: Che bello! È un grande classico, nonostante gli anni fa ancora paura! Vai da solo o in compagnia?

Message 5 - user: Vado con un paio di amici, così se ci spaventiamo almeno possiamo ridere insieme! Tu l’hai mai visto al cinema o solo in TV?

Message 6 - user: L’ho visto solo in TV, purtroppo! Deve essere tutta un’altra esperienza vederlo sul grande schermo, con l’audio che ti fa saltare sulla sedia a ogni nota della colonna sonora. Fammi sapere poi com’è stato, sono curioso di sapere se vi siete spaventati davvero!

Message 7 - user: Promesso, ti aggiornerò! Sono sicuro che qualche salto sulla sedia ci scapperà, soprattutto con quella musica inconfondibile che ti mette subito in tensione. Magari poi ci scappa anche una pizza post-film per smaltire la paura! Tu invece, pr

In [12]:
context = text
query = "On what date was Leonardo di ser Piero da Vinci born?"
conversation_history={}
flow(context=context, query=query, conversation_history=conversation_history)

'Leonardo di ser Piero da Vinci was born on 15 April 1452. If you’re curious, his birthplace was in or near the town of Vinci, which is where his surname comes from! Would you like to know more about his early life or perhaps his famous works?'

## Specify target callback to simulate against
You can bring any application endpoint to simulate against by specifying a target callback function such as the following given an application that is an LLM with a Prompty file like `application.prompty`

In [9]:
async def callback(
    messages: Dict,
    stream: bool = False,
    session_state: Any = None,  # noqa: ANN401
    context: Optional[Dict[str, Any]] = None,
    subfolder: str = "eval_assets",
) -> dict:
    messages_list = messages["messages"]
    # Get the last message
    latest_message = messages_list[-1]
    query = latest_message["content"]
    context = latest_message.get("context", None) # looks for context, default None
    # Call your endpoint or AI application here
    current_dir = os.getcwd() # os.path.dirname(__file__) doesn't work in an interactive environment
    prompty_path = os.path.join(current_dir, subfolder, "application.prompty")
    _flow = load_flow(source=prompty_path, model={"configuration": model_config})
    response = _flow(query=query, context=context, conversation_history=messages_list)
    # Format the response to follow the OpenAI chat protocol
    formatted_response = {
        "content": response,
        "role": "assistant",
        "context": context,
    }
    messages["messages"].append(formatted_response)
    return {
        "messages": messages["messages"],
        "stream": stream,
        "session_state": session_state,
        "context": context
    }

In [10]:
model_config

{'azure_endpoint': 'https://mmai-swc-hub01-oais581696736083.openai.azure.com/',
 'api_key': '9W7MYkTJhnsTiY4eSyH8zFlol3SEoj7hbYUSyXkJuvIcpUBCvwQnJQQJ99BDACfhMk5XJ3w3AAAAACOGKaCA',
 'azure_deployment': 'gpt-4.1',
 'api_version': '2025-03-01-preview',
 'type': 'AzureOpenAI'}

In [11]:
from azure.ai.evaluation.simulator import Simulator
# model_config["type"] = "AzureOpenAI" # "azure_openai" or "AzureOpenAI" 
simulator = Simulator(model_config=model_config)

Class Simulator: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


In [21]:
outputs = await simulator(
    target=callback,
    text="Il Duomo di Milano è una delle cattedrali più iconiche al mondo e il simbolo della città. La sua costruzione iniziò nel 1386 e si protrasse per secoli, con il completamento ufficiale nel 1932. È la chiesa più grande d'Italia e una delle più imponenti a livello mondiale",# text,
    num_queries=1,  # Minimal number of queries
)

Generating: 100%|████████████████████████████████████████████████| 5/5 [00:52<00:00, 10.48s/message]


In [22]:
outputs

[{'messages': [{'role': 'user',
    'content': 'In which year did the construction of Il Duomo di Milano officially complete?',
    'context': 'None'},
   {'role': 'assistant',
    'content': 'The construction of Il Duomo di Milano (Milan Cathedral) officially completed in 1965, when the last gate was inaugurated. While the main structure was largely finished in 1813, work continued for centuries, with various details and decorations added over time. The cathedral is famous for its incredibly long construction period, spanning nearly six centuries!',
    'context': 'None'},
   {'role': 'user',
    'content': "That's fascinating! Are there any unique architectural features or artworks inside the cathedral that I shouldn't miss if I visit?",
    'context': 'None'},
   {'role': 'assistant',
    'content': "Absolutely! Il Duomo di Milano is filled with remarkable features and artworks. Here are a few highlights you shouldn't miss:\n\n1. The Forest of Spires: The cathedral boasts over 135 s

In [23]:
outputs[0]["messages"]

[{'role': 'user',
  'content': 'In which year did the construction of Il Duomo di Milano officially complete?',
  'context': 'None'},
 {'role': 'assistant',
  'content': 'The construction of Il Duomo di Milano (Milan Cathedral) officially completed in 1965, when the last gate was inaugurated. While the main structure was largely finished in 1813, work continued for centuries, with various details and decorations added over time. The cathedral is famous for its incredibly long construction period, spanning nearly six centuries!',
  'context': 'None'},
 {'role': 'user',
  'content': "That's fascinating! Are there any unique architectural features or artworks inside the cathedral that I shouldn't miss if I visit?",
  'context': 'None'},
 {'role': 'assistant',
  'content': "Absolutely! Il Duomo di Milano is filled with remarkable features and artworks. Here are a few highlights you shouldn't miss:\n\n1. The Forest of Spires: The cathedral boasts over 135 spires, each adorned with statues o