# [Generate synthetic and simulated data for evaluation](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/develop/simulator-interaction-data)
**Azure AI Evaluation SDK's** `Simulator` provides an end-to-end synthetic data generation capability to help developers test their application's response to typical user queries in the absence of production data. AI developers can use an index or text-based query generator and fully customizable simulator to create robust test datasets around non-adversarial tasks specific to their application. The `Simulator` class is a powerful tool designed to generate synthetic conversations and simulate task-based interactions. This capability is useful for:
- **Testing Conversational Applications**: Ensure your chatbots and virtual assistants respond accurately under various scenarios.
- **Training AI Models**: Generate diverse datasets to train and fine-tune machine learning models.
- **Generating Datasets**: Create extensive conversation logs for analysis and development purposes.
By automating the creation of synthetic data, the Simulator class helps streamline the development and testing processes, ensuring your applications are robust and reliable.
<br/>
By automating the creation of synthetic data, the `Simulator` class helps streamline the development and testing processes, ensuring your applications are robust and reliable.

In [1]:
# !az login

In [2]:
# Constants and Libraries
import os, json
from azure.identity import DefaultAzureCredential, get_bearer_token_provider #requires azure-identity
from pprint import pprint
from dotenv import load_dotenv # requires python-dotenv

if not load_dotenv("./../../config/credentials_my.env"):
    print("Environment variables not loaded, cell execution stopped")
    sys.exit()
os.environ["AZURE_OPENAI_API_VERSION"] = os.environ["OPENAI_API_VERSION"]

credential = DefaultAzureCredential()

In [3]:
# Initialize Azure OpenAI connection

model_config = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
    "azure_deployment": os.environ.get("MODEL_DEPLOYMENT_NAME"),
    "api_version": os.environ.get("AZURE_OPENAI_API_VERSION"),
    "type": "AzureOpenAI" # NEEDED FOR \Lib\site-packages\promptflow\core\_prompty_utils.py
}

## Specify application Prompty
The following `application.prompty` file specifies how a chat application behaves.

In [4]:
%%writefile ./eval_assets/application.prompty
---
name: ApplicationPrompty
description: Chat RAG application
model:
    api: chat
    parameters:
        temperature: 0.0
        top_p: 1.0
        presence_penalty: 0
        frequency_penalty: 0
        response_format:
            type: text
 
inputs:
    context:
        type: string
    query:
        type: string
    conversation_history:
        type: dict
---
system:
You are a helpful assistant and you're helping with the user's query. 
Keep the conversation engaging and interesting.

Keep your conversation grounded in the provided context: 
{{ context }}

Output with a string that continues the conversation, responding to 
the latest message from the user query using it same language:
{{ query }}

given the conversation history:
{{ conversation_history }}

Overwriting ./eval_assets/application.prompty


In [5]:
from promptflow.client import load_flow
from pprint import pprint

prompty_path = "eval_assets/application.prompty"
flow = load_flow(source=prompty_path, model={"configuration": model_config})

pprint(flow(context="", query="Come preparare una pizza fatta in casa", conversation_history=[]))

('Certo! Ecco una ricetta semplice per preparare una pizza fatta in casa:\n'
 '\n'
 'Ingredienti per l’impasto:\n'
 '- 500 g di farina 00\n'
 '- 300 ml di acqua tiepida\n'
 '- 10 g di sale\n'
 '- 3 g di lievito di birra secco (oppure 10 g di lievito fresco)\n'
 '- 1 cucchiaio di olio extravergine d’oliva\n'
 '\n'
 'Procedimento:\n'
 '1. Sciogli il lievito nell’acqua tiepida.\n'
 '2. In una ciotola capiente, versa la farina e aggiungi il sale.\n'
 '3. Unisci l’acqua con il lievito e inizia a impastare.\n'
 '4. Aggiungi l’olio e continua a impastare fino a ottenere un impasto liscio '
 'ed elastico.\n'
 '5. Copri la ciotola con un canovaccio e lascia lievitare per almeno 2 ore '
 '(meglio se 4).\n'
 '6. Stendi l’impasto su una teglia leggermente unta o su carta forno.\n'
 '7. Condisci con salsa di pomodoro, mozzarella e gli ingredienti che '
 'preferisci.\n'
 '8. Cuoci in forno preriscaldato a 250°C per circa 10-15 minuti, finché la '
 'pizza non è dorata e croccante.\n'
 '\n'
 'Buon app

In [6]:
context = "Conversazione amichevole fra due persone"

conversation_history = [
    {
        "content": "Cosa fai di bello questa sera?",
        "role": "user",
        "context": "Domanda amichevole",
    },
    {
        "content": "Vado al cinema",
        "role": "assistant",
        "context": "Esposizione di un'attività ricreativa divertente",
    },
    {
        "content": "Ah fantastico! Che cosa vai a vedere?",
        "role": "user",
        "context": "Ulteriore domanda amichevole a dimostrare interesse verso l'interlocutore",
    },
    {
        "content": "Lo SQUALO di Steven Spielberg",
        "role": "assistant",
        "context": "Informazione specifica di un'attività divertente che si intende compiere",
    },
]

flow(context=conversation_history[-1]["context"], query=conversation_history[-1]["content"], conversation_history=conversation_history)

'Che bello! È un grande classico, nonostante gli anni fa ancora paura! Vai da solo o con qualcuno?'

In [16]:
i = 0
for ch in conversation_history:
    print(f'Message {i} - {ch["role"]}: {ch["content"]}\n')
    i += 1
    
response = flow(context=conversation_history[-1]["context"], query=conversation_history[-1]["content"], conversation_history=conversation_history)

conversation_history.append({"content": response, "role": "user" if conversation_history[-1]["role"]=="assistant" else "assistant", "context": ""})
print(f'Message {i} - {conversation_history[-1]["role"]}: {conversation_history[-1]["content"]}\n')

Message 0 - user: Cosa fai di bello questa sera?

Message 1 - assistant: Vado al cinema

Message 2 - user: Ah fantastico! Che cosa vai a vedere?

Message 3 - assistant: Lo SQUALO di Steven Spielberg

Message 4 - user: Che bello! È un grande classico, non passa mai di moda. Lo rivedi per nostalgia o è la prima volta che lo guardi al cinema?

Message 5 - assistant: Lo rivedo per nostalgia! L’ho visto solo in TV, ma mai sul grande schermo, quindi sono super curioso di vivere l’esperienza con l’audio e l’atmosfera del cinema. Tu l’hai mai visto al cinema?

Message 6 - user: No, purtroppo non ho mai avuto l’occasione di vederlo al cinema, solo in TV anche io! Deve essere davvero un’altra cosa con il suono avvolgente e la suspense che si respira in sala. Fammi sapere com’è stata l’esperienza, sono curioso! Magari la prossima volta ci vado anch’io.

Message 7 - assistant: Assolutamente, ti farò sapere com’è andata! Sono sicuro che sarà un’esperienza completamente diversa, magari con qualche s

## Specify target callback to simulate against
You can bring any application endpoint to simulate against by specifying a target callback function such as the following given an application that is an LLM with a Prompty file like `application.prompty`

In [8]:
from typing import List, Dict, Any, Optional

async def callback(
    messages: Dict,
    stream: bool = False,
    session_state: Any = None,  # noqa: ANN401
    context: Optional[Dict[str, Any]] = None,
    subfolder: str = "eval_assets",
) -> dict:
    messages_list = messages["messages"]
    # Get the last message
    latest_message = messages_list[-1]
    query = latest_message["content"]
    context = latest_message.get("context", None) # looks for context, default None
    # Call your endpoint or AI application here
    prompty_path = os.path.join(os.getcwd(), subfolder, "application.prompty") 
    _flow = load_flow(source=prompty_path, model={"configuration": model_config})
    response = _flow(query=query, context=context, conversation_history=messages_list)
    # Format the response to follow the OpenAI chat protocol
    formatted_response = {
        "content": response,
        "role": "user" if conversation_history[-1]["role"]=="assistant" else "assistant",
        "context": context,
    }
    messages["messages"].append(formatted_response)
    return {
        "messages": messages["messages"],
        "stream": stream,
        "session_state": session_state,
        "context": context
    }

## Generate text or index-based synthetic data as input
In the first part, we prepare the text for generating the input to our simulator

In [9]:
import asyncio
from azure.identity import DefaultAzureCredential
import wikipedia
import os
# Prepare the text to send to the simulator
wiki_search_term = "Leonardo da Vinci"
wiki_title = wikipedia.search(wiki_search_term)[0]
wiki_page = wikipedia.page(wiki_title)
wiki_text = wiki_page.summary[:5000]
print(f"{wiki_text[:100]}...")

Leonardo di ser Piero da Vinci (15 April 1452 – 2 May 1519) was an Italian polymath of the High Rena...


In [10]:
from azure.ai.evaluation.simulator import Simulator
simulator = Simulator(model_config=model_config)

Class Simulator: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


In [11]:
outputs = await simulator(
    num_queries = 1,  # Number of queries
    text        = wiki_text,
    target      = callback,
)

Generating: 100%|████████████████████████████████████████████████| 5/5 [00:52<00:00, 10.51s/message]


In [12]:
outputs

[{'messages': [{'role': 'user',
    'content': 'What is the full name and birth date of the Italian polymath of the High Renaissance known for painting, engineering, and science?',
    'context': 'None'},
   {'role': 'assistant',
    'content': 'The full name of the Italian polymath is Leonardo di ser Piero da Vinci. He was born on April 15, 1452.',
    'context': 'None'},
   {'role': 'user',
    'content': "Can you tell me more about Leonardo da Vinci's most famous works or inventions? I'm interested in learning about his contributions to both art and science.",
    'context': 'None'},
   {'role': 'assistant',
    'content': 'Absolutely! Leonardo da Vinci is celebrated for his remarkable achievements in both art and science.\n\nIn art, his most famous works include:\n\n- The Mona Lisa: Perhaps the most iconic painting in the world, known for its mysterious smile and masterful technique.\n- The Last Supper: A monumental mural depicting Jesus and his disciples, renowned for its composit

In [13]:
i = 0
for ch in outputs[0]["messages"]:
    print(f'Message {i} - {ch["role"]}: {ch["content"]}\n')
    i += 1

Message 0 - user: What is the full name and birth date of the Italian polymath of the High Renaissance known for painting, engineering, and science?

Message 1 - assistant: The full name of the Italian polymath is Leonardo di ser Piero da Vinci. He was born on April 15, 1452.

Message 2 - user: Can you tell me more about Leonardo da Vinci's most famous works or inventions? I'm interested in learning about his contributions to both art and science.

Message 3 - assistant: Absolutely! Leonardo da Vinci is celebrated for his remarkable achievements in both art and science.

In art, his most famous works include:

- The Mona Lisa: Perhaps the most iconic painting in the world, known for its mysterious smile and masterful technique.
- The Last Supper: A monumental mural depicting Jesus and his disciples, renowned for its composition and emotional depth.
- Vitruvian Man: A famous drawing illustrating ideal human proportions, blending art and anatomical science.

In science and engineering, L