# [Generate synthetic and simulated data for evaluation](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/develop/simulator-interaction-data)
**Azure AI Evaluation SDK's** `Simulator` provides an end-to-end synthetic data generation capability to help developers test their application's response to typical user queries in the absence of production data. AI developers can use an index or text-based query generator and fully customizable simulator to create robust test datasets around non-adversarial tasks specific to their application. The `Simulator` class is a powerful tool designed to generate synthetic conversations and simulate task-based interactions. This capability is useful for:
- **Testing Conversational Applications**: Ensure your chatbots and virtual assistants respond accurately under various scenarios.
- **Training AI Models**: Generate diverse datasets to train and fine-tune machine learning models.
- **Generating Datasets**: Create extensive conversation logs for analysis and development purposes.
By automating the creation of synthetic data, the Simulator class helps streamline the development and testing processes, ensuring your applications are robust and reliable.
<br/>
By automating the creation of synthetic data, the `Simulator` class helps streamline the development and testing processes, ensuring your applications are robust and reliable.

In [1]:
# !az login

In [2]:
# Constants and Libraries
import os, json
from azure.identity import DefaultAzureCredential, get_bearer_token_provider #requires azure-identity
from pprint import pprint
from dotenv import load_dotenv # requires python-dotenv
from typing import List, Dict, Any, Optional
from promptflow.client import load_flow
from pprint import pprint

if not load_dotenv("./../../config/credentials_my.env"):
    print("Environment variables not loaded, cell execution stopped")
    sys.exit()
os.environ["AZURE_OPENAI_API_VERSION"] = os.environ["OPENAI_API_VERSION"]

credential = DefaultAzureCredential()

In [3]:
# Initialize Azure OpenAI connection

model_config = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
    "azure_deployment": os.environ.get("MODEL_DEPLOYMENT_NAME"),
    "api_version": os.environ.get("AZURE_OPENAI_API_VERSION"),
    "type": "AzureOpenAI" # NEEDED FOR \Lib\site-packages\promptflow\core\_prompty_utils.py
}

## Specify application Prompty
The following `application.prompty` file specifies how a chat application behaves.

In [4]:
%%writefile ./eval_assets/application.prompty
---
name: ApplicationPrompty
description: Chat RAG application
model:
    api: chat
    parameters:
        temperature: 0.0
        top_p: 1.0
        presence_penalty: 0
        frequency_penalty: 0
        response_format:
            type: text
 
inputs:
    context:
        type: string
    query:
        type: string
    conversation_history:
        type: dict
---
system:
You are a helpful assistant and you're helping with the user's query. 
Keep the conversation engaging and interesting.

Keep your conversation grounded in the provided context: 
{{ context }}

Output with a string that continues the conversation, responding to 
the latest message from the user query using it same language:
{{ query }}

given the conversation history:
{{ conversation_history }}

Overwriting ./eval_assets/application.prompty


In [5]:
prompty_path = "eval_assets/application.prompty"
flow = load_flow(source=prompty_path, model={"configuration": model_config})

pprint(flow(context="", query="How to make a pizza at home", conversation_history=[]))

("Making pizza at home is a fun and delicious activity! Here's a simple guide "
 'to get you started:\n'
 '\n'
 '1. **Ingredients**:\n'
 '   - **Dough**: You can make your own or buy pre-made dough. For homemade, '
 "you'll need flour, yeast, water, salt, and olive oil.\n"
 '   - **Sauce**: Tomato sauce or pizza sauce.\n'
 '   - **Cheese**: Mozzarella is classic, but feel free to mix in other '
 'cheeses like parmesan or cheddar.\n'
 '   - **Toppings**: Whatever you like! Popular choices include pepperoni, '
 'mushrooms, bell peppers, onions, olives, and basil.\n'
 '\n'
 '2. **Instructions**:\n'
 '   - **Prepare the Dough**: If making from scratch, mix the flour, yeast, '
 'salt, and water until it forms a dough. Knead it for about 10 minutes, then '
 'let it rise for an hour.\n'
 '   - **Preheat the Oven**: Set it to 475°F (245°C).\n'
 '   - **Roll Out the Dough**: Once risen, roll out the dough on a floured '
 'surface to your desired thickness.\n'
 '   - **Add Sauce and Toppings**: 

In [6]:
# English version

context = "Friendly conversation between two people, in English"

conversation_history = [
    {
        "content": "What are you up to this evening?",
        "role": "user",
        "context": "Friendly question",
    },
    {
        "content": "I'm going to the cinema",
        "role": "assistant",
        "context": "Description of a fun recreational activity",
    },
    {
        "content": "Ah fantastic! What are you going to see?",
        "role": "user",
        "context": "Follow-up friendly question showing interest in the other person",
    },
    {
        "content": "JAWS by Steven Spielberg",
        "role": "assistant",
        "context": "Specific information about a fun activity one plans to do",
    },
]

# flow(context=conversation_history[-1]["context"], query=conversation_history[-1]["content"], conversation_history=conversation_history)

In [7]:
i = 0
for ch in conversation_history:
    print(f'Message {i} - {ch["role"]}: {ch["content"]}\n')
    i += 1
    
response = flow(context=conversation_history[-1]["context"], query=conversation_history[-1]["content"], conversation_history=conversation_history)

conversation_history.append({"content": response, "role": "user" if conversation_history[-1]["role"]=="assistant" else "assistant", "context": ""})
print(f'Message {i} - {conversation_history[-1]["role"]}: {conversation_history[-1]["content"]}\n')

Message 0 - user: What are you up to this evening?

Message 1 - assistant: I'm going to the cinema

Message 2 - user: Ah fantastic! What are you going to see?

Message 3 - assistant: JAWS by Steven Spielberg

Message 4 - user: That sounds thrilling! JAWS is such a classic. Are you a fan of Spielberg's work, or is this your first time watching it?



## Specify target callback to simulate against
You can bring any application endpoint to simulate against by specifying a target callback function such as the following given an application that is an LLM with a Prompty file like `application.prompty`

In [8]:
async def callback(
    messages: Dict,
    stream: bool = False,
    session_state: Any = None,  # noqa: ANN401
    context: Optional[Dict[str, Any]] = None,
    subfolder: str = "eval_assets",
) -> dict:
    messages_list = messages["messages"]
    latest_message = messages_list[-1] # Get the last message
    query = latest_message["content"]
    context = latest_message.get("context", None) # looks for context, default None
    # Call your endpoint or AI application here
    prompty_path = os.path.join(os.getcwd(), subfolder, "application.prompty") 
    _flow = load_flow(source=prompty_path, model={"configuration": model_config})
    response = _flow(query=query, context=context, conversation_history=messages_list)
    # Format the response to follow the OpenAI chat protocol
    formatted_response = {
        "content": response,
        "role": "user" if conversation_history[-1]["role"]=="assistant" else "assistant",
        "context": context,
    }
    messages["messages"].append(formatted_response)
    return {
        "messages": messages["messages"],
        "stream": stream,
        "session_state": session_state,
        "context": context
    }

## Generate text or index-based synthetic data as input
In the first part, we prepare the text for generating the input to our simulator

In [9]:
import asyncio
from azure.identity import DefaultAzureCredential
import wikipedia
import os
# Prepare the text to send to the simulator
wiki_search_term = "Leonardo da Vinci"
wiki_title = wikipedia.search(wiki_search_term)[0]
wiki_page = wikipedia.page(wiki_title)
wiki_text = wiki_page.summary[:5000]
print(f"{wiki_text[:100]}...")

Leonardo di ser Piero da Vinci (15 April 1452 – 2 May 1519) was an Italian polymath of the High Rena...


In [10]:
from azure.ai.evaluation.simulator import Simulator
simulator = Simulator(model_config=model_config)

Class Simulator: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


In [11]:
outputs = await simulator(
    num_queries = 1,  # Number of queries
    text        = wiki_text,
    target      = callback   
)

Generating: 100%|████████████████████████████████████████████████| 5/5 [00:21<00:00,  4.33s/message]


In [12]:
outputs

[{'messages': [{'role': 'user',
    'content': 'What was the sale price of Salvator Mundi attributed to Leonardo at auction in 2017?',
    'context': 'None'},
   {'role': 'assistant',
    'content': 'The "Salvator Mundi" attributed to Leonardo da Vinci was sold at auction in 2017 for a staggering $450.3 million, making it the most expensive painting ever sold at auction. It\'s quite a fascinating piece with a mysterious history!',
    'context': 'None'},
   {'role': 'user',
    'content': "That's incredible! Do you know where the painting is currently located or who owns it now?",
    'context': 'None'},
   {'role': 'assistant',
    'content': 'The "Salvator Mundi" is believed to be owned by Saudi Arabia\'s Crown Prince Mohammed bin Salman. There have been reports suggesting that the painting might be housed on his yacht, the Serene, or possibly in storage. However, its exact location remains somewhat of a mystery, adding to the intrigue surrounding this masterpiece. Would you like to 

In [13]:
i = 0
for ch in outputs[0]["messages"]:
    print(f'Message {i} - {ch["role"]}: {ch["content"]}\n')
    i += 1

Message 0 - user: What was the sale price of Salvator Mundi attributed to Leonardo at auction in 2017?

Message 1 - assistant: The "Salvator Mundi" attributed to Leonardo da Vinci was sold at auction in 2017 for a staggering $450.3 million, making it the most expensive painting ever sold at auction. It's quite a fascinating piece with a mysterious history!

Message 2 - user: That's incredible! Do you know where the painting is currently located or who owns it now?

Message 3 - assistant: The "Salvator Mundi" is believed to be owned by Saudi Arabia's Crown Prince Mohammed bin Salman. There have been reports suggesting that the painting might be housed on his yacht, the Serene, or possibly in storage. However, its exact location remains somewhat of a mystery, adding to the intrigue surrounding this masterpiece. Would you like to know more about its history or the controversies surrounding it?

Message 4 - user: Yes, I'd love to learn more about the controversies surrounding the 'Salvator