# [Generate synthetic and simulated data for evaluation](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/develop/simulator-interaction-data)
**Azure AI Evaluation SDK's** `Simulator` provides an end-to-end synthetic data generation capability to help developers test their application's response to typical user queries in the absence of production data. AI developers can use an index or text-based query generator and fully customizable simulator to create robust test datasets around non-adversarial tasks specific to their application. The `Simulator` class is a powerful tool designed to generate synthetic conversations and simulate task-based interactions. This capability is useful for:
- **Testing Conversational Applications**: Ensure your chatbots and virtual assistants respond accurately under various scenarios.
- **Training AI Models**: Generate diverse datasets to train and fine-tune machine learning models.
- **Generating Datasets**: Create extensive conversation logs for analysis and development purposes.
By automating the creation of synthetic data, the Simulator class helps streamline the development and testing processes, ensuring your applications are robust and reliable.
<br/>
By automating the creation of synthetic data, the `Simulator` class helps streamline the development and testing processes, ensuring your applications are robust and reliable.

In [1]:
# !az login

In [2]:
# Constants and Libraries
import os, json
from azure.identity import DefaultAzureCredential, get_bearer_token_provider #requires azure-identity
from pprint import pprint
from dotenv import load_dotenv # requires python-dotenv

if not load_dotenv("./../../config/credentials_my.env"):
    print("Environment variables not loaded, cell execution stopped")
    sys.exit()
os.environ["AZURE_OPENAI_API_VERSION"] = os.environ["OPENAI_API_VERSION"]

credential = DefaultAzureCredential()

In [31]:
# Initialize Azure AI project and Azure OpenAI connection
azure_ai_project = {
    "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
    "resource_group_name": os.environ.get("RESOURCE_GROUP_NAME"),
    "project_name": os.environ.get("PROJECT_NAME"),
}

model_config = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
    "azure_deployment": os.environ.get("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"),
    "api_version": os.environ.get("AZURE_OPENAI_API_VERSION"),
    "type": os.environ.get("GLOBAL_LLM_SERVICE"), # "AzureOpenAI"
}

In [4]:
from azure.ai.evaluation.simulator import Simulator

## Generate text or index-based synthetic data as input
In the first part, we prepare the text for generating the input to our simulator

In [11]:
import asyncio
from azure.identity import DefaultAzureCredential
import wikipedia
import os
from typing import List, Dict, Any, Optional
# Prepare the text to send to the simulator
wiki_search_term = "Leonardo da Vinci"
wiki_title = wikipedia.search(wiki_search_term)[0]
wiki_page = wikipedia.page(wiki_title)
text = wiki_page.summary[:5000]
print(f"{text[:100]}...")

Leonardo di ser Piero da Vinci (15 April 1452 – 2 May 1519) was an Italian polymath of the High Rena...


## Specify application Prompty
The following `application.prompty` file specifies how a chat application behaves.

In [38]:
%%writefile ./eval_assets/application.prompty
---
name: ApplicationPrompty
description: Chat RAG application
model:
  api: chat
  parameters:
    temperature: 0.0
    top_p: 1.0
    presence_penalty: 0
    frequency_penalty: 0
    response_format:
      type: text
 
inputs:
  conversation_history:
    type: dict
  context:
    type: string
  query:
    type: string
 
---
system:
You are a helpful assistant and you're helping with the user's query. 
Keep the conversation engaging and interesting.

Keep your conversation grounded in the provided context: 
{{ context }}

Output with a string that continues the conversation, responding to 
the latest message from the user query:
{{ query }}

given the conversation history:
{{ conversation_history }}

Overwriting ./eval_assets/application.prompty


In [39]:
from promptflow.client import load_flow

prompty_path = "eval_assets/application.prompty"
flow = load_flow(source=prompty_path, model={"configuration": model_config})

flow(conversation_history={}, context="", query="how to make a pizza")

'Making pizza at home is a fun and delicious project! Here’s a simple way to do it:\n\n1. Make or buy pizza dough. You can find ready-made dough at most grocery stores, or make your own with flour, yeast, water, salt, and a bit of olive oil.\n2. Preheat your oven to its highest setting (usually around 475°F/245°C).\n3. Roll out the dough on a floured surface to your desired thickness.\n4. Place the dough on a baking sheet or pizza stone.\n5. Spread a thin layer of pizza sauce (store-bought or homemade) over the dough.\n6. Add your favorite toppings—cheese, pepperoni, veggies, mushrooms, etc.\n7. Bake for 10-15 minutes, or until the crust is golden and the cheese is bubbly.\n8. Let it cool for a couple of minutes, slice, and enjoy!\n\nWould you like a specific recipe for dough or sauce, or ideas for unique toppings?'

## Specify target callback to simulate against
You can bring any application endpoint to simulate against by specifying a target callback function such as the following given an application that is an LLM with a Prompty file like `application.prompty`

In [15]:
async def callback(
    messages: Dict,
    stream: bool = False,
    session_state: Any = None,  # noqa: ANN401
    context: Optional[Dict[str, Any]] = None,
) -> dict:
    messages_list = messages["messages"]
    # Get the last message
    latest_message = messages_list[-1]
    query = latest_message["content"]
    context = latest_message.get("context", None) # looks for context, default None
    # Call your endpoint or AI application here
    current_dir = os.path.dirname(__file__)
    prompty_path = os.path.join(current_dir, "application.prompty")
    _flow = load_flow(source=prompty_path, model={"configuration": azure_ai_project})
    response = _flow(query=query, context=context, conversation_history=messages_list)
    # Format the response to follow the OpenAI chat protocol
    formatted_response = {
        "content": response,
        "role": "assistant",
        "context": context,
    }
    messages["messages"].append(formatted_response)
    return {
        "messages": messages["messages"],
        "stream": stream,
        "session_state": session_state,
        "context": context
    }