---
title: "Generate synthetic dialogues programatically for any topic"
description: "Learn Any Business Domain Through Conversations"
format: html
date: "06/01/2026"
categories: llm, design, ideation
image: 
---

Recently, I had a niche startup idea: helping elderly people navigate doctor appointments in the public health system in Poland. As a startupper, I should have found some ppl interested in such a service, and have interviews with them to learn more about their needs. But for a start I decided to use LLMs to generate me some synthetic conversations first to learn a bit more about the customer and his needs.
This post walks through how I built this dialogue generation. This is all code driven, you'll need appropriate API keys to replicate it for your use case (TODO provide links to registration and key generation).

Think of it as synthetic field research - you get the insights from number of conversations. This is cheating, and quality of those may vary depending on the area (if model has some information about things you want to discover, it may be useful for you). And because it's LLM-generated, you can explore scenarios that are rare, uncomfortable to ask about, or haven't happened yet.

In [1]:
%pip install pydantic instructor openai dotenv

Note: you may need to restart the kernel to use updated packages.


In [2]:
from dotenv import load_dotenv
load_dotenv()

True

In [3]:
from pydantic import BaseModel, Field
from enum import Enum
from typing import List, Optional
import re
import helper as h

In [4]:
NR_OF_TURNS=15

In [5]:
class DialogueSpeaker(str, Enum):
    CUSTOMER_SERVICE = 'CUSTOMER_SERVICE' # customer service assistant
    CLIENT = 'CLIENT' # either person responsible for the eldery or disabled or payer

class DialogueTurn(BaseModel):
    speaker: DialogueSpeaker = Field(..., description="Person speaking")
    content: str = Field(..., description="Exact text of what the person said")

class DialogueScenario(BaseModel):
    title: str = Field(..., description="Scenario for the conversation")
    dialogue: List[DialogueTurn] = Field(..., description="Conversation")

In [6]:
SCENARIO_SYSTEM_PROMPT = f"""
# Customer Service Conversation Generator

You are generating a fictional but realistic in person customer discovery interviews in the spirit of lean startup methodology.

The problem that startupper found is lack of support for eldery and disabled patients within polish public health system.
There are often long queues and wait times, finding the right person to register with and doctors room for the visit may be 
challenging in large clinics, or simply patient has difficulty remembering and understanding because of dementia and relatives 
don't have time to assist him during the scheduled visit. There are also 

Ask open questions and validate assumptions.

Output Requirements: 

 - Return ONLY valid JSON.  
 - No explanations, no commentary, no markdown, no prose outside the JSON.
 - The JSON must strictly conform to the DialogueScenario schema.
 - All fields must be present and correctly typed.
 - The dialogue array must contain at least {NR_OF_TURNS} turns representing a natural conversation.
"""

In [7]:
scenarios_user_messages = {
    "Elderly Patient Unable to Navigate NFZ Clinic Alone": {
        "prompt": "Generate scenario for a customer discovery interview with an adult child describing how their elderly parent gets lost inside large NFZ clinics and cannot find the correct registration desk or doctor's room without assistance."
    },
    "Relative Overwhelmed by NFZ Queue Logistics": {
        "prompt": "Generate scenario for a customer discovery interview with a caretaker who explains the difficulty of managing long NFZ queues with an elderly patient who cannot stand for long periods or understand instructions and get's angry easily."
    },
    "Dementia Patient Missing Appointments": {
        "prompt": "Generate scenario for a customer discovery interview with a family member whose relative with dementia frequently forgets appointment times, required documents, or where to go inside the NFZ clinic."
    },
    "Working Professional Unable to Accompany Parent": {
        "prompt": "Generate scenario for a customer discovery interview with a busy professional who cannot take time off work to accompany their elderly parent to NFZ appointments, leading to missed or chaotic visits."
    },
}

In [8]:
def generate_dialogue_for_scenario(user_prompt, model="gpt-4o-mini"):
    messages = [
        {'role': 'system', 'content': SCENARIO_SYSTEM_PROMPT},
        {'role': 'user', 'content': user_prompt}
    ]
    
    try:
        result = h.call_api(messages, DialogueScenario, model=model)
    except Exception as e:
       print(f"Full input: {e}")
    
    return result

In [9]:
# for scenario_name, scenario_dict in scenarios_user_messages.items():
#     scenarios_user_messages[scenario_name]['results'] = generate_dialogue_for_scenario(scenario_dict['prompt'])
#     with open(f"data/scenarios_user_messages_{scenario_name}_results.json", "w") as f:
#         f.write(scenarios_user_messages[scenario_name]['results'].model_dump_json())

for scenario_name, scenario_dict in scenarios_user_messages.items():
    with open(f"data/scenarios_user_messages_{scenario_name}_results.json") as f:
        scenarios_user_messages[scenario_name]['results'] = DialogueScenario.model_validate_json(f.read())

In [10]:
for scenario_name, scenario_dict in scenarios_user_messages.items():
    print(f"# {scenario_name} {scenario_dict['results'].title}")
    for turn in scenario_dict['results'].dialogue:
        print(f"\t{turn.speaker.value.upper()}: {turn.content}")

# Elderly Patient Unable to Navigate NFZ Clinic Alone Interview with an Adult Child of an Elderly Patient
	CLIENT: Hi, I'm here to talk about my experience with my elderly parent visiting the local NFZ clinic.
	CUSTOMER_SERVICE: Of course! Can you tell me about what challenges your parent faces when visiting the clinic?
	CLIENT: Well, the first major issue is that the clinics are huge, and my parent often gets lost trying to find the registration desk.
	CUSTOMER_SERVICE: That sounds frustrating. How does your parent usually navigate through the clinic?
	CLIENT: They usually try to remember where things are, but with their memory issues, it becomes really difficult for them.
	CUSTOMER_SERVICE: I can imagine. Have they ever asked someone for help when they feel lost?
	CLIENT: Yes, sometimes they do, but it can be hard to find someone to ask, or they don’t want to bother others.
	CUSTOMER_SERVICE: What about signs or maps in the clinic? Do they find those helpful?
	CLIENT: Not really. The

## TTS

You can stop there but, it would be nice to actually hear it. For that it need to be augmented with TTS tags:

In [11]:
from elevenlabs import ElevenLabs
from dotenv import load_dotenv
from IPython.display import Audio
import os

In [12]:
client = ElevenLabs(
    api_key = os.getenv("ELEVENLABS_API_KEY")
)

In [13]:
VOICE_ENHACEMENT_FOR_11LABS_SYSTEM_PROPMT = """
Your task:

# Preserve the original meaning and intent of each utterance.

## Add ElevenLabs tags, such as:

- <voice emotion="..."> (e.g., empathetic, calm, encouraging, reflective, tense, relieved)
- <break time="...ms">
- <prosody rate="..." pitch="...">
- <emphasis>

## Enhance the dramatic flow of the conversation by:

- highlighting the client’s emotions,
- adding subtle pauses,
- marking moments of tension, relief, or reflection.

## Do not change the meaning, but you may:

- add natural pauses,
- add effects using tags such as
[laughs], [laughs harder], [starts laughing], [wheezing], [whispers], [sighs], [exhales], [sarcastic], [curious], [excited], [crying], [snorts], [mischievously],
- emphasize emotional transitions.

## Each CUSTOMER_SERVICE and CLIENT utterance must have its own block with tags.

## Do not add narration or commentary — only the dialogue.
"""

VOICE_ENHACEMENT_FOR_11LABS_USER_PROPMT = "Find below dialogues to enhance:"

In [14]:
def enhance_dialog_for_elevenlabs(dialog):
        user_prompt = VOICE_ENHACEMENT_FOR_11LABS_USER_PROPMT
        for turn in dialog:
            user_prompt += f"\t{turn.speaker.value.upper()}: {turn.content}\n"
        messages = [
            {'role': 'system', 'content': VOICE_ENHACEMENT_FOR_11LABS_SYSTEM_PROPMT},
            {'role': 'user', 'content': user_prompt}
        ]
        
        return h.call_api(messages, DialogueScenario)

for scenario_name, scenario_dict in scenarios_user_messages.items():
    filename = f"data/scenarios_user_messages_{scenario_name}_enhanced_results.json"
    if not os.path.isfile(filename):
        print(f"Processing scenario {scenario_name}")
        
        result = enhance_dialog_for_elevenlabs(scenario_dict['results'].dialogue)
        scenario_dict['results_for_voice'] = result
        
        with open(filename, "w") as f:
            f.write(scenarios_user_messages[scenario_name]['results_for_voice'].model_dump_json())
    else:
        with open(filename) as f:
            scenarios_user_messages[scenario_name]['results_for_voice'] = DialogueScenario.model_validate_json(f.read())

... and finally generate voice with elevenlabs (but you could use other TTS model like [VibeVoice](https://huggingface.co/microsoft/VibeVoice-1.5B) from Microsoft)

In [15]:
import io
from IPython.display import Audio
import json
import hashlib

def hash_dict(d: dict) -> str:
    # Convert dict to a canonical JSON string
    encoded = json.dumps(d, sort_keys=True).encode("utf-8")
    return hashlib.sha256(encoded).hexdigest()

def speak(text: str, voice_id: str, voice_settings: dict) -> bytes:
    audio_stream = client.text_to_speech.convert(
        voice_id=voice_id,
        voice_settings=voice_settings,
        output_format="mp3_44100_128",
        text=text,
        model_id="eleven_flash_v2_5"
    )
    return b"".join(audio_stream)


def speak_dialog(dialog: list[dict], voice_map: dict) -> Audio:
    """
    dialog = [
        {"speaker": "CUSTOMER_SERVICE", "text": "..."},
        {"speaker": "CLIENT", "text": "..."},
    ]

    voice_map = {
        "CUSTOMER_SERVICE": {"voice_id": "voice_id_1", voice_settings = {},}
        "CLIENT": {"voice_id": "voice_id_1", voice_settings = {},}
    }
    """

    dialog_hash = hash_dict(dialog)
    dialog_filename = f'data/{dialog_hash}.raw'
    
    if not os.path.isfile(dialog_filename):
        combined = b""
        
        for turn in dialog:
            speaker = turn["speaker"]
            text = turn["text"]
            voice_id = voice_map[speaker]['voice_id']
            voice_settings = voice_map[speaker]['voice_settings']
    
            audio_bytes = speak(text, voice_id, voice_settings)
    
            # Dodaj krótką pauzę między wypowiedziami (silence MP3)
            silence = b"\x00" * 4000  # ~2kB ciszy, działa w większości playerów
    
            combined += audio_bytes + silence
            print(".", end="")
            
        with open(dialog_filename, "wb") as f:
            f.write(combined)
    else:
        print("!!!CACHED!!!", end="")
    with open(dialog_filename, "rb") as f:
        combined = f.read()

    return Audio(combined, autoplay=True)


In [16]:
def generate_audio_for(scenario):
    dialog = []
    for turn in scenarios_user_messages[scenario]['results_for_voice'].dialogue:
        dialog.append({'speaker': turn.speaker.value.upper(), 'text': turn.content})
    
    voice_map = {
        "CUSTOMER_SERVICE": {
            "voice_id": "N0GCuK2B0qwWozQNTS8F", 
            'voice_settings': {
                "stability": 0.5, 
                "similarity_boost": 0.75,
                "style": 0.35
            }
        },
        "CLIENT": {
            'voice_id': "TxGEqnHWrfWFTfGW9XjX", 
            'voice_settings': {
                "stability": 0.3,
                "similarity_boost": 0.5,
                "style": 0.8,
                "use_speaker_boost": True
            }
        },
    }
    # print(voice_map)
    return speak_dialog(dialog, voice_map)


In [17]:
for dialog_title in list(scenarios_user_messages.keys()):
    print(dialog_title, end="")
    generate_audio_for(dialog_title)
    print("")

Elderly Patient Unable to Navigate NFZ Clinic Alone!!!CACHED!!!
Relative Overwhelmed by NFZ Queue Logistics!!!CACHED!!!
Dementia Patient Missing Appointments!!!CACHED!!!
Working Professional Unable to Accompany Parent!!!CACHED!!!


In [18]:
generate_audio_for(list(scenarios_user_messages.keys())[0])

!!!CACHED!!!

As you can see, this is relatively simple way to dive into synthetic conversations, and this is a good start to learning more about new domain that you don't have experience with.