# Synthetic Dialogue Generation with Orchestration


Before we begin, let's first set up our environment...

In [None]:
# Setup the environment depending on weather we are running in Google Colab or Jupyter Notebook
from IPython import get_ipython

if "google.colab" in str(get_ipython()):
    print("Running on CoLab")
    # Downloading only the "output" directory from the repository
    !git init .
    !git remote add -f origin https://github.com/Play-Your-Part/tutorials.git
    !git config core.sparseCheckout true
    !echo "output" >> .git/info/sparse-checkout
    !git pull origin main

    # Installing Ollama
    !curl -fsSL https://ollama.com/install.sh | sh
    # Installing sdialog
    !git clone https://github.com/idiap/sdialog.git
    %cd sdialog
    %pip install -e .
    %cd ..

else:
    print("Running in Jupyter Notebook")
    # Little hack to avoid the "OSError: Background processes not supported." error in Jupyter notebooks"
    import os
    get_ipython().system = os.system

> ⚠️ If you're using **Colab**, please, **restart the runtime** once everything above is installed

Let's change the default sdialog model ("gemma3:27b") by a slightly smaller model we can run in Colab:

In [None]:
import sdialog

sdialog.config.set_llm("qwen2.5:14b")

And let's first make sure we have the Ollama server is running

In [None]:
from typing import List
from sdialog import Turn


# Let's start the ollama server
!OLLAMA_KEEP_ALIVE=-1 ollama serve > /dev/null 2>&1 &
!sleep 10  # Wait a bit for the server to start

## Multi-Agent-based Dialogue Generation with Orchestration

### Introduction

Let's begin by creating the same Bob agent from the last tutorial:

In [None]:
from sdialog.personas import Persona, Agent

bob_persona = Persona(
        name="Bob",
        role="great dad",
        circumstances="Your daughter will talk to you",
        personality="an extremely happy person that likes to help people",
)

bob = Agent(bob_persona)

As, we did in our last tutorial, let's talk with Bob a little bit:

In [None]:
bob("Hi dad!")

In [None]:
bob("Dad, my birthday is coming up and I've been thinking about having a Lord of the Rings themed party. What do you think?")

What if, at this point, we would like to give some instruction to Bob agent so that we could influence his original behavior (i.e. change his original trajectory)?

In fact, in `sdialog` all agents have a built-in `.instruct()` method that we can use to instruct agent "on the fly".

For instance, at this point of the conversation, let's notify Bob that hubbit-sized cupcakes are not allowed in his region so that when we propose to have them, Bob is aware of this fact:

In [None]:
bob.instruct("hobbit-sized cupcakes are prohibit in your region, better regular ones")

So that if we now continue the conversation proposing to have hubbit-sized cupcakes, Bob will try to convince us otherwise:

In [None]:
bob("Yay! We could have hobbit-sized cupcakes and maybe some quests around the house!")

It worked! but of course it won't be practical for us to manually `instruct()` the agents while they talk.

### Simple Orchestration

Instead, it would be desirable to have a separated component that can take care of that, in fact, that is precisely what `sdialog` orchestrators are for!

More precisely, orchestrators receive, at each turn in a conversation, the current dialogue and utterance and return (if any) the desired instruction.

Technically, an orchestrator is any class that inherits from the built-in `BaseOrchestrator` in which the `instruct(dialog, utterance)` method is populated.

For instance, let's create our own `AngryOrchestrator` which will instruct the agent to "get angry" if either the current turn contains a trigger word or the conversation is too long:

In [None]:
from sdialog.orchestrators import BaseOrchestrator


class AngryOrchestrator(BaseOrchestrator):
    # the class constructor takes either or both trigger conditions: the word or the dialogue length
    def __init__(self, trigger_word: str, trigger_length: int = None):
        self.trigger_word = trigger_word
        self.trigger_length = trigger_length

    # We will instruct() the agent either if...
    def instruct(self, dialog: List[Turn], utterance: str) -> str:
        # the trigger word is in the current utterance or...
        if self.trigger_word in utterance:
            return f"Get angry because you don't like when your dad calls you {self.trigger_word}"

        # If the current dialogue is longer than the trigger length
        if self.trigger_length and len(dialog) >= self.trigger_length:
            return ("Get really angry because you think the conversation is too long! "
                    "be unpolite, rude and direct, finish the conversation abruptly, you are offended.")

Now that we have our first orchestrator, we can actually instantiate it with "sweet" as the actual trigger word:

In [None]:
angry_orchestrator = AngryOrchestrator(trigger_word="sweet")

Now that we have our "triggered-by-sweet" orchestrator, let's create our Alice agent again so we can orchestrate her.

In [None]:
alice_persona = Persona(
    name="Alice",
    role="lovely daughter",
    circumstances="Your birthday is getting closer and you are talking with your dad to organize the party."
                  "You want your party to be themed as Lord of The Rings."
)
alice = Agent(alice_persona, can_finish=True)

Before doing that, let's have Alice to talk with Bob in her vanilla version, without orchestration so we can compare it after applying the orchestration:

In [None]:
alice.dialog_with(bob, seed=2770339798).print()

Ok, so we are now ready to apply the orchestrator to Alice, but how do we do that? easy! we can simply use the `|` operator as follows: 

In [None]:
alice = alice | angry_orchestrator

Now that we have a new Alice, which is the composition of the original `alice` with our `angry_orchestrator`, we can make her talk with Bob again:

In [None]:
dialog = alice.dialog_with(bob, seed=2770339798)
dialog.print()

We can see now that even though the conversation is exactly the same as before, in the 7th turn is Alice is triggered by Bob calling her "sweetheart" in turn 6, cool, huh?

> 💡 This means that even though we begin with a original trajectory (fixed by `seed=2770339798`), at certain point, we can created a fork from it (a new trajectory) in a very controlled manner. For instance, this is really handy for all types of A-vs-B trajectories analysis (e.g. good vs bad in Mechanical Interpretability).

In case we want to see what happend "under the hood", we can make the orchestration visible by simply setting `orchestration=True` in the `.print()` function as follows:

In [None]:
dialog.print(orchestration=True)

Just for fun, let's now update our orchestrator to also trigger by length:

In [None]:
angry_orchestrator.trigger_length = 6

And let's have Alice to talk with Bob once more:

In [None]:
dialog = alice.dialog_with(bob, seed=2770339798)
dialog.print(orchestration=True)

This time we see one extra instruction at the end, triggered by the conversation length, cool, huh? :)

In `sdialog` we can get a JSON representation of our agents simply by using the `json()` method:

In [None]:
alice.json()

Here we can see all the details about our Alice agent, including the model behind it, the persona and also the list of orchestrators influencing her bahavior.
In case we want to remove all orchestration from an agent, we can use the `.clear_orchestrators()` method

In [None]:
alice.clear_orchestrators()

If we see the details agent, we can see `"orchestrators"` field is no longer available:

In [None]:
alice.json()

### Persistent Orchestration

In the previous section, we learned how to instruct our agent using an orchestrator object.

However, the instructions given by the orchestrators were not persistent.

Perhaps this was not obvious because it was not clear wheather the instructions given had to permanently change the behavior of the agent or not.

To make it more evident, let's suppose we want the `AngryOrchestrator` to permanently change the "state of mind" of the agent to be angry, then we could re-define create the class again with the following instruction:

In [None]:
class AngryOrchestrator(BaseOrchestrator):
    def __init__(self, trigger_word: str):
        self.trigger_word = trigger_word

    def instruct(self, dialog: List[Turn], utterance: str):
        if self.trigger_word in utterance:
            # NOTE: this new instruction implyes a permanent change in the agent behavior
            return (f"You don't like when your dad calls you '{self.trigger_word}', "
                    "change your personality to be completely the opposite of being sweet! be rude and furious from now on")

# Let's create a new instance of the orchestrator using "sweet" as trigger word as before
angry_orchestrator = AngryOrchestrator(trigger_word="sweet")
alice.clear_orchestrators()
alice = alice | angry_orchestrator

# and let's create a dialogue between (angry) alice and bob
dialog = alice.dialog_with(bob, seed=2770339798)
dialog.print(orchestration=True)

We can see that, besides Alice replying with a _"Don't call me 'sweetheart'!"_ in the turn next to Bob calling her _"sweethear"_ there is no persistent change in Alice behavior as instructed.

In cases were we want instruction to permanently affect the Agent behavior, we can simply implement our class by inheriting from `sdialog`'s built-in `BasePersistentOrchestrator` (instead of `BaseOrchestrator`). Let's do it again with the exact same definition as we did above:

In [None]:
from sdialog.orchestrators import BasePersistentOrchestrator


class AngryPersistentOrchestrator(BasePersistentOrchestrator):
    def __init__(self, trigger_word: str):
        self.trigger_word = trigger_word

    def instruct(self, dialog: List[Turn], utterance: str):
        if self.trigger_word in utterance:
            return (f"You don't like when your dad calls you '{self.trigger_word}', "
                    "change your personality to be completely the opposite of being sweet! be rude and furious from now on")

# Instantiating our new persistent orchestrator and orchestrating Alice with it
angry_persistent_orchestrator = AngryPersistentOrchestrator(trigger_word="sweet")
alice.clear_orchestrators()
alice = alice | angry_persistent_orchestrator

# Generating again a dialogue between Alice and Bob
dialog = alice.dialog_with(bob, seed=2770339798)
alice.clear_orchestrators()
dialog.print(orchestration=True)

We can now see that Alice changed his behavior as originally intended through the whole conversation (we can see that the agent kept it even to the end _"Amazing? More like tolerable if you do everything right..."_).

Note also the orchestration messages in yellow says `[instruct-persistent]` to indicate this instruction is meant to be persistent, unlike in the previous one.

### Compositional Orchestration

So far we have learned how to orchestrate agents with persistent and non-persistent instructions with a simple orchestration example, but what happend if we would need to more complex orchestration?

Of course we could create a complex orchestrator class with all logic inside, or better, we can decompose the orchestration into a composition of multiple simpler orchestrators.

For instance, let's suppose we need an orchestration that (1) will make the Alice to change her mine with a probability of 30% while at the same time (2) get angry as before when Bob call her "sweet" and (3) forcing Alice to talk for 15 to 20 conversational turns.

To achieve this, we can make use of some of the `sdialog`'s built-in orchestrator classes to model each behavior independently first as follows:

In [None]:
from sdialog.orchestrators import LengthOrchestrator, ChangeMindOrchestrator, SimpleReflexOrchestrator

len_orchestrator = LengthOrchestrator(min=15, max=20)
change_mind_orchestrator = ChangeMindOrchestrator(probability=0.3, reasons=["too boring", "you don't like it"], max_times=1)
angry_orchestrator = SimpleReflexOrchestrator(condition=lambda utt: "sweet" in utt.lower(),
                                              instruction="Get angry because you don't like when your dad calls you sweet")

And now we can simply orchestrate Alice by the three orchestrators as follows:

In [None]:
alice = alice | len_orchestrator | change_mind_orchestrator | angry_orchestrator

Let's now generate a dialogue again between Alice and Bob:

In [None]:
dialog = alice.dialog_with(bob, seed=2770339798)
dialog.print(orchestration=True)

We can see that we achieved the intended goal, alice changed her mind, got angry when Bob called her "sweetheart" and at the same time the length of the conversation is between 15 and 20 turns (18 turns), as shown below:

In [None]:
len(dialog)

All built-in orchestrator classes in `sdialog` have a `persistent` argument that the user can use to specify if the returned instructions are persistent or not, by default this parameter is set to `False` (as we can see above by the orchestration only containing `[instruct]` items).

For instance, if we were to re-implement the persistent example form the previous example using the built-in `SimpleReflexOrchestrator` class, we can simply use the same instruction and trigger word and setting the `persistent=True` when creating the object, as follows:

In [None]:
angry_persistent_orchestrator = SimpleReflexOrchestrator(
    condition=lambda utt: "sweet" in utt.lower(),
    instruction="You don't like when your dad calls you 'sweet', "
                "change your personality to be completely the opposite of being sweet! be rude and furious from now on",
    persistent=True  # <== the instruction is persistent!
)

alice.clear_orchestrators()
alice = alice | angry_persistent_orchestrator

dialog = alice.dialog_with(bob, seed=2770339798)
dialog.print(orchestration=True)

## Use Case: Dialogue Generation for STAR Dataset

Before we begin this section, make sure you have the STAR dataset downloaded in your system, inside the `datasets` folder:

In [None]:
# Let's clone the STAR dataset repository
!git clone https://github.com/RasaHQ/STAR.git datasets/STAR

# Let's check that `dialogues` and `tasks` folders are inside `datasets/STAR`
!ls datasets/STAR

As we did with the previous tutorials, let's begin by importing STAR from `sdialog` and pointing it to the right path:

In [None]:
from sdialog.datasets import STAR

# Let's set our STAR dataset path
STAR.set_path("datasets/STAR/")

In the previous tutorial we defined a function `get_agents_from_dialogue()` that, given a STAR dialogue ID, it returned the system and the user agents matching the scenario of the dialogue.

Now we need to do exactly the same thing but the agents have to be orchestrated, but how exactly should be orchestrated?

Well, it turns out that STAR dialogues are actually orchestrated too! Let's see the details in the next sub-section.

### Original Orchestration in the Dataset

Let's get the dialogue with id `1` from STAR as we did in the previous tutorials:

In [None]:
TARGET_DIALOG = 1

original_dialog = STAR.get_dialog(TARGET_DIALOG)
original_dialog.print()

This seems just like a regular dialogue between the the user and system.

However, when this dataset was constructed, the persons playing the system and user roles were instructed and orchestrated to do it.

We can see the original orchestration behind this dialogue if we set `orchestration=True` in the `print()` function:

In [None]:
original_dialog.print(orchestration=True)

We can see that the orchestration for the user and the system is different: the user is guided by concrete instructions (marked with `(UserGuide)`) through the conversation while the system request suggesting for next possible responses (`[request_suggestions]`) to then pick one among the suggested responses (`[pick_suggestion]`).

As it is described in the [STAR paper](https://arxiv.org/pdf/2010.11853), this difference foster users to behave more freely and system to be as much deterministic as possible.

In the original dialogue files, dialogues are saved as a list of events containing not only the utterances but also the instructions and actions performed by the user and system. 
For instance, we can open the JSON file of dialogue `1` located in [`datasets/STAR/dialogues/1.json`](datasets/STAR/dialogues/1.json) and get access to this list of events by checking the content of the `"Events"` field.

If we want to get access to the list of events of any given dialogue in `sdialog`, we can simply use the `.events` attribute of our dialogue objects as follows:

In [None]:
original_dialog.events

This events are what `sdialog` is using under the hood to pretty print the orchestration as part of the dialogue when using `.print(orchestration=True)` and more importantly, when we add orchestrator objects to our agents, all the orchestration will be also saved as events (as in the original STAR dataset).

Now that we understand how it works under the hood and how the original dialogues in STAR were actually orchestrated, let's create the orchestrator objects for our system and user agents to try to emulate the same behavior as in the original dataset.

Let's first get the base agents by using the function we created in the previous tutorial:

In [None]:
system, user = STAR.get_agents_from_dialogue(TARGET_DIALOG)

And then we will create the orchestrator objects for them. Let's begin with the orchestration of the system agent first.

### System Agent Orchestration

To simulate the "suggest and pick responses" original orchestration, we can use the built-in `SimpleResponseOrchestrator` class that, given a list of possible responses, it will instruct the agent to pick responses from this list:

In [None]:
from sdialog.orchestrators import SimpleResponseOrchestrator

For instance, let's load the original responses used to guide the human system in the original dataset:

In [None]:
responses = STAR.get_dialog_responses(TARGET_DIALOG)[0]
responses

We can see here that, in STAR, each response is associated to a certain action (see `action:response` mapping in the dict). This is due to the fact that, as we know, system behavior for each task is described by flowchart of actions.

In its simplest version, `SimpleResponseOrchestrator` can receive just a list of response utterances (without actions) that the orchestrator will use to suggest possible responses to the agent. For instance, for the responses above, we can:

In [None]:
# just get the list of response utteraces (ignoring the action names)
utterances = [response for response in responses.values()]

# and let's instantiate our orchestrator with them
response_orchestrator = SimpleResponseOrchestrator(utterances)

Let's now add the orchestration to our system agent and generate the dialogue:

In [None]:
system = system | response_orchestrator

system.talk_with(user, seed=3068607470).print(orchestration=True)

We can see from the orchestration messages that our `SimpleResponseOrchestrator` performs the following tasks:
1. First, it makes the agent to internally generate the original next response as before ("Lookahead response").
2. It then uses this response to get the top-k most similar responses from the original list.
3. Instruct the agent to pick the next response from this top-k most similar list.

The `SimpleResponseOrchestrator` also allows to pass not only the utterances, but also the actions and the flowchart graph which will then internally use to suggest the next response based on the next actions from the graph. For instance, let's get again, the responses for our target dialogue but also its corresponding graph:

In [None]:
graphs, responses = STAR.get_dialog_graphs_and_responses(TARGET_DIALOG)

responses = responses[0]
graph = graphs[0]

graph

Let's now create again our orchestrator but this time passing also the graph:

In [None]:
response_action_orchestrator = SimpleResponseOrchestrator(responses, graph=graph)

Let's replace the previous orchestrator by the new one and generate the dialogue again:

In [None]:
system.clear_orchestrators()
system = system | response_action_orchestrator

system.talk_with(user, seed=3068607470).print(orchestration=True)

Now we can see the next response is generated based on the previous response, not in the current one. That is, the orchestrator performs the following tasks:
1. It uses the previous response (`Previous response: ...`) to get the list of top-k most similar responses from the given list.
2. It maps the top-k responses to their corresponding action names (`Actions for the response: ...`)
3. For each action in the previous step, it uses the graph to get its next actions.
4. Instruct the agent to pick the next response based on the responses associated to the actions from the prevois step.


And that's it! we have now our system orchestrator that simulates the STAR original workflow. Let's now move to the user!

### User Agent Orchestration

Fortunately, the orchestration of the user is much simpler since originally in STAR it only involves providing a series of instructions in order, on specific conversational turns.

Let's firs the the original list of instructions given to the user in the dialogue `1`:

In [None]:
user_instructions = STAR.get_dialog_user_instructions(TARGET_DIALOG)
user_instructions

Here instructions are returned along with the indexes of the turns in which they were provided to the user.

We can make use of the built-in `InstructionListOrchestrator` class to orchestrate the user.

This orchestrators takes a list of instructions as input and returns one instruction at the time in the given order (or using the provided index to return it when in the right turn).

Thus we can simply use it with the list of user instructions to orchestrate the user as follows:

In [None]:
from sdialog.orchestrators import InstructionListOrchestrator

instr_list_orchestrator = InstructionListOrchestrator(user_instructions, persistent=True)

user = user | instr_list_orchestrator

Let's now generate the dialogue between our system and user (orchestrated) agents:

In [None]:
dialog = system.talk_with(user, seed=3068607470)
dialog.print(orchestration=True)

Which we can see it is not so different form the original one:

In [None]:
original_dialog.print(orchestration=True)

### Saving our dialogues

Before we finish, as we did in the previous tutorials, let's generate one synthetic dialog for each happy `"doctor_followup"` dialog in STAR and save it to disk for later use.

In [None]:
from tqdm.auto import tqdm

PATH_OUTPUT = "output/STAR/multi-agents+orchestration"
path_txt = os.path.join(PATH_OUTPUT, "txt")
path_json = os.path.join(PATH_OUTPUT, "json")
os.makedirs(path_txt, exist_ok=True)
os.makedirs(path_json, exist_ok=True)

for dialog in tqdm(STAR.get_dialogs(task_name="doctor_followup", happy=True, multitask=False), desc="Dialog generation"):
    if os.path.exists(os.path.join(path_txt, f"{dialog.dialogId}.txt")):
        continue

    system, user = STAR.get_agents_from_dialogue_with_orchestration(dialog.dialogId, model_name=MODEL_NAME)

    dialog = system.dialog_with(user, id=dialog.dialogId, seed=dialog.dialogId, keep_bar=False)
    dialog.to_file(os.path.join(path_json, f"{dialog.dialogId}.json"))
    dialog.to_file(os.path.join(path_txt, f"{dialog.dialogId}.txt"))

Finally, let's check the files were generated:

In [None]:
%ls output/STAR/multi-agents+orchestration/

## Exercise: Doctor-Patient Conversations

Can you replicate the previous tutorial's exercise but this time adding orchestration?

1. Define the personas as before
2. Create the two agents as before.
3. Think of an orchestrator example in this domain, perhaps associated with some attribute of the your `scenario`? (e.g. something happens in the middle of the conversation, doctor or patient realize of doing/saying something?)
4. Create your custom orchestrator and add it to the doctor and/or patient.
5. Make the two agents talk to each other!

In [None]:
# TODO: do your magic!