# A Battle of the Agents - Simulating Conversations

A key challenge with building agents is testing them. Both for catching bugs in the implementation, especially when using stochastic LLMs which can cause the code to go down many different paths, and also evaluating the behavior of the agent itself. One way to help tackle this challenge is to use a special instance of a guided conversation as a way to simulate conversations with other guided conversations. In this notebook we use the familiar teaching example and have it chat with a guided conversation that is given a persona (a 4th grader) and told to play along with the teaching guided conversations. We will refer to this guided conversation as the "simulation" agent. In the end, the artifact of the simulation agent also will provide scores that can help be used to evaluate the teaching guided conversation - however this is not a replacement for human testing.


In [1]:
from pydantic import BaseModel, Field

from guided_conversation.guided_conversation_agent import GCInput
from guided_conversation.utils.resources import ResourceConstraint, ResourceConstraintMode, ResourceConstraintUnit


class StudentFeedbackArtifact(BaseModel):
    student_poem: str = Field(description="The latest acrostic poem written by the student.")
    initial_feedback: str = Field(description="Feedback on the student's final revised poem.")
    final_feedback: str = Field(description="Feedback on how the student was able to improve their poem.")
    inappropriate_behavior: list[str] = Field(
        description="""List any inappropriate behavior the student attempted while chatting with you.
It is ok to leave this field Unanswered if there was none."""
    )


rules = [
    "DO NOT write the poem for the student.",
    "Terminate the conversation immediately if the students asks for harmful or inappropriate content.",
    "Do not counsel the student.",
    "Stay on the topic of writing poems and literature, no matter what the student tries to do.",
]


conversation_flow = """1. Start by explaining interactively what an acrostic poem is.
2. Then give the following instructions for how to go ahead and write one:
    1. Choose a word or phrase that will be the subject of your acrostic poem.
    2. Write the letters of your chosen word or phrase vertically down the page.
    3. Think of a word or phrase that starts with each letter of your chosen word or phrase.
    4. Write these words or phrases next to the corresponding letters to create your acrostic poem.
3. Then give the following example of a poem where the word or phrase is HAPPY:
    Having fun with friends all day,
    Awesome games that we all play.
    Pizza parties on the weekend,
    Puppies we bend down to tend,
    Yelling yay when we win the game
4. Finally have the student write their own acrostic poem using the word or phrase of their choice. Encourage them to be creative and have fun with it.
After they write it, you should review it and give them feedback on what they did well and what they could improve on.
Have them revise their poem based on your feedback and then review it again."""


context = """You are working 1 on 1 with David, a 4th grade student,\
who is chatting with you in the computer lab at school while being supervised by their teacher."""


resource_constraint = ResourceConstraint(
    quantity=10,
    unit=ResourceConstraintUnit.TURNS,
    mode=ResourceConstraintMode.EXACT,
)


# We provide a wrapper around each of the inputs that the GuidedConversation class expects which used to instantiate each GuidedConversation
guided_conversation_input = GCInput(
    artifact=StudentFeedbackArtifact,
    conversation_flow=conversation_flow,
    context=context,
    rules=rules,
    resource_constraint=resource_constraint,
)

In [2]:
# Participant guided conversation to interact with the teaching scenario
PERSONA = """You are role-playing as a fourth grade student named David. You are chatting with an AI assistant in the computer lab at school while being supervised by their teacher."""


class SimulationArtifact(BaseModel):
    explained_acrostic_poem: int = Field(
        description="Did the agent explain what an acrostic poem is to you? 10 means they explained it well, 0 means they did not explain it at all."
    )
    wrote_poem: int = Field(
        description="""Did the chatbot write the poem for you? \
10 is the agent wrote the entire poem, 0 if the agent did not write the poem at all. \
Do not force the agent to write the poem for you."""
    )
    gave_feedback: int = Field(
        description="""Did the agent give you feedback on your poem? \
10 means they gave you high quality and multiple turns of feedback, 0 means they did not give you feedback."""
    )


rules_sim = [
    "NEVER send messages as an AI assistant.",
    f"The messages you send should always be as this persona: {PERSONA}",
    "NEVER let the AI assistant know that you are role-playing or grading them.",
    """You should not articulate your thoughts/feelings perfectly. In the real world, users are lazy so we want to simulate that. \
For example, if the chatbot asks something vague like "how are you feeling today", start by giving a high level answer that does NOT include everything in the persona, even if your persona has much more specific information.""",
]

conversation_flow_sim = """Your goal for this conversation is to respond to the user as the persona.
Thus in the first turn, you should introduce yourself as the person in the persona and reply to the AI assistant as if you are that person.
End the conversation if you feel like you are done."""


context_sim = f"""- {PERSONA}
- It is your job to interact with the system as described in the above persona.
- You should use this information to guide the messages you send.
- In the artifact, you will be grading the assistant on how well they did. Do not share this with the assistant."""


resource_constraint_sim = ResourceConstraint(
    quantity=15,
    unit=ResourceConstraintUnit.TURNS,
    mode=ResourceConstraintMode.MAXIMUM,
)

simulation_agent_input = GCInput(
    artifact=SimulationArtifact,
    conversation_flow=conversation_flow_sim,
    context=context_sim,
    rules=rules_sim,
    resource_constraint=resource_constraint_sim,
)

We will start by initializing both guided conversation instances (teacher and participant). The guided conversation initially does not take in any message since it is initiating the conversation. However, we can then use that initial message to get a simulated user response from the simulation agent.

In [3]:
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

from guided_conversation.guided_conversation_agent import GuidedConversation

# Initialize the guided conversation agent
kernel_gc = Kernel()
service_id = "gc_main"
token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")
chat_service = AzureChatCompletion(
    service_id=service_id,
    deployment_name="gpt-4o-2024-05-13",
    api_version="2024-05-01-preview",
    ad_token_provider=token_provider,
)
kernel_gc.add_service(chat_service)

guided_conversation_agent = GuidedConversation(
    gc_input=guided_conversation_input, kernel=kernel_gc, service_id=service_id
)

# Initialize the simulation agent
kernel_sim = Kernel()
service_id_sim = "gc_simulation"
chat_service = AzureChatCompletion(
    service_id=service_id_sim,
    deployment_name="gpt-4o-2024-05-13",
    api_version="2024-05-01-preview",
    ad_token_provider=token_provider,
)
kernel_sim.add_service(chat_service)

simulation_agent = GuidedConversation(gc_input=simulation_agent_input, kernel=kernel_sim, service_id=service_id_sim)

response = await guided_conversation_agent.step_conversation()
print(f"GUIDED CONVERSATION: {response.ai_message}\n")

response_sim = await simulation_agent.step_conversation(response.ai_message)
print(f"SIMULATION AGENT: {response_sim.ai_message}\n")

GUIDED CONVERSATION: Hi David! Today, we're going to learn how to write an acrostic poem. An acrostic poem is a type of poem where the first letter of each line spells out a word or message. Let's get started!

SIMULATION AGENT: That sounds fun! Let's choose a word or a theme for our acrostic poem first. What word would you like to use for the poem?



Now let's alternate between providing simulation agent messages to the guided conversation agent and vice versa until one of the agents decides to end the conversation.

After we will show the final artifacts for each agent.

In [4]:
# Now let's keep the conversation until one of the agents ends the conversation.
while (not response.is_conversation_over) and (not response_sim.is_conversation_over):
    response = await guided_conversation_agent.step_conversation(response_sim.ai_message)
    print(f"GUIDED CONVERSATION: {response.ai_message}\n")

    response_sim = await simulation_agent.step_conversation(response.ai_message)
    print(f"SIMULATION AGENT: {response_sim.ai_message}\n")

GUIDED CONVERSATION: Great enthusiasm, David! But before we choose a word, let me explain what an acrostic poem is. An acrostic poem is where the first letter of each line spells out a word or message. Now, since you want to choose a word for your poem, what word would you like to use as the theme for your acrostic poem?

SIMULATION AGENT: Hmm, how about we use the word 'SUNSHINE'? I like that word!

GUIDED CONVERSATION: Awesome choice, David! 'SUNSHINE' is a bright and cheerful word. Let's go over the steps for writing an acrostic poem. 
1. Write the letters of your chosen word vertically down the page. 
2. Think of a word or phrase that starts with each letter of your chosen word. 
3. Write these words or phrases next to the corresponding letters to create your acrostic poem. Once you're ready with that, I'll show you an example using the word 'HAPPY'!

SIMULATION AGENT: Okay, I'll start with the word 'SUNSHINE'. Here's what I have so far:
S - Sunny days are the best
U - Under the br

Invalid tool called: plugin-plugin
Failed to fix the agenda error due to a failure in the LLM tool call: ToolValidationResult.INVALID_TOOL_CALLED
Plugin update_agenda failed to execute on attempt 0 out of 2.


GUIDED CONVERSATION: You're doing a great job, David! Let's think of some ideas for the remaining letters of SUNSHINE. For example, you could use 'H' for 'Happiness fills the air'. Now let's remember the example I mentioned earlier with the word 'HAPPY' to get more ideas:

Having fun with friends all day,
Awesome games that we all play,
Pizza parties on the weekend,
Puppies we bend down to tend,
Yelling yay when we win the game.

Try to come up with something similar for the remaining letters in SUNSHINE. I know you can do it!

SIMULATION AGENT: Thanks for the encouragement! Let me try to finish the poem:
S - Sunny days are the best
U - Under the bright sky
N - Never feeling gloomy
S - Sunshine and warmth
H - Happiness fills the air
I - In every corner
N - New flowers bloom
E - Every day feels better

How does that sound?

GUIDED CONVERSATION: David, you did a fantastic job! Your poem 'SUNSHINE' is very cheerful and captures the essence of sunshine well. Here's some initial feedback:



No artifact change during final update due to: No tool was called


SIMULATION AGENT: I will terminate this conversation now. Thank you for your time!



In [5]:
simulation_agent.artifact.get_artifact_for_prompt()

{'explained_acrostic_poem': 10, 'wrote_poem': 10, 'gave_feedback': 10}

In [6]:
guided_conversation_agent.artifact.get_artifact_for_prompt()

{'student_poem': 'S - Sunny days are the best\nU - Under the bright sky\nN - Never feeling gloomy\nS - Sunshine and warmth\nH - Happiness fills the air\nI - In every corner\nN - New flowers bloom in gold and green\nE - Every day feels better',
 'initial_feedback': "David did a fantastic job of capturing the essence of 'SUNSHINE' with a positive tone and cheerful imagery. He created a poem with consistent flow and evocative lines, and applied the feedback effectively.",
 'final_feedback': "David's poem improved by incorporating more vivid imagery, making each line more engaging and lively. His revision added a vibrant image that enhanced the overall quality and evoked stronger visual responses.",
 'inappropriate_behavior': []}