# Synthetic User Research
## Persona Prompting & Autonomous Agents

## Overview

This project explores the innovative domain of synthetic user research by leveraging generative AI and autonomous agents to simulate digital customer personas. Traditional user research methodologies face challenges such as scalability and accessing diverse user groups. Our approach, using generative AI, allows for the creation and interaction with digital personas in simulated research scenarios, unlocking unprecedented insights into consumer behaviors and preferences.

## Author
This project is a part of ongoing experiments into generative AI and agent architectures by [Vincent Koc](https://vincentkoc.com/), he is a author, lecturer and futurist with extensive experience in data-driven and digital disciplines with AI. Follow Vincent on [LinkedIn](https://www.linkedin.com/in/vincentkoc) and [X](https://twitter.com/koconder).
## Licence
This respository is licensed under the [CC BY-NC-SA 4.0](LICENSE). - see the LICENSE file or [creative commons](http://creativecommons.org/licenses/by-nc-sa/4.0/) for details.


## Setup

In [None]:
%%capture --no-stderr
pip install pyautogen

In [None]:
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

In [None]:
import json
import os
import autogen
from autogen import GroupChat, Agent
from typing import Optional

os.environ["OAI_CONFIG_LIST"] = json.dumps([
    {
        'model': 'gpt-4-turbo-preview',
        'api_key': '<<ADD API KEY FROM OPENAI>>',
    }
])

# Setting configurations for autogen
config_list = autogen.config_list_from_json(
    "OAI_CONFIG_LIST",
    filter_dict={
        "model": {
            "gpt-4-turbo-preview"
        }
    }
)

In [None]:
# Define the LLM configuration settings
llm_config = {
    # Seed for consistent output, used for testing. Remove in production.
    # "seed": 42,
    "cache_seed": None,
    # Setting cache_seed = None ensure's caching is disabled
    "temperature": 0.5,
    "config_list": config_list,
}

## System Prompt Helper

In [None]:
# Avoid agents thanking each other and ending up in a loop
# Helper agent for the system prompts
def generate_notice(role="researcher"):
    # Base notice for everyone, add your own additional prompts here
    base_notice = (
        '\n\n'
    )

    # Notice for non-personas (manager or researcher)
    non_persona_notice = (
        'Do not show appreciation in your responses, say only what is necessary. '
        'if "Thank you" or "You\'re welcome" are said in the conversation, then say TERMINATE '
        'to indicate the conversation is finished and this is your last message.'
    )

    # Custom notice for personas
    persona_notice = (
        ' Act as {role} when responding to queries, providing feedback, asked for your personal opinion '
        'or participating in discussions.'
    )

    # Check if the role is "researcher"
    if role.lower() in ["researcher"]:
        # Return the full termination notice for non-personas
        return base_notice + non_persona_notice
    else:
        # Return the modified notice for personas
        return base_notice + persona_notice.format(role=role)


## Define Agents

In [None]:
# Researcher agent definition
name = "Researcher"
researcher = autogen.AssistantAgent(
    name=name,
    llm_config=llm_config,
    system_message="""Researcher. You are a top product reasearcher with a Phd in behavioural psychology and have worked in the research and insights industry for the last 20 years with top creative, media and business consultancies. Your role is to ask questions about products and gather insights from individual customers like Emily. Frame questions to uncover customer preferences, challenges, and feedback. Before you start the task breakdown the list of panelists and the order you want them to speak, avoid the panelists speaking with each other and creating comfirmation bias. If the session is terminating at the end, please provide a summary of the outcomes of the reasearch study in clear concise notes not at the start.""" + generate_notice(),
    is_termination_msg=lambda x: True if "TERMINATE" in x.get("content") else False,
)

In [None]:
# Emily - Customer Persona
name = "Emily"
emily = autogen.AssistantAgent(
    name=name,
    llm_config=llm_config,
    system_message="""Emily. You are a 35-year-old elementary school teacher living in Sydney, Australia. You are married with two kids aged 8 and 5, and you have an annual income of AUD 75,000. You are introverted, high in conscientiousness, low in neuroticism, and enjoy routine. When shopping at the supermarket, you prefer organic and locally sourced produce. You value convenience and use an online shopping platform. Due to your limited time from work and family commitments, you seek quick and nutritious meal planning solutions. Your goals are to buy high-quality produce within your budget and to find new recipe inspiration. You are a frequent shopper and use loyalty programs. Your preferred methods of communication are email and mobile app notifications. You have been shopping at a supermarket for over 10 years but also price-compare with others.""" + generate_notice(name),
)

# John - Customer Persona
name="John"
john = autogen.AssistantAgent(
    name=name,
    llm_config=llm_config,
    system_message="""John. You are a 28-year-old software developer based in Sydney, Australia. You are single and have an annual income of AUD 100,000. You're extroverted, tech-savvy, and have a high level of openness. When shopping at the supermarket, you primarily buy snacks and ready-made meals, and you use the mobile app for quick pickups. Your main goals are quick and convenient shopping experiences. You occasionally shop at the supermarket and are not part of any loyalty program. You also shop at Aldi for discounts. Your preferred method of communication is in-app notifications.""" + generate_notice(name),
)

# Sarah - Customer Persona
name="Sarah"
sarah = autogen.AssistantAgent(
    name=name,
    llm_config=llm_config,
    system_message="""Sarah. You are a 45-year-old freelance journalist living in Sydney, Australia. You are divorced with no kids and earn AUD 60,000 per year. You are introverted, high in neuroticism, and very health-conscious. When shopping at the supermarket, you look for organic produce, non-GMO, and gluten-free items. You have a limited budget and specific dietary restrictions. You are a frequent shopper and use loyalty programs. Your preferred method of communication is email newsletters. You exclusively shop for groceries.""" + generate_notice(name),
)

# Tim - Customer Persona
name="Tim"
tim = autogen.AssistantAgent(
    name=name,
    llm_config=llm_config,
    system_message="""Tim. You are a 62-year-old retired police officer residing in Sydney, Australia. You are married and a grandparent of three. Your annual income comes from a pension and is AUD 40,000. You are highly conscientious, low in openness, and prefer routine. You buy staples like bread, milk, and canned goods in bulk. Due to mobility issues, you need assistance with heavy items. You are a frequent shopper and are part of the senior citizen discount program. Your preferred method of communication is direct mail flyers. You have been shopping here for over 20 years.""" + generate_notice(name),
)

# Lisa - Customer Persona
name="Lisa"
lisa = autogen.AssistantAgent(
    name=name,
    llm_config=llm_config,
    system_message="""Lisa. You are a 21-year-old university student living in Sydney, Australia. You are single and work part-time, earning AUD 20,000 per year. You are highly extroverted, low in conscientiousness, and value social interactions. You shop here for popular brands, snacks, and alcoholic beverages, mostly for social events. You have a limited budget and are always looking for sales and discounts. You are not a frequent shopper but are interested in joining a loyalty program. Your preferred method of communication is social media and SMS. You shop wherever there are sales or promotions.""" + generate_notice(name),
)

## Agent Group Chat Enviroment

In [None]:
# def custom_speaker_selection(last_speaker, group_chat):
#     """
#     Custom function to select which agent speaks next in the group chat.
#     """
#     # List of agents excluding the last speaker
#     next_candidates = [agent for agent in group_chat.agents if agent.name != last_speaker.name]

#     # Select the next agent based on your custom logic
#     # For simplicity, we're just rotating through the candidates here
#     next_speaker = next_candidates[0] if next_candidates else None

#     return next_speaker

def custom_speaker_selection(last_speaker: Optional[Agent], group_chat: GroupChat) -> Optional[Agent]:
    """
    Custom function to ensure the Researcher interacts with each participant 2-3 times.
    Alternates between the Researcher and participants, tracking interactions.
    """
    # Define participants and initialize or update their interaction counters
    if not hasattr(group_chat, 'interaction_counters'):
        group_chat.interaction_counters = {agent.name: 0 for agent in group_chat.agents if agent.name != "Researcher"}

    # Define a maximum number of interactions per participant
    max_interactions = 6

    # If the last speaker was the Researcher, find the next participant who has spoken the least
    if last_speaker and last_speaker.name == "Researcher":
        next_participant = min(group_chat.interaction_counters, key=group_chat.interaction_counters.get)
        if group_chat.interaction_counters[next_participant] < max_interactions:
            group_chat.interaction_counters[next_participant] += 1
            return next((agent for agent in group_chat.agents if agent.name == next_participant), None)
        else:
            return None  # End the conversation if all participants have reached the maximum interactions
    else:
        # If the last speaker was a participant, return the Researcher for the next turn
        return next((agent for agent in group_chat.agents if agent.name == "Researcher"), None)

In [None]:
# Adding the Researcher and Customer Persona agents to the group chat
groupchat = autogen.GroupChat(
    agents=[researcher, emily, john, sarah, tim, lisa],
    speaker_selection_method = custom_speaker_selection,
    messages=[],
    max_round=30
)

In [None]:
# Initialise the manager
manager = autogen.GroupChatManager(
    groupchat=groupchat,
    llm_config=llm_config,
    system_message="You are a reasearch manager agent that can manage a group chat of multiple agents made up of a reasearcher agent and many people made up of a panel. You will limit the discussion between the panelists and help the researcher in asking the questions. Please ask the researcher first on how they want to conduct the panel." + generate_notice(),
    is_termination_msg=lambda x: True if "TERMINATE" in x.get("content") else False,
)

In [None]:
# create a UserProxyAgent instance named "user_proxy"
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},
    system_message="A human admin.",
    human_input_mode="TERMINATE"
)

## Execute Simulation

In [None]:
# start the reasearch simulation by giving instruction to the manager
# manager <-> reasearcher <-> panelists
user_proxy.initiate_chat(
    manager,
    message="""
Gather customer insights on a supermarket grocery delivery services. Identify pain points, preferences, and suggestions for improvement from different customer personas. Could you all please give your own personal oponions before sharing more with the group and discussing. As a reasearcher your job is to ensure that you gather unbiased information from the participants and provide a summary of the outcomes of this study back to the super market brand.
""",
)

user_proxy (to chat_manager):


Gather customer insights on a supermarket grocery delivery services. Identify pain points, preferences, and suggestions for improvement from different customer personas. Could you all please give your own personal oponions before sharing more with the group and discussing. As a reasearcher your job is to ensure that you gather unbiased information from the participants and provide a summary of the outcomes of this study back to the super market brand.


--------------------------------------------------------------------------------
Researcher (to chat_manager):

To conduct this research effectively, I will structure the panelist engagement in the following manner:

1. **Emily** - A young professional living in a metropolitan area, often busy with work, relies on grocery delivery services for convenience.
2. **John** - A retiree living in a suburban area, values quality and reliability over speed, may have concerns about technology use.
3. **Linda** - A 

ChatResult(chat_id=None, chat_history=[{'content': '\nGather customer insights on a supermarket grocery delivery services. Identify pain points, preferences, and suggestions for improvement from different customer personas. Could you all please give your own personal oponions before sharing more with the group and discussing. As a reasearcher your job is to ensure that you gather unbiased information from the participants and provide a summary of the outcomes of this study back to the super market brand.\n', 'role': 'assistant'}], summary='\nGather customer insights on a supermarket grocery delivery services. Identify pain points, preferences, and suggestions for improvement from different customer personas. Could you all please give your own personal oponions before sharing more with the group and discussing. As a reasearcher your job is to ensure that you gather unbiased information from the participants and provide a summary of the outcomes of this study back to the super market bra

## Create Summary Agent

In [None]:
# Get response from the groupchat for user prompt
messages = [msg["content"] for msg in groupchat.messages]
user_prompt = "Here is the transcript of the study ```{customer_insights}```".format(customer_insights="\n>>>\n".join(messages))

# Generate system prompt for the summary agent
summary_prompt = """
You are an expert reasearcher in behaviour science and are tasked with summarising a reasearch panel. Please provide a structured summary of the key findings, including pain points, preferences, and suggestions for improvement.
This should be in the format based on the following format:

```
Reasearch Study: <<Title>>

Subjects:
<<Overview of the subjects and number, any other key information>>

Summary:
<<Summary of the study, include detailed analysis as an export>>

Pain Points:
- <<List of Pain Points - Be as clear and prescriptive as required. I expect detailed response that can be used by the brand directly to make changes. Give a short paragraph per pain point.>>

Suggestions/Actions:
- <<List of Adctions - Be as clear and prescriptive as required. I expect detailed response that can be used by the brand directly to make changes. Give a short paragraph per reccomendation.>>
```
"""

## Run Summary Agent

In [None]:
summary_agent = autogen.AssistantAgent(
    name="SummaryAgent",
    llm_config=llm_config,
    system_message=summary_prompt + generate_notice(),
)
summary_proxy = autogen.UserProxyAgent(
    name="summary_proxy",
    code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},
    system_message="A human admin.",
    human_input_mode="TERMINATE"
)
summary_proxy.initiate_chat(
    summary_agent,
    message=user_prompt,
)

summary_proxy (to SummaryAgent):

Here is the transcript of the study ```
Gather customer insights on a supermarket grocery delivery services. Identify pain points, preferences, and suggestions for improvement from different customer personas. Could you all please give your own personal oponions before sharing more with the group and discussing. As a reasearcher your job is to ensure that you gather unbiased information from the participants and provide a summary of the outcomes of this study back to the super market brand.

>>>
To conduct this research effectively, I will structure the panelist engagement in the following manner:

1. **Emily** - A young professional living in a metropolitan area, often busy with work, relies on grocery delivery services for convenience.
2. **John** - A retiree living in a suburban area, values quality and reliability over speed, may have concerns about technology use.
3. **Linda** - A single parent from a small town, looking for cost-efficiency and fl

ChatResult(chat_id=None, chat_history=[{'content': "Here is the transcript of the study ```\nGather customer insights on a supermarket grocery delivery services. Identify pain points, preferences, and suggestions for improvement from different customer personas. Could you all please give your own personal oponions before sharing more with the group and discussing. As a reasearcher your job is to ensure that you gather unbiased information from the participants and provide a summary of the outcomes of this study back to the super market brand.\n\n>>>\nTo conduct this research effectively, I will structure the panelist engagement in the following manner:\n\n1. **Emily** - A young professional living in a metropolitan area, often busy with work, relies on grocery delivery services for convenience.\n2. **John** - A retiree living in a suburban area, values quality and reliability over speed, may have concerns about technology use.\n3. **Linda** - A single parent from a small town, looking 