# Simulate conversations

This notebook simulates OCS conversations by using one bot to simulate users interacting with another bot. Before running, you should:

## 1. Set up the project overall

Follow the overall project setup instructions in `README.md`.

## 2. Set up a user simulator experiment

You will need an experiment configured to simulate user interactions from the user side. This experiment should be set up to expect a persona description and any other context in its first user message, respond with an initial user message to begin the conversation, and then carry on to simulate that persona in all subsequent messages until finally sending a one-word response of "END" to end the simulated conversation. For example, here is the prompt text for an example experiment named `U.S. Health Advice - User Simulator`:

    You are a helpful user simulator for testing a U.S. Health Advice AI assistant. 
    
    The first message you receive from the user will describe the kind of user you should simulate. For example, you might be told that you are to simulate a 42-year-old male with daily headaches that started two weeks ago and have slowly worsened. Your response to that first message should be an initial message to send to the AI assistant, with no prefix or suffix; your response will be sent directly to the AI assistant to begin the conversation. Do your best to simulate what a real humans with the described persona and context would say to begin a conversation in this context. 
    
    All subsequent messages you receive from the user will be messages from a U.S. Health Advice AI assistant, beginning with its response to your initial message. You should respond to each of these messages "in character," acting as the persona described in the first user message. While you should use the persona description to guide your interactions with the AI assistant, it will be far from complete; feel free to make up additional details to fill in gaps in the persona, symptoms, etc. as necessary to plausibly simulate a human user interacting with a health advice assistant. 

    Continue to ask and answer questions until you expect a normal human with the described persona would end the conversation. To end the conversation after receiving any AI message, respond to any AI message with one word alone (with nothing before or after it):

    END

    To be clear, never add "END" to the end of a message for the assistant. I.e., the assistant will always have the last word, in response to which you can use END (alone) to end the conversation.

    Finally: do not continue conversations for more time than a regular human with the described persona would, and under no circumstances should you continue beyond 50 back-and-forth interactions between you and the AI assistant.
    
You can customize your simulator prompt text to suit your needs, but it should always instruct the AI to:
 
1. Expect persona and context details in user message #1 
2. Respond with an initial message to begin the conversation
3. Respond to subsequent user messages as the persona described in the first user message
4. End the conversation with a one-word response of "END" 

## 3. Configure your .ini file

The notebook begins by loading credentials and configuration from an `.ini` file stored in `~/.ocs/open-chat-studio-sim.ini`. The `~` in the path refers to the current user's home directory, and the `.ini` file contents should follow this format:

    [ocs]
    ocs-api-key=YOURKEYHERE
    experiment-id=YOURIDHERE
    user-simulator-experiment-id=YOURIDHERE
    participant-id=open-chat-studio-sim
    
    [files]
    input-path-prefix=~/ocs-sim/inputs
    output-path-prefix=~/ocs-sim/outputs
    
    [athina]
    athina-api-key=

You can get started quickly by:

1. Copying the `example-open-chat-studio-sim.ini` file to `~/.ocs/open-chat-studio-sim.ini`.

2. Editing `~/.ocs/open-chat-studio-sim.ini` as follows:

    a. Add your Open Chat Studio API key

    b. Add the experiment ID for the experiment you want to simulate user interactions with

    c. Add the experiment ID for the user simulator experiment you set up in step 2 

    d. Adjust the input and output path prefixes as appropriate (where `~` refers to your user home directory)

    e. Optionally, add an Athina API key if you want to export results as a dataset in Athina

## 4. Save input file to your configured input path

Your configured input path should include a `simulations_to_run.csv` file with the following columns:

- `context`: a description of the context to simulate, which will be sent to the user simulator experiment as the first message (should include everything needed to simulate a user, including background information, why the user is coming to interact with the bot, etc.)
- `simulation_id`: (optional) a unique identifier for the simulation (if not provided, row number will be used)

Finally, note that supporting code for this notebook can be found in `ocs_api.py` and `ocs_simulation_support.py`.

In [1]:
import logging
import configparser
import os

# set log level to WARNING
logging.basicConfig(level=logging.WARNING)

# load credentials and other configuration from local ini file
inifile_location = os.path.expanduser("~/.ocs/open-chat-studio-sim.ini")
inifile = configparser.RawConfigParser()
inifile.read(inifile_location)

# load configuration
ocs_api_key = inifile.get("ocs", "ocs-api-key")
experiment_id = inifile.get("ocs", "experiment-id")
user_simulator_experiment_id = inifile.get("ocs", "user-simulator-experiment-id")
participant_id = inifile.get("ocs", "participant-id")
input_path_prefix = os.path.expanduser(inifile.get("files", "input-path-prefix"))
output_path_prefix = os.path.expanduser(inifile.get("files", "output-path-prefix"))
athina_api_key = inifile.get("athina", "athina-api-key")

# initialize OCS and simulation support
from ocs_api import OCSAPIClient
from ocs_simulation_support import OCSBotToBotSimulator
ocs_api_client = OCSAPIClient(ocs_api_key)
ocs_simulator = OCSBotToBotSimulator(ocs_api_client, experiment_id, user_simulator_experiment_id, participant_id)

# report results
print("Local configuration loaded, OCS API initialized.")

Local configuration loaded, OCS API initialized.


## Execute simulations

The following code block reads a list of simulations to run from the `simulations_to_run.csv` file in the configured input path, executes them, and saves the results to the `simulation_results.csv` file in the configured output path. 

`simulations_to_run.csv` should have the following columns:

- `context`: a description of the context to simulate, which will be sent to the user simulator experiment as the first message (should include everything needed to simulate a user, including background information, why the user is coming to interact with the bot, etc.)
- `simulation_id`: (optional) a unique identifier for the simulation (if not provided, row number will be used)

`simulation_results.csv` will have the following columns:

- `simulation_id`: the unique identifier for the simulation
- `session_id`: the unique identifier for the session
- `context`: the context description
- `query`: the query sent to the AI assistant
- `response`: the response received from the AI assistant

In [2]:
import csv
import pandas as pd

# load input file using pandas
input_file = os.path.join(input_path_prefix, "simulations_to_run.csv")
simulations_to_run = pd.read_csv(input_file)

# assemble simulations to run
simulations = []
for index, row in simulations_to_run.iterrows():
    # if there's a "simulation_id" column, use that, otherwise use the row number as the simulation ID
    simulation_id = row.get("simulation_id", index+1)
    # add to list of simulations to run
    simulations.append((str(simulation_id), row["context"]))

# execute all the simulations (continuing on error and limiting to 50 exchanges per simulation)
results = ocs_simulator.exec_simulations(simulations, continue_on_error=True, max_exchanges=50)

# save results to output .csv file
output_file = os.path.join(output_path_prefix, "simulation_results.csv")
output_rows = []
with open(output_file, "w", newline="") as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=["simulation_id", "session_id", "context", "query", "response"], quoting=csv.QUOTE_NONNUMERIC, escapechar='\\')
    writer.writeheader()
    for result in results:
        for query, response in result["messages"]:
            output_row = {
                "simulation_id": result["simulation_id"],
                "session_id": result["experiment_session_id"],
                "context": result["context"],
                "query": query,
                "response": response
            }
            writer.writerow(output_row)
            output_rows.append(output_row)

# report results
print(f"Simulations executed and {len(results)} simulation results saved to {output_file}.")

Simulations executed and 2 simulation results saved to /Users/crobert/Files/ocs-sim/outputs/simulation_results.csv.


## Optional: Export results to Athina dataset

If an Athina API key is configured, the results can be exported to an Athina dataset. The dataset will be named `simulations-{experiment_id}-{timestamp}` and will contain the rows from the `simulation_results.csv` file.

In [4]:
from ocs_simulation_support import athina_create_dataset

# optionally export the results to an Athina dataset
if athina_api_key:
    # push new dataset to Athina
    dataset_name = f"simulations-{experiment_id}-{pd.Timestamp.now().strftime('%Y%m%d%H%M%S')}"
    dataset_description = f"Simulated conversations for experiment {experiment_id} at {pd.Timestamp.now()}, using user simulator experiment {user_simulator_experiment_id}"
    try:
        dataset = athina_create_dataset(athina_api_key=athina_api_key, dataset_name=dataset_name, dataset_description=dataset_description, dataset_rows=output_rows)
    except Exception as e:
        print(f"Failed to create Athina dataset: {e}")
    else:
        print(f"Results exported to Athina dataset {dataset.id} (name: {dataset_name}).")

Results exported to Athina dataset 3c389eb5-8ed3-41ce-ab13-778e0e7a3207 (name: simulations-f721dce8-1e6e-4aff-a7ab-81459b255ed8-20240905120922).
