# Cheatsheet: Scenarios
This notebook provides quick examples of methods for using `Scenario` objects to add data or other content to your [EDSL](https://docs.expectedparrot.com/) survey questions. Scenarios allow you to efficiently administer multiple versions of questions at once, which can be useful in conducting experiments and labeling/exploration tasks where you want to answer the same questions about many different things, such as every piece of data in a dataset, or a collection of texts or other content.

Below we show how to each of the following:

* Inspect an example scenario
* Use a scenario in a question
* Create scenarios
* Combine scenarios
* Replicate scenarios
* Rename scenario keys
* Sample scenarios
* Select and drop scenarios
* Slice/chunk text as scenarios
* Turn PDFs into scenarios
* Turn images into scenarios
* Add metadata to scenarios

<i>[EDSL](https://github.com/expectedparrot/edsl) is an open-source Python library for simulating surveys, experiments and other research with AI agents and large language models. Please see our [documentation page](https://docs.expectedparrot.com/) for information and tutorials on getting started, and more details on [methods for working with scenarios](https://docs.expectedparrot.com/en/latest/scenarios.html) that are shown here.</i>

## Importing the tools
We start by importing the relevant tools (see [installation instructions](https://docs.expectedparrot.com/en/latest/installation.html)):

In [1]:
# ! pip install edsl

In [2]:
from edsl import Scenario, ScenarioList

## Inspecting an example
A `Scenario` contains a dictionary of keys and values representing data or content to be added to (inserted in) the `question_text` field of a `Question` object (see [examples of all question types](https://docs.expectedparrot.com/en/latest/questions.html)). We can call the `example()` method to inspect an example scenario:

In [3]:
example_scenario = Scenario.example()
example_scenario

We can also see an example `ScenarioList`, which is a dictionary containing a list of scenarios:

In [4]:
example_scenariolist = ScenarioList.example()
example_scenariolist

## Using a Scenario
To use a scenario, we create a `Question` with a `{{ placeholder }}` in the `question_text` matching the scenario key. Then we call the `by()` method to add the scenario to the individual question or `Survey` (a [collection of questions](https://docs.expectedparrot.com/en/latest/surveys.html)) when we run it:

In [5]:
# Import question types
from edsl.questions import QuestionFreeText, QuestionList
from edsl import Survey

# Create questions in the relevant templates with placeholders
q1 = QuestionFreeText(
    question_name = "background",
    question_text = "Draft a sample bio for this researcher: {{ persona }}"
)
q2 = QuestionList(
    question_name = "interests",
    question_text = "Identify some potential interests of this researcher: {{ persona }}"
)

# Combine questions into a survey to administer them together
survey = Survey(questions = [q1, q2])

# Run the survey with the scenarios to generate a dataset of results
results = survey.by(example_scenario).run()

In [6]:
# Print a table of selected components of the results
results.select("persona", "background", "interests").print(format="rich")

Note that the `by()` method can take an individual `Scenario` or a list of scenarios (examples below). Learn more about how to [construct surveys](https://docs.expectedparrot.com/en/latest/surveys.html) and [analyze results](https://docs.expectedparrot.com/en/latest/results.html).

## Creating a Scenario
We create a scenario by passing a dictionary to a `Scenario` object:

In [7]:
weather_scenario = Scenario({"weather":"sunny"})
weather_scenario

## Creating a ScenarioList
It can be useful to create a set of scenarios all at once. This can be done by constructing a list of `Scenario` objects or a `ScenarioList`. Compare a list of `Scenario` objects:

In [8]:
weather_scenarios = [Scenario({"weather":w}) for w in ["sunny", "cloudy", "rainy", "snowy"]]
weather_scenarios

[Scenario({'weather': 'sunny'}),
 Scenario({'weather': 'cloudy'}),
 Scenario({'weather': 'rainy'}),
 Scenario({'weather': 'snowy'})]

Alternatively, we can create a `ScenarioList` which has a key `scenarios` and a list of scenarios as the values:

In [9]:
example_scenariolist = ScenarioList.example()
example_scenariolist

In [10]:
weather_scenariolist = ScenarioList([Scenario({"weather":w}) for w in ["sunny", "cloudy", "rainy", "snowy"]])
weather_scenariolist

## Combining scenarios
We can add scenarios together to create a single new scenario with an extended dictionary:

In [11]:
scenario1 = Scenario({"food": "apple"})
scenario2 = Scenario({"drink": "juice"})

snack_scenario = scenario1 + scenario2
snack_scenario

## Replicating scenarios
We can replicate a scenario to create a `ScenarioList`:

In [12]:
personas_scenariolist = Scenario.example().replicate(n=3)
personas_scenariolist

## Renaming scenarios
We can call the `rename()` method to rename the fields (keys) of a `Scenario`:

In [13]:
role_scenario = Scenario.example().rename({"persona": "role"})
role_scenario

The method can also be called on a `ScenarioList`:

In [14]:
scenariolist = ScenarioList([Scenario({"name": "Apostolos"}), Scenario({"name": "John"}),  Scenario({"name": "Robin"})])

renamed_scenariolist = scenariolist.rename({"name": "first_name"})
renamed_scenariolist

## Sampling
We can call the `sample()` method to take a sample from a `ScenarioList`:

In [15]:
weather_scenariolist = ScenarioList([Scenario({"weather":w}) for w in ["sunny", "cloudy", "rainy", "snowy"]])

sample = weather_scenariolist.sample(n=2)
sample

## Selecting and dropping scenarios
We can call the `select()` and `drop()` methods on a `ScenarioList` to include and exclude specified fields from the scenarios:

In [16]:
snacks_scenariolist = ScenarioList([Scenario({"food": "apple", "drink": "water"}), Scenario({"food": "banana", "drink": "milk"})])

food_scenariolist = snacks_scenariolist.select("food")
food_scenariolist

In [17]:
drink_scenariolist = snacks_scenariolist.drop("food")
drink_scenariolist

## Adding metadata to scenarios
Note that we can create fields in scenarios without including them in the `question_text`. This will cause the fields to be present in the `Results` dataset, which can be useful for adding metadata to your questions and results. [See more examples here](https://docs.expectedparrot.com/en/latest/notebooks/adding_metadata.html).

Example usage:

In [18]:
songs = [
    ["1999", "Prince", "pop"],
    ["1979", "The Smashing Pumpkins", "alt"],
    ["1901", "Phoenix", "indie"]
]
metadata_scenarios = [Scenario({"title":t, "musician":m, "genre":g}) for [t,m,g] in songs]
metadata_scenarios

[Scenario({'title': '1999', 'musician': 'Prince', 'genre': 'pop'}),
 Scenario({'title': '1979', 'musician': 'The Smashing Pumpkins', 'genre': 'alt'}),
 Scenario({'title': '1901', 'musician': 'Phoenix', 'genre': 'indie'})]

In [19]:
q = QuestionFreeText(
    question_name = "song",
    question_text = "What is this song about: {{ title }}" # optionally omitting other fields in the scenarios
)

results = q.by(metadata_scenarios).run()
results.select("scenario.*", "song").print(format="rich") # all scenario fields will be present

Note that it does not matter if we use a list of `Scenario` objects or a `ScenarioList` with the same data--the scenarios are added to the survey in the same way when it is run:

In [20]:
songs = [
    ["1999", "Prince", "pop"],
    ["1979", "The Smashing Pumpkins", "alt"],
    ["1901", "Phoenix", "indie"]
]
metadata_scenarios = ScenarioList([Scenario({"title":t, "musician":m, "genre":g}) for [t,m,g] in songs])
metadata_scenarios

In [21]:
q = QuestionFreeText(
    question_name = "song",
    question_text = "What is this song about: {{ title }}" # optionally omitting other fields in the scenarios
)

results = q.by(metadata_scenarios).run()
results.select("scenario.*", "song").print(format="rich") # all scenario fields will be present

## Chunking text
We can use the `chunk()` method to turn a `Scenario` into a `ScenarioList` with specified slice/chunk sizes based on `num_words` or `num_lines`. Note that the field `_chunk` is created automatically, and `_original` is added if optional parameter `include_original` is used:

In [22]:
my_haiku = """
This is a long text. 
Pages and pages, oh my!
I need to chunk it.
"""

text_scenario = Scenario({"my_text": my_haiku})

word_chunks_scenariolist = text_scenario.chunk("my_text", 
                                               num_words = 5, # use num_words or num_lines but not both
                                               include_original = True, # optional 
                                               hash_original = True # optional
)
word_chunks_scenariolist

In [23]:
line_chunks_scenariolist = text_scenario.chunk("my_text", 
                                               num_lines = 1
)
line_chunks_scenariolist

## Tallying scenario values
We can call the `tally()` method on a `ScenarioList` to tally numeric values for a specified key. It returns a dictionary with keys representing the number of each `Scenario` in the `ScenarioList` and values representing the tally of the key that was specified:

In [24]:
numeric_scenariolist = ScenarioList([Scenario({"a": 1, "b": 1}), Scenario({"a": 1, "b": 2})])

tallied_scenariolist = numeric_scenariolist.tally("b")
tallied_scenariolist

{1: 1, 2: 1}

## Expanding scenarios
We can call the `expand()` method on a `ScenarioList` to expand it by a specified field. For example, if the values of a scenario key are a list we can pass that key to the method to generate a `Scenario` for each item in the list:

## Mutating scenarios
We can call the `mutate()` method on a `ScenarioList` to add a key/value to each `Scenario` based on a logical expression:

In [25]:
scenariolist = ScenarioList([Scenario({"a": 1, "b": 1}), Scenario({"a": 1, "b": 2})])

mutated_scenariolist = scenariolist.mutate("c = a + b")
mutated_scenariolist

## Ordering scenarios
We can call the `order_by()` method on a `ScenarioList` to order the scenarios by a field:

In [26]:
unordered_scenariolist = ScenarioList([Scenario({"a": 1, "b": 1}), Scenario({"a": 1, "b": 2})])

ordered_scenariolist = unordered_scenariolist.order_by("b")
ordered_scenariolist

## Filtering scenarios
We can call the `filter()` method on a `ScenarioList` to filer scenarios based on a conditional expression.

In [27]:
unfiltered_scenariolist = ScenarioList([Scenario({"a": 1, "b": 1}), Scenario({"a": 1, "b": 2})])

filtered_scenariolist = unfiltered_scenariolist.filter("b == 2")
filtered_scenariolist

## Create scenarios from a list
We can call the `from_list()` method to create a `ScenarioList` from a list of values and a specified key:

In [28]:
my_list = ["Apostolos", "John", "Robin"]

scenariolist = ScenarioList.from_list("name", my_list)
scenariolist

## Adding a list of values to individual scenarios
We can call the `add_list()` method to add values to individual scenarios in a `ScenarioList`:

In [29]:
scenariolist = ScenarioList([Scenario({"weather": "sunny"}), Scenario({"weather": "rainy"})])

added_scenariolist = scenariolist.add_list("preference", ["high", "low"])
added_scenariolist

## Adding values to all scenarios
We can call the `add_value()` to add a value to all scenarios in a `ScenarioList`:

In [30]:
scenariolist = ScenarioList([Scenario({"name": "Apostolos"}), Scenario({"name": "John"}),  Scenario({"name": "Robin"})])

added_scenariolist = scenariolist.add_value("company", "Expected Parrot")
added_scenariolist

## Creating scenarios from a pandas DataFrame
We can call the `from_pandas()` method to create a `ScenarioList` from a pandas DataFrame:

In [31]:
import pandas as pd

df = pd.DataFrame({"name": ["Apostolos", "John", "Robin"], "location": ["New York", "Cambridge", "Cambridge"]})

scenariolist = ScenarioList.from_pandas(df)
scenariolist

## Creating scenarios from a CSV
We can call the `from_csv()` method to create a `ScenarioList` from a CSV:

In [32]:
scenariolist = ScenarioList.from_csv("example.csv")
scenariolist

## Turn a `ScenarioList` into a dictionary
We can call the `to_dict()` method to turn a `ScenarioList` into a dictionary:

In [33]:
scenariolist = ScenarioList([Scenario({"name": "Apostolos"}), Scenario({"name": "John"}),  Scenario({"name": "Robin"})])

dict_scenariolist = scenariolist.to_dict()
dict_scenariolist

{'scenarios': [{'name': 'Apostolos',
   'edsl_version': '0.1.25',
   'edsl_class_name': 'Scenario'},
  {'name': 'John', 'edsl_version': '0.1.25', 'edsl_class_name': 'Scenario'},
  {'name': 'Robin', 'edsl_version': '0.1.25', 'edsl_class_name': 'Scenario'}],
 'edsl_version': '0.1.25',
 'edsl_class_name': 'ScenarioList'}

## Create a `ScenarioList` from a dictionary
We can call the `from_dict()` method to create a `ScenarioList` from a dictionary. Note that the dictionary must contain a key "scenarios":

In [34]:
my_dict = {
    "scenarios": [
        {
            "name": "Apostolos",
            "location": "New York"
        },
        {
            "name": "John",
            "location": "Cambridge"
        },
        {
            "name": "Robin",
            "location": "Cambridge"
        }
    ]
}

scenariolist = ScenarioList.from_dict(my_dict)
scenariolist

## Turning PDF pages into scenarios
We can call the `from_pdf()` method to turn the pages of a PDF or doc into a `ScenarioList`. Here we use it for John's paper <i>"Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?"</i> ([link to paper](https://arxiv.org/pdf/2301.07543)). Note that the keys `filename`, `page` and `text` are automatically specified, so the `question_text` placeholder that we use for the scenarios must be `{{ text }}`:

In [35]:
pdf_pages_scenariolist = ScenarioList.from_pdf("homo_silicus.pdf")
pdf_pages_scenariolist[0:2] # inspecting the first couple pages as scenarios

Example usage:

## Turning PDF pages into scenarios
We can call the `from_pdf()` method to turn the pages of a PDF or doc into a `ScenarioList`. Here we use it for John's paper <i>"Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?"</i> ([link to paper](https://arxiv.org/pdf/2301.07543)). Note that the keys `filename`, `page` and `text` are automatically specified, so the `question_text` placeholder that we use for the scenarios must be `{{ text }}`:

In [36]:
homo_silicus_scenariolist = ScenarioList.from_pdf("homo_silicus.pdf")

Here we inspect a couple pages:

In [37]:
homo_silicus_scenariolist["scenarios"][0:2]

[{'filename': 'homo_silicus.pdf',
  'page': 1,
  'text': 'Large Language Models as Simulated Economic Agents:\nWhat Can We Learn from Homo Silicus?∗\nJohn J. Horton\nMIT & NBER\nJanuary 19, 2023\nAbstract\nNewly-developed large language models (LLM)—because of how they are trained and\ndesigned—are implicit computational models of humans—a homo silicus. LLMs can be\nused like economists use homo economicus: they can be given endowments, information,\npreferences, and so on, and then their behavior can be explored in scenarios via simulation.\nExperiments using this approach, derived from Charness and Rabin (2002), Kahneman,\nKnetsch and Thaler (1986), and Samuelson and Zeckhauser (1988) show qualitatively\nsimilar results to the original, but it is also easy to try variations for fresh insights. LLMs\ncould allow researchers to pilot studies via simulation ﬁrst, searching for novel social sci-\nence insights to test in the real world.\n∗Thanks to the MIT Center for Collective Intellige

Example usage--note that we can [sort results](https://docs.expectedparrot.com/en/latest/results.html#sorting-results) by any component, [filter results](https://docs.expectedparrot.com/en/latest/notebooks/docs_questions.html#Filtering-results) using conditional expressions, and also [limit how many results to display](https://docs.expectedparrot.com/en/latest/results.html#limiting-results):

In [38]:
q = QuestionFreeText(
    question_name = "summarize",
    question_text = "Summarize this page: {{ text }}" 
)
results = q.by(homo_silicus_scenariolist).run()

In [39]:
(results
 .sort_by("page")
 .filter("page > 1")
 .select("page", "summarize")
 .print(format="rich", max_rows = 3)
)

## Using images as scenarios
We can call the `from_image()` method to create a scenario for an image. Here we use it for Figure 1 in the <i>Home Silicus</i> paper.

Note that this method must be used with a vision model (e.g., GPT-4o) and does not require the use of a `{{ placeholder }}` in the question text. The scenario keys `file_path` and `encoded_image` are generated automatically:

In [40]:
from edsl import Model

model = Model("gpt-4o")

In [41]:
image_scenario = Scenario.from_image("homo_silicus_figure1.png")

In [42]:
image_scenario.keys()

['file_path', 'encoded_image']

Example usage:

In [43]:
q = QuestionFreeText(
    question_name = "figure",
    question_text = "Explain the graphic on this page." # no scenario placeholder
)

results = q.by(image_scenario).by(model).run()
results.select("figure").print(format="rich")

In [44]:
scenariolist = ScenarioList([Scenario({"a":1, "b":[1,2,3]})])

expanded_scenarios = scenariolist.expand("b")
expanded_scenarios

## Generating code for scenarios
We can call the `code()` method to generate the code for producing scenarios:

In [45]:
scenariolist = ScenarioList.example()

scenariolist_code = scenariolist.code()
scenariolist_code

['from edsl.scenarios.Scenario import Scenario\nfrom edsl.scenarios.ScenarioList import ScenarioList',
 "scenario_0 = Scenario({'persona': 'A reseacher studying whether LLMs can be used to generate surveys.'})",
 "scenario_1 = Scenario({'persona': 'A reseacher studying whether LLMs can be used to generate surveys.'})",
 'scenarios = ScenarioList([scenario_0, scenario_1])']

In [46]:
from edsl.scenarios.Scenario import Scenario
from edsl.scenarios.ScenarioList import ScenarioList

scenario_0 = Scenario({'persona': 'A reseacher studying whether LLMs can be used to generate surveys.'})
scenario_1 = Scenario({'persona': 'A reseacher studying whether LLMs can be used to generate surveys.'})
scenarios = ScenarioList([scenario_0, scenario_1])

## Converting a `ScenarioList` into an `AgentList`
We can call the `to_agent_list()` method to convert a `ScenarioList` into an `AgentList`. Note that agent `traits` cannot include a "name" key as `agent_name` is a separate optional field of `Agent` objects:

In [47]:
from edsl import AgentList

scenariolist = ScenarioList([Scenario({"first_name": "Apostolos", "location": "New York"}), 
                             Scenario({"first_name": "John", "location": "Cambridge"}), 
                             Scenario({"first_name": "Robin", "location": "Cambridge"})])

agentlist = scenariolist.to_agent_list()
agentlist

(Note that scenarios function similarly to `traits` dictionaries that we pass to AI `Agents` that we can use to answer survey questions. [Learn more about designing AI agents](https://docs.expectedparrot.com/en/latest/agents.html) for simulating surveys and experiments.)