# Starter Tutorial
This tutorial provides example code for basic features of [EDSL, an open-source Python library](https://pypi.org/project/edsl/) for simulating surveys, experiments and other research using AI agents and large language models.

In the steps below we show how to construct and run a simple question in EDSL, and then how to design more complex surveys with AI agents and different language models.
We also demonstrate methods for applying logic and rules to surveys, piping answers and adding data to questions, and analyzing survey results as datasets.

## Technical setup
Before running the code below, please ensure that you have completed technical steps for using EDSL:

* **Download the EDSL package.** See [installation](https://docs.expectedparrot.com/en/latest/installation.html) instructions and options.
* *Optional:* **Create a Coop account.** [Coop](https://docs.expectedparrot.com/en/latest/coop.html) is a (free) platform for creating, storing and sharing LLM-based research using EDSL. [Creating an account](https://www.expectedparrot.com/login) allows you to easily share your EDSL surveys, results, notebooks and other work, and access special features.
* **Choose how to manage API keys for language models.** Decide whether you want to [manage your own API keys](https://docs.expectedparrot.com/en/latest/api_keys.html) for language models or activate [remote inference](https://docs.expectedparrot.com/en/latest/remote_inference.html) to use your Expected Parrot API key to securely access all available models at once.

If you encounter any issues or have questions, please email us at info@expectedparrot.com or post a question at our [Discord channel](https://discord.com/invite/mxAYkjfy9m).

## Example: Running a simple question
EDSL comes with a [variety of question types](https://docs.expectedparrot.com/en/latest/questions.html) that we can choose from based on the form of the response that we want to get back from a model.
To see a list of all question types:

In [1]:
from edsl import Question

Question.available()

['checkbox',
 'extract',
 'free_text',
 'functional',
 'likert_five',
 'linear_scale',
 'list',
 'multiple_choice',
 'numerical',
 'rank',
 'top_k',
 'yes_no']

We can see the components of a particular question type by importing the question type class and calling the `example` method on it:

In [2]:
from edsl import (
    # QuestionCheckBox,
    # QuestionExtract,
    # QuestionFreeText,
    # QuestionFunctional,
    # QuestionLikertFive,
    # QuestionLinearScale,
    # QuestionList,
    QuestionMultipleChoice,
    # QuestionNumerical,
    # QuestionRank,
    # QuestionTopK,
    # QuestionYesNo
)

q = QuestionMultipleChoice.example() # substitute any question type class name
q

Here we create a simple multiple choice question:

In [3]:
from edsl import QuestionMultipleChoice

q = QuestionMultipleChoice(
    question_name = "smallest_prime",
    question_text = "Which is the smallest prime number?",
    question_options = [0, 1, 2, 3]
)

We can administer it to a language model by calling the `run` method:

In [4]:
results = q.run()

This generates a dataset of `Results` that we can readily access with [built-in methods for analysis](https://docs.expectedparrot.com/en/latest/results.html). 
Here we inspect the response, together with the model that was used and the model's "comment" about its response--a field that is automatically added to all question types other than free text:

In [5]:
results.select("model", "smallest_prime", "smallest_prime_comment").print(format="rich")

The `Results` also include information about the question, model parameters, prompts, generated tokens and raw responses. 
To see a list of all the components:

In [6]:
results.columns

['agent.agent_instruction',
 'agent.agent_name',
 'answer.smallest_prime',
 'comment.smallest_prime_comment',
 'generated_tokens.smallest_prime_generated_tokens',
 'iteration.iteration',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.smallest_prime_system_prompt',
 'prompt.smallest_prime_user_prompt',
 'question_options.smallest_prime_question_options',
 'question_text.smallest_prime_question_text',
 'question_type.smallest_prime_question_type',
 'raw_model_response.smallest_prime_cost',
 'raw_model_response.smallest_prime_one_usd_buys',
 'raw_model_response.smallest_prime_raw_model_response']

## Example: Conducting a survey with agents and models
In the next example we construct a more complex survey consisting of multiple questions, and design personas for AI agents to answer it.
Then we select specific language models to generate the answers.

We start by creating questions in different types and passing them to a `Survey`:

In [7]:
from edsl import QuestionLinearScale, QuestionFreeText

q_enjoy = QuestionLinearScale(
    question_name = "enjoy",
    question_text = "On a scale from 1 to 5, how much do you enjoy reading?",
    question_options = [1, 2, 3, 4, 5],
    option_labels = {1:"Not at all", 5:"Very much"}
)

q_favorite_place = QuestionFreeText(
    question_name = "favorite_place",
    question_text = "Describe your favorite place for reading."
)

We construct a `Survey` by passing a list of questions:

In [8]:
from edsl import Survey

survey = Survey(questions = [q_enjoy, q_favorite_place])

### Agents
An important feature of EDSL is the ability to create AI agents to answer questions.
This is done by passing dictionaries of relevant "traits" to `Agent` objects that are used by language models to generate responses.
Learn more about [designing agents](https://docs.expectedparrot.com/en/latest/agents.html).

Here we construct several simple agent personas to use with our survey:

In [9]:
from edsl import AgentList, Agent

agents = AgentList(
    Agent(traits = {"persona":p}) for p in ["artist", "mechanic", "sailor"]
)

### Language models
EDSL works with many popular large language models that we can select to use with a survey.
This makes it easy to compare responses among models in the results that are generated.

To see a current list of available models:

In [10]:
from edsl import Model

# Model.available() # uncomment this code and run it to see the full list of currently available models

To check the default model that will be used if no models are specified for a survey (e.g., as in the first example above):

In [11]:
Model()

(Note that the output may be different if the default model has changed since this page was last updated.)

Here we select some models to use with our survey:

In [12]:
from edsl import ModelList, Model

models = ModelList(
    Model(m) for m in ["gpt-4o", "gemini-pro"]
)

### Running a survey
We add agents and models to a survey using the `by` method.
Then we administer a survey the same way that we do an individual question, by calling the `run` method on it:

In [13]:
results = survey.by(agents).by(models).run()

In [14]:
(
    results
    .sort_by("persona", "model")
    .select("model", "persona", "enjoy", "favorite_place")
    .print(format="rich")
)

## Example: Adding context to questions
EDSL provides a variety of ways to add data or content to survey questions. 
These methods include:

* [Piping](https://docs.expectedparrot.com/en/latest/surveys.html#id2) answers to questions into follow-on questions
* [Adding "memory"](https://docs.expectedparrot.com/en/latest/surveys.html#question-memory) of prior questions and answers in a survey when presenting other questions to a model
* [Parameterizing questions with data](https://docs.expectedparrot.com/en/latest/scenarios.html), e.g., content from PDFs, CSVs, docs, images or other sources that you want to add to questions

### Piping question answers
Here we demonstrate how to pipe the answer to a question into the text of another question.
This is done by using a placeholder `{{ <question_name>.answer }}` in the text of the follow-on question where the answer to the prior question is to be inserted when the survey is run.
This causes the questions to be administered in the required order (survey questions are administered asynchronously by default).
Learn more about [piping question answers](https://docs.expectedparrot.com/en/latest/surveys.html#id2).

Here we insert the answer to a numerical question into the text of a follow-on yes/no question:

In [15]:
from edsl import QuestionNumerical, QuestionYesNo, Survey

q1 = QuestionNumerical(
    question_name = "random_number",
    question_text = "Pick a random number between 1 and 1,000."
)

q2 = QuestionYesNo(
    question_name = "prime",
    question_text = "Is this a prime number: {{ random_number.answer }}"
)

survey = Survey([q1, q2])

results = survey.run()

We can check the `user_prompt` for the `prime` question to verify that that the answer to the `random_number` question was piped into it:

In [16]:
results.select("random_number", "prime_user_prompt", "prime", "prime_comment").print(format="rich")

### Adding "memory" of questions and answers
Here we instead add a "memory" of the first question and answer to the context of the second question.
This is done by calling a memory rule and identifying the question(s) to add.
Instead of just the answer, information about the full question and answer are presented with the follow-on question text, and no placeholder is used.
Learn more about [question memory rules](https://docs.expectedparrot.com/en/latest/surveys.html#survey-rules-logic).

Here we demonstrate the `add_targeted_memory` method (we could also use `set_full_memory_mode` or other memory rules):

In [17]:
from edsl import QuestionNumerical, QuestionYesNo, Survey

q1 = QuestionNumerical(
    question_name = "random_number",
    question_text = "Pick a random number between 1 and 1,000."
)

q2 = QuestionYesNo(
    question_name = "prime",
    question_text = "Is the number you picked a prime number?"
)

survey = Survey([q1, q2]).add_targeted_memory(q2, q1)

results = survey.run()

We can again use the `user_prompt` to verify the context that was added to the follow-on question:

In [18]:
results.select("random_number", "prime_user_prompt", "prime", "prime_comment").print(format="rich")

*Related topic: Learn more about exploring and simulating "randomness" with AI agents and LLMs in [this notebook](https://docs.expectedparrot.com/en/latest/notebooks/random_numbers.html).*

## Scenarios
We can also add external data or content to survey questions.
This can be useful when you want to efficiently create and administer multiple versions of questions at once, e.g., for conducting data labeling tasks.
This is done by creating `Scenario` dictionaries for the data or content to be used with a survey, where the keys match `{{ placeholder }}` names used in question texts (or question options) and the values are the content to be added.
Scenarios can also be used to [add metadata to survey results](https://docs.expectedparrot.com/en/latest/notebooks/adding_metadata.html), e.g., data sources or other information that you may want to include in the results for reference but not necessarily include in question texts.

In the next example we revise the prior survey questions about reading to take a parameter for other activities that we may want to add to the questions, and create simple scenarios for some activities.
EDSL provides methods for automatically generating scenarios from a variety of data sources, including PDFs, CSVs, docs, images, tables and dicts. 
We use the `from_list` method to convert a list of activities into scenarios.

Then we demonstrate how to use scenarios to create multiple versions of our questions either (i) when constructing a survey or (ii) when running it:

* In the latter case, the `by` method is used to add scenarios to a survey of questions with placeholders at the time that it is run (the same way that agents and models are added to a survey). This adds a `scenario` column to the results with a row for each answer to each question for each scenario.
* In the former case, the `loop` method is used to create a list of versions of a question with the scenarios already added to it; when the questions are passed to a survey and it is run, the results include columns for each individual question; there is no `scenario` column and a single row for each agent's answers to all the questions.

Learn more about [using scenarios](https://docs.expectedparrot.com/en/latest/scenarios.html).

Here we create scenarios for a simple list of activities:

In [19]:
from edsl import ScenarioList, Scenario

scenarios = ScenarioList.from_list("activity", ["reading", "running", "relaxing"])

### Adding scenarios using the `by` method
Here we add the scenarios to the survey when we run it, together with any desired agents and models:

In [20]:
from edsl import QuestionLinearScale, QuestionFreeText, Survey

q_enjoy = QuestionLinearScale(
    question_name = "enjoy",
    question_text = "On a scale from 1 to 5, how much do you enjoy {{ activity }}?",
    question_options = [1, 2, 3, 4, 5],
    option_labels = {1:"Not at all", 5:"Very much"}
)

q_favorite_place = QuestionFreeText(
    question_name = "favorite_place",
    question_text = "In a brief sentence, describe your favorite place for {{ activity }}."
)

survey = Survey([q_enjoy, q_favorite_place])

In [21]:
results = survey.by(scenarios).by(agents).by(models).run()

In [22]:
(
    results
    .filter("model.model == 'gpt-4o'")
    .sort_by("activity", "persona")
    .select("activity", "persona", "enjoy", "favorite_place")
    .print(format="rich")
)

### Adding scenarios using the `loop` method
Here we add scenarios to questions when constructing a survey, as opposed to when running it.
When we run the survey the results will include columns for each question and no `scenario` field. 
Note that we can also optionally use the scenario key in the question names (they are otherwise incremented by default):

In [23]:
from edsl import QuestionLinearScale, QuestionFreeText

q_enjoy = QuestionLinearScale(
    question_name = "enjoy_{{ activity }}", # optional use of scenario key
    question_text = "On a scale from 1 to 5, how much do you enjoy {{ activity }}?",
    question_options = [1, 2, 3, 4, 5],
    option_labels = {1:"Not at all", 5:"Very much"}
)

q_favorite_place = QuestionFreeText(
    question_name = "favorite_place_{{ activity }}", # optional use of scenario key
    question_text = "In a brief sentence, describe your favorite place for {{ activity }}."
)

Looping the scenarios to create lists of questions:

In [24]:
enjoy_questions = q_enjoy.loop(scenarios)
enjoy_questions

[Question('linear_scale', question_name = """enjoy_reading""", question_text = """On a scale from 1 to 5, how much do you enjoy reading?""", question_options = [1, 2, 3, 4, 5], option_labels = {1: 'Not at all', 5: 'Very much'}),
 Question('linear_scale', question_name = """enjoy_running""", question_text = """On a scale from 1 to 5, how much do you enjoy running?""", question_options = [1, 2, 3, 4, 5], option_labels = {1: 'Not at all', 5: 'Very much'}),
 Question('linear_scale', question_name = """enjoy_relaxing""", question_text = """On a scale from 1 to 5, how much do you enjoy relaxing?""", question_options = [1, 2, 3, 4, 5], option_labels = {1: 'Not at all', 5: 'Very much'})]

In [25]:
favorite_place_questions = q_favorite_place.loop(scenarios)
favorite_place_questions

[Question('free_text', question_name = """favorite_place_reading""", question_text = """In a brief sentence, describe your favorite place for reading."""),
 Question('free_text', question_name = """favorite_place_running""", question_text = """In a brief sentence, describe your favorite place for running."""),
 Question('free_text', question_name = """favorite_place_relaxing""", question_text = """In a brief sentence, describe your favorite place for relaxing.""")]

Combining the questions in a survey:

In [26]:
survey = Survey(questions = enjoy_questions + favorite_place_questions)

In [27]:
results = survey.by(agents).by(models).run()

In [28]:
# results.columns # see that there are additional question fields and no scenario field

In [29]:
(
    results
    .filter("model.model == 'gpt-4o'")
    .sort_by("persona")
    .select("persona", "enjoy_reading", "enjoy_running", "enjoy_relaxing", "favorite_place_reading", "favorite_place_running", "favorite_place_relaxing")
    .print(format="rich")
)

## Exploring `Results`
EDSL comes with [built-in methods for analyzing and visualizing survey results](https://docs.expectedparrot.com/en/latest/language_models.html). 
For example, you can call the `to_pandas` method to convert results into a dataframe:

In [30]:
df = results.to_pandas(remove_prefix=True)
df

Unnamed: 0,enjoy_relaxing,favorite_place_relaxing,favorite_place_reading,favorite_place_running,enjoy_running,enjoy_reading,agent_instruction,agent_name,persona,temperature,...,favorite_place_reading_comment,enjoy_running_comment,enjoy_relaxing_comment,enjoy_reading_comment,favorite_place_reading_generated_tokens,enjoy_relaxing_generated_tokens,favorite_place_relaxing_generated_tokens,favorite_place_running_generated_tokens,enjoy_reading_generated_tokens,enjoy_running_generated_tokens
0,4,My favorite place for relaxing is a cozy nook ...,My favorite place for reading is a cozy nook b...,My favorite place for running is a serene fore...,1,4,You are answering questions as if you were a h...,Agent_1,artist,0.5,...,,Running isn't really my thing; I'd much rather...,"As an artist, I love finding moments of relaxa...",Reading can be incredibly inspiring and a grea...,My favorite place for reading is a cozy nook b...,"4 \nAs an artist, I love finding moments of r...",My favorite place for relaxing is a cozy nook ...,My favorite place for running is a serene fore...,4 \nReading can be incredibly inspiring and a...,1 \nRunning isn't really my thing; I'd much r...
1,5,My favorite place for relaxing is in my art st...,"My heart finds solace in the hushed, sun-drenc...",My favorite place for running is the trail tha...,3,5,You are answering questions as if you were a h...,Agent_1,artist,0.5,...,,Running is a great way to clear my head and ge...,Relaxing is essential for my creativity. It al...,Reading is a great way to relax and escape int...,"My heart finds solace in the hushed, sun-drenc...",5\n\nRelaxing is essential for my creativity. ...,My favorite place for relaxing is in my art st...,My favorite place for running is the trail tha...,5\n\nReading is a great way to relax and escap...,3\n\nRunning is a great way to clear my head a...
2,3,"My favorite place for relaxing is my garage, w...","My favorite place for reading is in my garage,...",My favorite place for running is a scenic trai...,1,3,You are answering questions as if you were a h...,Agent_2,mechanic,0.5,...,,I prefer working with my hands and fixing engi...,"I enjoy relaxing when I can, but I often find ...",I enjoy reading technical manuals and guides r...,"My favorite place for reading is in my garage,...","3 \nI enjoy relaxing when I can, but I often ...","My favorite place for relaxing is my garage, w...",My favorite place for running is a scenic trai...,3 \nI enjoy reading technical manuals and gui...,1\n\nI prefer working with my hands and fixing...
3,5,"My favorite place to relax is in my garage, su...","My favorite place to read is in my garage, sur...",My favorite place for running is the park near...,2,4,You are answering questions as if you were a h...,Agent_2,mechanic,0.5,...,,I'm not much of a runner. I prefer to work wit...,Relaxing is essential for me to recharge and c...,I enjoy reading a lot because it allows me to ...,"My favorite place to read is in my garage, sur...",5\n\nRelaxing is essential for me to recharge ...,"My favorite place to relax is in my garage, su...",My favorite place for running is the park near...,4\n\nI enjoy reading a lot because it allows m...,2\n\nI'm not much of a runner. I prefer to wor...
4,3,My favorite place for relaxing is on the deck ...,My favorite place for reading is the ship's de...,My favorite place for running is along the rug...,3,4,You are answering questions as if you were a h...,Agent_3,sailor,0.5,...,,"Running is alright, but I much prefer the feel...","As a sailor, I enjoy relaxing when I get the c...",Reading is a great way to pass the time on lon...,My favorite place for reading is the ship's de...,"3 \nAs a sailor, I enjoy relaxing when I get ...",My favorite place for relaxing is on the deck ...,My favorite place for running is along the rug...,4 \nReading is a great way to pass the time o...,"3 \nRunning is alright, but I much prefer the..."
5,5,"After a long day on the open seas, there's no ...",My favorite place for reading is the bow of th...,"Avast there, matey! Me favorite place for runn...",5,5,You are answering questions as if you were a h...,Agent_3,sailor,0.5,...,,I love the feeling of the wind in my hair and ...,I love relaxing. It's one of my favorite thing...,I love reading! It's one of my favorite ways t...,My favorite place for reading is the bow of th...,5\n\nI love relaxing. It's one of my favorite ...,"After a long day on the open seas, there's no ...","Avast there, matey! Me favorite place for runn...",5\n\nI love reading! It's one of my favorite w...,5\n\nI love the feeling of the wind in my hair...


The `Results` object also supports SQL-like queries with the the `sql` method:

In [31]:
results.sql("""
select model, persona, enjoy_reading, favorite_place_reading
from self
order by 1,2,3
""", shape="wide")

Unnamed: 0,model,persona,enjoy_reading,favorite_place_reading
0,gemini-pro,artist,5,"My heart finds solace in the hushed, sun-drenc..."
1,gemini-pro,mechanic,4,"My favorite place to read is in my garage, sur..."
2,gemini-pro,sailor,5,My favorite place for reading is the bow of th...
3,gpt-4o,artist,4,My favorite place for reading is a cozy nook b...
4,gpt-4o,mechanic,3,"My favorite place for reading is in my garage,..."
5,gpt-4o,sailor,4,My favorite place for reading is the ship's de...


## Posting to the Coop
The [Coop](https://www.expectedparrot.com/explore) is a platform for creating, storing and sharing LLM-based research.
It is fully integrated with EDSL and accessible from your workspace or Coop account page.
Learn more about [creating an account](https://www.expectedparrot.com/login) and [using the Coop](https://docs.expectedparrot.com/en/latest/coop.html).

We can post any EDSL object to the Coop by call the `push` method on it, optionally passing a `description` and `visibility` status:

In [32]:
results.push(description = "Starter tutorial sample survey results", visibility="public")

{'description': 'Starter tutorial sample survey results',
 'object_type': 'results',
 'url': 'https://www.expectedparrot.com/content/c7001765-a312-4db4-9838-8e783a376039',
 'uuid': 'c7001765-a312-4db4-9838-8e783a376039',
 'version': '0.1.33.dev1',
 'visibility': 'public'}

We can also post this notebook:

In [33]:
from edsl import Notebook

notebook = Notebook(path="starter_tutorial.ipynb")

notebook.push(description="Starter Tutorial", visibility="public")

{'description': 'Starter Tutorial',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/2d0c7905-933c-441a-8203-741d9dd942c9',
 'uuid': '2d0c7905-933c-441a-8203-741d9dd942c9',
 'version': '0.1.33.dev1',
 'visibility': 'public'}

To update an object:

In [35]:
notebook = Notebook(path="starter_tutorial.ipynb") # resave

notebook.patch(uuid = "2d0c7905-933c-441a-8203-741d9dd942c9", value = notebook)

{'status': 'success'}