# Google Form -> EDSL
This notebook provides example code for using EDSL to convert a survey into [EDSL](https://docs.expectedparrot.com) - an open-source Python package for simulating surveys, experiments and other research using AI agents and large language models.

## Designing the task as an EDSL survey
We design the task as an EDSL survey *about* the survey to be converted: a series of questions where we use a language model to extract and reformat the contents of a given survey. The results are components of a new EDSL survey that can be used to gather responses from AI agents and/or human audiences, and posted to the [Coop: a platform for creating, storing and sharing LLM-based research](https://www.expectedparrot.com/explore).

## Creating a meta-survey
We start by selecting appropriate question types for extracting and formatting the contents of the given survey.
[EDSL comes with many common question types](https://docs.expectedparrot.com/en/latest/questions.html) that we can choose from based on the form of the response that we want to get back from the model (multiple choice, checkbox, free text, linear scale, etc.). 

Here we use question type `QuestionList` in order to prompt a model to return information about all the questions in the survey at once (we may not know how many questions there are in advance). We then use the question type multiple times, creating a series of questions, in order to allow a model to focus on distinct tasks, instead of performing them all at once. This may improve overall performance and also allow us to pinpoint necessarily modifications to instructions as needed (some models will perform better than others). Note that we use a `{{ placeholder }}` for the text of the survey that we want to add to the initial question. which allows us to reuse it with other content:

In [1]:
from edsl import QuestionList

In [2]:
q1 = QuestionList(
    question_name="q_text",
    question_text="""
    You are being asked to extract questions from the text of a survey.
    Read the text and then return a list of all the questions in the 
    order that you find them. Return only the list of questions.
    Survey: {{ text }}
    """
)

In [3]:
q2 = QuestionList(
    question_name="q_type",
    question_text="""
    Now create a dictionary for each question, using keys 'question_text' and 'question_type'.
    The value for 'question_text' is the question text you already identified.
    The value for 'question_type' should be the most appropriate of the following types:
    'multiple_choice', 'checkbox', 'linear_scale' or 'free_text'.
    Return only the list of dictionaries you have created, with the 2 key/value pairs for each question.
    """
)

In [4]:
q3 = QuestionList(
    question_name="q_options",
    question_text="""
    Now add a key 'question_options' to each dictionary for all questions that are not free text,
    with a value that is a list of the answer options for the question.
    Preserve any integer options as integers, not strings.
    If there are labels for linear scale answer options then add another key 'option_labels'
    with a value that is a dictionary: the keys are the relevant integers and the values are the labels.
    Return only the list of dictionaries you have created with all relevant key/value pairs for each question.
    """
)

In [5]:
q4 = QuestionList(
    question_name="q_name",
    question_text="""
    Now add a key 'question_name' to each dictionary.
    The value should be a unique short pythonic string.
    Return only the list of dictionaries that you have created, 
    with all the key/value pairs for each question.
    """
)

Next we combine the questions into a `Survey` in order to administer them to a model together. 
We add a "memory" of each prior question in the survey so that the model will have the context and it's answer to the prior step on hand when answering each successive question:

In [6]:
from edsl import Survey

In [7]:
survey = Survey(questions = [q1, q2, q3, q4]).set_full_memory_mode()

## Adding content to questions
Next we create a `Scenario` object for the contents of the survey to the questions about it. 
This allows us to reuse the questions with any number of different contents. [Learn more about using scenarios](https://docs.expectedparrot.com/en/latest/scenarios.html) to scale data labeling and other tasks.

Here we create a scenario for a [Google Form](https://forms.gle/GufjVVs5PfxUeyoj7) (a customer feedback survey):

In [8]:
from edsl import ScenarioList, Scenario

In [9]:
s = Scenario.from_pdf("customer_feedback_survey.pdf")

In [10]:
s

## Selecting language models
EDSL works with many popular language models that we can select to use in generating survey responses. 
You can provide your own API keys for models or activate remote inference to run surveys at the Expected Parrot server with any available models. 
[Learn more about working with language models](https://docs.expectedparrot.com/en/latest/language_models.html) and using [remote inference](https://docs.expectedparrot.com/en/latest/remote_inference.html).

In [11]:
from edsl import ModelList, Model

To see a list of all available models:

In [12]:
# Model.available()

Here we select several models to compare their responses:

In [13]:
models = ModelList(
    Model(m) for m in ["gemini-pro", "gpt-4o", "claude-3-5-sonnet-20240620"]
)

## Running a survey
Next we add the scenario and models to the survey and run it. 
This generates a dataset of `Results` that we can access with built-in methods for analysis. 
[Learn more about working with results](https://docs.expectedparrot.com/en/latest/results.html).

In [14]:
results = survey.by(s).by(models).run()

To see a list of all the components of the results that have been generated:

In [15]:
# results.columns

We can filter, sort, select and print components in a table:

In [16]:
(
    results.sort_by("model")
    .select("model", "q_name") #"q_text", "q_type", "q_options", "q_name")
    .print(format="rich")
)

## Creating a new EDSL survey
Now we can construct a new EDSL survey with the reformatted components of the original survey.
This is done by creating `Question` objects with the question components, passing them to a new `Survey`, and then optionally designing and assigning AI agents to answer the survey.

Here we select one of the model's responses to use:

In [17]:
from edsl import Question

In [18]:
questions_list = results.filter("model.model == 'gpt-4o'").select("q_name").to_list()[0]
questions_list

[{'question_text': 'Email',
  'question_type': 'free_text',
  'question_name': 'email'},
 {'question_text': 'How did you first hear about our company?',
  'question_type': 'multiple_choice',
  'question_options': ['Social media',
   'Online search',
   'Friend/family recommendation',
   'Advertisement',
   'Other'],
  'question_name': 'first_hear'},
 {'question_text': 'Which of the following services have you used?',
  'question_type': 'checkbox',
  'question_options': ['Product support',
   'Online ordering',
   'In-store shopping',
   'Delivery services',
   'Loyalty program'],
  'question_name': 'services_used'},
 {'question_text': 'On a scale from 1 to 5, how satisfied are you with our customer service?',
  'question_type': 'linear_scale',
  'question_options': [1, 2, 3, 4, 5],
  'option_labels': {'1': 'Not at all satisfied', '5': 'Very satisfied'},
  'question_name': 'satisfaction'},
 {'question_text': 'How many times have you purchased from us in the past year?',
  'question_type

In [19]:
edsl_questions = [Question(**q) for q in questions_list]
edsl_questions

[Question('free_text', question_name = """email""", question_text = """Email"""),
 Question('multiple_choice', question_name = """first_hear""", question_text = """How did you first hear about our company?""", question_options = ['Social media', 'Online search', 'Friend/family recommendation', 'Advertisement', 'Other']),
 Question('checkbox', question_name = """services_used""", question_text = """Which of the following services have you used?""", min_selections = None, max_selections = None, question_options = ['Product support', 'Online ordering', 'In-store shopping', 'Delivery services', 'Loyalty program']),
 Question('linear_scale', question_name = """satisfaction""", question_text = """On a scale from 1 to 5, how satisfied are you with our customer service?""", question_options = [1, 2, 3, 4, 5], option_labels = {1: 'Not at all satisfied', 5: 'Very satisfied'}),
 Question('free_text', question_name = """purchase_frequency""", question_text = """How many times have you purchased fr

In [20]:
new_survey = Survey(edsl_questions)

We can inspect the survey that has been created, e.g.:

In [21]:
new_survey.question_names

['email',
 'first_hear',
 'services_used',
 'satisfaction',
 'purchase_frequency',
 'additional_comments']

## Designing AI agents
EDSL comes with methods for designing AI agent personas for language models to use in answering questions.
An `Agent` is created by passing a dictionary of relevant `traits`, and then assigned to a survey with the `by()` method when it is run (as we do with scenarios and models).

We can import existing data to create agents representing audiences of interest, or use EDSL to generate personas:

In [22]:
q_personas = QuestionList(
    question_name="personas",
    question_text="Draft 10 diverse personas for customers of a landscape business in New England capable of answering a feedback survey."
)

If we do not specify a model to use in running the question, the default model GPT 4 preview is used:

In [23]:
personas = q_personas.run().select("personas").to_list()[0]
personas

['Retired couple with a passion for gardening',
 'Young professional living in a suburban home',
 'Eco-conscious family focused on sustainability',
 'Small business owner needing commercial landscaping',
 'New homeowner looking to revamp their yard',
 'Luxury property owner seeking high-end design',
 'Busy parents wanting low-maintenance solutions',
 'Local government official managing public spaces',
 'Real estate agent preparing homes for sale',
 'DIY enthusiast seeking expert advice']

Note that the personas can be (much) longer and include key/value pairs for any desired traits; we keep it simple here for demonstration purposes.
Here we pass the personas to a list of agents and have them answer the survey:

In [24]:
from edsl import AgentList, Agent

In [25]:
agents = AgentList(
    Agent(
        traits = {"persona":p},
        instruction = """
        You are answering a customer feedback survey for a landscaping business that you have engaged in the past.
        Your answers are completely confidential.
        """
    )
    for p in personas
)

In [26]:
new_results = new_survey.by(agents).by(models).run()

In [27]:
(
    new_results.sort_by("model", "persona")
    .select("model", "persona", "first_hear", "services_used", "satisfaction") #, "purchase_frequency", "additional_comments")
    .print(format="rich")
)

In [28]:
(
    new_results.sort_by("model", "persona")
    .select("model", "persona", "purchase_frequency", "additional_comments")
    .print(format="rich")
)

## Posting to the Coop
The [Coop](https://www.expectedparrot.com/explore) is a platform for creating, storing and sharing LLM-based research.
It is fully integrated with EDSL and accessible from your workspace or Coop account page.
Learn more about [creating an account](https://www.expectedparrot.com/login) and [using the Coop](https://docs.expectedparrot.com/en/latest/coop.html).

Here we demonstrate how to post this notebook:

In [30]:
from edsl import Notebook

In [31]:
n = Notebook(path = "google_form_to_edsl.ipynb")

In [32]:
n.push(description = "Example code for using EDSL to convert a non-EDSL survey into EDSL", visibility = "public")

{'description': 'Example code for using EDSL to convert a non-EDSL survey into EDSL',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/358f4ff8-3996-4a49-a47f-d169e533f975',
 'uuid': '358f4ff8-3996-4a49-a47f-d169e533f975',
 'version': '0.1.33.dev1',
 'visibility': 'public'}