# Google Form -> EDSL
This notebook provides example EDSL code for converting a non-EDSL survey into in an EDSL survey. This can be useful for accessing EDSL's built-in methods for analyzing survey data, and extending it with responses simulated with AI agents and diverse large language models.

[EDSL is an open-source library](https://github.com/expectedparrot/edsl) for simulating surveys, experiments and other research with AI agents and large language models. 
Before running the code below, please ensure that you have [installed the EDSL library](https://docs.expectedparrot.com/en/latest/installation.html) and either [activated remote inference](https://docs.expectedparrot.com/en/latest/remote_inference.html) from your [Coop account](https://docs.expectedparrot.com/en/latest/coop.html) or [stored API keys](https://docs.expectedparrot.com/en/latest/api_keys.html) for the language models that you want to use with EDSL. Please also see our [documentation page](https://docs.expectedparrot.com/) for tips and tutorials on getting started using EDSL.

## Designing the task as an EDSL survey
We design the task as an EDSL survey *about* the survey to be converted: a series of questions prompting a language model to read and reformat the contents of a given survey. The formatted responses of the language model are readily usable components of a new EDSL survey that can be administered to AI agents and/or human audiences.

## Creating a meta-survey
We start by selecting appropriate question types for reformatting the contents of a given survey.
[EDSL comes with many common question types](https://docs.expectedparrot.com/en/latest/questions.html) that we can choose from based on the form of the response that we want to get back from a model: multiple choice, checkbox, free text, linear scale, etc.

Here we use `QuestionList` to return information about all the questions in the survey at once, as a list. We create a sequence of questions, using the response to one question as an input to the next question. This step-wise approach can improve performance by allowing a model to focus on distinct tasks, and also allow us to pinpoint modifications to instructions as needed (some models will perform better and need fewer instructions than others). Note that we use a `{{ placeholder }}` for the text of the survey to be reformatted that we want to add to the initial question, which allows us to reuse it with other content (e.g., another survey):

In [1]:
from edsl import QuestionList

First we ask the model to return just the questions from the survey:

In [2]:
q1 = QuestionList(
    question_name="q_text",
    question_text="""
    You are being asked to extract questions from the text of a survey.
    Read the text and then return a list of all the questions in the 
    order that you find them. Return only the list of questions.
    Survey: {{ text }}
    """
)

Next we ask the model to format the questions as dictionaries, and specify the question text and type:

In [3]:
q2 = QuestionList(
    question_name="q_type",
    question_text="""
    Now create a dictionary for each question, using keys 'question_text' and 'question_type'.
    The value for 'question_text' is the question text you already identified.
    The value for 'question_type' should be the most appropriate of the following types:
    'multiple_choice', 'checkbox', 'linear_scale' or 'free_text'.
    Return only the list of dictionaries you have created, with the 2 key/value pairs for each question.
    """
)

Next we ask the model to add the question options (if any):

In [4]:
q3 = QuestionList(
    question_name="q_options",
    question_text="""
    Now add a key 'question_options' to each dictionary for all questions that are not free text,
    with a value that is a list of the answer options for the question.
    Preserve any integer options as integers, not strings.
    If there are labels for linear scale answer options then add another key 'option_labels'
    with a value that is a dictionary: the keys are the relevant integers and the values are the labels.
    Return only the list of dictionaries you have created with all relevant key/value pairs for each question.
    """
)

Finally, we ask the model to give each question a name:

In [5]:
q4 = QuestionList(
    question_name="q_name",
    question_text="""
    Now add a key 'question_name' to each dictionary.
    The value should be a unique short pythonic string.
    Return only the list of dictionaries that you have created, 
    with all the key/value pairs for each question.
    """
)

Next we combine the questions into a `Survey` in order to administer them together. 
We add a "memory" of each prior question in the survey so that the model will have the context and its answers on hand when answering each successive question:

In [6]:
from edsl import Survey

In [7]:
survey = Survey(questions = [q1, q2, q3, q4]).set_full_memory_mode()

## Adding content to questions
Next we create a `Scenario` object for the contents of a (non-EDSL) survey to be inserted in the first question. 
This allows us to reuse the questions with other content. [Learn more about using scenarios](https://docs.expectedparrot.com/en/latest/scenarios.html) to scale data labeling and other tasks.

Here we create a scenario for a [Google Form](https://forms.gle/GufjVVs5PfxUeyoj7) (a customer feedback survey) that we have stored as a publicly-accessible PDF at the [Coop](https://www.expectedparrot.com/explore).

Code for posting a PDF to the Coop (uncomment and run with your own file):

In [8]:
# from edsl.scenarios.FileStore import PDFFileStore
# fs = PDFFileStore("customer_feedback_survey.pdf")
# info = fs.push()
# print(info)

Retrieving a file (replace with UUID of desired object):

In [9]:
from edsl.scenarios.FileStore import PDFFileStore

In [10]:
pdf_file = PDFFileStore.pull("0059a1a8-e5ff-4f19-a2bf-7cf1b98e4baf", expected_parrot_url="https://www.expectedparrot.com")

Creating a scenario for the content:

In [11]:
from edsl import Scenario

In [12]:
s = Scenario.from_pdf(pdf_file.to_tempfile())

Alternative code for creating a scenario from a local file:

In [13]:
# s = Scenario.from_pdf("customer_feedback_survey.pdf") # replace with your own local file

Inspecting the scenario that has been created:

In [14]:
s

## Selecting language models
EDSL works with many popular language models that we can select to use in generating survey responses. 
You can provide your own API keys for models or activate remote inference to run surveys at the Expected Parrot server with any available models. 
[Learn more about working with language models](https://docs.expectedparrot.com/en/latest/language_models.html) and using [remote inference](https://docs.expectedparrot.com/en/latest/remote_inference.html).

In [15]:
from edsl import ModelList, Model

To see a list of all available models:

In [16]:
# Model.available()

Here we select several models to compare their responses:

In [17]:
models = ModelList(
    Model(m) for m in ["gemini-pro", "gpt-4o", "claude-3-5-sonnet-20240620"]
)

## Running a survey
Next we add the scenario and models to the survey and run it. 
This generates a dataset of `Results` that we can access with built-in methods for analysis. 
[Learn more about working with results](https://docs.expectedparrot.com/en/latest/results.html).

In [18]:
results = survey.by(s).by(models).run()

To see a list of all the components of the results that have been generated:

In [19]:
# results.columns

We can filter, sort, select and print components in a table:

In [20]:
(
    results.sort_by("model")
    .select("model", "q_name") #"q_text", "q_type", "q_options", "q_name")
    .print(format="rich")
)

## Creating a new EDSL survey
Now we can construct a new EDSL survey with the reformatted components of the original survey.
This is done by creating `Question` objects with the question components, passing them to a new `Survey`, and then optionally designing and assigning AI agents to answer the survey.

Here we select one of the model's responses to use:

In [21]:
from edsl import Question

In [22]:
questions_list = results.filter("model.model == 'gpt-4o'").select("q_name").to_list()[0]
questions_list

[{'question_text': 'How did you first hear about our company?',
  'question_type': 'multiple_choice',
  'question_options': ['Social media',
   'Online search',
   'Friend/family recommendation',
   'Advertisement',
   'Other'],
  'question_name': 'hear_about_us'},
 {'question_text': 'Which of the following services have you used?',
  'question_type': 'checkbox',
  'question_options': ['Product support',
   'Online ordering',
   'In-store shopping',
   'Delivery services',
   'Loyalty program'],
  'question_name': 'services_used'},
 {'question_text': 'On a scale from 1 to 5, how satisfied are you with our customer service?',
  'question_type': 'linear_scale',
  'question_options': [1, 2, 3, 4, 5],
  'option_labels': {'1': 'Not at all satisfied', '5': 'Very satisfied'},
  'question_name': 'satisfaction_scale'},
 {'question_text': 'How many times have you purchased from us in the past year?',
  'question_type': 'free_text',
  'question_name': 'purchase_frequency'},
 {'question_text': 'Pl

In [23]:
edsl_questions = [Question(**q) for q in questions_list]
edsl_questions

[Question('multiple_choice', question_name = """hear_about_us""", question_text = """How did you first hear about our company?""", question_options = ['Social media', 'Online search', 'Friend/family recommendation', 'Advertisement', 'Other']),
 Question('checkbox', question_name = """services_used""", question_text = """Which of the following services have you used?""", min_selections = None, max_selections = None, question_options = ['Product support', 'Online ordering', 'In-store shopping', 'Delivery services', 'Loyalty program']),
 Question('linear_scale', question_name = """satisfaction_scale""", question_text = """On a scale from 1 to 5, how satisfied are you with our customer service?""", question_options = [1, 2, 3, 4, 5], option_labels = {1: 'Not at all satisfied', 5: 'Very satisfied'}),
 Question('free_text', question_name = """purchase_frequency""", question_text = """How many times have you purchased from us in the past year?"""),
 Question('free_text', question_name = """ad

In [24]:
new_survey = Survey(edsl_questions)

We can inspect the survey that has been created, e.g.:

In [25]:
new_survey.question_names

['hear_about_us',
 'services_used',
 'satisfaction_scale',
 'purchase_frequency',
 'additional_comments']

## Designing AI agents
EDSL comes with methods for designing AI agent personas for language models to use in answering questions.
An `Agent` is created by passing a dictionary of relevant `traits`. It can then be assigned to a survey using the `by()` method when the survey is run (the same as we do with scenarios and models).

We can import existing data to create agents representing audiences of interest, or use EDSL to generate personas:

In [26]:
q_personas = QuestionList(
    question_name="personas",
    question_text="Draft 5 diverse personas for customers of a landscape business in New England capable of answering a feedback survey."
)

If we do not specify a model to use in running the question, the default model GPT 4 preview is used:

In [27]:
personas = q_personas.run().select("personas").to_list()[0]
personas

['Retired Couple in Suburban Massachusetts',
 'Young Professional in Urban Connecticut',
 'Middle-Aged Homeowner in Rural Vermont',
 'Small Business Owner in Coastal Maine',
 'Eco-Conscious Family in New Hampshire']

Note that the personas can be (much) longer and include key/value pairs for any desired traits; we keep it simple here for demonstration purposes.
Here we pass the personas to a list of agents and have them answer the survey:

In [28]:
from edsl import AgentList, Agent

In [29]:
agents = AgentList(
    Agent(
        traits = {"persona":p},
        instruction = """
        You are answering a customer feedback survey for a landscaping business that you have engaged in the past.
        Your answers are completely confidential.
        """
    )
    for p in personas
)

In [30]:
new_results = new_survey.by(agents).by(models).run()

In [31]:
(
    new_results.sort_by("model", "persona")
    .select("model", "persona", "answer.*")
    .print(format="rich")
)

## Posting to the Coop
The [Coop](https://www.expectedparrot.com/explore) is a platform for creating, storing and sharing LLM-based research.
It is fully integrated with EDSL and accessible from your workspace or Coop account page.
Learn more about [creating an account](https://www.expectedparrot.com/login) and [using the Coop](https://docs.expectedparrot.com/en/latest/coop.html).

Here we demonstrate how to post this notebook:

In [32]:
from edsl import Notebook

In [34]:
n = Notebook(path = "google_form_to_edsl.ipynb")

In [35]:
n.push(description = "Example code for using EDSL to convert a non-EDSL survey into EDSL", visibility = "public")

{'description': 'Example code for using EDSL to convert a non-EDSL survey into EDSL',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/eae8ca14-da74-49cb-a220-59f52fea0faa',
 'uuid': 'eae8ca14-da74-49cb-a220-59f52fea0faa',
 'version': '0.1.33.dev1',
 'visibility': 'public'}

Updating an opject at the Coop:

In [35]:
n = Notebook(path = "google_form_to_edsl.ipynb") # resave

In [36]:
n.patch(uuid = "eae8ca14-da74-49cb-a220-59f52fea0faa", value = n)

{'status': 'success'}