# Exploring Your Results
This notebook shows examples of built-in methods for accessing and working with `Results` objects, including print options, dataframes, SQL and visualization methods, as well as options for exporting results.

## Generating Results
We start by creating a `Results` object to work with. This is done automatically by constructing a survey and administering it to an LLM. For purposes of demonstration, we go through each of the steps to do this: creating questions, compiling them in a survey, optionally designing personas for agents that will respond to the survey and selecting LLMs, and administering the survey.

You can also skip these steps and instead call the `Results.example()` method to generate an example object to work with. (The `.example()` method is also available for `Question` types, `Agent` and `Survey` objects.)

In [1]:
# EDSL should be automatically installed when you run this notebook. If not, run the following command:
# ! pip install edsl

In [2]:
from edsl.results import Results

example_results = Results.example()

### Creating `Question` types`
We construct a survey by creating questions in the form of `Question` objects, and then compiling them into a `Survey` object.

In [3]:
from edsl.questions import QuestionMultipleChoice, QuestionCheckBox, QuestionLinearScale, QuestionYesNo, QuestionBudget, QuestionFreeText, QuestionList, QuestionNumerical

In [4]:
q_mc = QuestionMultipleChoice(
    question_name = "q_mc",
    question_text = "How often do you shop for {{item}}?",
    question_options = [
        "Rarely or never",
        "Annually",
        "Seasonally",
        "Monthly",
        "Daily"
    ]
)

In [5]:
q_cb = QuestionCheckBox(
    question_name = "q_cb",
    question_text = "Which of the following factors are important to you in making decisions about {{item}} shopping? Select all that apply.",
    question_options = [
        "Price",
        "Quality",
        "Brand Reputation",
        "Style and Design",
        "Fit and Comfort",
        "Customer Reviews and Recommendations",
        "Ethical and Sustainable Practices",
        "Return Policy",
        "Convenience",
        "Other"
    ]
)

In [6]:
q_ls = QuestionLinearScale(
    question_name = "q_ls",
    question_text = "On a scale of 0-10, how much do you typically enjoy {{item}} shopping? (0 = Not at all, 10 = Very much)",
    question_options = [0,1,2,3,4,5,6,7,8,9,10]
)

In [7]:
q_yn = QuestionYesNo(
    question_name = "q_yn",
    question_text = "Have you ever felt excluded or frustrated by the standard sizes of {{item}} in the fashion industry?", 
)

In [8]:
q_bg = QuestionBudget(
    question_name = "q_bg",
    question_text = "Estimate the percentage of your total time spent shopping for {{item}} in each of the following modes.",
    question_options=[
        "Online",
        "Malls",
        "Freestanding stores",
        "Mail order catalogs",
        "Other"
    ],
    budget_sum = 100,
)

In [9]:
q_ft = QuestionFreeText(
    question_name = "q_ft",
    question_text = "What improvements would you like to see in {{item}} shopping options?",
    allow_nonresponse = False,
)

In [10]:
q_li = QuestionList(
    question_name = "q_li",
    question_text = "What improvements would you like to see in {{item}} shopping options?"
)

In [11]:
q_nu = QuestionNumerical(
    question_name = "q_nu",
    question_text = "Estimate the amount of money that you spent on {{item}} in the past year (in $USD)."
)

### Compiling questions into a `Survey`
A survey takes a list of questions:

In [12]:
from edsl import Survey

survey = Survey(
    questions = [q_mc, q_cb, q_ls, q_yn, q_bg, q_ft, q_li, q_nu]
)

### Adding `Scenario` parameters to the questions
Scenarios are inputs for any questions that we have parameterized. When we administer the survey, each version of the question will be delivered to the LLMs:

In [13]:
from edsl import Scenario

items = ["clothes", "shoes"]

scenarios = [Scenario({"item": item}) for item in items]

### Creating personas for agents that will respond to the survey
We can optionally specify personas and individual traits for any agents that we want to respond to the survey. For this example, we will use randomly generated ages and heights. 

In [14]:
import random

ages = [random.randint(20, 80) for _ in range(5)] 
heights = [random.randint(58, 74) for _ in range(5)] 

In [15]:
agent_traits = [{
    "persona": "You are an adult woman living in the US.",
    "age": f"You are {age} years old.",
    "age_numerical": age,
    "height": f"You are {height} inches tall.",
    "height_numerical": height
} for age in ages for height in heights]

### Constructing the `AgentList`
We construct `Agent` objects by assigning the traits. Note that we include some `_numerical` copies of traits in order to easily access those values later on, as well as optional agent names for quick reference when we work with survey results data. Default agent names are assigned when results are simulated if they are not specified.

In [16]:
from edsl import Agent, AgentList

agent_list = AgentList([Agent(traits = traits, name = f"Agent_{index}") for index, traits in enumerate(agent_traits)])

In [17]:
agent_list.print()

### Selecting a `Model`
We can specify the LLMs that we want to use for the survey by creating `Model` objects. If none are specified, the default LLM model will be used:

In [18]:
from edsl import Model 

Model

Available models: ['claude-3-haiku-20240307', 'claude-3-opus-20240229', 'claude-3-sonnet-20240229', 'dbrx-instruct', 'gemini_pro', 'gpt-3.5-turbo', 'gpt-4-1106-preview', 'llama-2-13b-chat-hf', 'llama-2-70b-chat-hf', 'mixtral-8x7B-instruct-v0.1']

To create an instance, you can do: 
>>> m = Model('gpt-4-1106-preview', temperature=0.5, ...)

To get the default model, you can leave out the model name. 
To see the available models, you can do:
>>> Model.available()

In [19]:
models = [Model(m) for m in ("gpt-3.5-turbo", "gpt-4-1106-preview")]

### Administering the survey to the agents with the specified LLM
We administer a survey by appending any scenarios, agents and specified models to the survey with the `.by()` method and then append the `.run()` method:

In [20]:
results = survey.by(scenarios).by(agent_list).by(models).run()

Exceptions were raised in the following interviews: [17]

>>> results.task_history.show_exceptions()

If you want to plot by-task completion times, you can use 

>>> results.task_history.plot_completion_times()

If you want to plot by-task status over time, you can use

>>> results.task_history.plot()




### Inspecting the `Results`:

In [21]:
# results

# our results are long, we'll just show 1 here:
results[:1]

Result 0
                                                      Result                                                       
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Attribute              ┃ Value                                                                                  ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ agent                  │                                    Agent Attributes                                    │
│                        │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│                        │ ┃ Attribute               ┃ Value                                                    ┃ │
│                        │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│                        │ │ _name                   │ 'Agent_2

## Print options
We can use the `.print()` method to view results in a table, with columns for each of the components: questions, responses, agent traits and LLM parameters (default or specified). 

We first use the `.columns` method to identify those columns that we want to `.select()` to print:

In [22]:
results.columns

['agent.age',
 'agent.age_numerical',
 'agent.agent_name',
 'agent.height',
 'agent.height_numerical',
 'agent.persona',
 'answer.q_bg',
 'answer.q_bg_comment',
 'answer.q_cb',
 'answer.q_cb_comment',
 'answer.q_ft',
 'answer.q_li',
 'answer.q_li_comment',
 'answer.q_ls',
 'answer.q_ls_comment',
 'answer.q_mc',
 'answer.q_mc_comment',
 'answer.q_nu',
 'answer.q_nu_comment',
 'answer.q_yn',
 'answer.q_yn_comment',
 'iteration.iteration',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.q_bg_system_prompt',
 'prompt.q_bg_user_prompt',
 'prompt.q_cb_system_prompt',
 'prompt.q_cb_user_prompt',
 'prompt.q_ft_system_prompt',
 'prompt.q_ft_user_prompt',
 'prompt.q_li_system_prompt',
 'prompt.q_li_user_prompt',
 'prompt.q_ls_system_prompt',
 'prompt.q_ls_user_prompt',
 'prompt.q_mc_system_prompt',
 'prompt.q_mc_user_prompt',
 'prompt.q_nu_system_prompt',
 'prompt.

In [23]:
results.filter("item == 'shoes'").select("agent.age_numerical", "agent.height_numerical", "answer.q_nu").print()

We can apply some `pretty_labels()` to our table:

In [24]:
(results
 .filter("item == 'shoes'")
 .select("agent.age_numerical", "agent.height_numerical", "answer.q_nu")
 .print(pretty_labels={
     "agent.age_numerical":"Age (years)", 
     "agent.height_numerical":"Height (inches)",
     "answer.q_nu":q_nu.question_text})
)

We can also use the `.print_long()` method to show a vertical view of all the fields and data in our results:

In [25]:
# results.print_long()
results[:1].print_long()

## Dataframes
We can turn `Results` into a dataframe with the `.to_pandas()` method, then select `.columns`:

In [26]:
results.to_pandas().columns

Index(['agent.age', 'agent.age_numerical', 'agent.agent_name', 'agent.height',
       'agent.height_numerical', 'agent.persona', 'answer.q_bg',
       'answer.q_bg_comment', 'answer.q_cb', 'answer.q_cb_comment',
       'answer.q_ft', 'answer.q_li', 'answer.q_li_comment', 'answer.q_ls',
       'answer.q_ls_comment', 'answer.q_mc', 'answer.q_mc_comment',
       'answer.q_nu', 'answer.q_nu_comment', 'answer.q_yn',
       'answer.q_yn_comment', 'iteration.iteration', 'model.frequency_penalty',
       'model.logprobs', 'model.max_tokens', 'model.model',
       'model.presence_penalty', 'model.temperature', 'model.top_logprobs',
       'model.top_p', 'prompt.q_bg_system_prompt', 'prompt.q_bg_user_prompt',
       'prompt.q_cb_system_prompt', 'prompt.q_cb_user_prompt',
       'prompt.q_ft_system_prompt', 'prompt.q_ft_user_prompt',
       'prompt.q_li_system_prompt', 'prompt.q_li_user_prompt',
       'prompt.q_ls_system_prompt', 'prompt.q_ls_user_prompt',
       'prompt.q_mc_system_prompt', 'pr

In [27]:
results.to_pandas()[["agent.age_numerical", "agent.height_numerical", "scenario.item", "answer.q_nu"]]

Unnamed: 0,agent.age_numerical,agent.height_numerical,scenario.item,answer.q_nu
0,55,72,clothes,1200
1,55,72,shoes,500
2,55,70,clothes,1200
3,55,72,clothes,500
4,55,72,shoes,500
...,...,...,...,...
95,28,72,shoes,200
96,47,72,clothes,500
97,49,72,shoes,200
98,49,72,clothes,500


## SQL
We can use the `.sql()` method to access `Results` as a queryable SQL data table. This method takes the following arguments: <br>
<i>Required:</i> 
A SQL command of the form `"select * from self"` <br>
<i>Note: use `self` as the table name, <u>not</u> the name you've given your `Results`</i> <br>
<i>Required:</i>  
A `shape` selection: `shape="wide"` to maintain data fields horizontally or `shape="long"` to view the data vertically by data type (agent, model, prompt, answer), key (individual agent traits, questions, etc.) and corresponding values. <br>
<i>Optional:</i> 
An indicator `remove_prefix=True` if you want to access column names in the SQL command without the prefixes `agent.`, `model.`, etc., to avoid having to quote them (i.e., in order to do `"select temperature from self"` instead of `"select 'model.temperature' from self"`) <br><br>

The "wide" view lets you select individual fields as columns:

In [28]:
results.sql("""
    select 
        age_numerical as age, 
        height_numerical as height,
        item,
        q_mc as freq, 
        q_nu as annual_spend,
        count(*) as ct
    from self 
    group by 1,2,3,4,5
    limit 10
    """, 
    shape="wide", 
    remove_prefix=True
)

Unnamed: 0,age,height,item,freq,annual_spend,ct
0,24,62,clothes,Seasonally,500,1
1,24,62,clothes,Seasonally,1200,1
2,24,62,shoes,Seasonally,300,1
3,24,62,shoes,Seasonally,500,1
4,24,70,clothes,Seasonally,500,1
5,24,70,clothes,Seasonally,1200,1
6,24,70,shoes,Seasonally,200,1
7,24,70,shoes,Seasonally,500,1
8,24,72,clothes,Seasonally,500,2
9,24,72,clothes,Seasonally,800,1


The "long" view lets you arrange agent responses as you like:

In [29]:
(results
 .filter("item == 'clothes'")
 .sql("""
    select 
        a0.key,
        a0.value as agent0,
        a1.value as agent1,
        a2.value as agent2
    from self as a0
    inner join (
        select key, value
        from self 
        where id = 1
    ) as a1 on a1.key = a0.key
    inner join (
        select key, value
        from self 
        where id = 2
    ) as a2 on a2.key = a0.key
    where 1=1 
        and a0.id = 0
        and (a0.key like 'q_%' or a0.key = 'age')
        and a0.key not like '%comment'
        and a0.key not like '%prompt'
    """, 
    shape="long", 
    remove_prefix=True)
)

Unnamed: 0,key,agent0,agent1,agent2
0,age,You are 55 years old.,You are 55 years old.,You are 55 years old.
1,q_cb,"['Quality', 'Style and Design', 'Fit and Comfo...","['Price', 'Quality', 'Style and Design', 'Fit ...","['Quality', 'Style and Design', 'Fit and Comfo..."
2,q_ls,3,3,7
3,q_ft,I would like to see more inclusive sizing opti...,I would like to see more size inclusivity in c...,I'd like to see more inclusive sizing options ...
4,q_mc,Seasonally,Seasonally,Seasonally
5,q_yn,Yes,Yes,Yes
6,q_bg,"[{'Online': 40}, {'Malls': 30}, {'Freestanding...","[{'Online': 40}, {'Malls': 30}, {'Freestanding...","[{'Online': 40}, {'Malls': 30}, {'Freestanding..."
7,q_li,"['more inclusive sizing options', 'better qual...","['more sustainable options', 'better size incl...","['more inclusive sizing', 'virtual fitting too..."
8,q_nu,1200,1200,500
9,q_mc_raw_model_response,{'id': 'chatcmpl-9CaVxwABJ102o79Rm4GDUMQQQpDdX...,{'id': 'chatcmpl-9CaVwPO85WKJMJm1GaO9UyY3tRUri...,{'id': 'chatcmpl-9CaVxnNlJgxk4XmSeJXRQ4I45DT99...


The "long" view can be also useful for exporting data:

In [30]:
results.sql("select * from self", shape="long")

Unnamed: 0,id,data_type,key,value
0,0,agent,persona,You are an adult woman living in the US.
1,0,agent,age,You are 55 years old.
2,0,agent,age_numerical,55
3,0,agent,height,You are 72 inches tall.
4,0,agent,height_numerical,72
...,...,...,...,...
6294,99,question_text,q_mc_question_text,How often do you shop for {{item}}?
6295,99,question_text,q_li_question_text,What improvements would you like to see in {{i...
6296,99,question_text,q_bg_question_text,Estimate the percentage of your total time spe...
6297,99,question_text,q_cb_question_text,Which of the following factors are important t...


### Show the table schema:

You can use the method `.show_schema()` to see the table schema. This method also requires a `shape` argument.

Note that using `shape="wide"` in the `.sql()` method will display all columns horizontally, so each of the column names is listed when we do `.show_schema(shape="wide")`. 

In [31]:
results.show_schema(shape="long")

Type: table, Name: self, SQL: CREATE TABLE self (
                id INTEGER,
                data_type TEXT,
                key TEXT, 
                value TEXT
            )



In [32]:
results.show_schema(shape="wide", remove_prefix=True)

    cid                     name     type  notnull dflt_value  pk
0     0                      age     TEXT        0       None   0
1     1            age_numerical   BIGINT        0       None   0
2     2               agent_name     TEXT        0       None   0
3     3        frequency_penalty   BIGINT        0       None   0
4     4                   height     TEXT        0       None   0
5     5         height_numerical   BIGINT        0       None   0
6     6                     item     TEXT        0       None   0
7     7                iteration   BIGINT        0       None   0
8     8                 logprobs  BOOLEAN        0       None   0
9     9               max_tokens   BIGINT        0       None   0
10   10                    model     TEXT        0       None   0
11   11                  persona     TEXT        0       None   0
12   12         presence_penalty   BIGINT        0       None   0
13   13                     q_bg     TEXT        0       None   0
14   14   

## Visualization methods
Available visualization methods include: <br>
`.word_cloud_plot()` <br>
`.bar_chart()` <br>
`.faceted_bar_chart()` <br>
`.histogram_plot()` <br>
and an interactive html method: <br>
`.select(...).print(html=True, pretty_labels = {...}, interactive = True)` <br>

For questions where `Scenario` parameters should be isolated, we can first apply the `filter()` method to the results.

### Word cloud:

In [33]:
results.word_cloud_plot("q_ft")

### Bar chart:

In [34]:
results.filter("item == 'shoes'").bar_chart("q_mc", title = "How often do you shop for shoes?")

### Faceted bar chart:

In [35]:
results.faceted_bar_chart("q_mc", "item", title = "Shopping frequency")

### Histogram: 

In [36]:
results.filter("item == 'shoes'").histogram_plot("q_nu", title = "Estimated annual spend on shoes")

## Exporting data

We can export Results with the ```to_csv()``` method:

In [37]:
results.to_pandas().to_csv("results.csv")

We can also export `Results` using SQL:

In [38]:
csv_string = results.sql("select * from self", shape="long", csv=True)

In [39]:
with open("example.csv", "w") as f:
    f.write(csv_string)

In [40]:
! head "example.csv"

id,data_type,key,value
0,agent,persona,You are an adult woman living in the US.
0,agent,age,You are 55 years old.
0,agent,age_numerical,55
0,agent,height,You are 72 inches tall.
0,agent,height_numerical,72
0,agent,agent_name,Agent_20
0,scenario,item,clothes
0,model,temperature,0.5
0,model,max_tokens,1000


---
<p style="font-size: 14px;">Copyright © 2024 Go Emeritus, Inc. All rights reserved.   <a href="www.goemeritus.com" style="color:#130061">www.goemeritus.com</a></p>