# Analyzing course evaluations
This notebook provides sample EDSL code for using a language model to analyze a set of course evaluations. The analysis is designed as a survey of questions about the evaluations that we prompt an AI agent to answer, using a language model to generate the responses as a dataset.

[EDSL](https://pypi.org/project/edsl/) is an open-source Python package for simulating surveys and experiments with AI agents and language models. Please [see our docs](https://docs.expectedparrot.com/en/latest/index.html#) for tips on getting started.

## Technical setup
Before running the code below, please see instructions for [installing EDSL](https://docs.expectedparrot.com/en/latest/installation.html) and [storing API keys](https://docs.expectedparrot.com/en/latest/api_keys.html) for the language models that you want to use. 

## Create questions
We start by creating questions about the evaluations for an agent to answer. EDSL comes with a [variety of question types](https://docs.expectedparrot.com/en/latest/questions.html) (multiple choice, free text, etc.) that we can choose from based on the desired format of the response (e.g., a selection from a list of options, unstructured text, etc.). We can use a `{{ placeholder }}` in each question text in order to parameterize it with each evaluation. This allows us to create different "scenarios" of the questions that we can administer together.

Here we select some question types:

In [1]:
from edsl.questions import QuestionList, QuestionMultipleChoice

Here we compose some questions in the relevant question type templates (see [examples of all types](https://docs.expectedparrot.com/en/latest/questions.html#question-type-classes) in the docs):

In [2]:
q_sentiment = QuestionMultipleChoice(
    question_name="sentiment",
    question_text="What is the overall sentiment of this evaluation: {{ evaluation }}",
    question_options=["Positive", "Neutral", "Negative"],
)

q_themes = QuestionList(
    question_name="themes",
    question_text="Summarize the key points of this evaluation: {{ evaluation }}",
    max_list_items=3,  # Optional
)

q_improvements = QuestionList(
    question_name="improvements",
    question_text="Identify areas for improvement based on this evaluation: {{ evaluation }}",
    max_list_items=3,
)

## Construct a survey
Next we combine our questions into a survey. This allows us to administer the questions asynchronously (by default), or according to any desired [survey logic or rules](https://docs.expectedparrot.com/en/latest/surveys.html) that we want to add, such as skip/stop rules or giving an agent "memories" of other questions in the survey. Here we create a simple asynchronous survey by passing the list of questions to a `Survey` object:

In [3]:
from edsl import Survey

survey = Survey(questions=[q_sentiment, q_themes, q_improvements])

## Select data for review
Next we identify the data to be analyzed. Here we use some mock evaluations for an Econ 101 course stored as a list of texts:

In [4]:
evaluations = [
    "I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings.",
    "This class was a struggle for me. The material felt dry and difficult to connect with real-world applications, which I think could have made it more interesting. More examples from current events would definitely have helped spark my interest.",
    "Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging.",
    "As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times.",
    "I enjoyed the course, especially the group projects, which were both challenging and rewarding. It was great to apply economic concepts to solve real-life problems. I did feel, however, that the feedback on assignments could be more detailed to help us understand our mistakes.",
    "The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions.",
    "This was my favorite class this semester! The mix of theory and case studies was perfect, and the exams were fair. I also really appreciated the diversity of perspectives we explored in class, especially in terms of global economic policies.",
    "I found the textbook to be overly complex for an introductory course. It often used jargon that hadn't been explained in lectures, which was confusing. Simpler reading materials or more explanatory lectures would make a big difference for newcomers to economics.",
    "The professor was knowledgeable and clearly passionate about economics, but I felt the course relied too heavily on tests rather than more creative forms of assessment. More varied assignments would make the course more accessible to students with different learning styles.",
    "This class was a solid introduction to economics, though it leaned heavily on theoretical aspects. I would have liked more opportunities to discuss the real-world implications of economic theories, which I believe would enhance understanding and retention of the material.",
]

## Add data to the questions
Next we create a `Scenario` for each evaluation that we will add to the questions when we run the survey:

In [5]:
from edsl import ScenarioList

scenarios = ScenarioList.from_list("evaluation", evaluations)

## Design AI agents
Next we design agents with relevant traits and personas for the model to use in answering the questions. This can be useful if we want to compare responses among different audiences. We do this by passing a dictionaries of `traits` to `Agent` objects. We can also choose whether to give an agent additional instructions for ansering the survey (independent of individual question texts). Here we create a persona for the professor of the course and pass it some special instructions:

In [6]:
from edsl import Agent

persona = (
    "You are a professor reviewing student evaluations for your recent Econ 101 course."
)
instruction = "Be very specific and constructive in providing feedback and suggestions."

agent = Agent(traits={"persona": persona}, instruction=instruction)

## Select language models
EDSL works with many popular language models that we can use to generate responses for our survey. We can see a current list of all available models:

In [7]:
from edsl import Model
# To see available models, run 'Model.available()

We select models to use with a survey by creating `Model` objects for them. The default model is GPT 4 Preview, meaning that EDSL will use it to run our survey if we do not specify a different model (with API keys stored). For purposes of demontration, we'll explicitly specify this model the way that we do any other model:

In [8]:
model = Model()

Learn more about available [language models and methods](https://docs.expectedparrot.com/en/latest/language_models.html).

## Run the survey
Now we add the scenarios and agent to the survey, and then run it with the specified model. This will generate a dataset of responses that we can store and begin analyzing:

In [9]:
results = survey.by(scenarios).by(agent).by(model).run()

## Inspect the responses
EDSL comes with [built-in methods for analyzing results](https://docs.expectedparrot.com/en/latest/results.html) in data tables, dataframes, SQL queries and other formats. We can print a list of all the components that can be accessed. Here we will just look at the first 5:

In [10]:
results.columns[:5]

['agent.agent_instruction',
 'agent.agent_name',
 'agent.persona',
 'answer.improvements',
 'answer.sentiment']

For example, we can transform the results into a dataframe:

In [11]:
df = results.to_pandas()
df.head()

Unnamed: 0,answer.sentiment,answer.improvements,answer.themes,scenario.evaluation,agent.persona,agent.agent_name,agent.agent_instruction,model.presence_penalty,model.frequency_penalty,model.logprobs,...,question_text.sentiment_question_text,question_options.improvements_question_options,question_options.themes_question_options,question_options.sentiment_question_options,question_type.themes_question_type,question_type.sentiment_question_type,question_type.improvements_question_type,comment.sentiment_comment,comment.themes_comment,comment.improvements_comment
0,Positive,"['Adjust the course pacing', 'Provide addition...","['Course was engaging and informative', 'Profe...",I found the course very engaging and informati...,You are a professor reviewing student evaluati...,Agent_0,Be very specific and constructive in providing...,0,0,False,...,What is the overall sentiment of this evaluati...,,,"['Positive', 'Neutral', 'Negative']",list,multiple_choice,list,"The evaluation is generally positive, highligh...",The feedback highlights the positive aspects o...,The student's feedback indicates that while th...
1,Negative,['Incorporate more real-world examples into le...,['Student found the material dry and unengagin...,This class was a struggle for me. The material...,You are a professor reviewing student evaluati...,Agent_0,Be very specific and constructive in providing...,0,0,False,...,What is the overall sentiment of this evaluati...,,,"['Positive', 'Neutral', 'Negative']",list,multiple_choice,list,The evaluation suggests that the student had d...,The student's evaluation indicates a need for ...,The feedback suggests that the student found t...
2,Positive,"['Incorporate more real-world case studies', '...","['Enthusiastic teaching style', 'Availability ...",Excellent introductory course! The professor w...,You are a professor reviewing student evaluati...,Agent_0,Be very specific and constructive in providing...,0,0,False,...,What is the overall sentiment of this evaluati...,,,"['Positive', 'Neutral', 'Negative']",list,multiple_choice,list,The evaluation is clearly appreciative of the ...,The feedback highlights the positive teaching ...,"The feedback is generally positive, suggesting..."
3,Neutral,['Incorporate more real-world examples into le...,"['Appreciated analytical rigor', 'Desire for m...","As someone with a strong background in math, I...",You are a professor reviewing student evaluati...,Agent_0,Be very specific and constructive in providing...,0,0,False,...,What is the overall sentiment of this evaluati...,,,"['Positive', 'Neutral', 'Negative']",list,multiple_choice,list,The evaluation suggests satisfaction with the ...,The student values the analytical aspect of th...,The feedback suggests that while the analytica...
4,Positive,['Provide more detailed feedback on assignment...,['Student appreciated group projects and pract...,"I enjoyed the course, especially the group pro...",You are a professor reviewing student evaluati...,Agent_0,Be very specific and constructive in providing...,0,0,False,...,What is the overall sentiment of this evaluati...,,,"['Positive', 'Neutral', 'Negative']",list,multiple_choice,list,The evaluation expresses enjoyment and appreci...,The feedback is constructive and highlights th...,The student appreciates the practical applicat...


Here we select just the responses to the questions and display them in a table:

In [12]:
results.select("sentiment", "themes", "improvements").print(format="rich")

We can do a quick tally of the sentiments:

In [13]:
results.select("sentiment").tally().print(format = "rich")

We can also use pandas methods by first converting:

In [14]:
df_sentiment = results.to_pandas()["answer.sentiment"]
df_sentiment.value_counts()

answer.sentiment
Positive    4
Neutral     4
Negative    2
Name: count, dtype: int64

## Use responses to construct new questions
We can use the responses to our initial questions to construct more questions about the texts. For example, we can prompt a model to condense the individual lists of themes and areas for improvement into short lists, and then use the new lists to quantify the topics across the set of evaluations.

Here we take the lists of themes in each evaluation, flatten them into a (duplicative) list, and then create a new question prompting a model to condense it for us:

In [15]:
themes = results.select("themes").to_list(flatten=True)
themes

['Course was engaging and informative',
 'Professor effectively simplified complex concepts',
 'Course pace was fast, making it hard to keep up with readings',
 'Student found the material dry and unengaging',
 'Difficulty in relating the material to real-world applications',
 'Suggestion to include more current event examples',
 'Enthusiastic teaching style',
 'Availability for extra help',
 'Effective use of interactive lectures and practical assignments',
 'Appreciated analytical rigor',
 'Desire for more real-world applications',
 'Felt theories were isolated from practical realities',
 'Student appreciated group projects and practical application of economic concepts',
 'Student desires more detailed feedback on assignments',
 'Student enjoyed the course overall',
 'Course content organization is good',
 'Lectures are monotonous and hard to follow',
 'Incorporate more visual aids and guest lectures',
 'Effective integration of theory and practical case studies',
 'Fairness of exam

Next we construct a question to condense the list into a new list:

In [16]:
q_condensed_themes = QuestionList(
    question_name="condensed_themes",
    question_text="""Combine the following list of themes extracted from the evaluations 
    into a consolidated, non-redundant list: """
    + ", ".join(themes),
    max_list_items=10,
)

Now we run the question and select the new list. Note that we can choose whether we want to use the agent for this question by not adding it to the question when we run it:

In [17]:
condensed_themes = q_condensed_themes.run().select("condensed_themes").to_list()[0]


Now we can create a question to identify all the themes in the list that appear in each evaluation (our new list becomes the list of answer options):

In [18]:
from edsl.questions import QuestionCheckBox

q_themes_list = QuestionCheckBox(
    question_name="themes_list",
    question_text="Select all of the themes that are mentioned in this evaluation: {{ evaluation }}",
    question_options=condensed_themes,
)

Here we run the question and show a table listing all the themes for each evaluation in the results:

In [19]:
themes_lists = q_themes_list.by(scenarios).by(agent).run()
themes_lists.select("evaluation", "themes_list").print(format="rich")

In [20]:
wide_evaluation_themes = themes_lists.select("evaluation", "themes_list").to_scenario_list().expand("themes_list").rename({"themes_list": "theme"})
wide_evaluation_themes.print(max_rows = 10)

evaluation,theme
"I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings.","Course was engaging, informative, and provided a good foundational understanding of economics"
"I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings.",Professor simplified complex concepts and was passionate about the subject
"I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings.","Need for better pacing, more accessible reading materials, and diverse assessment methods"
"This class was a struggle for me. The material felt dry and difficult to connect with real-world applications, which I think could have made it more interesting. More examples from current events would definitely have helped spark my interest.",Material sometimes found dry and challenging to relate to real-world applications
"This class was a struggle for me. The material felt dry and difficult to connect with real-world applications, which I think could have made it more interesting. More examples from current events would definitely have helped spark my interest.","Demand for increased real-world examples, interactive lectures, and practical assignments"
Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging.,"Course was engaging, informative, and provided a good foundational understanding of economics"
Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging.,Professor simplified complex concepts and was passionate about the subject
Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging.,"Demand for increased real-world examples, interactive lectures, and practical assignments"
Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging.,"Appreciation for analytical rigor, enthusiastic teaching style, and availability for extra help"
"As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times.",Material sometimes found dry and challenging to relate to real-world applications


In [21]:
wide_evaluation_themes.tally("theme").print(format="rich")

Now we can count the number of evaluations that mention each of the themes:

We can do the same thing with the areas of improvement:

In [22]:
improvements = results.select("improvements").to_list(flatten=True)
improvements

['Adjust the course pacing',
 'Provide additional support for weekly readings',
 'Consider supplemental review sessions',
 'Incorporate more real-world examples into lectures',
 'Use current events to illustrate economic principles',
 'Develop interactive class activities to enhance engagement',
 'Incorporate more real-world case studies',
 'Enhance the use of multimedia to supplement lectures',
 'Offer additional resources for advanced study',
 'Incorporate more real-world examples into lectures',
 'Facilitate class discussions on current economic events',
 'Assign case studies or articles that relate theory to practice',
 'Provide more detailed feedback on assignments',
 'Clarify understanding of economic concepts through feedback',
 'Offer additional resources or sessions to review common mistakes',
 'Enhance lecture engagement through varied teaching methods',
 'Incorporate more visual aids to support learning',
 'Invite guest speakers to provide practical insights',
 'Explore addi

In [23]:
q_condensed_improvements = QuestionList(
    question_name="condensed_improvements",
    question_text="""Combine the following list of areas for improvement from the evaluations 
    into a consolidated, non-redundant list: """
    + ", ".join(improvements),
    max_list_items=10,
)

In [24]:
condensed_improvements = (
    q_condensed_improvements.run().select("condensed_improvements").to_list()[0]
)
condensed_improvements

['Adjust course pacing',
 'Provide additional support for weekly readings and resources for advanced study',
 'Incorporate more real-world examples and case studies into lectures and assignments',
 'Enhance engagement through interactive activities, varied teaching methods, and guest speakers',
 'Use multimedia, visual aids, and diverse assessment methods to support learning',
 'Facilitate class discussions on current economic events',
 'Offer additional review sessions and resources to clarify concepts and address common mistakes',
 'Enhance online resources and support',
 'Reevaluate textbook choice and supplement with simpler materials and a glossary',
 'Explore alternative teaching strategies to accommodate different learning styles']

In [25]:
condensed_improvements

['Adjust course pacing',
 'Provide additional support for weekly readings and resources for advanced study',
 'Incorporate more real-world examples and case studies into lectures and assignments',
 'Enhance engagement through interactive activities, varied teaching methods, and guest speakers',
 'Use multimedia, visual aids, and diverse assessment methods to support learning',
 'Facilitate class discussions on current economic events',
 'Offer additional review sessions and resources to clarify concepts and address common mistakes',
 'Enhance online resources and support',
 'Reevaluate textbook choice and supplement with simpler materials and a glossary',
 'Explore alternative teaching strategies to accommodate different learning styles']

In [26]:
q_improvements_list = QuestionCheckBox(
    question_name="improvements_list",
    question_text="Select all of the improvements that are mentioned in this evaluation: {{ evaluation }}",
    question_options=condensed_improvements,
)

In [27]:
improvements_lists = q_improvements_list.by(scenarios).by(agent).run()
improvements_lists.select("evaluation", "improvements_list").print(format="rich")

In [28]:
# improvements_lists # this turns VS Code white!

In [29]:
wide_themes = (improvements_lists
               .select("evaluation", "improvements_list")
               .to_scenario_list()
               .expand("improvements_list")
               .rename({"improvements_list": "theme"})
)

In [30]:
wide_themes.tally("theme").print(format="rich")

In [31]:
improvements_summary = wide_themes.tally("theme")

In [32]:
summary_string = improvements_summary.print(format = "markdown", return_string = True)

## Summarize the review
Here we create another question prompting the agent to summarize the analysis that was done, using the results of the prior steps:

In [33]:
from edsl.questions import QuestionFreeText

q_summary = QuestionFreeText(
    question_name="summary",
    question_text="Consider the following analyses of the evaluations and draft a paragraph summarizing them."
    + "Evaluation counts by theme: " + 
    wide_evaluation_themes.tally("theme").print(format = "markdown", return_string = True)
    + "Evaluation counts by area of improvement:"
    + summary_string
)

summary = q_summary.by(agent).run()
summary.select("summary").print(format="rich")

## Other examples
Please check out the [EDSL Docs](https://docs.expectedparrot.com/en/latest/index.html) for examples of other methods and templates for use cases, and [join our Discord channel](https://discord.com/invite/mxAYkjfy9m) to ask questions and with other users!