# Analyzing course evaluations
This notebook provides sample EDSL code for using a language model to analyze a set of course evaluations. The analysis is designed as a series of questions about the evaluations that we prompt an AI agent to answer, using a language model to generate the responses as a dataset.

[EDSL](https://pypi.org/project/edsl/) is an open-source Python package for simulating surveys and experiments with language models. Please [see our docs](https://docs.expectedparrot.com/en/latest/index.html#) to learn more about using it.

## Technical setup
Before running the code below, please see instructions for [installing EDSL](https://docs.expectedparrot.com/en/latest/installation.html) and [storing API keys](https://docs.expectedparrot.com/en/latest/api_keys.html) for the language models that you want to use.

In [1]:
# pip install edsl

## Creating questions
We start by creating questions about the evaluations for the agent to answer. EDSL comes with a [variety of question types](https://docs.expectedparrot.com/en/latest/questions.html) (multiple choice, free text, etc.) that we can choose from based on the desired format of the response. We can use a `{{ placeholder }}` in each question in order to parameterize it with each evaluation.

In [2]:
from edsl.questions import QuestionList, QuestionMultipleChoice

In [3]:
q_themes = QuestionList(
    question_name = "themes",
    question_text = """Consider the following evaluation, and then provide a sentence summarizing
    each of the key points in it: {{ evaluation }}""",
    max_list_items = 3
)

q_sentiment = QuestionMultipleChoice(
    question_name = "sentiment",
    question_text = "What is the overall sentiment of the following evaluation: {{ evaluation }}",
    question_options = ["Positive", "Neutral", "Negative"]
)

q_improvement = QuestionList(
    question_name = "improvement",
    question_text = """Based on the following evaluation, what are some ways you could improve 
    your course to receive more positive evaluations: {{ evaluation }}"""
)

## Construct a survey
Next we combine our questions into a survey. This allows us to administer them asynchronously (by default), or according to any desired [survey logic or rules](https://docs.expectedparrot.com/en/latest/surveys.html), such as skip/stop rules or giving the agent "memories" of other questions in the survey. Here we create a simple asynchronous survey by passing the list of questions:

In [4]:
from edsl import Survey

survey = Survey([q_themes, q_sentiment, q_improvement])

## Selecting data for review
Next we identify the data to be analyzed. Here we use some mock evaluations for an Econ 101 course stored as a list of texts:

In [5]:
evaluations = [
    "The professor explained economic principles clearly and made the subject accessible. The course was well-structured, aligning textbook readings with practical exercises. However, the lectures sometimes felt a bit rushed. Office hours were extremely helpful, especially before exams. Overall, this was a very informative course.",
    "I appreciated the real-world applications included in the course, which made complex theories relatable. The class discussions were insightful but could be better if more structured. The workload was reasonable, though the project deadlines were tight. Feedback on assignments was constructive and timely. I'd recommend this course to those interested in understanding economics fundamentals.",
    "The course was challenging, particularly the sections on microeconomics. The professor was knowledgeable but sometimes skipped over foundational concepts too quickly. Visual aids used in lectures enhanced learning, though more interactive components would be beneficial. Exams were fair but demanding. Overall, it was a rigorous introduction to economics.",
    "This Econ 101 course was engaging, thanks to the professor's enthusiasm and clear explanations. Group projects facilitated a deeper understanding of the material, although coordinating times was sometimes difficult. The course materials were comprehensive and well-organized. I found the textbook particularly useful. This class solidified my interest in further economic studies.",
    "The course effectively covered a broad range of economic theories and models. However, the pace was sometimes too fast, making it hard to absorb all the information. The use of current events helped illustrate how economic principles apply in real life. I wish there were more opportunities for revising core concepts before exams. The professor was approachable and knowledgeable.",
    "The professor used a variety of teaching methods, which catered to different learning styles. The PowerPoint presentations were detailed and useful for revising. However, the reliance on multiple-choice questions for assessments didn't fully test our understanding of the material. Class participation was encouraged, which made the course interactive. More case studies could improve the learning experience.",
    "As a beginner to economics, I found the course to be a solid foundation. The mathematical aspects were well explained, but the economic theories could be abstract and challenging. The assignments were helpful for reinforcing the lectures. Feedback was sometimes slow but always helpful. This course demands attention and study but is very rewarding.",
    "This was a well-taught course with a clear emphasis on practical applications of economics. The examples used in class were current and relevant, which kept the content interesting. The exams were comprehensive and a good measure of our understanding. However, the reading material was extensive and sometimes overwhelming. More supplemental resources would be helpful.",
    "The professor was passionate about economics, which made the class more engaging. The lectures were packed with information, and the examples provided were relevant. The grading was strict but fair, and it pushed us to work harder. The final project was a great way to apply what we learned in a practical scenario. I would recommend this course for anyone looking to start a career in economics.",
    "This Econ 101 course was informative and well-paced. The professor was always prepared and encouraged questions, making the material accessible. However, the lecture hall was often overcrowded, which made it hard to interact during class. The tests required a deep understanding of the material, which was good for learning. More visual aids and examples during lectures would make this course perfect.",
]

## Add the data to the questions
Next we add the data as "scenarios" of the questions in order to run each question for each evaluation:

In [6]:
from edsl import Scenario

scenarios = [Scenario({"evaluation":e}) for e in evaluations]

## Design AI agents to answer the questions
Next we can design agents with relevant traits and personas to answer the questions. Here we create a persona for the professor for the course. (We could also try some third parties with coaching or other expertise for comparison!)

In [7]:
from edsl import Agent

persona = "You are a professor reviewing student evaluations for your recent Econ 101 course."

agent = Agent(traits = {"persona": persona})

## Run the survey
To generate responses, we add the scenarios and agent to the survey, and then run it:

In [8]:
results = survey.by(scenarios).by(agent).run()

## Inspecting the responses
EDSL comes with built-in methods for analyzing results in data tables, dataframes, SQL queries and other formats. We can print a list of all the components that can be accessed:

In [9]:
results.columns

['agent.agent_name',
 'agent.persona',
 'answer.improvement',
 'answer.sentiment',
 'answer.themes',
 'comment.improvement_comment',
 'comment.sentiment_comment',
 'comment.themes_comment',
 'iteration.iteration',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.improvement_system_prompt',
 'prompt.improvement_user_prompt',
 'prompt.sentiment_system_prompt',
 'prompt.sentiment_user_prompt',
 'prompt.themes_system_prompt',
 'prompt.themes_user_prompt',
 'question_options.improvement_question_options',
 'question_options.sentiment_question_options',
 'question_options.themes_question_options',
 'question_text.improvement_question_text',
 'question_text.sentiment_question_text',
 'question_text.themes_question_text',
 'question_type.improvement_question_type',
 'question_type.sentiment_question_type',
 'question_type.themes_question_type',
 'raw_model_respons

Here we select just the responses to the questions and display them in a table:

In [10]:
results.select("themes", "sentiment", "improvement").print(format="rich")

We could tally the sentiments:

In [11]:
df = results.to_pandas()['answer.sentiment'].value_counts()
df

answer.sentiment
Positive    9
Neutral     1
Name: count, dtype: int64

## Using responses as to construct new questions
We can use the responses to our initial questions to construct new questions about the texts. For example, we could condense the individual lists of themes and improvements into short lists, and then use them to quantify themes and areas for improvement across the set of evaluations.

Here we take the lists of themes in each evaluation and flatten them into a list that we will prompt an agent to condense for us:

In [12]:
themes = results.select("themes").to_list(flatten=True)
themes

['Clear explanation of principles',
 'Well-structured course',
 'Rushed lectures',
 'real-world applications appreciated',
 'desire for more structured discussions',
 'constructive and timely feedback',
 'Challenging microeconomics',
 'Knowledgeable, fast-paced teaching',
 'Visual aids helpful, interactive elements needed',
 'Engaging and clear instruction',
 'Group projects deepened understanding, with some coordination challenges',
 'Comprehensive and well-organized materials',
 'Comprehensive coverage',
 'Fast pace',
 'Real-life application',
 'Diverse teaching methods',
 'Effective PowerPoint presentations',
 'Interactive class participation',
 'solid foundation',
 'abstract theories',
 'rewarding but demanding',
 'Clear practical emphasis',
 'Relevant examples',
 'Comprehensive exams',
 'Passionate teaching',
 'Information-rich lectures with relevant examples',
 'Strict but fair grading and practical final project',
 'Informative and well-paced',
 'Encouraged questions and accessi

Next we construct the question prompting the agent to condense the list into a new list:

In [13]:
q_condensed_themes = QuestionList(
    question_name = "condensed_themes",
    question_text = """Combine the following list of themes extracted from the evaluations 
    into a consolidated, non-redundant list: """ + ", ".join(themes),
    max_list_items = 10
)

Now we run the question and select the new list:

In [14]:
condensed_themes = q_condensed_themes.run().select("condensed_themes").to_list()[0]
condensed_themes

['Well-structured and comprehensive course',
 'Clear and engaging instruction with real-world applications',
 'Fast-paced and information-rich lectures',
 'Interactive and diverse teaching methods with visual aids',
 'Constructive feedback with timely responses',
 'Demanding curriculum with challenging content',
 'Structured discussions and group projects with coordination emphasis',
 'Knowledgeable and passionate teaching',
 'Effective use of PowerPoint and practical emphasis',
 'Facilities and resource management']

Now we can create a new question prompting the agent identify all the themes in the list that appear in each evaluation (our new list becomes the list of answer options):

In [15]:
from edsl.questions import QuestionCheckBox

q_themes_list = QuestionCheckBox(
    question_name = "themes_list",
    question_text = "Select all of the themes that are mentioned in this evaluation: {{ evaluation }}",
    question_options = condensed_themes
)

Here we run the question and show a table listing all the themes for each evaluation:

In [16]:
themes_lists = q_themes_list.by(scenarios).run()
themes_lists.select("evaluation", "themes_list").print(format="rich")

Now we can count the number of evaluations that mention each of the themes:

In [18]:
import pandas as pd
from collections import Counter

themes_lists = themes_lists.select("themes_list").to_list()

flat_list = [(theme, idx) for idx, themes in enumerate(themes_lists) for theme in themes]
count = Counter(theme for theme, idx in set(flat_list))

df = pd.DataFrame(list(count.items()), columns=['Theme', 'Evaluations'])
print(df.sort_values(by='Evaluations', ascending=False))

                                               Theme  Evaluations
2  Clear and engaging instruction with real-world...            7
0              Knowledgeable and passionate teaching            6
1           Well-structured and comprehensive course            6
4      Demanding curriculum with challenging content            6
5           Fast-paced and information-rich lectures            4
3  Interactive and diverse teaching methods with ...            2
6        Constructive feedback with timely responses            2
9  Effective use of PowerPoint and practical emph...            2
7                 Facilities and resource management            1
8  Structured discussions and group projects with...            1


We can do the same thing with the areas of improvement:

In [27]:
improvements = results.select("improvement").to_list(flatten=True)
improvements

['Pace lectures better',
 'Allow more time for complex topics',
 'Include more interactive elements during lectures',
 'Provide additional review sessions before exams',
 'More structured class discussions',
 'Adjust project deadlines',
 'Slow down',
 'Review foundational concepts',
 'Increase interactivity',
 'Maintain visual aids',
 'Consider exam difficulty',
 'Scheduling tools',
 'Doodle polls',
 'Group project guidelines',
 'Time management strategies',
 'Adjust pacing',
 'Review sessions',
 'Supplemental materials',
 'Interactive learning activities',
 'Diversify assessments',
 'Incorporate case studies',
 'Reduce reliance on multiple-choice questions',
 'Include written assignments',
 'Implement project-based evaluations',
 'Clarify economic theories',
 'Use practical examples',
 'Improve feedback timeliness',
 'Reduce reading load',
 'Curate reading material',
 'Provide supplemental resources',
 'Offer optional reading lists',
 'Maintain passion and engagement',
 'Continue usin

In [28]:
q_condensed_improvements = QuestionList(
    question_name = "condensed_improvements",
    question_text = """Combine the following list of areas for improvement from the evaluations 
    into a consolidated, non-redundant list: """ + ", ".join(improvements),
    max_list_items = 10
)

In [29]:
condensed_improvements = q_condensed_improvements.run().select("condensed_improvements").to_list()[0]
condensed_improvements

['Improve pacing and allocate more time for complex topics',
 'Enhance interactivity with more interactive elements and discussions',
 'Provide additional support such as review sessions and supplemental materials',
 'Optimize assessments by diversifying types and adjusting difficulty',
 'Clarify and apply economic concepts using practical and real-life examples',
 'Improve project management with clearer guidelines and adjusted deadlines',
 'Enhance resources with curated reading materials and optional lists',
 'Maintain high engagement with passion and relevant examples',
 'Ensure fair grading and timely feedback',
 'Adjust class structure by reducing size and increasing interaction opportunities']

In [30]:
q_improvements_list = QuestionCheckBox(
    question_name = "improvements_list",
    question_text = "Select all of the improvements that are mentioned in this evaluation: {{ evaluation }}",
    question_options = condensed_improvements
)

In [31]:
improvements_lists = q_improvements_list.by(scenarios).run()
improvements_lists.select("evaluation", "improvements_list").print(format="rich")

In [32]:
import pandas as pd
from collections import Counter

improvements_lists = improvements_lists.select("improvements_list").to_list()

flat_list = [(theme, idx) for idx, themes in enumerate(improvements_lists) for theme in themes]
count = Counter(theme for theme, idx in set(flat_list))

df = pd.DataFrame(list(count.items()), columns=['Improvement', 'Evaluations'])
print(df.sort_values(by='Evaluations', ascending=False))

                                         Improvement  Evaluations
1            Ensure fair grading and timely feedback            3
4  Improve pacing and allocate more time for comp...            3
6  Provide additional support such as review sess...            3
2  Maintain high engagement with passion and rele...            2
3  Enhance interactivity with more interactive el...            2
7  Clarify and apply economic concepts using prac...            2
0  Adjust class structure by reducing size and in...            1
5  Enhance resources with curated reading materia...            1
8  Improve project management with clearer guidel...            1
9  Optimize assessments by diversifying types and...            1
