# Analyzing course evaluations
This notebook provides sample EDSL code for using a language model to analyze a set of course evaluations. The analysis is designed as a survey of questions about the evaluations answered by an AI agent using a language model to generate the responses as a dataset.

[EDSL](https://pypi.org/project/edsl/) is an open-source Python package for simulating surveys and experiments with language models. Please [see our docs](https://docs.expectedparrot.com/en/latest/index.html#) to learn more about using it.

## Technical setup
Before running the code below, please see instructions for [installing EDSL](https://docs.expectedparrot.com/en/latest/installation.html) and [storing API keys](https://docs.expectedparrot.com/en/latest/api_keys.html) for the language models that you want to use.

In [1]:
# pip install edsl

## Creating questions
We start by creating questions about the evaluations for the agent to answer. EDSL comes with a [variety of question types](https://docs.expectedparrot.com/en/latest/questions.html) (multiple choice, free text, etc.) that we can choose from based on the desired format of the response. We can use a `{{ placeholder }}` in each question in order to parameterize it with each evaluation.

In [2]:
from edsl.questions import QuestionList, QuestionMultipleChoice

In [3]:
q_themes = QuestionList(
    question_name = "themes",
    question_text = """Consider the following evaluation, and then provide a sentence summarizing
    each of the key points in it: {{ evaluation }}""",
    max_list_items = 3
)

q_sentiment = QuestionMultipleChoice(
    question_name = "sentiment",
    question_text = "What is the overall sentiment of the following evaluation: {{ evaluation }}",
    question_options = ["Positive", "Neutral", "Negative"]
)

q_improvement = QuestionList(
    question_name = "improvement",
    question_text = """Based on the following evaluation, what are some ways you could improve 
    your course to receive more positive evaluations: {{ evaluation }}"""
)

## Construct a survey
Next we combine our questions into a survey. This allows us to administer the questoins asynchronously (by default), or according to any desired [survey logic or rules](https://docs.expectedparrot.com/en/latest/surveys.html), such as skip/stop rules or giving the agent "memories" of other questions in the survey. Here we create a simple asynchronous survey by passing the list of questions to a survey object:

In [4]:
from edsl import Survey

survey = Survey([q_themes, q_sentiment, q_improvement])

## Selecting data for review
Next we identify the data to be analyzed. Here we use some mock evaluations for an Econ 101 course stored as a list of texts:

In [5]:
evaluations = [
    "The course was well-organized with clear objectives for each lecture. The professor was knowledgeable, but I wish there had been more real-world application of concepts. The exams were fair but challenging. Overall, a solid introduction to economics.",
    "Econ 101 was tougher than I anticipated. The lectures moved too quickly through complex theories, making it hard to follow without additional self-study. I appreciated the comprehensive reading materials provided, though they could be quite dense at times.",
    "I thoroughly enjoyed this course! The professor used a lot of current events to illustrate economic principles, which made the class engaging and relevant. Group projects were a highlight, fostering a practical understanding of the material.",
    "The course content was interesting, but the teaching style wasn’t for me. Lectures were mostly theoretical with few interactive elements, which made it hard to stay engaged. A more hands-on approach would have been appreciated.",
    "Excellent course with an enthusiastic instructor who made complex topics accessible and enjoyable. The PowerPoint slides were always clear and helpful for reviewing. Tests were fair, and I always felt well-prepared thanks to the thorough lectures.",
    "I found the professor to be disorganized, often straying from the topic. Office hours, however, were incredibly helpful and allowed for one-on-one engagement with the material. More consistency in lecture themes would improve the course.",
    "This was a great introductory course to economics with a lot of emphasis on mathematical models. For someone without a strong math background, this was a bit intimidating. More preliminary resources or a review session on math skills would be helpful.",
    "The professor was passionate about economics, which made learning exciting. I loved the real-life examples used to explain economic theories. However, the grading seemed tough, and feedback on assignments was sometimes vague.",
    "The class was well-structured with clear expectations set from the start. However, the professor's lectures were somewhat monotone, which made it difficult to maintain focus. More engaging presentations or guest speakers could liven up the content.",
    "As a visual learner, I appreciated the use of charts and graphs to explain economic concepts. The homework was directly related to lecture material, which reinforced learning. I would have liked more group discussions to hear different perspectives.",
]

## Add the data to the questions
Next we add the data as "scenarios" of the questions in order to run each question for each evaluation:

In [6]:
from edsl import Scenario

scenarios = [Scenario({"evaluation":e}) for e in evaluations]

## Design AI agents to answer the questions
Next we can design agents with relevant traits and personas to answer the questions. Here we create a persona for the professor for the course. (We could also try some third parties with coaching or other expertise to compare their responses!)

In [7]:
from edsl import Agent

persona = "You are a professor reviewing student evaluations for your recent Econ 101 course."

agent = Agent(traits = {"persona": persona})

## Run the survey
Next we add the scenarios and agent to the survey, and then run it. This will generate a dataset of responses that we can store and begin analyzing:

In [8]:
results = survey.by(scenarios).by(agent).run()

## Inspecting the responses
EDSL comes with built-in methods for analyzing results in data tables, dataframes, SQL queries and other formats. We can print a list of all the components that can be accessed:

In [9]:
results.columns

['agent.agent_name',
 'agent.persona',
 'answer.improvement',
 'answer.sentiment',
 'answer.themes',
 'comment.improvement_comment',
 'comment.sentiment_comment',
 'comment.themes_comment',
 'iteration.iteration',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.improvement_system_prompt',
 'prompt.improvement_user_prompt',
 'prompt.sentiment_system_prompt',
 'prompt.sentiment_user_prompt',
 'prompt.themes_system_prompt',
 'prompt.themes_user_prompt',
 'question_options.improvement_question_options',
 'question_options.sentiment_question_options',
 'question_options.themes_question_options',
 'question_text.improvement_question_text',
 'question_text.sentiment_question_text',
 'question_text.themes_question_text',
 'question_type.improvement_question_type',
 'question_type.sentiment_question_type',
 'question_type.themes_question_type',
 'raw_model_respons

Here we select just the responses to the questions and display them in a table:

In [10]:
results.select("themes", "sentiment", "improvement").print(format="rich")

We can do a quick tally of the sentiments:

In [11]:
df = results.to_pandas()['answer.sentiment'].value_counts()
df

answer.sentiment
Positive    5
Neutral     3
Negative    2
Name: count, dtype: int64

## Using responses to construct new questions
We can use the responses to our initial questions to construct more questions about the texts. For example, we can prompt the agent to condense the individual lists of themes and areas of improvements into short lists, and then use them to quantify those topics across the set of evaluations.

Here we take the lists of themes in each evaluation and flatten them into a list that we will prompt an agent to condense for us:

In [12]:
themes = results.select("themes").to_list(flatten=True)
themes

['Well-organized course',
 'Desire for more real-world applications',
 'Fair but challenging exams',
 'Course difficulty',
 'Fast-paced lectures',
 'Comprehensive but dense readings',
 'enjoyed course',
 'current events examples',
 'group projects beneficial',
 'Interesting content',
 'Teaching style mismatch',
 'Desire for interactivity',
 'Enthusiastic teaching',
 'Effective PowerPoint slides',
 'Fair and well-prepared tests',
 'disorganized lectures',
 'helpful office hours',
 'needs consistent themes',
 'great introductory course',
 'emphasis on mathematical models',
 'needs more math support',
 'Passionate teaching',
 'Real-life examples',
 'Tough grading and vague feedback',
 'well-structured class',
 'monotone lectures',
 'needs more engaging presentations',
 'visual aids effective',
 'homework aligned with lectures',
 'desire for more group discussions']

Next we construct a question prompting the agent to condense the list into a new list:

In [13]:
q_condensed_themes = QuestionList(
    question_name = "condensed_themes",
    question_text = """Combine the following list of themes extracted from the evaluations 
    into a consolidated, non-redundant list: """ + ", ".join(themes),
    max_list_items = 10
)

Now we run the question and select the new list:

In [14]:
condensed_themes = q_condensed_themes.run().select("condensed_themes").to_list()[0]
condensed_themes

['Well-organized and structured course',
 'Desire for practical applications and real-life examples',
 'Challenging and fair assessments',
 'Teaching style and interactivity',
 'Pace and clarity of lectures',
 'Quality of course materials',
 'Support for complex concepts',
 'Engagement and discussion',
 'Teaching enthusiasm and effectiveness',
 'Feedback and grading transparency']

Now we can create a question prompting the agent to identify all the themes in the list that appear in each evaluation (our new list becomes the list of answer options):

In [15]:
from edsl.questions import QuestionCheckBox

q_themes_list = QuestionCheckBox(
    question_name = "themes_list",
    question_text = "Select all of the themes that are mentioned in this evaluation: {{ evaluation }}",
    question_options = condensed_themes
)

Here we run the question and show a table listing all the themes for each evaluation:

In [16]:
themes_lists = q_themes_list.by(scenarios).run()
themes_lists.select("evaluation", "themes_list").print(format="rich")

Now we can count the number of evaluations that mention each of the themes:

In [17]:
import pandas as pd
from collections import Counter

themes_lists = themes_lists.select("themes_list").to_list()

flat_list = [(theme, idx) for idx, themes in enumerate(themes_lists) for theme in themes]
count = Counter(theme for theme, idx in set(flat_list))

df = pd.DataFrame(list(count.items()), columns=['Theme', 'Evaluations'])
print(df.sort_values(by='Evaluations', ascending=False))

                                               Theme  Evaluations
7                   Teaching style and interactivity            6
0                       Pace and clarity of lectures            5
3                          Engagement and discussion            5
2                       Support for complex concepts            4
4  Desire for practical applications and real-lif...            4
1                        Quality of course materials            3
5              Teaching enthusiasm and effectiveness            3
6               Well-organized and structured course            3
8                   Challenging and fair assessments            2
9                  Feedback and grading transparency            1


We can do the same thing with the areas of improvement:

In [18]:
improvements = results.select("improvement").to_list(flatten=True)
improvements

['Incorporate case studies',
 'Real-world examples',
 'Guest speakers from industry',
 'Interactive projects',
 'Current events discussions',
 'Pace lectures appropriately',
 'Simplify complex theories',
 'Provide summaries for readings',
 'Incorporate more in-class examples',
 'Offer supplemental instruction sessions',
 'Incorporate more current events',
 'Increase group project frequency',
 'Ensure relevance to real-world applications',
 'Incorporate interactive elements',
 'Include hands-on activities',
 'Engage students with practical examples',
 'Use multimedia resources',
 'Encourage class participation',
 'Implement group work',
 'Connect theory to real-world scenarios',
 'Incorporate more interactive elements',
 'Include real-world applications',
 'Provide additional resources for complex topics',
 'Offer more office hours',
 'Enhance student engagement',
 'Introduce guest speakers',
 'Update course materials regularly',
 'Solicit ongoing feedback',
 'improve organization',
 'm

In [19]:
q_condensed_improvements = QuestionList(
    question_name = "condensed_improvements",
    question_text = """Combine the following list of areas for improvement from the evaluations 
    into a consolidated, non-redundant list: """ + ", ".join(improvements),
    max_list_items = 10
)

In [20]:
condensed_improvements = q_condensed_improvements.run().select("condensed_improvements").to_list()[0]
condensed_improvements

['Incorporate real-world examples and case studies',
 'Enhance interactive learning with projects and activities',
 'Introduce guest speakers from industry',
 'Utilize multimedia resources and encourage class participation',
 'Align curriculum with current events and real-world applications',
 'Provide support for complex topics through summaries, resources, and office hours',
 'Foster collaborative learning with group projects and discussions',
 'Adapt teaching methods to cater to diverse learning styles',
 'Maintain clarity and organization in lectures and grading',
 'Solicit ongoing feedback and regularly update course materials']

In [21]:
q_improvements_list = QuestionCheckBox(
    question_name = "improvements_list",
    question_text = "Select all of the improvements that are mentioned in this evaluation: {{ evaluation }}",
    question_options = condensed_improvements
)

In [22]:
improvements_lists = q_improvements_list.by(scenarios).run()
improvements_lists.select("evaluation", "improvements_list").print(format="rich")

In [23]:
import pandas as pd
from collections import Counter

improvements_lists = improvements_lists.select("improvements_list").to_list()

flat_list = [(theme, idx) for idx, themes in enumerate(improvements_lists) for theme in themes]
count = Counter(theme for theme, idx in set(flat_list))

df = pd.DataFrame(list(count.items()), columns=['Improvement', 'Evaluations'])
print(df.sort_values(by='Evaluations', ascending=False))

                                         Improvement  Evaluations
4  Maintain clarity and organization in lectures ...            4
0  Foster collaborative learning with group proje...            3
1  Utilize multimedia resources and encourage cla...            2
2  Align curriculum with current events and real-...            2
3   Incorporate real-world examples and case studies            2
5  Enhance interactive learning with projects and...            2
6  Provide support for complex topics through sum...            2
7  Adapt teaching methods to cater to diverse lea...            1
8             Introduce guest speakers from industry            1


## Summarizing the review
Here we create another question prompting the agent to summarize the analysis that was done:

In [24]:
from edsl.questions import QuestionFreeText

q_summary = QuestionFreeText(
    question_name = "summary",
    question_text = """Consider the following analysis of the evaluations and draft a paragraph
    summarizing it: """ + df.to_string()
)

summary = q_summary.by(agent).run()
summary.select("summary").print(format="rich")